dianfu commented on a change in pull request #9653: [FLINK-14014][python] Introduce PythonScalarFunctionRunner to handle the communication with Python worker for Python ScalarFunction execution URL: https://github.com/apache/flink/pull/9653#discussion_r324529935
########## File path: flink-python/pyflink/proto/flink-fn-execution.proto ########## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +syntax = "proto3"; + +package org.apache.flink.fn_execution.v1; + +option java_package = "org.apache.flink.fnexecution.v1"; +option java_outer_classname = "FlinkFnApi"; + +// User-defined function definition. It supports chaining functions, that's, the execution +// result of one user-defined function as the input of another user-defined function. +message UserDefinedFunction { + message Input { + oneof input { + UserDefinedFunction udf = 1; + int32 inputOffset = 2; + } + } + + // The serialized representation of the user-defined function + bytes payload = 1; + + // The input arguments of the user-defined function, it could be one of the following: + // 1. A column from the input row + // 2. The result of another user-defined function + repeated Input inputs = 2; +} + +// A list of user-defined functions to be executed in a batch. +message UserDefinedFunctions { + repeated UserDefinedFunction udfs = 1; +} + +// A representation of the data schema. +message Schema { + enum TypeName { + TINYINT = 0; + SMALLINT = 1; + INT = 2; + BIGINT = 3; + DECIMAL = 4; + FLOAT = 5; + DOUBLE = 6; + DATE = 7; + TIME = 8; + DATETIME = 9; + BOOLEAN = 10; + BINARY = 11; + VARBINARY = 12; + CHAR = 13; + VARCHAR = 14; + ARRAY = 15; + MAP = 16; + MULTISET = 17; + ROW = 18; Review comment: Good catch for the performance issue. Make sense to me. +1 to put it at the first of the list. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
