[GitHub] [flink] WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument Python UDFs.

2019-10-12 Thread GitBox
WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument 
Python UDFs.
URL: https://github.com/apache/flink/pull/9865#issuecomment-541303687
 
 
   > > @JingsongLi I noticed that the "getScalarFunction" method in 
"ScalarSqlFunction" of blink planner was replaced by "makeFunction" method. I 
think using "getScalarFunction" to get the initial ScalarFunction object and 
check its language type makes more sense here so I re-add the method. Please 
take a look at this PR to make sure it does not cause any side effects to blink 
planner :).
   > 
   > LGTM, it is reasonable, can you just use scala val to scalarFunction just 
like `TableSqlFunction`?
   
   Thanks for your reply! That makes sense to me. I have updated the code in 
the latest commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument Python UDFs.

2019-10-11 Thread GitBox
WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument 
Python UDFs.
URL: https://github.com/apache/flink/pull/9865#issuecomment-540973875
 
 
   @hequn8128 Thanks for your review! I have addressed your comments in the 
latest commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument Python UDFs.

2019-10-11 Thread GitBox
WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument 
Python UDFs.
URL: https://github.com/apache/flink/pull/9865#issuecomment-540956430
 
 
   @hequn8128 Thanks for your feedback! Dian Fu and me has talked about these 
two approach and we come to an agreement that skip the optimization in 
`ExpressionReducer` and optimize the UDFs during runtime. Our thoughts are as 
follows:
   1. How to optimize Python UDFs during runtime?
   After support constant parameters in Python UDF(see [this 
PR](https://github.com/apache/flink/pull/9858)), we can do this optimization 
when evaluating the chained Python UDFs in python worker: 
   If the evaluated Python UDF is deterministic and has no argument or its 
arguments are all constant value, which means it will always return a constant 
value, replace it with the constant result value.
   This rule can be applied recursively until all the deterministic UDFs with 
constant inputs have been replaced. If the root nodes of the chained Python 
UDFs become constant values, we can only transmit them only once and replace 
them with None in following transmission to save IO resource. The Java operator 
also knows which fields of the evaluated result should be constant value rather 
than None because the reduce rule can be applied in Java side too. No 
additional interaction between Java operator and Python worker is needed in 
this design.
   2. Why not optimize Python UDFs during optimization phase?
   Reducing Python UDFs in optimization phase is not a easy work in current 
design. It means the generated Java wrappers of Python UDFs can be evaluated 
and return the correct result. In other word we need run Python UDFs in client 
side, but the Python UDFs is designed to run on cluster whose python 
environment may different from client side after we introduce environment and 
dependency management in the future. To solve the environment problem, we need 
to prepare a python environment that is the same as the python environment on 
cluster before reducing Python UDFs. To evaluate the Python UDF, we need to 
implement a new UDF runner which does not depend on Apache Beam(the client side 
of Flink Python API does not depend on Apache Beam). We know if we complete 
these it will be a perfect solution of this problem, but it is too expensive 
compared to runtime optimization.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument Python UDFs.

2019-10-10 Thread GitBox
WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument 
Python UDFs.
URL: https://github.com/apache/flink/pull/9865#issuecomment-540463353
 
 
   @dianfu Thanks for your review! I have addressed the comment in the latest 
commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument Python UDFs.

2019-10-10 Thread GitBox
WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument 
Python UDFs.
URL: https://github.com/apache/flink/pull/9865#issuecomment-540452708
 
 
   @dianfu Thanks for your review again! I have addressed your comments in the 
latest commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument Python UDFs.

2019-10-10 Thread GitBox
WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument 
Python UDFs.
URL: https://github.com/apache/flink/pull/9865#issuecomment-540423623
 
 
   @JingsongLi I noticed that the "getScalarFunction" method in 
"ScalarSqlFunction" of blink planner was replaced by "makeFunction" method. I 
think using "getScalarFunction" to get the initial ScalarFunction object and 
check its language type makes more sense here so I re-add the method. Please 
take a look at this PR to make sure it does not cause any side effects to blink 
planner :).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument Python UDFs.

2019-10-10 Thread GitBox
WeiZhong94 commented on issue #9865: [FLINK-14212][python] Support no-argument 
Python UDFs.
URL: https://github.com/apache/flink/pull/9865#issuecomment-540416014
 
 
   @dianfu Thanks for your comments! I have addressed them in the latest 
commit, please take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services