Table 
API的作业在执行之前会经过一系列的rule优化,最终的执行计划,存在一个UDF调用多次的可能,你可以把执行计划打印出来看看(TableEnvironment#explain)。

具体原因,需要看一下作业逻辑。可以发一下你的作业吗?可重现代码即可。

> 在 2020年7月9日,下午5:50,lgs <[email protected]> 写道:
> 
> Hi,
> 
> 我观察到一个现象:我定义了一个tumble window,调用一个python udf,在这个udf里面使用requests发送rest api。
> log显示这个udf会被调用两次。相隔不到一秒。这个是什么原因?requests库跟beam冲突了?
> 
> 2020-07-09 17:44:17,501 INFO  flink_test_stream_time_kafka.py:22              
>             
> [] - start to ad
> 2020-07-09 17:44:17,530 INFO  flink_test_stream_time_kafka.py:63              
>             
> [] - start to send rest api.
> 2020-07-09 17:44:17,532 INFO  flink_test_stream_time_kafka.py:69              
>             
> [] - Receive: {"Received": "successful"}
> 2020-07-09 17:44:17,579 INFO 
> /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py:564
> [] - Creating insecure state channel for localhost:57954.
> 2020-07-09 17:44:17,580 INFO 
> /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py:571
> [] - State channel established.
> 2020-07-09 17:44:17,584 INFO 
> /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/data_plane.py:526
> [] - Creating client data channel for localhost:60902
> 2020-07-09 17:44:17,591 INFO 
> org.apache.beam.runners.fnexecution.data.GrpcDataService     [] - Beam Fn
> Data client connected.
> 2020-07-09 17:44:17,761 INFO  flink_test_stream_time_kafka.py:22              
>             
> [] - start to ad
> 2020-07-09 17:44:17,810 INFO  flink_test_stream_time_kafka.py:63              
>             
> [] - start to send rest api.
> 2020-07-09 17:44:17,812 INFO  flink_test_stream_time_kafka.py:69              
>             
> [] - Receive: {"Received": "successful"}
> 
> 
> 
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/

回复