Table API的作业在执行之前会经过一系列的rule优化,最终的执行计划,存在一个UDF调用多次的可能,你可以把执行计划打印出来看看(TableEnvironment#explain)。
具体原因,需要看一下作业逻辑。可以发一下你的作业吗?可重现代码即可。 > 在 2020年7月9日,下午5:50,lgs <[email protected]> 写道: > > Hi, > > 我观察到一个现象:我定义了一个tumble window,调用一个python udf,在这个udf里面使用requests发送rest api。 > log显示这个udf会被调用两次。相隔不到一秒。这个是什么原因?requests库跟beam冲突了? > > 2020-07-09 17:44:17,501 INFO flink_test_stream_time_kafka.py:22 > > [] - start to ad > 2020-07-09 17:44:17,530 INFO flink_test_stream_time_kafka.py:63 > > [] - start to send rest api. > 2020-07-09 17:44:17,532 INFO flink_test_stream_time_kafka.py:69 > > [] - Receive: {"Received": "successful"} > 2020-07-09 17:44:17,579 INFO > /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py:564 > [] - Creating insecure state channel for localhost:57954. > 2020-07-09 17:44:17,580 INFO > /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py:571 > [] - State channel established. > 2020-07-09 17:44:17,584 INFO > /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/data_plane.py:526 > [] - Creating client data channel for localhost:60902 > 2020-07-09 17:44:17,591 INFO > org.apache.beam.runners.fnexecution.data.GrpcDataService [] - Beam Fn > Data client connected. > 2020-07-09 17:44:17,761 INFO flink_test_stream_time_kafka.py:22 > > [] - start to ad > 2020-07-09 17:44:17,810 INFO flink_test_stream_time_kafka.py:63 > > [] - start to send rest api. > 2020-07-09 17:44:17,812 INFO flink_test_stream_time_kafka.py:69 > > [] - Receive: {"Received": "successful"} > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/
