Hi, community!

I am working on data processing structure optimization from full data pipeline 
to incremental data pipeline, from PySpark with PythonCode to two optional ways 
below: 


1. PyFlink 1.13 + Python 2.7
2. JavaFlink 1.13 + JPython + Python 2.7 


As far as i know, the python APIs only provide a subset of about 2/3 of what's 
available in Java APIs; the performance of PyFlink is worse than JavaFlink and 
some features contributed after 1.10 are not implemented in PyFlink yet. 


And python code can be compiled to java bytecode by ASM carrier and loaded into 
JVM, so can i argue that the python code is not much less efficient than java 
code? 


So i prefer the second way. 
Thanks for any suggestions or replies. 


Best Regards!

Reply via email to