Hi Thomas,
Thanks for the confirmation. I will now start a vote.
Best,
Xingbo
Thomas Weise 于2022年1月12日周三 02:20写道:
> Hi Xingbo,
>
> +1 from my side
>
> Thanks for the clarification. For your use case the parameter size and
> therefore serialization overhead was the limiting factor. I have seen
Hi Xingbo,
+1 from my side
Thanks for the clarification. For your use case the parameter size and
therefore serialization overhead was the limiting factor. I have seen
use cases where that is not the concern, because the Python logic
itself is heavy and dwarfs the protocol overhead (for example
Hi everyone,
Thanks to all of you for the discussion.
If there are no objections, I would like to start a vote thread tomorrow.
Best,
Xingbo
Xingbo Huang 于2022年1月7日周五 16:18写道:
> Hi Till,
>
> I have written a more complicated PyFlink job. Compared with the previous
> single python udf job,
Hi Till,
I have written a more complicated PyFlink job. Compared with the previous
single python udf job, there is an extra stage of converting between table
and datastream. Besides, I added a python map function for the job. Because
python datastream has not yet implemented Thread mode, the
Thanks for the detailed answer Xingbo. Quick question on the last figure in
the FLIP. You said that this is a real world Flink stream SQL job. The
title of the graph says UDF(String Upper). So do I understand correctly
that string upper is the real world use case you have measured? What I
wanted
Hi Till and Thomas,
Thanks a lot for joining the discussion.
For Till:
>>> Is the slower performance currently the biggest pain point for our
Python users? What else are our Python users mainly complaining about?
PyFlink users are most concerned about two parts, one is better usability,
the
Interesting discussion. It caught my attention because I was also
interested in the Beam fn execution overhead a few years ago.
We found back then that while in theory the fn protocol overhead is
very significant, for realistic function workloads that overhead was
negligible. And of course it all
One more question that came to my mind: How much performance improvement do
we gain on a real-world Python use case? Were the measurements more like
micro benchmarks where the Python UDF was called w/o the overhead of Flink?
I would just be curious how much the Python component contributes to the
Hi Xingbo,
Thanks for creating this FLIP. I have two general questions about the
motivation for this FLIP because I have only very little exposure to our
Python users:
Is the slower performance currently the biggest pain point for our Python
users?
What else are our Python users mainly
Hi Wei,
Thanks a lot for your feedback. Very good questions!
>>> 1. It seems that we dynamically load an embedded Python and user
dependencies in the TM process. Can they be uninstalled cleanly after the
task finished? i.e. Can we use the Thread Mode in session mode and Pyflink
shell?
I
Hi Xingbo,
Thanks for creating this FLIP. Big +1 for it!
I have some question about the Thread Mode:
1. It seems that we dynamically load an embedded Python and user dependencies
in the TM process. Can they be uninstalled cleanly after the task finished?
i.e. Can we use the Thread Mode in
Hi everyone,
I would like to start a discussion thread on "Support PyFlink Runtime
Execution in Thread Mode"
We have provided PyFlink Runtime framework to support Python user-defined
functions since Flink 1.10. The PyFlink Runtime framework is called Process
Mode, which depends on an
12 matches
Mail list logo