Re: [DISCUSS] PyFlink User-Defined Function Resource Management

Dian Fu Tue, 03 Dec 2019 02:08:50 -0800

Hi Jingsong,

Thanks for your valuable feedback. I have updated the "Example" section 
describing how to use these options in a Python Table API program.


Thanks,
Dian

> 在 2019年12月2日，下午6:12，Jingsong Lee <[email protected]> 写道：
> 
> Hi Dian:
> 
> Thanks for you explanation.
> If you can update the document to add explanation for the changes to the
> table layer,
> it might be better. (it's just a suggestion, it depends on you)
> About forwardedInputQueue in AbstractPythonScalarFunctionOperator,
> Will this queue take up a lot of memory?
> Can it also occupy memory as large as buffer.memory?
> If so, what we're dealing with now is the silent use of heap memory?
> I feel a little strange, because the memory on the python side will reserve,
> but the memory on the JVM side is used silently.
> 
> After carefully seeing your comments on Google doc:
>> The memory used by the Java operator is currently accounted as the task
> on-heap memory. We can revisit this if we find it's a problem in the future.
> I agree that we can ignore it now, But we can add some content to the
> document to remind the user, What do you think?
> 
> Best,
> Jingsong Lee
> 
> On Mon, Dec 2, 2019 at 5:17 PM Dian Fu <[email protected]> wrote:
> 
>> Hi Jingsong,
>> 
>> Thanks a lot for your comments. Please see my reply inlined below.
>> 
>>> 在 2019年12月2日，下午3:47，Jingsong Lee <[email protected]> 写道：
>>> 
>>> Hi Dian:
>>> 
>>> 
>>> Thanks for your driving. I have some questions:
>>> 
>>> 
>>> - Where should these configurations belong? You have mentioned
>> tableApi/SQL,
>>> so should in TableConfig?
>> 
>> All Python related configurations are defined in PythonOptions. User could
>> configure these configurations via TableConfig.getConfiguration.setXXX for
>> Python Table API programs.
>> 
>>> 
>>> - If just in table/sql, whether it should be called: table.python.****,
>>> because in table, all config options are called table.***.
>> 
>> These configurations are not table specific. They will be used for both
>> Python Table API programs and Python DataStream API programs (which is
>> planned to be supported in the future). So python.xxx seems more
>> appropriate, what do you think?
>> 
>>> - What should table module do? So in CommonPythonCalc, we should read
>>> options from table config, and set resources to OneInputTransformation?
>> 
>> As described in the design doc, in compilation phase, for batch jobs, the
>> required memory of the Python worker will be calculated according to the
>> configuration and set as the managed memory for the operator. For stream
>> jobs, the resource spec will be unknown(The reason is that currently the
>> resources for all the operators in stream jobs are unknown and it doesn’t
>> support to configure both known and unknown resources in a single job).
>> 
>>> - Are all buffer.memory off-heap memory? I took a look
>>> to AbstractPythonScalarFunctionOperator, there is a forwardedInputQueue,
>> is
>>> this one a heap queue? So we need heap memory too?
>> 
>> Yes, they are all off-heap memory which is supposed to be used by the
>> Python process. The forwardedInputQueue is a buffer used in the Java
>> operator and its memory is accounted as the on-heap memory.
>> 
>> Regards,
>> Dian
>> 
>>> 
>>> Hope to get your reply.
>>> 
>>> 
>>> Best,
>>> 
>>> Jingsong Lee
>>> 
>>> On Tue, Nov 26, 2019 at 12:17 PM Dian Fu <[email protected]> wrote:
>>> 
>>>> Thanks for your votes and feedbacks. I have discussed with @Zhu Zhu
>>>> offline and also on the design doc.
>>>> 
>>>> It seems that we have reached consensus on the design. I would bring up
>>>> the VOTE if there is no other feedbacks.
>>>> 
>>>> Thanks,
>>>> Dian
>>>> 
>>>>> 在 2019年11月22日，下午2:51，Hequn Cheng <[email protected]> 写道：
>>>>> 
>>>>> Thanks a lot for putting this together, Dian! Definitely +1 for this!
>>>>> It is great to make sure that the resources used by the Python process
>>>> are
>>>>> managed properly by Flink’s resource management framework.
>>>>> 
>>>>> Also, thanks to the guys that working on the unified memory management
>>>>> framework.
>>>>> 
>>>>> Best, Hequn
>>>>> 
>>>>> 
>>>>> On Mon, Nov 18, 2019 at 5:23 PM Yangze Guo <[email protected]> wrote:
>>>>> 
>>>>>> Thanks for driving this discussion, Dian!
>>>>>> 
>>>>>> +1 for this proposal. It will help to reduce container failure due to
>>>>>> the memory overuse.
>>>>>> Some comments left in the design doc.
>>>>>> 
>>>>>> Best,
>>>>>> Yangze Guo
>>>>>> 
>>>>>> On Mon, Nov 18, 2019 at 4:06 PM Xintong Song <[email protected]>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Sorry for the late reply.
>>>>>>> 
>>>>>>> +1 for the general proposal.
>>>>>>> 
>>>>>>> And one remainder, to use UNKNOWN resource requirement, we need to
>> make
>>>>>>> sure optimizer knowns which operators use off-heap managed memory,
>> and
>>>>>>> compute and set a fraction to the operators. See FLIP-53[1] for more
>>>>>>> details, and I would suggest you to double check with @Zhu Zhu who
>>>> works
>>>>>> on
>>>>>>> this part.
>>>>>>> 
>>>>>>> Thank you~
>>>>>>> 
>>>>>>> Xintong Song
>>>>>>> 
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
>>>>>>> 
>>>>>>> On Tue, Nov 12, 2019 at 11:53 AM Dian Fu <[email protected]>
>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Jincheng,
>>>>>>>> 
>>>>>>>> Thanks for the reply and also looking forward to the feedback from
>> the
>>>>>>>> community.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Dian
>>>>>>>> 
>>>>>>>>> 在 2019年11月11日，下午2:34，jincheng sun <[email protected]> 写道：
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> +1， Thanks for bring up this discussion Dian!
>>>>>>>>> 
>>>>>>>>> The Resource Management is very important for PyFlink UDF. So, It's
>>>>>> great
>>>>>>>>> if anyone can add more comments or inputs in the design doc or
>>>>>> feedback
>>>>>>>> in
>>>>>>>>> ML. :)
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Jincheng
>>>>>>>>> 
>>>>>>>>> Dian Fu <[email protected]> 于2019年11月5日周二 上午11:32写道：
>>>>>>>>> 
>>>>>>>>>> Hi everyone,
>>>>>>>>>> 
>>>>>>>>>> In FLIP-58[1] it will add the support of Python user-defined
>>>>>> stateless
>>>>>>>>>> function for Python Table API. It will launch a separate Python
>>>>>> process
>>>>>>>> for
>>>>>>>>>> Python user-defined function execution. The resources used by the
>>>>>> Python
>>>>>>>>>> process should be managed properly by Flink’s resource management
>>>>>>>>>> framework. FLIP-49[2] has proposed a unified memory management
>>>>>> framework
>>>>>>>>>> and PyFlink user-defined function resource management should be
>>>>>> based on
>>>>>>>>>> it. Jincheng, Hequn, Xintong, GuoWei and I discussed offline about
>>>>>>>> this. I
>>>>>>>>>> draft a design doc[3] and want to start a discussion about PyFlink
>>>>>>>>>> user-defined function resource management.
>>>>>>>>>> 
>>>>>>>>>> Welcome any comments on the design doc or giving us feedback on
>> the
>>>>>> ML
>>>>>>>>>> directly.
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Dian
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
>>>>>>>>>> [2]
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>>>>>>>>>> [3]
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> https://docs.google.com/document/d/1LQP8L66Thu2yVv6YRSfmF9EkkMnwhBHGjcTQ11GUmFc/edit#heading=h.4q4ggaftf78m
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Best, Jingsong Lee
>> 
>> 
> 
> -- 
> Best, Jingsong Lee

Re: [DISCUSS] PyFlink User-Defined Function Resource Management

Reply via email to