[
https://issues.apache.org/jira/browse/FLINK-18738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175553#comment-17175553
]
Till Rohrmann commented on FLINK-18738:
---------------------------------------
Thanks for starting the discussion about this problem [~xintongsong]. I think
it would be nice to solve this problem not only for Python but also for other
languages we might want to support in the future (if possible and if it does
not broaden the scope too massively).
I agree that the processing model is probably the most important aspect of the
discussion since the configuration somewhat follows this decision.
The first question which comes to my mind is whether the {{TaskExecutor}}
should be responsible for managing the component running the Python (or any
other language) code or not. If there is no good reason for the
{{TaskExecutor}} to manage this component, it could make the {{TaskExecutor}}
simpler by not aggregating too many responsibilities. If the language execution
service benefitted from a tight integration in the execution of a job (e.g. by
allocating and releasing resources swiftly) on the other hand, then a tighter
integration could make sense.
If the {{TaskExecutor}} is not responsible for managing the language execution
component, then it could be the responsibility of the {{ActiveResourceManager}}
or of the user to start the language execution service (e.g. in the case of a
standalone deployment).
Whether a supported language supports code isolation or not should not
necessarily decide which process model to choose. For example, a separate
language execution service for Python (option 3.) could spawn individual Python
processes for different jobs, slots or tasks to achieve isolation. Hence, I
believe this is more of an implementation detail of the language execution
service than necessarily a strict argument for any of the proposed processing
models.
Concerning the proposed memory management: Does option 2. entail that one
enables the cluster to run with Python at deployment time of the
{{TaskExecutor}}? Differently asked will we reserve a fixed amount of memory
for the Python process independent of whether we actually run a Python workload
or not?
> Revisit resource management model for python processes.
> -------------------------------------------------------
>
> Key: FLINK-18738
> URL: https://issues.apache.org/jira/browse/FLINK-18738
> Project: Flink
> Issue Type: Task
> Components: API / Python, Runtime / Coordination
> Reporter: Xintong Song
> Assignee: Xintong Song
> Priority: Major
> Fix For: 1.12.0
>
>
> This ticket is for tracking the effort towards a proper long-term resource
> management model for python processes.
> In FLINK-17923, we run into problems due to python processes are not well
> integrate with the task manager resource management mechanism. A temporal
> workaround has been merged for release-1.11.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)