[
https://issues.apache.org/jira/browse/BEAM-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-10200:
-----------------------------------
This Jira ticket has a pull request attached to it, but is still open. Did the
pull request resolve the issue? If so, could you please mark it resolved? This
will help the project have a clear view of its open issues.
> Improve memory profiling for users of Portable Beam Python
> ----------------------------------------------------------
>
> Key: BEAM-10200
> URL: https://issues.apache.org/jira/browse/BEAM-10200
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-harness
> Reporter: Valentyn Tymofieiev
> Priority: P3
> Labels: starter
> Time Spent: 4h 50m
> Remaining Estimate: 0h
>
> We have a Profiler[1] that is integrated with SDK worker[1a], however it only
> saves CPU metrics [1b].
> We have a MemoryReporter util[2] which can log heap dumps, however it is not
> documented on Beam Website and does not respect the --profile_memory and
> --profile_location options[3]. The profile_memory flag currently works only
> for Dataflow Runner users who run non-portable batch pipelines; profiles
> are saved only if memory usage between samples exceeds 1000M.
> We should improve memory profiling experience for Portable Python users and
> consider making a guide on how users can investigate OOMing pipelines on Beam
> website.
>
> [1]
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L46
> [1a]
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L157
> [1b]
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L112
> [2]
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L124
> [3]
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/options/pipeline_options.py#L846
--
This message was sent by Atlassian Jira
(v8.20.1#820001)