[ 
https://issues.apache.org/jira/browse/BEAM-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-10200:
-----------------------------------

This Jira ticket has a pull request attached to it, but is still open. Did the 
pull request resolve the issue? If so, could you please mark it resolved? This 
will help the project have a clear view of its open issues.

> Improve memory profiling for users of Portable Beam Python
> ----------------------------------------------------------
>
>                 Key: BEAM-10200
>                 URL: https://issues.apache.org/jira/browse/BEAM-10200
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-harness
>            Reporter: Valentyn Tymofieiev
>            Priority: P3
>              Labels: starter
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> We have a Profiler[1] that is integrated with SDK worker[1a], however it only 
> saves CPU metrics [1b].
> We have a MemoryReporter util[2] which can log heap dumps, however it is not 
> documented on Beam Website and does not respect the --profile_memory and 
> --profile_location options[3]. The profile_memory flag currently works only 
> for  Dataflow Runner users who run non-portable batch pipelines;  profiles 
> are saved only if memory usage between samples exceeds 1000M. 
> We should improve memory profiling experience for Portable Python users and 
> consider making a guide on how users can investigate OOMing pipelines on Beam 
> website.
>  
> [1] 
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L46
> [1a] 
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L157
> [1b] 
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L112
> [2] 
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L124
> [3] 
> https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/options/pipeline_options.py#L846



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to