+user <[email protected]>

On Thu, Aug 20, 2020 at 9:47 AM Luke Cwik <[email protected]> wrote:

> Are you using Dataflow runner v2[1]?
>
> If so, then you can use:
> --number_of_worker_harness_threads=X
>
> Do you know where/why the OOM is occurring?
>
> 1:
> https://cloud.google.com/dataflow/docs/guides/deploying-a-pipeline#dataflow-runner-v2
> 2:
> https://github.com/apache/beam/blob/017936f637b119f0b0c0279a226c9f92a2cf4f15/sdks/python/apache_beam/options/pipeline_options.py#L834
>
> On Thu, Aug 20, 2020 at 7:33 AM Kamil Wasilewski <
> [email protected]> wrote:
>
>> Hi all,
>>
>> As I stated in the title, is there an equivalent for
>> --numberOfWorkerHarnessThreads in Python SDK? I've got a streaming pipeline
>> in Python which suffers from OutOfMemory exceptions (I'm using Dataflow).
>> Switching to highmem workers solved the issue, but I wonder if I can set a
>> limit of threads that will be used in a single worker to decrease memory
>> usage.
>>
>> Regards,
>> Kamil
>>
>>

Reply via email to