tvalentyn commented on code in PR #28781:
URL: https://github.com/apache/beam/pull/28781#discussion_r1378059211
##########
sdks/python/apache_beam/options/pipeline_options.py:
##########
@@ -1135,6 +1135,20 @@ def _add_argparse_args(cls, parser):
dest='min_cpu_platform',
type=str,
help='GCE minimum CPU platform. Default is determined by GCP.')
+ parser.add_argument(
+ '--max_cache_memory_usage_mb',
+ dest='max_cache_memory_usage_mb',
+ type=int,
+ default=100,
+ help=(
+ 'Size of the SdkHarness cache to store user state and side inputs '
Review Comment:
nit: consider following wording
```
'Size of the SDK Harness cache to store user state and side
inputs '
'in MB. Default is 100MB. If the cache is full, least recently '
'used elements will be evicted. This cache is per '
'each SDK Harness instance. SDK Harness is a component
responsible '
'for executing the user code and communicating with the runner.
'
'Depending on the runner, '
'there may be more than one SDK Harness process running on the
same worker node. '
'Increasing cache size might improve performance of some
pipelines, but can lead to an increase '
'in memory consumption and OOM errors if workers are not
appropriately provisioned.'
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]