Re: [PR] [Python]Enable state cache to 100 MB [beam]

via GitHub Tue, 31 Oct 2023 12:11:09 -0700


tvalentyn commented on code in PR #28781:
URL: https://github.com/apache/beam/pull/28781#discussion_r1378059211



##########
sdks/python/apache_beam/options/pipeline_options.py:
##########
@@ -1135,6 +1135,20 @@ def _add_argparse_args(cls, parser):
         dest='min_cpu_platform',
         type=str,
         help='GCE minimum CPU platform. Default is determined by GCP.')
+    parser.add_argument(
+        '--max_cache_memory_usage_mb',
+        dest='max_cache_memory_usage_mb',
+        type=int,
+        default=100,
+        help=(
+            'Size of the SdkHarness cache to store user state and side inputs '

Review Comment:
   nit: consider following wording
   
   ```
               'Size of the SDK Harness cache to store user state and side 
inputs '
               'in MB. Default is 100MB. If the cache is full, least recently '
               'used elements will be evicted.  This cache is per '
               'each SDK Harness instance. SDK Harness  is a component  
responsible '
               'for executing the user code and communicating with the runner. 
' 
               'Depending on the runner, '
               'there may be more than one SDK Harness process running on the 
same worker node. '
               'Increasing cache size  might improve performance of some 
pipelines, but can lead to an increase '
               'in memory consumption and OOM errors if workers are not 
appropriately provisioned.'
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [Python]Enable state cache to 100 MB [beam]

Reply via email to