KevinGG commented on a change in pull request #16555:
URL: https://github.com/apache/beam/pull/16555#discussion_r788143922



##########
File path: 
sdks/python/apache_beam/runners/interactive/interactive_environment.py
##########
@@ -359,10 +359,14 @@ def get_cache_manager(self, pipeline, 
create_if_absent=False):
     manager for the pipeline."""
     cache_manager = self._cache_managers.get(str(id(pipeline)), None)
     if not cache_manager and create_if_absent:
-      cache_dir = tempfile.mkdtemp(
-          suffix=str(id(pipeline)),
-          prefix='it-',
-          dir=os.environ.get('TEST_TMPDIR', None))
+      from apache_beam.runners.interactive import interactive_beam as ib
+      if ib.options.specified_cache_dir:
+        cache_dir = ib.options.specified_cache_dir

Review comment:
       Should we add a prefix check here?
   
   For example:
   
   - if the path starts with "gs://", treat it as a GCS bucket (we may add a 
GCS file check later and log warnings if something isn't right).
   - else treat it as a local path, create a temp dir under that path and log 
warning if something is wrong.
   `cache_dir = tempfile.mkdtemp(dir=cache_dir)`
   
   Ideally, pipelines should not share a common directory to store cache files, 
so the ib.options.specified_cache_dir should only serve as a parent path for 
each cache manager's own files. We also need to explain this in the setter in 
interactive_beam module. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to