KevinGG commented on a change in pull request #16555:
URL: https://github.com/apache/beam/pull/16555#discussion_r788143922
##########
File path:
sdks/python/apache_beam/runners/interactive/interactive_environment.py
##########
@@ -359,10 +359,14 @@ def get_cache_manager(self, pipeline,
create_if_absent=False):
manager for the pipeline."""
cache_manager = self._cache_managers.get(str(id(pipeline)), None)
if not cache_manager and create_if_absent:
- cache_dir = tempfile.mkdtemp(
- suffix=str(id(pipeline)),
- prefix='it-',
- dir=os.environ.get('TEST_TMPDIR', None))
+ from apache_beam.runners.interactive import interactive_beam as ib
+ if ib.options.specified_cache_dir:
+ cache_dir = ib.options.specified_cache_dir
Review comment:
Should we add a prefix check here?
For example:
- if the path starts with "gs://", treat it as a GCS bucket (we may add a
GCS file check later and log warnings if something isn't right).
- else treat it as a local path, create a temp dir under that path and log
warning if something is wrong.
`cache_dir = tempfile.mkdtemp(dir=cache_dir)`
Ideally, pipelines should not share a common directory store cache files, so
the ib.options.specified_cache_dir should only serve as a parent path for each
cache manager's own files.
-
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]