gemini-code-assist[bot] commented on code in PR #38547:
URL: https://github.com/apache/beam/pull/38547#discussion_r3277076535
##########
sdks/python/apache_beam/yaml/yaml_transform.py:
##########
@@ -1391,19 +1391,61 @@ def validate_transform_references(spec):
return spec
+def strip_leading_comments(source: str) -> str:
+ lines = source.splitlines(keepends=True)
+ stripped_lines = []
+ in_leading_comments = True
+ for line in lines:
+ stripped_line = line.lstrip()
+ if in_leading_comments:
+ if stripped_line.startswith('#') or not stripped_line:
+ continue
+ else:
+ in_leading_comments = False
+ stripped_lines.append(line)
+ return "".join(stripped_lines)
+
+
class _BeamFileIOLoader(jinja2.BaseLoader):
+ def __init__(self, search_paths=()):
+ self.search_paths = list(search_paths)
+
def get_source(self, environment, path):
- with FileSystems.open(path) as fin:
- source = fin.read().decode()
- return source, path, lambda: True
+ candidates = [path]
+ if FileSystems.get_scheme(path) is None and not path.startswith('/'):
+ for search_path in self.search_paths:
+ candidates.append(FileSystems.join(search_path, path))
+
+ for candidate in candidates:
+ try:
+ if FileSystems.exists(candidate):
+ with FileSystems.open(candidate) as fin:
+ source = fin.read().decode()
+ return strip_leading_comments(source), candidate, lambda: True
+ except Exception:
+ pass
Review Comment:

Catching all exceptions and silently passing can hide important issues such
as permission errors or corrupted files. Since `FileSystems.exists` is checked
before opening, any exception here is likely an unexpected error that should at
least be logged or allowed to propagate if it's not a 'file not found' error.
##########
sdks/python/apache_beam/yaml/yaml_transform.py:
##########
@@ -1391,19 +1391,61 @@ def validate_transform_references(spec):
return spec
+def strip_leading_comments(source: str) -> str:
+ lines = source.splitlines(keepends=True)
+ stripped_lines = []
+ in_leading_comments = True
+ for line in lines:
+ stripped_line = line.lstrip()
+ if in_leading_comments:
+ if stripped_line.startswith('#') or not stripped_line:
+ continue
+ else:
+ in_leading_comments = False
+ stripped_lines.append(line)
+ return "".join(stripped_lines)
+
+
class _BeamFileIOLoader(jinja2.BaseLoader):
+ def __init__(self, search_paths=()):
+ self.search_paths = list(search_paths)
+
def get_source(self, environment, path):
- with FileSystems.open(path) as fin:
- source = fin.read().decode()
- return source, path, lambda: True
+ candidates = [path]
+ if FileSystems.get_scheme(path) is None and not path.startswith('/'):
Review Comment:

The check `not path.startswith('/')` is Unix-centric and may not correctly
identify absolute paths on other platforms like Windows. Since this block is
only executed when no scheme is present (local paths), using `os.path.isabs()`
would be more robust.
```suggestion
if FileSystems.get_scheme(path) is None and not os.path.isabs(path):
```
##########
sdks/python/apache_beam/yaml/yaml_transform_test.py:
##########
@@ -23,12 +23,16 @@
import tempfile
import unittest
+import yaml
Review Comment:

The `shutil` module is used in the `tearDown` method of
`TestYamlExpandJinja` (line 1584) but it is not imported in this file. This
will cause a `NameError` during test cleanup.
```suggestion
import shutil
import yaml
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]