gemini-code-assist[bot] commented on code in PR #38547:
URL: https://github.com/apache/beam/pull/38547#discussion_r3277076535


##########
sdks/python/apache_beam/yaml/yaml_transform.py:
##########
@@ -1391,19 +1391,61 @@ def validate_transform_references(spec):
   return spec
 
 
+def strip_leading_comments(source: str) -> str:
+  lines = source.splitlines(keepends=True)
+  stripped_lines = []
+  in_leading_comments = True
+  for line in lines:
+    stripped_line = line.lstrip()
+    if in_leading_comments:
+      if stripped_line.startswith('#') or not stripped_line:
+        continue
+      else:
+        in_leading_comments = False
+    stripped_lines.append(line)
+  return "".join(stripped_lines)
+
+
 class _BeamFileIOLoader(jinja2.BaseLoader):
+  def __init__(self, search_paths=()):
+    self.search_paths = list(search_paths)
+
   def get_source(self, environment, path):
-    with FileSystems.open(path) as fin:
-      source = fin.read().decode()
-    return source, path, lambda: True
+    candidates = [path]
+    if FileSystems.get_scheme(path) is None and not path.startswith('/'):
+      for search_path in self.search_paths:
+        candidates.append(FileSystems.join(search_path, path))
+
+    for candidate in candidates:
+      try:
+        if FileSystems.exists(candidate):
+          with FileSystems.open(candidate) as fin:
+            source = fin.read().decode()
+          return strip_leading_comments(source), candidate, lambda: True
+      except Exception:
+        pass

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   Catching all exceptions and silently passing can hide important issues such 
as permission errors or corrupted files. Since `FileSystems.exists` is checked 
before opening, any exception here is likely an unexpected error that should at 
least be logged or allowed to propagate if it's not a 'file not found' error.



##########
sdks/python/apache_beam/yaml/yaml_transform.py:
##########
@@ -1391,19 +1391,61 @@ def validate_transform_references(spec):
   return spec
 
 
+def strip_leading_comments(source: str) -> str:
+  lines = source.splitlines(keepends=True)
+  stripped_lines = []
+  in_leading_comments = True
+  for line in lines:
+    stripped_line = line.lstrip()
+    if in_leading_comments:
+      if stripped_line.startswith('#') or not stripped_line:
+        continue
+      else:
+        in_leading_comments = False
+    stripped_lines.append(line)
+  return "".join(stripped_lines)
+
+
 class _BeamFileIOLoader(jinja2.BaseLoader):
+  def __init__(self, search_paths=()):
+    self.search_paths = list(search_paths)
+
   def get_source(self, environment, path):
-    with FileSystems.open(path) as fin:
-      source = fin.read().decode()
-    return source, path, lambda: True
+    candidates = [path]
+    if FileSystems.get_scheme(path) is None and not path.startswith('/'):

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The check `not path.startswith('/')` is Unix-centric and may not correctly 
identify absolute paths on other platforms like Windows. Since this block is 
only executed when no scheme is present (local paths), using `os.path.isabs()` 
would be more robust.
   
   ```suggestion
       if FileSystems.get_scheme(path) is None and not os.path.isabs(path):
   ```



##########
sdks/python/apache_beam/yaml/yaml_transform_test.py:
##########
@@ -23,12 +23,16 @@
 import tempfile
 import unittest
 
+import yaml

Review Comment:
   ![high](https://www.gstatic.com/codereviewagent/high-priority.svg)
   
   The `shutil` module is used in the `tearDown` method of 
`TestYamlExpandJinja` (line 1584) but it is not imported in this file. This 
will cause a `NameError` during test cleanup.
   
   ```suggestion
   import shutil
   import yaml
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to