charlespnh commented on code in PR #35715:
URL: https://github.com/apache/beam/pull/35715#discussion_r2241302621


##########
sdks/python/apache_beam/yaml/yaml_ml.py:
##########
@@ -29,14 +32,36 @@
 from apache_beam.yaml import options
 from apache_beam.yaml.yaml_utils import SafeLineLoader
 
+
+def list_submodules(package):
+  """
+    Lists all submodules within a given package.
+    """
+  submodules = []
+  for _, module_name, _ in pkgutil.walk_packages(
+      package.__path__, package.__name__ + '.'):
+    if 'test' in module_name:
+      continue
+    submodules.append(module_name)
+  return submodules
+
+
 try:
   from apache_beam.ml.transforms import tft
   from apache_beam.ml.transforms.base import MLTransform
   # TODO(robertwb): Is this all of them?
-  _transform_constructors = tft.__dict__
+  _transform_constructors = {}
 except ImportError:
   tft = None  # type: ignore
 
+# Load all available ML Transform modules
+for module_name in list_submodules(beam.ml.transforms):
+  try:
+    module = import_module(module_name)
+    _transform_constructors |= module.__dict__
+  except ImportError as e:
+    logging.warning('Could not load ML transform module %s: %s', module_name, 
e)

Review Comment:
   I'm personally a +1 for option 2, i.e. not having to install everything if 
I'm only using a subset of these transforms, and there's a well-defined error 
message when the pipeline uses a transform that doesn't have the dependencies 
installed properly.
   
   CC @chamikaramj and @liferoad. Not sure how our user base is using 
MLTransform.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to