charlespnh commented on code in PR #35715:
URL: https://github.com/apache/beam/pull/35715#discussion_r2240952446
##########
sdks/python/apache_beam/yaml/yaml_ml.py:
##########
@@ -29,14 +32,36 @@
from apache_beam.yaml import options
from apache_beam.yaml.yaml_utils import SafeLineLoader
+
+def list_submodules(package):
+ """
+ Lists all submodules within a given package.
+ """
+ submodules = []
+ for _, module_name, _ in pkgutil.walk_packages(
+ package.__path__, package.__name__ + '.'):
+ if 'test' in module_name:
+ continue
+ submodules.append(module_name)
+ return submodules
+
+
try:
from apache_beam.ml.transforms import tft
from apache_beam.ml.transforms.base import MLTransform
# TODO(robertwb): Is this all of them?
- _transform_constructors = tft.__dict__
+ _transform_constructors = {}
except ImportError:
tft = None # type: ignore
+# Load all available ML Transform modules
+for module_name in list_submodules(beam.ml.transforms):
+ try:
+ module = import_module(module_name)
+ _transform_constructors |= module.__dict__
+ except ImportError as e:
+ logging.warning('Could not load ML transform module %s: %s', module_name,
e)
Review Comment:
Sorry if I'm being pedantic, but let's say the pipeline makes use of one of
these embedding transforms, but dependencies was not installed (e.g. pipeline
uses `TensorflowHubTextEmbeddings` but `tensorflow_hub` was not installed),
then what would be the error when the pipeline fails later on? There's a
warning log here, and I'm assuming that we'll have the pipeline fail with
undefined behaviour?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]