Good Morning/Afternoon/Evening folks, The current support for beam-plugins is experimental and we would like to have it as a first class member of the beam library for Python Runner v2. This helps us load plugins into the runtime before starting the SdkHarness. https://github.com/apache/beam/pull/16920 is a PR I created for this purpose. Wanted to gather some thoughts around the approach here and have it standardized. The current implementation of beam plugins allows users to extend a class from BeamPlugin and it gets automatically populated in the --beam_plugin PipelineOption, e.g.: FileSystem <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filesystem.py#L475>. This creates the pipeline option as,
--beam_plugin=[ 'apache_beam.io.aws.s3filesystem.S3FileSystem', 'apache_beam.io.filesystem.FileSystem', 'apache_beam.io.hadoopfilesystem.HadoopFileSystem', 'apache_beam.io.localfilesystem.LocalFileSystem', 'apache_beam.io.gcp.gcsfilesystem.GCSFileSystem', 'apache_beam.io.azure.blobstoragefilesystem.BlobStorageFileSystem' ] Another way is to provide a module via the --beam_plugin PipelineOption, e.g.: --beam_plugin='twitter.beam.rule_the_world' The current implementation in the PR supports both these approaches but would love to have a standardized way forward and have it documented. Would love to hear your thoughts about this. Thanks & Regards, Rahul Iyer
