Polber commented on PR #30864:
URL: https://github.com/apache/beam/pull/30864#issuecomment-2040321400

   @robertwb After further investigation, it appears this occurs anytime that 2 
subsequent ExternalTransforms are declared that use the same gradle target, and 
the first is given multiple inputs (PCollectionTuple). I haven't been able to 
repro unless the input to the first ExternalTransform is a dict of PCollections.
   
   Python example:
   ```
   import apache_beam as beam
   from apache_beam.transforms import external
   import logging
   
   from apache_beam.utils import subprocess_server
   
   logging.getLogger().setLevel(logging.INFO)
   
   with beam.Pipeline('DirectRunner') as p:
     i1 = p | "i1" >> beam.Create([beam.Row(name='john', id=1)])
     i2 = p | "i2" >> beam.Create([beam.Row(name='jane', id=1)])
     result = {'i1': i1, 'i2': i2} | 'Sql1' >> external.ExternalTransform(
           'beam:external:java:sql:v1',
           external.ImplicitSchemaPayloadBuilder(
             {'query': 'SELECT * FROM i1 INNER JOIN i2 ON i1.id = i2.id'}
           ).payload(),
       external.JavaJarExpansionService(
         subprocess_server.JavaJarServer.path_to_beam_jar(
           gradle_target='sdks:java:extensions:sql:expansion-service:shadowJar',
           artifact_id=None
         )
       )) | 'LogForTesting' >> external.SchemaAwareExternalTransform(
           'beam:schematransform:org.apache.beam:yaml:log_for_testing:v1',
       external.JavaJarExpansionService(
         subprocess_server.JavaJarServer.path_to_beam_jar(
           gradle_target='sdks:java:extensions:sql:expansion-service:shadowJar',
           artifact_id=None
         )
       ), rearrange_based_on_discovery=True)
   ```
   
   YAML example:
   ```
   pipeline:
     transforms:
       - type: Create
         name: table1
         config:
           elements:
             - name: "john"
               id: 1
       - type: Create
         name: table2
         config:
           elements:
             - name: "jane"
               id: 1
       - type: Sql
         name: Join
         input:
           i1: table1
           i2: table2
         config:
           query: "SELECT * FROM i1 INNER JOIN i2 ON i1.id = i2.id"
       - type: LogForTestingJava
         input: Sql
   
   # Force same gradle target variant of LogForTesting
   providers:
     - type: 'beamJar'
       config:
         gradle_target: 'sdks:java:extensions:sql:expansion-service:shadowJar'
       transforms:
         LogForTestingJava: 
'beam:schematransform:org.apache.beam:yaml:log_for_testing:v1'
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to