pskevin commented on pull request #12445:
URL: https://github.com/apache/beam/pull/12445#issuecomment-669026718


   I've addressed the initial comments with the exception of ones suggesting 
organizational changes. In general this discussion will dictate where the final 
(and cleaner) API will reside and how the user will make use of it. Since I 
won't be the stakeholder for this feature after a few weeks, I thought it'd be 
worthwhile bringing in more opinions – @lostluck @robertwb @youngoli 
   
   The challenges and possible solutions are as follows:
   
   - Where should all xlang code be?\
   There are multiple moving parts related to the execution of cross-language 
transforms which are now condensed into one `TryCrossLanguage` function in 
`external.go`  
   
        - Allowing various combinations of inputs/outputs (n:n, 1:0, 0:1)
        - Correctly adding `ExternalTransform` to `Graph` representing the 
unexpanded cross-language transform within the pipeline
        - Building the `ExpansionRequest`
        - Querying the expansion service for the corresponding 
`ExpansionResponse`
        - Querying the expansion service for Artifact Resolution and Staging
        - Handling the expanded components of the `ExternalTransform`
   
        Decomposing these functions correctly within package(s) dedicated to 
`ExternalTransform` whether as `beam/.../xlangx` or `beam/.../xlang` or both 
will be ideal. It helps in segregation of responsibilities so that each 
function can be developed to maturity independent of each other. I can then 
make incremental progress on different fronts without compromising overall 
functionality that this PR has already achieved.  
        Futhermore, `external.go` can then use the package(s) to expose generic 
APIs such as `beam.CrossLanguage` (for n:n inputs/outputs), 
`beam.CrossLanguageAsSource` (for 0:1 inputs/outputs) and 
`beam.CrossLanguageAsSink` (for 1:0 inputs/outputs).  
        In the future, when the existing API is to be deprecated, these 
functions could then be renamed to their `beam.ExternalTransform`... 
counterparts. The exisiting API could be renamed to 
`beam.LegacyExternalTransform` and implemented using the API that exists as the 
outcome of this PR (since it is already backwards compatible for the most 
part).  
   
        @lostluck already pointed to a possible segregation by removing 
pipeline handling code within `universal.go` into `graphx/xlang.go` to be used 
in `graphx/translate.go`. I could explore similar options in alignment to the 
results of this discussion.
   
   - How is `ExternalTransform` associated to pipeline construction?
        - Adding to `Graph` and/or `MultiEdge`   
        Currently, `graph.Opcode` for external transforms already exists (with 
value as `"External"`) and is used in conjunction with `graph.Payload` by the 
existing external transform API. Adding `ExternalTransform` to `Graph` will 
pull in proto and grpc dependencies within `beam/core/graph` regardless of 
whether the pipeline contains any cross-language transform or not. Adding 
`ExternalTransform` to `MultiEdge` will need additional logic to distinguish 
between an external transform using the proposed v/s current API.  
        One scanario in which  I believe this method could be useful is if 
correct pipeline construction required explicit handling of `ExternalTransform` 
when each `MultiEdge` is being added during the `graphx.Marshal` call. As of 
now I don't feel that is the case.
   
        - Augmenting `Pipeline` and `Scope`  
        As is implemented in this PR, `ExpandedTransforms 
map[string]ExternalTransform` field is added to `Pipeline`. The key is the 
`MultiEdge`'s ID that represents an external transform within `Graph`. Using 
this, during pipeline construction, the Go proto representing this transform is 
swapped with the expanded `ExternalTransform`. All of the components associated 
with the expanded `ExternalTransform` are also added to the pipeline proto. 
This method does not conflict with any pipeline representation/construction 
code for strictly Go pipelines and may be more suitable for the future.  
   
                An existing problem worth calling out is the necessity to pass 
in the `Pipeline` reference to `beam.CrossLanguage` to add the new 
`ExternalTransform` in the `ExpandedTransforms` map. A possible solution to 
this would be:
                1. Adding `func (p *Pipeline) RegisterExternalTransform(k 
string, e *ExternalTransform)` function to `Pipeline`
                2. Adding `AddMapEntry func(string, *ExternalTransform)` field 
to `Scope`
                3. Changing `func (p *Pipeline) Root() Scope` to ` return 
Scope{scope: p.real.Root(), real: p.real, AddMapEntry: 
p.RegisterExternalTransform}`  
   
                `beam.CrossLanguage` should then be able to use 
`s.AddMapEntry(key, externalTransform)` instead of the current implementation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to