lostluck commented on PR #25568:
URL: https://github.com/apache/beam/pull/25568#issuecomment-1438978316

   Ready for another look!
   
   > These may be my personal preferences, but I think the code would be easier 
to follow if `RunPipeline`, as the main function in the internal package, was 
defined in the beginning of execute.go, followed by the functions it uses 
(`runEnvironment`, `executePipeline`, and so on).
   > 
   
   Fair! I re-arranged a little before sending this out. That's the reverse of 
my usual preference (having things defined before they're used). But I agree 
that ordering is more approachable for reading how the runner works in this 
case. 
   
   > It might also make sense to move `stage` from execute.go to preprocess.go, 
or to its own file if you would prefer that. As it is now, `executePipeline()` 
in execute.go initializes `preprocessor` and invokes 
`preprocessor.preProcessGraph()`, which is defined in preprocess.go. Whereas 
preprocessor.preProcessGraph() initializes stage, which is defined in 
execute.go.
   
   This one is trickier. Preprocess.go only uses a single field in stage: the 
slice of transform IDs, everything else is used around this file, and is more 
central to execution. So it shouldn't live in preprocess.go IMO.
   
   Preprocess used to live in execute.go but the preprocessor, while simple 
now, will eventually handle stage fusion, and become more complicated, so it's 
worth splitting to it's own file for testing. A bunch of stuff should probably 
moved to the pre-processor too.
   
   But I've ended up moving it to its own file in the end, because it turns out 
it's most of the content? Some of it will need to become its own "handler", for 
handling messages to the FnAPI. I had clearly lost scope about how much code 
was unique to building and managing stages. Ultimately, it'll be a candidate to 
move to it's own package along with associated unit testing and documentation.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to