je-ik commented on issue #18479:
URL: https://github.com/apache/beam/issues/18479#issuecomment-3864109181

   Hi @junaiddshaukat, great to hear about your interest. It's also cool that 
you have such a big insight into the Beam ecosystem. Maybe we can put down the 
design document for the GSoC together? Generally speaking, my point of view 
would be that:
    1) the skeleton should target the FnAPI to have full support of all SDKs 
(it should be possible to optimize the translation for Java SDK later)
    2) the skeleton should be "useful", i.e. it should be possible to run at 
least some basic Pipelines, that would imply we should target to implement at 
least:
     - Read
     - stateless ParDo
     - GBK
     - CBK
     - Window
     - optional: stateful ParDo, later splittable DoFn
    3) this is crucial question that needs deeper analysis to make sure that 
the DSL aligns correctly with the Apache Beam model. My intuition here would be 
that the Processor API would be a better choice, because it is flexible enough 
to support the model and yet it should not force us to manually manage state, 
etc.
   
   We would also need to define the design of watermarks, bundles (we must pay 
attention to Beam's guarantees and make the bundling compatible), timers (if we 
would support stateful ParDo).
   
   Would you like to try to sketch a design document we could iterate on? When 
we create some basic doc, we can then share it on the dev@ list to get more 
feedback and ensure we don't hit a wall during implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to