aljoscha opened a new pull request #14312:
URL: https://github.com/apache/flink/pull/14312


   Right now, we don't support `DataStream.connect(BroadcastStream)` in `BATCH` 
execution mode. I believe we can add support for this with not too much work.
   
   The key insight is that we can process the broadcast side before the 
non-broadcast side. Initially, we were shying away from this because of 
concerns about `ctx.applyToKeyedState()` which allows the broadcast side of the 
user function to access/iterate over state from the keyed side. We thought that 
we couldn't support this. However, since we know that we process the broadcast 
side first we know that the keyed side will always be empty when doing so. We 
can thus just make this "keyed iteration" call a no-op, instead of throwing an 
exception as we do now.
   
   This is work-in-progress. I have yet to add tests.
   
   ## Brief change log
   
   * turn broadcast operation into a logical `Transformation`
   * add `BATCH` operators for broadcast
   * add translators
   
   ## Verifying this change
   
   To be done.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? maybe
   
   We need to remove the caveat from the documentation. 🎊 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to