StephanEwen opened a new pull request #13400:
URL: https://github.com/apache/flink/pull/13400


   ## What is the purpose of the change
   
   Currently, the method `SplitReader.handleSplitsChanges()` gets passed a 
queue of split changes to handle. The method may decide to handle all of them, 
or only a subset of them. It it handles a subset of them, the method is later 
invoked with the remaining changes.
   
   In practice, this ends up being confusing and problematic:
   
     - It is important to remove the elements from the queue. If you just 
iterate over the splits, or the splits will get handles multiple time.
     - If the queue is not left empty, the task to handle the changes is 
immediately re-enqueued and the method immediately re-invoked. No other 
operation can happen before all split changes from the queue are handled. That 
means you have to handle all splits immediately, just spread it out over 
multiple method invocations.
   
   A simpler contract would be to simply pass a the split changes (list of 
splits) directly only once.
   The fetcher would pick those changes up and can internally stash them if it 
wants to process them later.
   For all source implementations we did so far, this was sufficient and easier.
   
   ## Brief change log
   
     - Change signature of `void 
handleSplitsChanges(Queue<SplitsChange<SplitT>> splitsChanges)`
       to `void handleSplitsChanges(SplitsChange<SplitT> splitsChanges);`
     - Adjust invoking and testing code
   
   ## Verifying this change
   
   This change is a simple rework already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): **no**
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
     - The serializers: **no**
     - The runtime per-record code paths (performance sensitive): **no**
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: **no**
     - The S3 file system connector: **no**
   
   ## Documentation
   
     - Does this pull request introduce a new feature? **no**
     - If yes, how is the feature documented? **not applicable**
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to