becketqin opened a new pull request, #25569:
URL: https://github.com/apache/flink/pull/25569

   ## What is the purpose of the change
   This patch fixes an issue introduced in #25130. In 
`SplitFetcherManager.close()`, the element queue draining thread was chaining 
the runnables to the element queue availability future in a tight loop. This 
causes problem (e.g. OOM, high CPU util) when the fetcher threads do not 
shutdown quickly.
   
   This patch changes the tight async loop to a blocking loop.
   
   ## Brief change log
   This patch changes the tight async loop to a blocking loop.
   
   ## Verifying this change
   This change added a unit test. However, that unit test is kind of testing 
against the implementation instead of the behavior. But given we are fixing an 
interal impl issue, this might be necessary.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (no)
     - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to