wangshuo128 commented on issue #23355: [SPARK-26418][SHUFFLE] Only OpenBlocks without any ChunkFetch for one stream will cause memory leak in ExternalShuffleService URL: https://github.com/apache/spark/pull/23355#issuecomment-448986131 Let me explain my problem in detail. We use `YarnShuffleService` as aux service of NodeManager in our cluster. Full GC happened in some NodeManagers. We dump the heap memory, found that the map held `StreamState` in `OneForOneStreamManager` was 3G bytes, almost 80% of heap size. Some applications have finished, but the `StreamState`s were still in `OneForOneStreamManager`. In current code, server creates `StreamState` when handle `OpenBlocks` request and associates `StreamState` with channel when handle following `ChunkFetchRequest`s. I think two reasons will cause this: 1. `OpenBlocks` request is received and `StreamState` is initialized in server side. Then transport layer client lost or even executor lost, no `ChunkFetchRequest` is sent to server for the stream. 2. `OpenBlocks` request is received and `StreamState` is initialized in server side. `ChunkFetchRequest`s for the stream are sent to server. But server is under heavy pressure and not able to handle the `ChunkFetchRequest` before timeout. Then client close its connection in `TransportChannelHandler`.`userEventTriggered`. Currently the `OpenBlocks` request and following `FetchChunkRequest`s for a specific stream are sent in the same `TransportClient` in `OneForOneBlockFetcher`. So I think associate `StreamState` with channel when handle `OpenBlocks` request will be fine.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
