jianghuazhu commented on PR #3806: URL: https://github.com/apache/hadoop/pull/3806#issuecomment-1136659630
Thanks @ZanderXu for following. Here are some explanations: 1. The main job of FsDatasetAsyncDiskService is to delete the replica files synchronously or asynchronously. The copy files to be deleted here are all files on the local DataNode, and the number is limited. Although the thread pool uses an unbounded queue, it will not be stored all the time, because it will always be consumed. And these copies have been loaded into memory when the DataNode is working, so the probability of OOM here is very low. 2. If the copy is deleted asynchronously, the thread pool work will be started. Before this, each disk will correspond to a thread pool, and the thread pool will have at most 4 fixed threads to work, and this condition is fixed. In our cluster, DataNodes have different numbers of disks, 12 disks, 36 disks, and 60 disks will exist. Take DataNode with 36 disks or 60 disks as an example, then during peak hours, DataNode needs to start a lot of thread work. Adjusting the number of threads flexibly will reduce the workload of the DataNode. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
