Shangshu Qian created HDFS-17660: ------------------------------------ Summary: HDFS cache commands should be throttled to avoid contention with the write pipeline Key: HDFS-17660 URL: https://issues.apache.org/jira/browse/HDFS-17660 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.10.2, 3.4.0 Reporter: Shangshu Qian
We found a potential feedback loop between the HDFS write pipeline and the block caching commands. Currently, there is no throttling on the number of cache commands generated for each heartbeat (HB) reply, unlike the block replication commands, which is throttled by `dfs.namenode.replication.work.multiplier.per.iteration`. The positive feedback loop of workload can be described as follows: # When there is a high write workload to the DN, there may be IOExceptions thrown in the write pipeline, causing more IncrementalBlockReports (IBRs) to be sent to the NN. # The IBRs can have a contention with the HB handling and the cache commands generations on the NN, because they are all part of the HB handling logic. # When the DN's heartbeat is delayed, the `CacheReplicationMonitor.chooseDatanodesForCaching` may take more time to iterate through more DNs because some DNs are temporarily unavailable due to the HB delays. Some cached blocks can also be temporarily unavailable, and the NN needs to generate commands for these blocks again, which also makes the cache command generation slower for each HB. # The extra cache commands generated causes extra workload on the DN, making them more vulnerable to IOExceptions in the write pipeline. Add throttling similar to the one in `BlockManager.computeDatanodeWork` can make this feedback loop less likely to happen. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org