Fenghua Hu created HDFS-10690: --------------------------------- Summary: Optimize insertion/removal of replica in ShortCircuitCache.java Key: HDFS-10690 URL: https://issues.apache.org/jira/browse/HDFS-10690 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Fenghua Hu
Currently in ShortCircuitCache, two TreeMap objects are used to track the cached replicas. private final TreeMap<Long, ShortCircuitReplica> evictable = new TreeMap<>(); private final TreeMap<Long, ShortCircuitReplica> evictableMmapped = new TreeMap<>(); TreeMap employs Red-Black tree for sorting. This isn't an issue when using traditional HDD. But when using high-performance SSD/PCIe Flash, the cost inserting/removing an entry becomes considerable. To mitigate it, we designed a new list-based for replica tracking. The list is a double-linked FIFO. FIFO is time-based, thus insertion is a very low cost operation. On the other hand, list is not lookup-friendly. To address this issue, we introduce two references into ShortCircuitReplica object. ShortCircuitReplica next = null; ShortCircuitReplica prev = null; In this way, lookup is not needed when removing a replica from the list. We only need to modify its predecessor's and successor's references in the lists. Our tests showed up to 15-50% performance improvement when using PCIe flash as storage media. The original patch is against 2.6.4, now I am porting to Hadoop trunk, and patch will be posted soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org