Fenghua Hu created HDFS-10690:
---------------------------------

             Summary: Optimize insertion/removal of replica in 
ShortCircuitCache.java
                 Key: HDFS-10690
                 URL: https://issues.apache.org/jira/browse/HDFS-10690
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs-client
            Reporter: Fenghua Hu


Currently in ShortCircuitCache, two TreeMap objects are used to track the 
cached replicas.

private final TreeMap<Long, ShortCircuitReplica> evictable = new TreeMap<>();
private final TreeMap<Long, ShortCircuitReplica> evictableMmapped = new 
TreeMap<>();

TreeMap employs Red-Black tree for sorting. This isn't an issue when using 
traditional HDD. But when using high-performance SSD/PCIe Flash, the cost 
inserting/removing an entry  becomes considerable.

To mitigate it, we designed a new list-based for replica tracking.

The list is a double-linked FIFO. FIFO is time-based, thus insertion is a very 
low cost operation. On the other hand, list is not lookup-friendly. To address 
this issue, we introduce two references into ShortCircuitReplica object.

ShortCircuitReplica next = null;
ShortCircuitReplica prev = null;

In this way, lookup is not needed when removing a replica from the list. We 
only need to modify its predecessor's and successor's references in the lists.

Our tests showed up to 15-50% performance improvement when using PCIe flash as 
storage media.

The original patch is against 2.6.4, now I am porting to Hadoop trunk, and 
patch will be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to