[jira] [Updated] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache
[ https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-10690: --- Fix Version/s: 3.0.0-alpha2 > Optimize insertion/removal of replica in ShortCircuitCache > -- > > Key: HDFS-10690 > URL: https://issues.apache.org/jira/browse/HDFS-10690 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0-alpha2 >Reporter: Fenghua Hu >Assignee: Fenghua Hu > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, > HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, > HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, > ShortCircuitCache_LinkedMap.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Currently in ShortCircuitCache, two TreeMap objects are used to track the > cached replicas. > private final TreeMap evictable = new TreeMap<>(); > private final TreeMap evictableMmapped = new > TreeMap<>(); > TreeMap employs Red-Black tree for sorting. This isn't an issue when using > traditional HDD. But when using high-performance SSD/PCIe Flash, the cost > inserting/removing an entry becomes considerable. > To mitigate it, we designed a new list-based for replica tracking. > The list is a double-linked FIFO. FIFO is time-based, thus insertion is a > very low cost operation. On the other hand, list is not lookup-friendly. To > address this issue, we introduce two references into ShortCircuitReplica > object. > ShortCircuitReplica next = null; > ShortCircuitReplica prev = null; > In this way, lookup is not needed when removing a replica from the list. We > only need to modify its predecessor's and successor's references in the lists. > Our tests showed up to 15-50% performance improvement when using PCIe flash > as storage media. > The original patch is against 2.6.4, now I am porting to Hadoop trunk, and > patch will be posted soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache
[ https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-10690: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks [~fenghua_hu] for the contribution and all the reviewers for the discussion. I've commit the patch to trunk, branch-2 and branch-2.8. > Optimize insertion/removal of replica in ShortCircuitCache > -- > > Key: HDFS-10690 > URL: https://issues.apache.org/jira/browse/HDFS-10690 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0-alpha2 >Reporter: Fenghua Hu >Assignee: Fenghua Hu > Fix For: 2.8.0 > > Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, > HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, > HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, > ShortCircuitCache_LinkedMap.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Currently in ShortCircuitCache, two TreeMap objects are used to track the > cached replicas. > private final TreeMap evictable = new TreeMap<>(); > private final TreeMap evictableMmapped = new > TreeMap<>(); > TreeMap employs Red-Black tree for sorting. This isn't an issue when using > traditional HDD. But when using high-performance SSD/PCIe Flash, the cost > inserting/removing an entry becomes considerable. > To mitigate it, we designed a new list-based for replica tracking. > The list is a double-linked FIFO. FIFO is time-based, thus insertion is a > very low cost operation. On the other hand, list is not lookup-friendly. To > address this issue, we introduce two references into ShortCircuitReplica > object. > ShortCircuitReplica next = null; > ShortCircuitReplica prev = null; > In this way, lookup is not needed when removing a replica from the list. We > only need to modify its predecessor's and successor's references in the lists. > Our tests showed up to 15-50% performance improvement when using PCIe flash > as storage media. > The original patch is against 2.6.4, now I am porting to Hadoop trunk, and > patch will be posted soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache
[ https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-10690: -- Summary: Optimize insertion/removal of replica in ShortCircuitCache (was: Optimize insertion/removal of replica in ShortCircuitCache.java) > Optimize insertion/removal of replica in ShortCircuitCache > -- > > Key: HDFS-10690 > URL: https://issues.apache.org/jira/browse/HDFS-10690 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0-alpha2 >Reporter: Fenghua Hu >Assignee: Fenghua Hu > Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, > HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, > HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, > ShortCircuitCache_LinkedMap.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Currently in ShortCircuitCache, two TreeMap objects are used to track the > cached replicas. > private final TreeMap evictable = new TreeMap<>(); > private final TreeMap evictableMmapped = new > TreeMap<>(); > TreeMap employs Red-Black tree for sorting. This isn't an issue when using > traditional HDD. But when using high-performance SSD/PCIe Flash, the cost > inserting/removing an entry becomes considerable. > To mitigate it, we designed a new list-based for replica tracking. > The list is a double-linked FIFO. FIFO is time-based, thus insertion is a > very low cost operation. On the other hand, list is not lookup-friendly. To > address this issue, we introduce two references into ShortCircuitReplica > object. > ShortCircuitReplica next = null; > ShortCircuitReplica prev = null; > In this way, lookup is not needed when removing a replica from the list. We > only need to modify its predecessor's and successor's references in the lists. > Our tests showed up to 15-50% performance improvement when using PCIe flash > as storage media. > The original patch is against 2.6.4, now I am porting to Hadoop trunk, and > patch will be posted soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org