[
https://issues.apache.org/jira/browse/HDFS-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059985#comment-17059985
]
Danil Lipovoy commented on HDFS-15202:
--------------------------------------
[~quanli]
I did 2 tests:
1. Via HBase. It was massive reading huge tables:
8 tables by 64 regions by 1.88 Gb data in each = 900 Gb total
Random read in 100, 200,... 500 threads via YCSB (good utility special for DB
performance testing)
It show +25% increase perforormance. Don't so much because the bottle neck is
here HBase .
2. Direct reading by function (I provided more details above):
FSDataInputStream in = fileSystem.open(path);
int res = in.read(position, byteBuffer, 0, 65536);
This function working in separate thread. Each thread is reading separate file
~1Gb. I run 10, 20, ... 200 threads and it means were reading 10, 20, ... 200
files. Importdant notice - this performance possible only with fast SSD or when
data cached by Linux buff/cache (my case). In this scenario increase
performance about +500%.
Environment:
4 nodes E5-2698 v4 @ 2.20GHz (40 cores each), 700 Gb Mem.
> HDFS-client: boost ShortCircuit Cache
> -------------------------------------
>
> Key: HDFS-15202
> URL: https://issues.apache.org/jira/browse/HDFS-15202
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: dfsclient
> Environment: 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 8 RegionServers (2 by host)
> 8 tables by 64 regions by 1.88 Gb data in each = 900 Gb total
> Random read in 800 threads via YCSB and a little bit updates (10% of reads)
> Reporter: Danil Lipovoy
> Assignee: Danil Lipovoy
> Priority: Minor
> Attachments: HDFS_CPU_full_cycle.png, cpu_SSC.png, cpu_SSC2.png,
> hdfs_cpu.png, hdfs_reads.png, hdfs_scc_3_test.png,
> hdfs_scc_test_full-cycle.png, locks.png, requests_SSC.png
>
>
> ТотI want to propose how to improve reading performance HDFS-client. The
> idea: create few instances ShortCircuit caches instead of one.
> The key points:
> 1. Create array of caches (set by
> clientShortCircuitNum=*dfs.client.short.circuit.num*, see in the pull
> requests below):
> {code:java}
> private ClientContext(String name, DfsClientConf conf, Configuration config) {
> ...
> shortCircuitCache = new ShortCircuitCache[this.clientShortCircuitNum];
> for (int i = 0; i < this.clientShortCircuitNum; i++) {
> this.shortCircuitCache[i] = ShortCircuitCache.fromConf(scConf);
> }
> {code}
> 2 Then divide blocks by caches:
> {code:java}
> public ShortCircuitCache getShortCircuitCache(long idx) {
> return shortCircuitCache[(int) (idx % clientShortCircuitNum)];
> }
> {code}
> 3. And how to call it:
> {code:java}
> ShortCircuitCache cache =
> clientContext.getShortCircuitCache(block.getBlockId());
> {code}
> The last number of offset evenly distributed from 0 to 9 - that's why all
> caches will full approximately the same.
> It is good for performance. Below the attachment, it is load test reading
> HDFS via HBase where clientShortCircuitNum = 1 vs 3. We can see that
> performance grows ~30%, CPU usage about +15%.
> Hope it is interesting for someone.
> Ready to explain some unobvious things.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]