[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.

Binglin Chang (JIRA) Sat, 12 Oct 2013 01:29:02 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793297#comment-13793297
 ]


Binglin Chang commented on HDFS-5276:
-------------------------------------

Did a micro-benchmark only on FileSystem.Statistics, results seams great, looks 
like thread local have very little performance penalty. 

Without patch: 
Thread  1, Time:       1107
Thread  2, Time:      11391
Thread  3, Time:      23813
Thread  4, Time:      37780

With patch:
Thread  1, Time:        901
Thread  2, Time:       1056
Thread  3, Time:       2473
Thread  4, Time:       2525
Thread  5, Time:       2689
Thread  6, Time:       2634
Thread  7, Time:       2938
Thread  8, Time:       3499
Thread  9, Time:       3551

My test env (i7 4core 8 thread hyperthreading) should have linear scalability 
under 4 threads, don't know why we still see 2x slow down on 3 and 4 threads.  
Don't have more cores test env, maybe [~chengxiang li] can provide more results?

Attach test code.


> FileSystem.Statistics got performance issue on multi-thread read/write.
> -----------------------------------------------------------------------
>
>                 Key: HDFS-5276
>                 URL: https://issues.apache.org/jira/browse/HDFS-5276
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.4-alpha
>            Reporter: Chengxiang Li
>            Assignee: Colin Patrick McCabe
>         Attachments: DisableFSReadWriteBytesStat.patch, HDFS-5276.001.patch, 
> HDFS-5276.002.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG, 
> ThreadLocalStat.patch
>
>
> FileSystem.Statistics is a singleton variable for each FS scheme, each 
> read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does 
> not perform well in multi-threads(let's say more than 30 threads). so it may 
> cause  serious performance issue. during our spark test profile, 32 threads 
> read data from HDFS, about 70% cpu time is spent on 
> FileSystem.Statistics.incrementBytesRead().



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.

Reply via email to