[ 
https://issues.apache.org/jira/browse/SPARK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241973#comment-14241973
 ] 

Saisai Shao edited comment on SPARK-4740 at 12/11/14 1:34 AM:
--------------------------------------------------------------

Hi Reynold, the code I pasted is just the example, I do convert ByteBuffer to 
Netty ByteBuf. We will test again using your patch to see the difference.

As we tested using ram disk to minimize the disk effect, the performance of 
Netty is similar to NIO, can we say that current implementation of Netty 
transfer service is not well tuned for slow random IO device like spinned disk? 
Also from my understanding I guess that the unbalanced situation might be 
introduced by random disk IO latency, since SSDs and ramdisk outperforms much 
better than spinned disk in random seek, which minimize the IO latency 
vibrance, and the difference of system configuration before and after is only 
the spinned disk to ram disk.

So what is your opinion?


was (Author: jerryshao):
Hi Reynold, the code I pasted is just the example, I do convert ByteBuffer to 
Netty ByteBuf. We will test again using your patch to see the difference.

As we tested using ram disk to minimize the disk effect, the performance of 
Netty is similar to NIO, can we say that current implementation of Netty 
transfer service is not well tuned for slow random IO device like spinned disk. 
Also from my understanding I guess that the unbalanced situation might be 
introduced by random disk IO latency, since SSDs and ramdisk outperforms much 
better than spinned disk in random seek, which minimize the IO latency 
vibrance, and the difference of system configuration before and after is only 
the spinned disk to ram disk.

So what is your opinion?

> Netty's network throughput is about 1/2 of NIO's in spark-perf sortByKey
> ------------------------------------------------------------------------
>
>                 Key: SPARK-4740
>                 URL: https://issues.apache.org/jira/browse/SPARK-4740
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Zhang, Liye
>            Assignee: Reynold Xin
>         Attachments: (rxin patch better executor)TestRunner  sort-by-key - 
> Thread dump for executor 3_files.zip, (rxin patch normal executor)TestRunner  
> sort-by-key - Thread dump for executor 0 _files.zip, Spark-perf Test Report 
> 16 Cores per Executor.pdf, Spark-perf Test Report.pdf, TestRunner  
> sort-by-key - Thread dump for executor 1_files (Netty-48 Cores per node).zip, 
> TestRunner  sort-by-key - Thread dump for executor 1_files (Nio-48 cores per 
> node).zip, rxin_patch-on_4_node_cluster_48CoresPerNode(Unbalance).7z
>
>
> When testing current spark master (1.3.0-snapshot) with spark-perf 
> (sort-by-key, aggregate-by-key, etc), Netty based shuffle transferService 
> takes much longer time than NIO based shuffle transferService. The network 
> throughput of Netty is only about half of that of NIO. 
> We tested with standalone mode, and the data set we used for test is 20 
> billion records, and the total size is about 400GB. Spark-perf test is 
> Running on a 4 node cluster with 10G NIC, 48 cpu cores per node and each 
> executor memory is 64GB. The reduce tasks number is set to 1000. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to