[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2015-07-16 Thread dhruv kapatel (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630762#comment-14630762
 ] 

dhruv kapatel commented on YARN-938:


Great work!
can any one help me how can i perform benchmarks without cloudera vm ?
I've already setup hadoop cluster on virtualbox. 


 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x-1.xls, 
 Hadoop-benchmarking-2.x-vs-1.x.xls, cdh500beta1_cpu_util.jpg, 
 cdh500beta1_mr1_mr2.xlsx


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-12-18 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851507#comment-13851507
 ] 

Luke Lu commented on YARN-938:
--

Thanks for the results Jeff!. It's interesting to note that the best terasort 
throughput in your configuration is ~140MB/s (mrv1, 96MB/s for mrv2) per 
physical host for a 8TB data set, compared with ~23MB/s (1.x, 21MB/s for 2.2) 
per physical host in Mayank's results for a 1TB (?) data set. Obviously 10Gb 
networking and 12 15K RPM SAS disks per host helped. OTOH, I'd expect Mayank's 
results to be a lot faster as the data set fits into the 260 slave host cluster 
memory (buffer cache).

It'll be interesting to show the Apache 1.2.1 results for Jeff's configuration 
as well, so it's more comparable to Mayank's results, as I suspect that CDH 
mrv1 have more optimizations than Apache.

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x-1.xls, 
 Hadoop-benchmarking-2.x-vs-1.x.xls, cdh500beta1_cpu_util.jpg, 
 cdh500beta1_mr1_mr2.xlsx


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-12-18 Thread Jeff Buell (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852078#comment-13852078
 ] 

Jeff Buell commented on YARN-938:
-

Yes, I spent a lot of time putting together high-performance hardware and 
tuning the software stack.  While out of the box tests have their place, it 
is much easier to analyze performance differences when both configurations are 
pushed to their limits.  Tunes not only improve elapsed time, but almost always 
they improve test repeatability and execution uniformity across the cluster.  
The latter allows performance data to be collected on one machine with 
confidence that it represents all machines in the cluster.

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x-1.xls, 
 Hadoop-benchmarking-2.x-vs-1.x.xls, cdh500beta1_cpu_util.jpg, 
 cdh500beta1_mr1_mr2.xlsx


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-11-14 Thread kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823088#comment-13823088
 ] 

kumar commented on YARN-938:


found there were some configs needs to be changed and after that we got some 
better performance.

  is this something that you can share ?

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x-1.xls, 
 Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-11-14 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823155#comment-13823155
 ] 

Luke Lu commented on YARN-938:
--

Yes, it'd be great if [~mayank_bansal] can share the configs and command lines 
to run the benchmarks, so others can reproduce the results.

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x-1.xls, 
 Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-10-01 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783446#comment-13783446
 ] 

Luke Lu commented on YARN-938:
--

Is jvm reuse (set to -1) turned on for Hadoop 1 runs? Unfortunately, container 
reuse is not in MRv2 yet (MAPREDUCE-3902 appear to be stalled). It'd be 
interesting to see numbers from Tez, which does have container reuse, as well 
for comparison.

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764907#comment-13764907
 ] 

Zhijie Shen commented on YARN-938:
--

Great job, Mayank! Just think out loud: perhaps it is good to take one more 
step to get the benchmarks on the clusters of different sizes. One goal of 
designing YARN is improving scalability. For example, it will be very 
encouraging if we can demonstrate on the cluster of 130 nodes, hadoop 1.x takes 
1 unit time to run job A while hadoop 2.x takes 0.9; on the cluster of 260 
nodes, hadoop 1.x takes 1 unit time while hadoop 2.x takes 0.8. Not sure about 
how much addition work required for this. Just think it will be the useful info.

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763572#comment-13763572
 ] 

Mayank Bansal commented on YARN-938:


I ran these benchmarks with vinod's [~vinodkv] collabration .

Thanks Vinod for all your help.

Attaching the results.

Thanks,
Mayank 

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763798#comment-13763798
 ] 

Sandy Ryza commented on YARN-938:
-

Thanks for working on these, [~mayank_bansal].  The results are pretty 
consistent with some internal benchmarking we've done at Cloudera.

A few questions:
* In MR1 was io.sort.record.percent tuned to spill the same number of times as 
MR2 does?
* What was slowstart completed maps set to?
* How many slots and MB were the TTs and NMs configured with?
* Any idea what caused the improvement between RC1 and the final release?  I'm 
guessing MAPREDUCE-5399 helped.


 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763817#comment-13763817
 ] 

Vinod Kumar Vavilapalli commented on YARN-938:
--

bq. The results are pretty consistent with some internal benchmarking we've 
done at Cloudera.
Interesting, do you mind sharing those results?

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763829#comment-13763829
 ] 

Sandy Ryza commented on YARN-938:
-

On vacation now, but I'll try to assemble them into a presentable form when I 
get back.

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763838#comment-13763838
 ] 

Nemon Lou commented on YARN-938:


Thanks Mayank Bansal for your work.Do you mind sharing how much input data do 
you run for TeraSort?

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-07-18 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13712712#comment-13712712
 ] 

Vinod Kumar Vavilapalli commented on YARN-938:
--

Thanks for doing this Mayank!

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal

 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira