[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-12-29 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-68296918 The size of the data is 100GB in its uncompressed binary representation. You are probably compressing the data when you saved it as sequence file. When you save it as text

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-12-29 Thread liuqiyun
Github user liuqiyun commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-68331374 So how to save as the uncompressed binary representation in the GenSort.scala program? I want to compare it with Hadoop MR which also use the uncompressed binary

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-12-27 Thread liuqiyun
Github user liuqiyun commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-68199942 @rxin I am confusing on the input parameters of GenSort.scala. It requires 3 parameters: [num-parts] [records-per-part] [output-path]. If I want to generate and

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-09-11 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-55352215 Hi @rxin , sorry to bring this out. Are you planning to merge this terasort example into Spark? I think this would be a good standard to test the performance of

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-09-11 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-55364233 I don't think we are going to merge this in Spark, unless there is huge demand from users... --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-09-01 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-54100933 @rxin can you close this for now? It's been lingering a long time. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-09-01 Thread rxin
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/1242 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-30 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47550102 The hadoop code for generating the data is out of date. It might not matter for your purposes, but if you want the up to date one look at sortbenchmark.org. I had

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47420731 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47420727 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47421446 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16227/ --- If your

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47421445 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47440103 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47440099 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47441036 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47441037 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16231/ --- If your

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47310836 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47310838 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16196/ --- If your

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47415018 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47415020 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47415821 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16224/ --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47415820 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-26 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1242 [SPARK-2304] tera sort example program for shuffle benchmarks This pull request adds an example program for benchmarking Spark shuffle. It dynamically generates a set of 100 byte records according to

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47308594 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47308593 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47308636 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16195/ --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47308635 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47308992 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-47308985 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not