Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15603509
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala ---
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15603895
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala ---
@@ -91,6 +97,20 @@ class ShuffleBlockManager(blockManager: BlockManager)
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15604053
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15605438
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50668930
Thanks everyone, I think I addressed all the comments. Anything else before
we merge this? I'd like to merge it fairly soon because there are a few other
issues I'd like
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50669058
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50669721
QA tests have started for PR 1499. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17472/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50678947
QA results for PR 1499:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brclass
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50679871
QA results for PR 1499:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brclass
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50694691
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15620594
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,662 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15620640
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,662 @@
+/*
+ * Licensed to the Apache Software
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15620674
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,662 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15621714
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/ExternalSorterSuite.scala
---
@@ -0,0 +1,566 @@
+/*
+ * Licensed to the Apache
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50700437
test this please!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50700807
QA tests have started for PR 1499. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17526/consoleFull
---
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50701304
I took another pass over the patch and the changes look ready to me. I also
tested this locally and verified that the shuffle files were actually cleaned
up. There is
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50701589
Ok I merged this in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/1499
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50704737
QA results for PR 1499:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brclass
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15508024
--- Diff:
core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala ---
@@ -43,10 +44,10 @@ import org.apache.spark.{Logging, RangePartitioner}
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15543949
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala ---
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15544043
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala ---
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15544196
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala ---
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15544259
--- Diff:
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -54,12 +55,16 @@ private[spark] class DiskBlockManager(shuffleManager:
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50522967
@mateiz please refer to changes here :
https://github.com/apache/spark/pull/1609/files#diff-10
They should be relevant to this PR too
---
If your project is set up
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r1237
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r1233
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,390 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r1382
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala ---
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r1675
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r1693
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15556144
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15556374
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala ---
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15557253
--- Diff:
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -54,12 +55,16 @@ private[spark] class
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15557939
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala ---
@@ -91,6 +97,20 @@ class ShuffleBlockManager(blockManager:
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15560379
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15560508
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15560695
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15560798
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15560817
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15560994
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15562065
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15562204
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15562761
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15562851
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15562901
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15562939
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15563349
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15563638
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50303404
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50303714
QA tests have started for PR 1499. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17274/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50306524
QA results for PR 1499:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brclass
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15493355
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15494109
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15494358
--- Diff:
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -54,12 +55,16 @@ private[spark] class DiskBlockManager(shuffleManager:
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15494423
--- Diff:
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -54,12 +55,16 @@ private[spark] class DiskBlockManager(shuffleManager:
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15494991
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15495007
--- Diff:
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -54,12 +55,16 @@ private[spark] class DiskBlockManager(shuffleManager:
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15495220
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala ---
@@ -91,6 +97,20 @@ class ShuffleBlockManager(blockManager:
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15497588
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/ExternalSorterSuite.scala
---
@@ -0,0 +1,566 @@
+/*
+ * Licensed to the Apache
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15501936
--- Diff:
core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala ---
@@ -43,10 +44,10 @@ import org.apache.spark.{Logging, RangePartitioner}
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50292919
I've now rebased this on top of the SizeTracker class in #1165 -- should be
ready to go in.
There is one issue left with both the ExternalSorter and
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50297730
Are map tasks spilling by any chance? There is one issue in this right now,
which is that if your map task spills to disk, you need to spill multiple times
with the
Github user colorant commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50301634
@mateiz , yep, the map tasks did spill and it seems contribute most to the
increased process time. though in my case only about 400K data been spilled to
disk per task.
64 matches
Mail list logo