[GitHub] spark pull request: [CORE][SPARK-14178]DAGScheduler should get map...

2016-04-02 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/11986 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-1239] Improve fetching of map output st...

2016-04-03 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/12113#discussion_r58324785 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -428,40 +503,93 @@ private[spark] class MapOutputTrackerMaster(conf: SparkConf

[GitHub] spark pull request: [SPARK-1239] Improve fetching of map output st...

2016-04-03 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/12113#discussion_r58325719 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -296,10 +290,89 @@ private[spark] class MapOutputTrackerMaster(conf: SparkConf

[GitHub] spark pull request: [SPARK-1239] Improve fetching of map output st...

2016-04-05 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/12113#discussion_r58502169 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -296,10 +290,89 @@ private[spark] class MapOutputTrackerMaster(conf: SparkConf

[GitHub] spark pull request: [SPARK-1239] Improve fetching of map output st...

2016-04-05 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/12113#discussion_r58503061 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -428,40 +503,93 @@ private[spark] class MapOutputTrackerMaster(conf: SparkConf

[GitHub] spark pull request: [CORE][SPARK-14178]DAGScheduler should get map...

2016-03-29 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/11986#issuecomment-202778045 retest please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-14667] Remove HashShuffleManager

2016-04-18 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/12423#issuecomment-211219477 I think the following should be deleted 1. [FileShuffleBlockResolver.scala](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/shuffle

[GitHub] spark pull request #14647: [WIP][Test only][SPARK-6235]Address various 2G li...

2016-08-15 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/14647 [WIP][Test only][SPARK-6235]Address various 2G limits ## What changes were proposed in this pull request? The main changes. 1. Replace DiskStore method `def getBytes (blockId: BlockId

[GitHub] spark pull request #14662: [WIP][SPARK-17082][CORE]Replace ByteBuffer with C...

2016-08-16 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/14662 [WIP][SPARK-17082][CORE]Replace ByteBuffer with ChunkedByteBuffer ## What changes were proposed in this pull request? Replace `ByteBuffer` with `ChunkedByteBuffer

[GitHub] spark issue #14647: [WIP][Test only][DEMO][SPARK-6235]Address various 2G lim...

2016-08-15 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14647 @hvanhovell I will submit some small PRs and provide a more high level description of them. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #14658: [WIP][SPARK-5928] Remote Shuffle Blocks cannot be...

2016-08-15 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/14658 [WIP][SPARK-5928] Remote Shuffle Blocks cannot be more than 2 GB ## What changes were proposed in this pull request? Add class `ChunkFetchInputStream` and it have the following effects

[GitHub] spark pull request #14664: Spark 6236 caching blocks 2G

2016-08-16 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/14664 Spark 6236 caching blocks 2G ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please

[GitHub] spark issue #14647: [WIP][Test only][DEMO][SPARK-6235]Address various 2G lim...

2016-08-16 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14647 @hvanhovell #14658 , #14662 and #14664 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-03 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99458252 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -51,8 +54,39 @@ private[spark] class TaskDescription( val index

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-03 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99457658 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -243,27 +245,42 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-03 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99458022 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -66,7 +100,8 @@ private[spark] object TaskDescription

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-03 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99458165 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -51,8 +54,39 @@ private[spark] class TaskDescription( val index

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-07 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99848516 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -243,27 +245,42 @@ class

[GitHub] spark pull request #16806: [WIP][SPARK-18890][CORE] Move task serialization ...

2017-02-07 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/16806 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-02-07 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 Okay, this may take some time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16806: [WIP][SPARK-18890][CORE] Move task serialization ...

2017-02-04 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/16806 [WIP][SPARK-18890][CORE] Move task serialization from the TaskSetManager to the CoarseGrainedSchedulerBackend ## What changes were proposed in this pull request? See https

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99462808 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -243,27 +245,42 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99462431 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -66,7 +100,8 @@ private[spark] object TaskDescription

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99462387 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -243,27 +245,42 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r99463382 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -244,32 +245,45 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-21 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r97211797 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -602,6 +619,20 @@ class

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-21 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 @squito My understanding is that the TaskSchedulerImpl class contains many synchronized statements (synchronized the methods). If a synchronized statements execution time is very long

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-24 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r97695455 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -602,6 +619,21 @@ class

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistices to ...

2017-02-23 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r102891969 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -39,16 +40,18 @@ private[spark] sealed trait MapStatus { * necessary

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-02-25 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 @kayousterhout I think the latest code is ready to merge into the master branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #14751: [WIP][SPARK-17184][[CORE]]Replace ByteBuf with In...

2016-08-22 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/14751 [WIP][SPARK-17184][[CORE]]Replace ByteBuf with InputStream ## What changes were proposed in this pull request? The size of ByteBuf can not be greater than 2G, should be replaced

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-02-27 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-03-01 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 @kayousterhout It takes some time to update the test report. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-03-02 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 @kayousterhout Test results have been updated: | [SPARK-17931](https://github.com/witgo/spark/tree/SPARK-17931) |https://github.com/apache/spark/commit

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-03-02 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 Do not know which pr causes the run time of this test case to be reduced from 21.764 s to 9.566 s. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #17139: [WIP][SPARK-18890][CORE](try 3) Move task seriali...

2017-03-02 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/17139 [WIP][SPARK-18890][CORE](try 3) Move task serialization from the TaskSetManager to the CoarseGrainedSchedulerBackend ## What changes were proposed in this pull request? See https

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103622493 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -454,33 +452,15 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103622591 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103627459 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103627873 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631254 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631300 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631294 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631260 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631249 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,8 +55,26 @@ private[spark] class TaskDescription( val

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631278 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631373 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -195,6 +197,11 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631322 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,8 +55,26 @@ private[spark] class TaskDescription( val

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631360 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631354 --- Diff: core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala --- @@ -82,9 +88,15 @@ private[spark] class LocalEndpoint

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631341 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala --- @@ -164,17 +164,18 @@ class ExecutorSuite extends SparkFunSuite

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103631404 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -130,7 +152,7 @@ private[spark] object TaskDescription

[GitHub] spark pull request #17116: [SPARK-18890][CORE](try 2) Move task serializatio...

2017-03-01 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/17116 [SPARK-18890][CORE](try 2) Move task serialization from the TaskSetManager to the CoarseGrainedSchedulerBackend ## What changes were proposed in this pull request? See https

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-03-01 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103636619 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-27 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103174824 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -23,7 +23,10 @@ import java.util.Properties import

[GitHub] spark issue #14311: [SPARK-16550] [SPARK-17042] [core] Certain classes fail ...

2016-08-31 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14311 @rxin @ericl This PR may cause the following code to throw an exception ```scala private def getRemoteValues(blockId: BlockId): Option[BlockResult] = { getRemoteBytes

[GitHub] spark pull request #14977: [Test Only][not ready for review][SPARK-6235][COR...

2016-09-06 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/14977 [Test Only][not ready for review][SPARK-6235][CORE]Address various 2G limits ## What changes were proposed in this pull request? ### Design Setup for eliminating the various 2G

[GitHub] spark pull request #14995: [Test Only][not ready for review][SPARK-6235][COR...

2016-09-07 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/14995 [Test Only][not ready for review][SPARK-6235][CORE]Address various 2G limits ## What changes were proposed in this pull request? ### motivation The various 2G limit in Spark

[GitHub] spark pull request #14977: [Test Only][not ready for review][SPARK-6235][COR...

2016-09-07 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/14977 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14995: [Test Only][not ready for review][SPARK-6235][CORE]Addre...

2016-09-08 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14995 retest please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-09-08 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14995 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #14662: [WIP][SPARK-17082][CORE]Replace ByteBuffer with C...

2016-08-23 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/14662 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #14751: [WIP][SPARK-17184][[CORE]]Replace ByteBuf with In...

2016-08-23 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/14751 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #14664: [WIP][SPARK-6236][SPARK-6237][CORE]Support cachin...

2016-08-23 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/14664 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #14647: [WIP][Test only][DEMO][SPARK-6235]Address various...

2016-08-23 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/14647 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #14658: [WIP][SPARK-5928][SPARK-6238] Remote Shuffle Bloc...

2016-08-23 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/14658 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-09-27 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14995 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15408: [SPARK-17839][CORE] UnsafeSorterSpillReader shoul...

2016-10-09 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15408#discussion_r82537791 --- Diff: core/src/main/java/org/apache/spark/io/NioBasedBufferedFileInputStream.java --- @@ -0,0 +1,91 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #15408: [SPARK-17839][CORE] UnsafeSorterSpillReader shoul...

2016-10-09 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15408#discussion_r82537694 --- Diff: core/src/main/java/org/apache/spark/io/NioBasedBufferedFileInputStream.java --- @@ -0,0 +1,77 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #15830: [WIP]Upgrade netty to 4.0.42.Final

2016-11-09 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/15830 [WIP]Upgrade netty to 4.0.42.Final ## What changes were proposed in this pull request? One of the important changes for 4.0.42.Final is "Support any FileRegion implementation when

[GitHub] spark pull request #15830: [SPARK-18375][SPARK-18383][BUILD][CORE]Upgrade ne...

2016-11-09 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15830#discussion_r87327561 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -,6 +2223,9 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #15505: [SPARK-17931][CORE] taskScheduler has some unneed...

2016-11-07 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r8669 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -486,7 +481,7 @@ private[spark] class Executor( * Download any missing

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-10-19 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15512: The SerializerInstance instance used when deseria...

2016-10-17 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/15512 The SerializerInstance instance used when deserializing a TaskResult is not reused ## What changes were proposed in this pull request? The following code is called when the DirectTaskResult

[GitHub] spark pull request #15512: [SPARK-17930][CORE]The SerializerInstance instanc...

2016-10-17 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15512#discussion_r83774753 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala --- @@ -77,14 +78,12 @@ private[spark] class DirectTaskResult[T

[GitHub] spark pull request #15512: [SPARK-17930][CORE]The SerializerInstance instanc...

2016-10-17 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15512#discussion_r83774768 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskResultGetter.scala --- @@ -84,6 +90,7 @@ private[spark] class TaskResultGetter(sparkEnv: SparkEnv

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15512 I also think that the time saved is all the registration which can be skipped, but did not verify. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-10-17 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 @wzhfy Ok, the code has been modified --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15505: [SPARK-17931][CORE] taskScheduler has some unneed...

2016-11-22 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r89249175 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -17,27 +17,178 @@ package org.apache.spark.scheduler

[GitHub] spark pull request #15505: [SPARK-17931][CORE] taskScheduler has some unneed...

2016-11-22 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r89249162 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -592,47 +579,6 @@ class TaskSetManagerSuite extends SparkFunSuite

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-11-24 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14995 Thanks, I am glad to hear this, and I want to solve the issues of reading, storing and transmitting data as much as possible. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #15505: [SPARK-17931][CORE] taskScheduler has some unneed...

2016-11-24 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r89483305 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -139,29 +139,6 @@ class TaskSchedulerImplSuite extends

[GitHub] spark pull request #15505: [SPARK-17931][CORE] taskScheduler has some unneed...

2016-11-22 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r89246343 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -139,29 +139,6 @@ class TaskSchedulerImplSuite extends

[GitHub] spark pull request #15505: [SPARK-17931][CORE] taskScheduler has some unneed...

2016-11-22 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r89246494 --- Diff: mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosFineGrainedSchedulerBackendSuite.scala --- @@ -345,7 +374,17 @@ class

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-11-21 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14995 This PR is Test only, it used to 1. verify code through CI 2. verify the effectiveness of the solution includes two underlying API changes. 1. Replace ByteBuffer

[GitHub] spark issue #15830: [SPARK-18375][SPARK-18383][BUILD][CORE]Upgrade netty to ...

2016-11-12 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15830 @srowen thank you --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-11-20 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/14995 @srowen This PR is a comprehensive solution. Used to solve address various 2G limits, RPC memory footprint and other issues. Users often encounter these problems. Why don't we need to solve

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-11-21 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 ping @kayousterhout / @squito / @JoshRosen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15297: [WIP][SPARK-9862]Handling data skew

2016-10-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15297#discussion_r82922339 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SkewShuffleRowRDD.scala --- @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15297: [WIP][SPARK-9862]Handling data skew

2016-10-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15297#discussion_r82922520 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SkewShuffleRowRDD.scala --- @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15297: [WIP][SPARK-9862]Handling data skew

2016-10-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15297#discussion_r82742056 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -138,13 +138,16 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf

[GitHub] spark pull request #15297: [WIP][SPARK-9862]Handling data skew

2016-10-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15297#discussion_r82743168 --- Diff: core/src/main/scala/org/apache/spark/shuffle/ShuffleManager.scala --- @@ -48,7 +48,8 @@ private[spark] trait ShuffleManager { handle

[GitHub] spark pull request #15297: [WIP][SPARK-9862]Handling data skew

2016-10-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15297#discussion_r82742902 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -687,18 +691,21 @@ private[spark] object MapOutputTracker extends Logging

[GitHub] spark pull request #15505: [WIP][SPARK-17931]taskScheduler has some unneeded...

2016-10-16 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/15505 [WIP][SPARK-17931]taskScheduler has some unneeded serialization ## What changes were proposed in this pull request? When taskScheduler instantiates TaskDescription, it calls

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-11-29 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 @kayousterhout Here are my thoughts: move the serialization out of the `TaskSetManager.resourceOffer` method. Split resourceOffer and serialization process, So that we can make

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r95723533 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,7 +55,36 @@ private[spark] class TaskDescription( val

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r95732717 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -245,6 +245,16 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r95731120 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -517,6 +518,32 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r95731056 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -517,6 +518,32 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-13 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r96104686 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,7 +55,43 @@ private[spark] class TaskDescription( val

<    3   4   5   6   7   8   9   >