[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-08-03 Thread javadba
Github user javadba commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-51022369 Thanks for commenting Josh. I will see about putting together something on this including solid testcases. ETA later in the coming week. --- If your project is set up

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-08-03 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-51020347 Since this same locking pattern occurs at several places in the code, I think it might make sense to abstract it behind a function or macro, which would give us a centr

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-31 Thread javadba
Github user javadba commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50814023 HI Folks, Yes I was on master for this PR . I had realized this was not the correct procedure so had already started to create separate branches on my fork for the

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-29 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50513840 It looks like this pull request was opened from @javadba's master branch. My hunch is that he force-pushed or otherwise reset that branch to bring it in sync with the

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-29 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50511022 I dunno, merging a PR with no changed files doesn't sound too scary to me. Something is definitely messed up in this PR, with both `Commits` and `Files change

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-29 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50509704 That was super scarey ! Thanks for clarifying @aarondav --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-29 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50497514 I think asfgit got confused somehow; the linked commit looks unrelated. --- If your project is set up for it, you can reply to this email and have your reply appear on G

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-29 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50488319 @pwendell @mateiz was this PR really merged into spark ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1542 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-26 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50236563 @javadba, sorry that maybe my previous comment is not clear. I'm opposed to use `synchronized` on `val monitor = shuffleId.toString.intern`. I see you have to use

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-25 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50196891 As I [commented on the JIRA](https://issues.apache.org/jira/browse/SPARK-2638?focusedCommentId=14074710&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabp

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-25 Thread javadba
Github user javadba commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50194492 Hi again. Upon closer inspection of the existing code/ functionality we do have an opportunity here to: (a) reduce the code size / complexity (b) at the same

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-25 Thread javadba
Github user javadba commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50170143 Hi, thanks for taking the time to review this PR again. A couple of things maybe I should have pointed out: a. The strategy here is not to use the container col

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-25 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/1542#discussion_r15392537 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -162,9 +164,9 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf)

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-25 Thread javadba
Github user javadba commented on a diff in the pull request: https://github.com/apache/spark/pull/1542#discussion_r15389491 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -130,7 +130,7 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf)

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/1542#discussion_r15326142 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -130,7 +130,7 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf)

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/1542#discussion_r15283935 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -130,7 +130,7 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf)

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-49832419 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-22 Thread javadba
GitHub user javadba opened a pull request: https://github.com/apache/spark/pull/1542 SPARK-2638 MapOutputTracker concurrency improvement Spark-2638 Improve concurrency of fetching Map outputs This issue was noticed while perusing the MapOutputTracker source code. Notice tha