[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-15 Thread dwmclary
Github user dwmclary commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37718637 @mateiz OK, should be good to go now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718877 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718879 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13191/ --- If your

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718880 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13192/ --- If your project

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718878 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread sryza
GitHub user sryza opened a pull request: https://github.com/apache/spark/pull/148 SPARK-1252. On YARN, use container-log4j.properties for executors container-log4j.properties is a file that YARN provides so that containers can have log4j.properties distinct from that of the

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/149 SPARK-1255: Allow user to pass Serializer object instead of class name for shuffle. This is more general than simply passing a string name and leaves more room for performance optimizations.

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37720234 @marmbrus this is for you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37720722 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37720723 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37720725 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37720724 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-15 Thread ScrapCodes
Github user ScrapCodes closed the pull request at: https://github.com/apache/spark/pull/140 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37722638 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13193/ --- If your

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37722637 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13194/ --- If your project

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37722634 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37722635 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix SPARK-1256: Master web UI and Worker web U...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/150#issuecomment-37723748 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

Re: [re-cont] map and flatMap

2014-03-15 Thread andy petrella
Yep, Regarding flatMap and an implicit parameter might work like in scala's future for instance: https://github.com/scala/scala/blob/master/src/library/scala/concurrent/Future.scala#L246 Dunno, still waiting for some insights from the team ^^ andy On Wed, Mar 12, 2014 at 3:23 PM, Pascal Voitot

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635444 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635447 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635557 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37733755 Seems reasonable to me. You still working on this or is it good to go? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37733845 Ah I see - so this isn't going to be externally a user-visible class (I didn't notice it was `private[spark]`)? Would it make sense to throw an assertion error if the

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37733908 Even if it's private we can end up with cases where users have a e.g. 10,000 partition RDD with only a few items in each partition. Do we know a priori when calling this

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10635964 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -136,4 +142,30 @@ class MapOutputTrackerSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10635968 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -136,4 +142,30 @@ class MapOutputTrackerSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734195 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13195/ --- If your project

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734242 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734241 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734634 It is hard to say what threshold to use. I couldn't think of a use case that requires a large window size, but I cannot say there is none. Another possible

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37735835 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13197/ --- If your project

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37735834 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37735832 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13196/ --- If your project

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37735830 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/151#issuecomment-37735857 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-1144 Added license and RAT to check lice...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/125#issuecomment-37736072 @ScrapCodes this is a good start but right now it doesn't actually fail the build if RAT doesn't succeed. Also, RAT reports a bunch of failures for python files that I

[GitHub] spark pull request: SPARK-1144 Added license and RAT to check lice...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/125#discussion_r10636342 --- Diff: dev/rat.bash --- @@ -0,0 +1,49 @@ +#!/usr/bin/env bash --- End diff -- could you remove the `.bash` extension here? --- If your

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10636356 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd) {

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37736391 I am not sure what the intent of this PR is. log config for workers should pretty much mirror what is in master. Also, the hardcoding of the config file, root

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10636404 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala --- @@ -736,7 +736,7 @@ class JavaPairDStream[K, V](val dstream:

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10636411 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10636424 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10636463 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -121,4 +121,9 @@ private[spark] object AkkaUtils extends Logging { def

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10636474 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -121,4 +121,9 @@ private[spark] object AkkaUtils extends Logging { def

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37736922 LGMT pending a minor comment about unifying the code path with the Executor thing that reads the frame size. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37737398 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10636602 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd) {

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/16#issuecomment-37740255 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37738873 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37738874 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13198/ --- If your project

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/16#issuecomment-37742137 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13199/ --- If your project

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/16#issuecomment-37742136 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37742370 https://github.com/sbt/sbt/blob/0.13/ivy/src/main/scala/sbt/Resolver.scala?source=c#L289 --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37742731 @mridulm I think the RDD definition is actually `private[spark]` and it's just intended to be used internally for higher level algorithms. --- If your project is set up

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37742890 @mridulm I think in YARN environments cluster operators can set a logging file on all of the machines to be shared across applications (e.g. Spark, MapReduce, etc). So

[GitHub] spark pull request: Akka frame

2014-03-15 Thread pwendell
GitHub user pwendell opened a pull request: https://github.com/apache/spark/pull/152 Akka frame This is a very small change on top of @andrewor14's patch in #147. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pwendell/spark

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10637306 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId:

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37743968 This should be ready to merge unless other people have more to add. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37744049 Hey @andrewor14 I submitted some small changes on top of this while you were working on it over at #152. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/145 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37744154 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 closed the pull request at: https://github.com/apache/spark/pull/147 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37744167 Continued at #152. Closing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37744244 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37744245 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/152#discussion_r10637484 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId:

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/152#discussion_r10637519 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int)

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/152#discussion_r10637523 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int)

Code documentation

2014-03-15 Thread David Thomas
Is there any documentation available that explains the code architecture that can help a new Spark framework developer?

Re: Code documentation

2014-03-15 Thread Reynold Xin
Take a look at https://cwiki.apache.org/confluence/display/SPARK/Spark+Internals On Sat, Mar 15, 2014 at 6:19 PM, David Thomas dt5434...@gmail.com wrote: Is there any documentation available that explains the code architecture that can help a new Spark framework developer?

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37745299 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13200/ --- If your project

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637809 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread gzm55
Github user gzm55 commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10637844 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala --- @@ -736,7 +736,7 @@ class JavaPairDStream[K, V](val dstream:

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37748284 @pwendell I was referring not to the actual implementation, but expectation when using the exposed API. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread gzm55
Github user gzm55 commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10637918 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37748592 But that would be to debug yarn/hadoop api's primarily - and no easy way to inject spark specific logging levels. I am curious why this was required actually.

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637947 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends

[GitHub] spark pull request: fix compile error of streaming project

2014-03-15 Thread gzm55
GitHub user gzm55 opened a pull request: https://github.com/apache/spark/pull/153 fix compile error of streaming project explicit return type for implicit function You can merge this pull request into a Git repository by running: $ git pull https://github.com/gzm55/spark

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread gzm55
Github user gzm55 commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10638000 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10638011 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread gzm55
Github user gzm55 commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10638121 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: remove staging dir when app quiting for yarn-c...

2014-03-15 Thread gzm55
GitHub user gzm55 opened a pull request: https://github.com/apache/spark/pull/154 remove staging dir when app quiting for yarn-cluster mode In yarn-cluster, the driver is actually running as 'yarn' user. When posting jobs from other users, we need give stagingDir a full path, so