Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3512#issuecomment-64971933
This has come up before, and I thought we resolved that the hadoop-2.4
profile would be used for all Hadoop versions above 2.5, until a Hadoop version
changes dependencies
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3484#issuecomment-64699567
If this is behavior that we encourage users to avoid, I would think that
defaulting to failure when System.exit is called would be preferable. Also,
I'm guessing
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3485#discussion_r20958933
--- Diff: pom.xml ---
@@ -1201,6 +1196,18 @@
/dependencies
/profile
+!-- External projects are not built in less this flag
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3485#issuecomment-64700676
external/kafka will still end up being built without the flag, right?
Also, will this not make the default build fail, because the examples
depend on the external
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3485#issuecomment-64728995
I missed the context for why we would just put mqtt behind the flag.
Something in its dependency graph is breaking the build?
---
If your project is set up for it, you
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3487#discussion_r20974521
--- Diff: external/mqtt/pom.xml ---
@@ -43,8 +43,8 @@
/dependency
dependency
groupIdorg.eclipse.paho/groupId
GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/3471
SPARK-3779. yarn spark.yarn.applicationMaster.waitTries config should be...
... changed to a time period
You can merge this pull request into a Git repository by running:
$ git pull https
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3471#issuecomment-64516335
@tgravescs
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/3426
SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sryza/spark sandy-spark-4567
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3438#discussion_r20845166
--- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala ---
@@ -17,6 +17,7 @@
package org.apache.spark.rdd
+import
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3438#issuecomment-64317326
The main changes we implemented here are:
* When a shuffle operation has a key ordering, sort records by key on the
map side in addition to sorting by partition
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3403#issuecomment-64018115
preferredNodeLocalityData is currently broken (see SPARK-2089), and we're
discussing changing the API for it. I think it would be best to hold off on
this change until
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3409#discussion_r20752652
--- Diff: conf/spark-env.sh.template ---
@@ -40,6 +40,7 @@
# - SPARK_WORKER_OPTS, to set config properties only for the worker (e.g.
-Dx=y
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3409#issuecomment-64062102
Adding the ability to specify java options for the ExecutorLauncher AM in
yarn-client mode sounds reasonable. I think we should use a config property
name that's more
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3390#issuecomment-63915538
+1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3341#discussion_r20563523
--- Diff:
repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala ---
@@ -61,11 +61,14 @@ class SparkILoop(in0: Option[BufferedReader], protected
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3360#discussion_r20563746
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -536,7 +536,7 @@ private[spark] class TaskSetManager
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3360#issuecomment-63609847
Is maxResultSize documented anywhere? I couldn't find it. If not, can we
add it to the configuration page?
---
If your project is set up for it, you can reply
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3341#discussion_r20564849
--- Diff:
repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala ---
@@ -61,11 +61,14 @@ class SparkILoop(in0: Option[BufferedReader], protected
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3360#issuecomment-63720314
I think just referencing the property by its full name and allowing the
user to look it up should be sufficient. Recommending boosting the limit is not
right in all
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3372#discussion_r20628831
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -40,41 +40,108 @@ class JobProgressListener(conf: SparkConf) extends
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3372#discussion_r20628990
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -40,41 +40,108 @@ class JobProgressListener(conf: SparkConf) extends
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3378#issuecomment-63765566
This seems like probably a great idea. Do you know what the overhead of
including a classtag is? Does it mean an extra pointer per object?
---
If your project is set up
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3361#issuecomment-63766299
+1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3339#discussion_r20490017
--- Diff: docs/building-spark.md ---
@@ -113,7 +113,17 @@ mvn -Pyarn -Phive -Phive-thriftserver-0.12.0
-Phadoop-2.4 -Dhadoop.version=2.4.0
{% endhighlight
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3339#discussion_r20490313
--- Diff: docs/building-spark.md ---
@@ -113,7 +113,17 @@ mvn -Pyarn -Phive -Phive-thriftserver-0.12.0
-Phadoop-2.4 -Dhadoop.version=2.4.0
{% endhighlight
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3339#discussion_r20490479
--- Diff: docs/building-spark.md ---
@@ -113,7 +113,17 @@ mvn -Pyarn -Phive -Phive-thriftserver-0.12.0
-Phadoop-2.4 -Dhadoop.version=2.4.0
{% endhighlight
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3341#discussion_r20523279
--- Diff:
repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala ---
@@ -61,11 +61,14 @@ class SparkILoop(in0: Option[BufferedReader], protected
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3342#discussion_r20523465
--- Diff: project/SparkBuild.scala ---
@@ -101,14 +101,10 @@ object SparkBuild extends PomBuild {
v.split((\\s+|,)).filterNot(_.isEmpty).map
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3342#discussion_r20523647
--- Diff: project/SparkBuild.scala ---
@@ -101,14 +101,10 @@ object SparkBuild extends PomBuild {
v.split((\\s+|,)).filterNot(_.isEmpty).map
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3337#discussion_r20523815
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1805,6 +1805,9 @@ object SparkContext extends Logging {
def localCpuCount
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3322#issuecomment-63551793
Sounds good. Updated the patch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3341#discussion_r20557393
--- Diff:
repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala ---
@@ -61,11 +61,14 @@ class SparkILoop(in0: Option[BufferedReader], protected
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3302#issuecomment-63274293
As far as I can tell, `elementsRead` isn't used for anything. Would we be
able to just remove it entirely?
---
If your project is set up for it, you can reply
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3302#issuecomment-63274353
Also, mind filing a JIRA for this, or, if one already exists, including the
name in the title here?
---
If your project is set up for it, you can reply to this email
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3215#issuecomment-63275953
The patch does remove the yarn/stable directory.
Updated patch includes the doc fix. Currently testing it on a
pseudo-distributed cluster.
---
If your project
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3215#issuecomment-63339779
Successfully ran spark-shell in yarn-client mode and an app in yarn-cluster
mode.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3310#discussion_r20450826
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1309,7 +1309,7 @@ abstract class RDD[T: ClassTag](
def debugSelf (rdd: RDD
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3215#discussion_r20454469
--- Diff: docs/building-spark.md ---
@@ -95,8 +74,11 @@ mvn -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0
-DskipTests clean package
# Apache Hadoop 2.4.X
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3310#discussion_r20454820
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1309,7 +1309,7 @@ abstract class RDD[T: ClassTag](
def debugSelf (rdd: RDD
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3302#issuecomment-63355366
That makes sense, my IDE for some reason didn't show me the usage in
Spillable.scala. In that case, this change makes sense.
Spilling is also based on the amount
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3302#issuecomment-63362140
I don't entirely understand that line of argument. Why would we want to
place a lower bound if the data structure is pushing the memory threshold? I
filed https
GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/3322
SPARK-4457. Document how to build for Hadoop versions greater than 2.4
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sryza/spark sandy-spark
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3215#discussion_r20459377
--- Diff: docs/building-spark.md ---
@@ -95,8 +74,11 @@ mvn -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0
-DskipTests clean package
# Apache Hadoop 2.4.X
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3310#issuecomment-63364931
Not a big deal at all, but was there some reason my other nit didn't apply?
One other thing: debugSelf returns s$rdd [$persistence]. Unless this is
some Scala
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3322#issuecomment-63384470
Updated the patch to warn against Hadoop versions greater than 2.5.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3327#issuecomment-63392980
`partitionBy` as well?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3302#issuecomment-63399594
after a while we unconditionally try to spill every 32 elements
regardless of whether the in-memory buffer has exceeded the spill threshold.
The code still
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3239#discussion_r20347403
--- Diff: docs/building-spark.md ---
@@ -113,9 +113,9 @@ mvn -Pyarn -Phive -Phive-thriftserver-0.12.0
-Phadoop-2.4 -Dhadoop.version=2.4.0
{% endhighlight
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3239#discussion_r20348576
--- Diff: docs/building-spark.md ---
@@ -113,9 +113,9 @@ mvn -Pyarn -Phive -Phive-thriftserver-0.12.0
-Phadoop-2.4 -Dhadoop.version=2.4.0
{% endhighlight
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3204#discussion_r20390334
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -360,6 +382,10 @@ private[spark] class ExecutorAllocationManager(sc
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3204#issuecomment-63138957
Cool, just added that as well
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/2490#issuecomment-63148833
Thanks for the review @Ishiihara . Updated the PR to clarify these points.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3239#issuecomment-62869875
I had some additional conversation with @pwendell and we agreed that
SPARK-4376 (putting external modules behind maven profiles) is worthwhile, so
this PR implements both
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3239#issuecomment-62871767
I think Sean is right. Have half a fix for this but am going to go to bed
now.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3239#issuecomment-62934308
@ScrapCodes @pwendell I just tried mvn package -Pscala-2.11 without my
patch and still got errors:
The following artifacts could not be resolved
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3239#discussion_r20329299
--- Diff: sql/catalyst/pom.xml ---
@@ -60,6 +60,11 @@
artifactIdscalacheck_${scala.binary.version}/artifactId
scopetest/scope
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3239#issuecomment-62990268
Here's a patch with a simpler approach that relies on @vanzin 's suggestion
of a -Dscala-2.11 property. I still like the idea of putting the external
projects
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3204#issuecomment-63005126
Updated patch addresses review comments
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3239#issuecomment-63020796
Updated the doc - it seems like there's actually not a ton more to say, but
let me know if I missed anything.
---
If your project is set up for it, you can reply
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3215#issuecomment-62763548
I have a whole set of simplifying changes that I want to go in (e.g.
YARN-1714), but thought it would probably be good to break things up a bit for
easier review
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3215#issuecomment-62792878
@tgravescs
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3215#discussion_r20251737
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala ---
@@ -178,21 +178,25 @@ private[spark] class ClientArguments(args:
Array
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r20262822
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -252,7 +258,7 @@ class HadoopRDD[K, V](
bytesReadCallback.isDefined
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r20263076
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -252,7 +258,7 @@ class HadoopRDD[K, V](
bytesReadCallback.isDefined
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3215#issuecomment-62822018
@tgravescs it would delay other PRs, but not a huge deal if you think it's
too soon.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3228#issuecomment-62850540
That still requires the user to set scala.version, right?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/3239
SPARK-4375. maven rejiggering
It seems like the winds might have moved away from this approach, but
wanted to post the PR anyway because I got it working and to show what it would
look like.
You
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3239#issuecomment-62852398
Right, this was before we came to that decision. Will update this to just
do Kafka.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3204#discussion_r20137693
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -217,14 +223,24 @@ private[spark] class ExecutorAllocationManager(sc
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3000#issuecomment-62588894
Just noticed this. I'd been working on something similar a little while ago
on SPARK-1216 / #304. One difference is that I had aimed to accept categorical
features
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3204#discussion_r20172344
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -110,6 +110,12 @@ private[spark] class ExecutorAllocationManager(sc
GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/3215
SPARK-4338. Ditch yarn-alpha.
Sorry if this is a little premature with 1.2 still not out the door, but it
will make other work like SPARK-4136 and SPARK-2089 a lot easier.
You can merge this pull
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/2744#issuecomment-62662666
@jdanbrown that seems reasonable. Mind filing a JIRA for it?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/3198
SPARK-3461. Support external groupByKey using repartitionAndSortWithinPa...
...rtitions
This is a WIP. It still needs tests and probably a better name for the
transformation, but I wanted
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/1977#discussion_r20132241
--- Diff: python/pyspark/shuffle.py ---
@@ -520,6 +505,295 @@ def sorted(self, iterator, key=None, reverse=False):
return heapq.merge(chunks, key
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3198#issuecomment-62502932
Will take a look at #1977.
I believe that the most common uses for groupByKey, like writing out
partitioned tables, involve iterating over each group a single time
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/1977#discussion_r20132424
--- Diff: python/pyspark/rdd.py ---
@@ -1579,21 +1577,34 @@ def createZero():
return self.combineByKey(lambda v: func(createZero(), v), func
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3198#issuecomment-62508663
All good points. Will close this for now.
Longer term, it worries me that Spark wouldn't be able to provide an
operator that gives comparable performance to what
Github user sryza closed the pull request at:
https://github.com/apache/spark/pull/3198
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/3204
SPARK-4214. With dynamic allocation, avoid outstanding requests for more...
... executors than pending tasks need.
WIP. Still need to add and fix tests.
You can merge this pull request
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3107#discussion_r20060629
--- Diff: docs/configuration.md ---
@@ -556,6 +556,9 @@ Apart from these, the following properties are also
available, and may be useful
tr
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3107#discussion_r20060674
--- Diff: docs/configuration.md ---
@@ -563,8 +566,8 @@ Apart from these, the following properties are also
available, and may be useful
/ul
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3107#issuecomment-62339705
Test failure looks unrelated
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/2968#discussion_r20055135
--- Diff:
core/src/test/scala/org/apache/spark/metrics/InputOutputMetricsSuite.scala ---
@@ -73,4 +78,32 @@ class InputMetricsSuite extends FunSuite
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/2968#discussion_r19931259
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -249,7 +248,7 @@ class HadoopRDD[K, V](
bytesReadCallback.isDefined
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/2968#issuecomment-61941116
Thanks for taking a look @kayousterhout . I'll add in an output type.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user sryza closed the pull request at:
https://github.com/apache/spark/pull/655
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/304#issuecomment-62084497
Definitely. Have been waiting for the Pipelines and Parameters PR goes in.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3148#issuecomment-62097234
+1. Was about to suggest such a change.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3099#issuecomment-61853890
If maxIter is a constant, would it be clearer to use MAX_ITER?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3099#issuecomment-61858097
Both the reference and the class internals are immutable, no? Typical Java
conventions would put such a variable in all caps, though maybe in Scala it's
different
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3099#discussion_r19897343
--- Diff: mllib/src/main/scala/org/apache/spark/ml/parameters.scala ---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r19922230
--- Diff:
core/src/test/scala/org/apache/spark/metrics/InputMetricsSuite.scala ---
@@ -17,8 +17,11 @@
package org.apache.spark.metrics
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r19922310
--- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala ---
@@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager:
BlockManager) extends
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r19922345
--- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala ---
@@ -173,12 +179,12 @@ class NewHadoopRDD[K, V](
// Update metrics
Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/3120#issuecomment-61917343
Had a few nitpicks. Otherwise, this looks good to me.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3082#discussion_r19834936
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -124,6 +125,22 @@ private[spark] class ExecutorAllocationManager(sc
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3082#discussion_r19836489
--- Diff:
network/yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache
Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/3082#discussion_r19836790
--- Diff:
network/yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache
801 - 900 of 1255 matches
Mail list logo