Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16189#discussion_r91257339
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -432,6 +435,57 @@ private[spark] class Executor
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16098
@srowen This is where try-with-resources is really useful in java.
Util.tryWithSafeFinallyAndFailureCallbacks is an attempt at doing that in
scala with mixed results.
I agree
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16098
That looks like a bug in ParquetRecordWriter (the contract of close() is
unambiguous) ... but then, I guess there is no point in fighting against buggy
code : we have to integrate with a lot
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16098
close is idempotent - simply close it before
'committer.commitTask(taskContext)' and you should be done. The second close in
finally will become no-op in case of successful commit, and will handle
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15736
(Particularly) as the number of partitions increase, "if (a._1 != b._1)"
might be better for bpt reasons.
---
If your project is set up for it, you can reply to this email and
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15861
@jiangxb1987 I did a single pass review - particularly given the
similarities in both the codepaths and the classnames, I will need to go over
it again to ensure we dont miss anything
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r87708046
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90119536
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90124359
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90121670
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90129259
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -561,7 +561,7 @@ class PairRDDFunctionsSuite extends SparkFunSuite
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r88077635
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90122251
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90121527
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90120144
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90127987
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r88075283
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapRedCommitProtocol.scala
---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90129155
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -1089,66 +1064,10 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90116879
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -1016,11 +1013,6 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16038
Without understanding the specifics of the ML part here - wont the actual
impact of a large dense vector on Task 'bytes' be minimal at best ?
We do compress the task binary; and 1B zero's should
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15861
@rxin I did see this PR, unfortunately it is a bit big and I am tied up
with other things - cant get to it for next few days.
---
If your project is set up for it, you can reply
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15563
Merging into master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15563
Looks good to me
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
Interesting, thanks for clarifying.
The way I was looking at the change was, given we have ability for custom
credential providers now, we would need to support for out-of-band
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15563
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15823
LGTM. Not at my laptop, would be great if you can merge @rxin, thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
@vanzin You are right, this does look orthogonal to what you and Tom
described.
'Where' the renewal is happening is the difference - from within spark or
from outside of spark (oozie/from
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87116199
--- Diff:
yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala
---
@@ -189,6 +190,39 @@ private[spark] abstract class
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
@tgravescs Interesting, so it means that there wont be a
--principal/--keytab, and what is being done in AMCredentialRenewer will be
done in the launcher itself ? So the driver behaves just like
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87072908
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -64,7 +64,7 @@ class
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87075067
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
---
@@ -138,6 +138,14 @@ class SparkHadoopUtil extends Logging
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87078341
--- Diff:
yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala
---
@@ -189,6 +190,39 @@ private[spark] abstract class
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
Hi @tgravescs,
This is specifically for the case where new input source/output
destinations are getting added after the driver has already started. Notebooks,
long running streaming
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15563#discussion_r86874659
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2569,7 +2572,8 @@ private[spark] class CallerContext
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15563#discussion_r86874432
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2591,6 +2595,16 @@ private[spark] class CallerContext
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15563#discussion_r86874386
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2582,7 +2586,7 @@ private[spark] class CallerContext(
val callerContext
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15563#discussion_r86873923
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2523,6 +2524,8 @@ private[spark] class CallerContext(
taskId: Option[Long
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15618
LGTM, merging it into master.
Thx @HyukjinKwon
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15769#discussion_r86855791
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15769#discussion_r86859239
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15769#discussion_r86858760
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15769#discussion_r86856937
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15743
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15743#discussion_r86841779
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala ---
@@ -197,8 +196,32 @@ private[spark] object UIData {
shuffleWriteMetrics
Github user mridulm closed the pull request at:
https://github.com/apache/spark/pull/15752
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user mridulm opened a pull request:
https://github.com/apache/spark/pull/15752
[SPARK-18250] [SQL] Minor fixes to UTF8String
## What changes were proposed in this pull request?
A few minor changes to utf8 string which were observed in the code.
## How
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15743
Looks good to me, some minor queries and comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15743#discussion_r86312868
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala ---
@@ -197,8 +196,32 @@ private[spark] object UIData {
shuffleWriteMetrics
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15743#discussion_r86312686
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala ---
@@ -197,8 +196,32 @@ private[spark] object UIData {
shuffleWriteMetrics
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15743#discussion_r86312418
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1089,7 +1089,7 @@ class DAGScheduler(
// To avoid UI cruft
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15743#discussion_r86311221
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -885,8 +885,8 @@ private[spark] object JsonProtocol {
if (json
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15743#discussion_r86310825
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -885,8 +885,8 @@ private[spark] object JsonProtocol {
if (json
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15743
+1 on UseStringDeduplication ! Though jdk8 specific
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15741#discussion_r86296072
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala ---
@@ -271,8 +271,12 @@ class OpenHashSet[@specialized(Long, Int) T
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15731
Thx for clarifying @rxin, looks good to me.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15731
I am not sure I follow, why do we need to redefine OutputCommitter from
mapreduce with slightly differing api ?
---
If your project is set up for it, you can reply to this email and have your
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15713
@andrewcraik If we are simply delegating to ArrayBuffer, it makes sense to
remove the class - it is an additional object created for each key being
aggregated.
Alternative would be to implement
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15736#discussion_r86173509
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/PartitionedAppendOnlyMap.scala
---
@@ -30,7 +30,21 @@ private[spark] class
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15722
I agree with @jiexiong's analysis : @davies to support the scenario you
mentioned, we should not be free'ing the pages.
---
If your project is set up for it, you can reply to this email and have
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15722#discussion_r86046017
--- Diff:
core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
@@ -903,11 +906,12 @@ public void reset() {
numKeys = 0
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
Closing PR given pushback to commit'ing it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mridulm closed the pull request at:
https://github.com/apache/spark/pull/15553
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15713
+CC @rxin who can elaborate better.
The reason iirc why the first two variables are inline is to do with usual
low cardinality of the buffer - it effectively comes for "free" due
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r85859191
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -55,6 +65,11 @@ private[spark] class PythonWorkerFactory
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r85860115
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -69,6 +84,66 @@ private[spark] class PythonWorkerFactory
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r85872283
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -307,6 +387,7 @@ private[spark] class PythonWorkerFactory
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r85859768
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -69,6 +84,67 @@ private[spark] class PythonWorkerFactory
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r85859164
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -46,6 +50,12 @@ private[spark] class PythonWorkerFactory
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r85859942
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -69,6 +84,66 @@ private[spark] class PythonWorkerFactory
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r85859906
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -69,6 +84,66 @@ private[spark] class PythonWorkerFactory
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15669
I agree with Tom, for yarn cluster mode we should rely on distributed cache
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
Thanks for the comments.
Specifically for our usecase, this is for automated full builds, where
there is a container teardown after each (so that builds dont pollute each
other : new
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15618#discussion_r85436193
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -194,10 +194,13 @@ class ReceiverTracker(ssc
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15618#discussion_r85435956
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala ---
@@ -209,7 +209,8 @@ class TaskResultGetterSuite extends
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
Since I don't have much experience with the build system, I will defer to
your opinion about it.
Having said that, I want to know what can be done to add support for
avoiding test compilation
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
I will leave it open for a couple of days more to solicit suggestions
/comments : if there is a way to reduce build complexity of the change please
do let me know : would be great
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15632
@davies that check looks wrong - behaviorally we might not be hitting it
due to other preconditions enforced ; worrying to see this
On Oct 26, 2016 10:18 AM, "Davies Liu" &
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15612
@rxin I am not sure I like this better than previous version.
We have traded an RDD reference for a boolean, and enforced constrain that
the 'major' RDD must be the first to get zipped against
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15612#discussion_r85227464
--- Diff:
core/src/main/scala/org/apache/spark/rdd/ZippedPartitionsRDD.scala ---
@@ -45,7 +45,8 @@ private[spark] class ZippedPartitionsPartition
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15618
@HyukjinKwon So the idea is that you acquire resources required and dont
need to track it by wrapping them in Utils.tryWithResource (similar to memory
management in jvm).
As an example
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15632
@srowen The last update to the upstream library was 1.5+ years ago. Is it
still maintained ?
You are right, pushing it there and getting it released it might be good -
if possible
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15632
Ah, nice catch !
LGTM, can you merge it when build it done ? Thx.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15618
There are a bunch of methods in Utils which actually nicely apply to this
PR.
* Utils.tryWithSafeFinally
* Utils.tryWithResource, etc
They also handle exception suppression, etc
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15612#discussion_r84742854
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -848,39 +849,44 @@ abstract class RDD[T: ClassTag](
* of elements in each
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15612#discussion_r84743321
--- Diff:
core/src/main/scala/org/apache/spark/rdd/ZippedPartitionsRDD.scala ---
@@ -58,10 +63,14 @@ private[spark] abstract class
ZippedPartitionsBaseRDD
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
@srowen This is explicitly supports maven.test.skip=true - the rationale
for using it can be project specific (generating clean builds for various
profile combinations, internal release builds from
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
@srowen -Dmaven.test.skip=true is the documented way in maven to skip test
compilation
(http://maven.apache.org/surefire/maven-surefire-plugin/examples/skipping-test.html)
which we
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15553#discussion_r84533967
--- Diff: sql/hive-thriftserver/pom.xml ---
@@ -41,11 +41,8 @@
${project.version}
- org.apache.spark
- spark
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
Changes done, please do let me know of any other comments, thanks !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15579
Btw, curious if you have actually tested this in yarn - I have a feeling it
wont work.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15553#discussion_r84374076
--- Diff: sql/hive-thriftserver/pom.xml ---
@@ -41,11 +41,8 @@
${project.version}
- org.apache.spark
- spark
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r84225583
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,226 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15481
Ah ! Apologies, I got confused. Yes, I agree, that is a better approach.
It also means we can get rid of the RemoveExecutor pattern match from
receive right ? As it stands now, that looks
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15481
@zsxwing I think the issue is that case RemoveExecutor() is not identical
to what exists in receiveAndReply
Any reason
'executorDataMap.get(executorId).foreach(_.executorEndpoint.send
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15481
@zsxwing To minimize scope of synchronized block.
The way @scwf has now, the synchronized block is limited to duplicating key
and setting some state.
Remaining can happen outside of the lock
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r84160226
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r84154979
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15553#discussion_r84081078
--- Diff: sql/hive-thriftserver/pom.xml ---
@@ -41,11 +41,8 @@
${project.version}
- org.apache.spark
- spark
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15553#discussion_r84057228
--- Diff: sql/hive-thriftserver/pom.xml ---
@@ -41,11 +41,8 @@
${project.version}
- org.apache.spark
- spark
701 - 800 of 1286 matches
Mail list logo