Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
@tgravescs I was finally able to contribute
https://github.com/apache/hadoop/pull/289 which solves
[YARN-6483](https://issues.apache.org/jira/browse/YARN-6483). With that patch,
and the code in
Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
@tgravescs I have opened https://github.com/apache/hadoop/pull/289 to the
YARN changes to get a notification in the AM when a nodes transitions to
DECOMMISSIONING. This should be already useful for
Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19583
Even though we only wait 5 seconds by default between retries, the retries
themselves can take a lot of time. For example in a simple word count job where
a node is lost during stage 1.0 I have seen
Github user juanrh commented on a diff in the pull request:
https://github.com/apache/spark/pull/19583#discussion_r148079448
--- Diff: core/src/test/scala/org/apache/spark/HeartbeatReceiverSuite.scala
---
@@ -225,6 +270,7 @@ class HeartbeatReceiverSuite
Matchers.eq
Github user juanrh commented on a diff in the pull request:
https://github.com/apache/spark/pull/19583#discussion_r148078847
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -51,7 +51,26 @@ private case class ExecutorRegistered(executorId: String
Github user juanrh commented on a diff in the pull request:
https://github.com/apache/spark/pull/19583#discussion_r148078337
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -241,6 +243,21 @@ final class ShuffleBlockFetcherIterator
Github user juanrh commented on a diff in the pull request:
https://github.com/apache/spark/pull/19583#discussion_r148075569
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -51,7 +51,26 @@ private case class ExecutorRegistered(executorId: String
GitHub user juanrh opened a pull request:
https://github.com/apache/spark/pull/19590
[WIP][SPARK-22148][CORE] TaskSetManager.abortIfCompletelyBlacklisted should
not abort when all current executors are blacklisted but dynamic allocation is
enabled
## What changes were proposed in
GitHub user juanrh opened a pull request:
https://github.com/apache/spark/pull/19583
[WIP][SPARK-22339] [CORE] [NETWORK-SHUFFLE]
## What changes were proposed in this pull request?
When a task finishes with error due to a fetch error, then DAGScheduler
unregisters the shuffle
Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
Hi @tgravescs, thanks again for your feedback. Regarding concrete uses
cases, this change might be used extend the existing graceful decommission
mechanism available in AWS EMR from a while ago
Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
Hi @vanzin and @tgravescs, do you have any other comments on this proposal?
Thanks,
Juan
---
-
To unsubscribe
Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
Hi Tom, thanks for your answer.
Regarding use cases for the Spark admin command, I think it would be a good
fit for cloud environments, where single job clusters are common, because
Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
Hi @vanzin, do you have any comments on the design document attached above?
Thanks
---
-
To unsubscribe, e-mail
Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
Hi @vanzin, thanks for taking a look.
This was part of a discussion with @holdenk about SPARK-20628. I have
attached the document
[Spark_Blacklisting_on_decommissioning-Scope.pdf](https
GitHub user juanrh opened a pull request:
https://github.com/apache/spark/pull/19267
[WIP][SPARK-20628][CORE] Blacklist nodes when they transition to
DECOMMISSIONING state in YARN
## What changes were proposed in this pull request?
Dynamic cluster configurations where cluster
GitHub user juanrh opened a pull request:
https://github.com/apache/spark/pull/17411
logging improvements
## What changes were proposed in this pull request?
Adding additional information to existing logging messages:
- YarnAllocator: log the executor ID together with the
Github user juanrh commented on the pull request:
https://github.com/apache/spark/pull/5367#issuecomment-95575488
the corresponding issue [SPARK-6714] has been closed as Won't Fix
---
If your project is set up for it, you can reply to this email and have your
reply appear on G
Github user juanrh closed the pull request at:
https://github.com/apache/spark/pull/5367
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
GitHub user juanrh opened a pull request:
https://github.com/apache/spark/pull/5367
[SPARK-6714][Streaming][Kafka] additionally overload KafkaUtils.createDi...
...rectStream for using a messageHandler without having to specify the
offsets
You can merge this pull request into a Git
19 matches
Mail list logo