Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-67102974
What's the status on this PR / JIRA? As far as I know, it seems that
TorrentBroadcast has been more stable lately, so if the only motivation here
was stability then I
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-67107062
Close this now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user davies closed the pull request at:
https://github.com/apache/spark/pull/2933
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user squito commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-62330965
I agree with @pwendell . It seems like the right thing to do is just fix
Broadcast ... and if we can't, then wouldn't you also want to turn off
Broadcast even for big
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-61396544
[Test build #22746 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22746/consoleFull)
for PR 2933 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-61396546
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-61395215
[Test build #22746 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22746/consoleFull)
for PR 2933 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60807456
[Test build #486 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/486/consoleFull)
for PR 2933 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60818616
[Test build #486 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/486/consoleFull)
for PR 2933 at commit
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60553581
@JoshRosen I think we still have it (in tests at tonight):
```
[info] org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 11.0
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60553702
This is really strange; I thought that the unexpected exception type
would have been addressed by https://github.com/apache/spark/pull/2932
---
If your project is set
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60553953
Can you point me to the commit that produced that stacktrace?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60554225
@JoshRosen @pwendell The test branch (internal) did not have that commit.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60525495
I find this a little bit hacky. If the broadcast implementation has bugs or
performance issues, we should just fix them and it will stabalize over time
like any other
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60529659
Broadcast (especially TorrentBroadcast) is designed for large object, using
it to send out small shared variables just like using tank to shot a
mosquitoes, it's not a
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60530014
I don't see fundamentally why the broadcast mechanism can't be done as
efficiently as task launching itself. Do you have a reproducible workload where
this caused a
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60531516
The motivation is not about performance, it's about stability.
We're fighting with the problem of failure during deserialize a task for
days, they can not be
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60534648
We're fighting with the problem of failure during deserialize a task for
days (failed in TorrentBroadcast)
I thought we had fixed this issue; can you point me
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60473661
[Test build #427 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/427/consoleFull)
for PR 2933 at commit
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60474042
@JoshRosen
#2846 fixes the scalastyle bug.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60474692
**[Test build #22196 timed
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22196/consoleFull)**
for PR 2933 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60474693
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60491396
[Test build #450 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/450/consoleFull)
for PR 2933 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60494261
[Test build #450 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/450/consoleFull)
for PR 2933 at commit
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/spark/pull/2933#discussion_r19377173
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -124,6 +123,10 @@ class DAGScheduler(
/** If enabled, we may run
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/2933#discussion_r19377630
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -124,6 +123,10 @@ class DAGScheduler(
/** If enabled, we may run
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/2933#discussion_r19377652
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -124,6 +123,10 @@ class DAGScheduler(
/** If enabled, we may run
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60498227
[Test build #3 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/3/consoleFull)
for PR 2933 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-6041
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60499988
[Test build #3 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/3/consoleFull)
for PR 2933 at commit
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60500716
I've been thinking about this some more and I wonder about the motivation
for this change: how much of a performance benefit does this buy us for typical
workloads?
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60501858
@JoshRosen The motivation is not about performance, it's about stability.
Sending tasks to executors is the critical part in spark, it should be as
stable as possible.
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/2933#discussion_r19369470
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -124,6 +123,10 @@ class DAGScheduler(
/** If enabled, we may run
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/2933#discussion_r19369647
--- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala ---
@@ -69,6 +70,10 @@ private[spark] class Stage(
var resultOfJob:
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60472421
[Test build #427 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/427/consoleFull)
for PR 2933 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2933#issuecomment-60472463
[Test build #22196 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22196/consoleFull)
for PR 2933 at commit
36 matches
Mail list logo