Moving user to bcc.
What I found was that the TaskSetManager for my task set that had 5 tasks
had preferred locations set for 4 of the 5. Three had localhost/
and had completed. The one that had nothing had also completed. The last
one was set by our code to be my IP address. Local mode can hang o
Of course, breakpointing on every status update and revive offers
invocation kept the problem from happening. Where could the race be?
On Thu, Feb 26, 2015 at 7:55 PM, Victor Tso-Guillen wrote:
> Love to hear some input on this. I did get a standalone cluster up on my
> local machine and the pro
Love to hear some input on this. I did get a standalone cluster up on my
local machine and the problem didn't present itself. I'm pretty confident
that means the problem is in the LocalBackend or something near it.
On Thu, Feb 26, 2015 at 1:37 PM, Victor Tso-Guillen wrote:
> Okay I confirmed my
Okay I confirmed my suspicions of a hang. I made a request that stopped
progressing, though the already-scheduled tasks had finished. I made a
separate request that was small enough not to hang, and it kicked the hung
job enough to finish. I think what's happening is that the scheduler or the
local
Thanks for the link. Unfortunately, I turned on rdd compression and nothing
changed. I tried moving netty -> nio and no change :(
On Thu, Feb 26, 2015 at 2:01 AM, Akhil Das
wrote:
> Not many that i know of, but i bumped into this one
> https://issues.apache.org/jira/browse/SPARK-4516
>
> Thanks
Not many that i know of, but i bumped into this one
https://issues.apache.org/jira/browse/SPARK-4516
Thanks
Best Regards
On Thu, Feb 26, 2015 at 3:26 PM, Victor Tso-Guillen wrote:
> Is there any potential problem from 1.1.1 to 1.2.1 with shuffle
> dependencies that produce no data?
>
> On Thu,
Is there any potential problem from 1.1.1 to 1.2.1 with shuffle
dependencies that produce no data?
On Thu, Feb 26, 2015 at 1:56 AM, Victor Tso-Guillen wrote:
> The data is small. The job is composed of many small stages.
>
> * I found that with fewer than 222 the problem exhibits. What will be
>
The data is small. The job is composed of many small stages.
* I found that with fewer than 222 the problem exhibits. What will be
gained by going higher?
* Pushing up the parallelism only pushes up the boundary at which the
system appears to hang. I'm worried about some sort of message loss or
in
What operation are you trying to do and how big is the data that you are
operating on?
Here's a few things which you can try:
- Repartition the RDD to a higher number than 222
- Specify the master as local[*] or local[10]
- Use Kryo Serializer (.set("spark.serializer",
"org.apache.spark.serialize
I'm getting this really reliably on Spark 1.2.1. Basically I'm in local
mode with parallelism at 8. I have 222 tasks and I never seem to get far
past 40. Usually in the 20s to 30s it will just hang. The last logging is
below, and a screenshot of the UI.
2015-02-25 20:39:55.779 GMT-0800 INFO [task
10 matches
Mail list logo