Re: Spark 2.1.0 and Shapeless

2017-01-31 Thread Phil Wills
Are you not able to shade it when you're building your fat jar with
something like https://github.com/sbt/sbt-assembly#shading? I would have
thought doing the shading at the app level would be a bit less painful than
doing it at the library level.

Phil

On Tue, 31 Jan 2017, 04:24 Timothy Chan,  wrote:

> I'm using a library, https://github.com/guardian/scanamo, that uses
> shapeless 2.3.2. What are my options if I want to use this with Spark
> 2.1.0?
>
> Based on this:
> http://apache-spark-developers-list.1001551.n3.nabble.com/shapeless-in-spark-2-1-0-tt20392.html
>
> I'm guessing I would have to release my own version of scanamo with a
> shaded shapeless?
>


Re: Single worker locked at 100% CPU

2014-12-24 Thread Phil Wills
Turns out that I was just being idiotic and had assigned so much memory to
Spark that the O/S was ending up continually swapping.  Apologies for the
noise.

Phil

On Wed, Dec 24, 2014 at 1:16 AM, Andrew Ash  wrote:

> Hi Phil,
>
> This sounds a lot like a deadlock in Hadoop's Configuration object that I
> ran into a while back.  If you jstack the JVM and see a thread that looks
> like the below, it could be
> https://issues.apache.org/jira/browse/SPARK-2546
>
> "Executor task launch worker-6" daemon prio=10 tid=0x7f91f01fe000 
> nid=0x54b1 runnable [0x7f92d74f1000]
>java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.transfer(HashMap.java:601)
> at java.util.HashMap.resize(HashMap.java:581)
> at java.util.HashMap.addEntry(HashMap.java:879)
> at java.util.HashMap.put(HashMap.java:505)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:803)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:783)
> at org.apache.hadoop.conf.Configuration.setClass(Configuration.java:1662)
>
>
> The fix for this issue is hidden behind a flag because it might have
> performance implications, but if it is this problem then you can set
> spark.hadoop.cloneConf=true and see if that fixes things.
>
> Good luck!
> Andrew
>
> On Tue, Dec 23, 2014 at 9:40 AM, Phil Wills  wrote:
>
>> I've been attempting to run a job based on MLlib's ALS implementation for
>> a while now and have hit an issue I'm having a lot of difficulty getting to
>> the bottom of.
>>
>> On a moderate size set of input data it works fine, but against larger
>> (still well short of what I'd think of as big) sets of data, I'll see one
>> or two workers get stuck spinning at 100% CPU and the job unable to
>> recover.
>>
>> I don't believe this is down to memory pressure as I seem to get the same
>> behaviour at about the same size of input data, even if  the cluster is
>> twice as large. GC logs also suggest things are proceeding reasonably with
>> some Full GC's occurring, but no suggestion of the process being GC locked.
>>
>> After rebooting the instance that got into trouble, I can see the stderr
>> log for the task truncated in the middle of a log-line at the time CPU
>> shoots to and sticks at 100%, but no other signs of a problem.
>>
>> I've run into the same issue on 1.1.0 and 1.2.0 in standalone mode and
>> running on YARN.
>>
>> Any suggestions on further steps I could try to get a clearer diagnosis
>> of the issue would be much appreciated.
>>
>> Thanks,
>>
>> Phil
>>
>
>


Single worker locked at 100% CPU

2014-12-23 Thread Phil Wills
I've been attempting to run a job based on MLlib's ALS implementation for a
while now and have hit an issue I'm having a lot of difficulty getting to
the bottom of.

On a moderate size set of input data it works fine, but against larger
(still well short of what I'd think of as big) sets of data, I'll see one
or two workers get stuck spinning at 100% CPU and the job unable to
recover.

I don't believe this is down to memory pressure as I seem to get the same
behaviour at about the same size of input data, even if  the cluster is
twice as large. GC logs also suggest things are proceeding reasonably with
some Full GC's occurring, but no suggestion of the process being GC locked.

After rebooting the instance that got into trouble, I can see the stderr
log for the task truncated in the middle of a log-line at the time CPU
shoots to and sticks at 100%, but no other signs of a problem.

I've run into the same issue on 1.1.0 and 1.2.0 in standalone mode and
running on YARN.

Any suggestions on further steps I could try to get a clearer diagnosis of
the issue would be much appreciated.

Thanks,

Phil