Re: java.lang.ArrayIndexOutOfBoundsException when attempting broadcastjoin

2016-02-03 Thread Alexandr Dzhagriev
Hi Sebastian, Do you have any updates on the issue? I faced with pretty the same problem and disabling kryo + raising the spark.network.timeout up to 600s helped. So for my job it takes about 5 minutes to broadcast the variable (~5GB in my case) but then it's fast. I mean much faster than shufflin

Re: java.lang.ArrayIndexOutOfBoundsException when attempting broadcastjoin

2016-01-21 Thread Sebastian Piu
I'm using Spark 1.6.0. I tried removing Kryo and reverting back to Java Serialisation, and get a different error which maybe points in the right direction... java.lang.AssertionError: assertion failed: No plan for BroadcastHint +- InMemoryRelation [tradeId#30,tradeVersion#31,agreement#49,counterP

Re: java.lang.ArrayIndexOutOfBoundsException when attempting broadcastjoin

2016-01-21 Thread Ted Yu
You were using Kryo serialization ? If you switch to Java serialization, your job should run fine. Which Spark release are you using ? Thanks On Thu, Jan 21, 2016 at 6:59 AM, sebastian.piu wrote: > Hi all, > > I'm trying to work out a problem when using Spark Streaming, currently I > have the

java.lang.ArrayIndexOutOfBoundsException when attempting broadcastjoin

2016-01-21 Thread sebastian.piu
Hi all, I'm trying to work out a problem when using Spark Streaming, currently I have the following piece of code inside a foreachRDD call: Dataframe results = ... //some dataframe created from the incoming rdd - moderately big, I don't want this to be shuffled DataFrame t = sqlContext.table("a_t