Re: sparksql - HiveConf not found during task deserialization

2015-04-29 Thread Manku Timma
The issue is solved. There was a problem in my hive codebase. Once that was fixed, -Phive-provided spark is working fine against my hive jars. On 27 April 2015 at 08:00, Manku Timma wrote: > Made some progress on this. Adding hive jars to the system classpath is > needed. But looks l

Re: sparksql - HiveConf not found during task deserialization

2015-04-26 Thread Manku Timma
Timma wrote: > Setting SPARK_CLASSPATH is triggering other errors. Not working. > > > On 25 April 2015 at 09:16, Manku Timma wrote: > >> Actually found the culprit. The JavaSerializerInstance.deserialize is >> called with a classloader (of type MutableURLClassLoader) whi

Re: sparksql - HiveConf not found during task deserialization

2015-04-24 Thread Manku Timma
Setting SPARK_CLASSPATH is triggering other errors. Not working. On 25 April 2015 at 09:16, Manku Timma wrote: > Actually found the culprit. The JavaSerializerInstance.deserialize is > called with a classloader (of type MutableURLClassLoader) which has access > to all the hive cla

Re: sparksql - HiveConf not found during task deserialization

2015-04-24 Thread Manku Timma
he hive jar to the > SPARK_CLASSPATH (in conf/spark-env.sh file on all machines) and make sure > that jar is available on all the machines in the cluster in the same path. > > Thanks > Best Regards > > On Wed, Apr 22, 2015 at 11:24 AM, Manku Timma > wrote: > >> Akhil, Thanks fo

Re: sparksql - HiveConf not found during task deserialization

2015-04-21 Thread Manku Timma
; it. > > Thanks > Best Regards > > On Mon, Apr 20, 2015 at 12:26 PM, Manku Timma > wrote: > >> Akhil, >> But the first case of creating HiveConf on the executor works fine (map >> case). Only the second case fails. I was suspecting some foul play with >> cl

Re: sparksql - HiveConf not found during task deserialization

2015-04-19 Thread Manku Timma
jar is present. > > Thanks > Best Regards > > On Mon, Apr 20, 2015 at 11:52 AM, Manku Timma > wrote: > >> I am using spark-1.3 with hadoop-provided and hive-provided and >> hive-0.13.1 profiles. I am running a simple spark job on a yarn cluster by >> ad

sparksql - HiveConf not found during task deserialization

2015-04-19 Thread Manku Timma
I am using spark-1.3 with hadoop-provided and hive-provided and hive-0.13.1 profiles. I am running a simple spark job on a yarn cluster by adding all hadoop2 and hive13 jars to the spark classpaths. If I remove the hive-provided while building spark, I dont face any issue. But with hive-provided I

Re: save rdd to ORC file

2015-01-03 Thread Manku Timma
One way is to use sparkSQL. scala> sqlContext.sql("create table orc_table(key INT, value STRING) stored as orc")scala> sqlContext.sql("insert into table orc_table select * from schema_rdd_temp_table")scala> sqlContext.sql("FROM orc_table select *") On 4 January 2015 at 00:57, SamyaMaiti wrote:

serialization changes -- OOM

2014-09-09 Thread Manku Timma
Has anything changed in the last 30 days w.r.t serialization? I had 620MB of compressed data which used to get serialized-in-spark-memory with 4GB executor memory. Now it fails to get serialized in memory even at 10GB of executor memory. -- Bharath

sharing off_heap rdds

2014-09-08 Thread Manku Timma
I see that the tachyon url constructed for an rdd partition has executor id in it. So if the same partition is being processed by a different executor on a reexecution of the same computation, it cannot really use the earlier result. Is this a correct assessment? Will removing the executor id from