The issue is solved. There was a problem in my hive codebase. Once that was
fixed, -Phive-provided spark is working fine against my hive jars.
On 27 April 2015 at 08:00, Manku Timma wrote:
> Made some progress on this. Adding hive jars to the system classpath is
> needed. But looks l
Timma wrote:
> Setting SPARK_CLASSPATH is triggering other errors. Not working.
>
>
> On 25 April 2015 at 09:16, Manku Timma wrote:
>
>> Actually found the culprit. The JavaSerializerInstance.deserialize is
>> called with a classloader (of type MutableURLClassLoader) whi
Setting SPARK_CLASSPATH is triggering other errors. Not working.
On 25 April 2015 at 09:16, Manku Timma wrote:
> Actually found the culprit. The JavaSerializerInstance.deserialize is
> called with a classloader (of type MutableURLClassLoader) which has access
> to all the hive cla
he hive jar to the
> SPARK_CLASSPATH (in conf/spark-env.sh file on all machines) and make sure
> that jar is available on all the machines in the cluster in the same path.
>
> Thanks
> Best Regards
>
> On Wed, Apr 22, 2015 at 11:24 AM, Manku Timma
> wrote:
>
>> Akhil, Thanks fo
; it.
>
> Thanks
> Best Regards
>
> On Mon, Apr 20, 2015 at 12:26 PM, Manku Timma
> wrote:
>
>> Akhil,
>> But the first case of creating HiveConf on the executor works fine (map
>> case). Only the second case fails. I was suspecting some foul play with
>> cl
jar is present.
>
> Thanks
> Best Regards
>
> On Mon, Apr 20, 2015 at 11:52 AM, Manku Timma
> wrote:
>
>> I am using spark-1.3 with hadoop-provided and hive-provided and
>> hive-0.13.1 profiles. I am running a simple spark job on a yarn cluster by
>> ad
I am using spark-1.3 with hadoop-provided and hive-provided and hive-0.13.1
profiles. I am running a simple spark job on a yarn cluster by adding all
hadoop2 and hive13 jars to the spark classpaths.
If I remove the hive-provided while building spark, I dont face any issue.
But with hive-provided I
One way is to use sparkSQL.
scala> sqlContext.sql("create table orc_table(key INT, value STRING)
stored as orc")scala> sqlContext.sql("insert into table orc_table
select * from schema_rdd_temp_table")scala> sqlContext.sql("FROM
orc_table select *")
On 4 January 2015 at 00:57, SamyaMaiti wrote:
Has anything changed in the last 30 days w.r.t serialization? I had 620MB
of compressed data which used to get serialized-in-spark-memory with 4GB
executor memory. Now it fails to get serialized in memory even at 10GB of
executor memory.
-- Bharath
I see that the tachyon url constructed for an rdd partition has executor id
in it. So if the same partition is being processed by a different executor
on a reexecution of the same computation, it cannot really use the earlier
result. Is this a correct assessment? Will removing the executor id from
10 matches
Mail list logo