ok solved. Looks like breathing the the spark-summit SFO air for 3 days helped
a lot !
Piping the 7 million records to local disk still runs out of memory.So piped
the results into another Hive table. I can live with that :-)
/opt/cloudera/parcels/CDH/lib/spark/bin/spark-sql -e "use aers; create
have spark on YARN but could never get it to
>> run fairly complex queries and I have no answers from this group of the CDH
>> groups.
>>
>> So my assumption is that its possibly not solved , else I have always got
>> very quick answers and responses :-) to m
ption is that its possibly not solved , else I have always got
> very quick answers and responses :-) to my questions on all CDH groups,
> Spark, Hive
>
> best regards
>
> sanjay
>
>
>
> ------------------
> *From:* Josh Rosen
> *To:* Sanjay Subra
:-) to my questions on all CDH groups, Spark, Hive
best regards
sanjay
From: Josh Rosen
To: Sanjay Subramanian
Cc: "user@spark.apache.org"
Sent: Friday, June 12, 2015 7:15 AM
Subject: Re: spark-sql from CLI --->EXCEPTION: java.lang.OutOfMemoryError:
Java heap space
It sounds like this might be caused by a memory configuration problem. In
addition to looking at the executor memory, I'd also bump up the driver memory,
since it appears that your shell is running out of memory when collecting a
large query result.
Sent from my phone
> On Jun 11, 2015, at 8:
Sent from my phone
> On Jun 11, 2015, at 8:43 AM, Sanjay Subramanian
> wrote:
>
> hey guys
>
> Using Hive and Impala daily intensively.
> Want to transition to spark-sql in CLI mode
>
> Currently in my sandbox I am using the Spark (standalone mode) in the CDH
> distribution (starving devel