Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-25 Thread RUSHIKESH RAUT
Yes I know it inevitable if the data is large. I want to know how do I
increase the interpreter memory to handle large data?

Thanks,
Rushikesh Raut

On Mar 26, 2017 8:56 AM, "Jianfeng (Jeff) Zhang" 
wrote:

>
> How large is your data ? This problem is inevitable if your data is too
> large, you can try to use spark data frame if that works for you.
>
>
>
>
>
> Best Regard,
> Jeff Zhang
>
>
> From: RUSHIKESH RAUT 
> Reply-To: "users@zeppelin.apache.org" 
> Date: Saturday, March 25, 2017 at 5:06 PM
> To: "users@zeppelin.apache.org" 
> Subject: Zeppelin out of memory issue - (GC overhead limit exceeded)
>
> Hi everyone,
>
> I am trying to load some data from hive table into my notebook and then
> convert this dataframe into r dataframe using spark.r interpreter. This
> works perfectly for small amount of data.
> But if the data is increased then it gives me error
>
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>
> I have tried increasing the ZEPPELIN_MEM and ZEPPELIN_INTP_MEM in
> the zeppelin-env.cmd file but i am still facing this issue. I have used the
> following configuration
>
> set ZEPPELIN_MEM="-Xms4096m -Xmx4096m -XX:MaxPermSize=2048m"
> set ZEPPELIN_INTP_MEM="-Xmx4096m -Xms4096m -XX:MaxPermSize=2048m"
>
> I am sure that this much size should be sufficient for my data but still i
> am getting this same error. Any guidance will be much appreciated.
>
> Thanks,
> Rushikesh Raut
>


Setting Zeppelin to work with multiple Hadoop clusters when running Spark.

2017-03-25 Thread Serega Sheypak
Hi, I have three hadoop clusters. Each cluster has it's own NN HA
configured and YARN.
I want to allow user to read from ant cluster and write to any cluster.
Also user should be able to choose where to run is spark job.
What is the right way to configure it in Zeppelin?


Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-25 Thread RUSHIKESH RAUT
Hi everyone,

I am trying to load some data from hive table into my notebook and then
convert this dataframe into r dataframe using spark.r interpreter. This
works perfectly for small amount of data.
But if the data is increased then it gives me error

java.lang.OutOfMemoryError: GC overhead limit exceeded

I have tried increasing the ZEPPELIN_MEM and ZEPPELIN_INTP_MEM in
the zeppelin-env.cmd file but i am still facing this issue. I have used the
following configuration

set ZEPPELIN_MEM="-Xms4096m -Xmx4096m -XX:MaxPermSize=2048m"
set ZEPPELIN_INTP_MEM="-Xmx4096m -Xms4096m -XX:MaxPermSize=2048m"

I am sure that this much size should be sufficient for my data but still i
am getting this same error. Any guidance will be much appreciated.

Thanks,
Rushikesh Raut