Re: JAVA heap space issue

2016-10-24 Thread Sankar Mittapally
I have lot of joint SQL operations, which is blocking me write data and
unresisted the data, if not useful.

On Oct 24, 2016 7:50 PM, "Mich Talebzadeh" 
wrote:

> OK so you are disabling broadcasting although it is not obvious how this
> helps in this case!
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 15:08, Sankar Mittapally  creditvidya.com> wrote:
>
>> sc <- sparkR.session(master = "spark://ip-172-31-6-116:7077"
>> ,sparkConfig=list(spark.executor.memory="10g",spark.app.name
>> ="Testing",spark.driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
>> -Xmx5g -XX:-UseGCOverheadLimit",spark.driver.extraJavaOption="-Xms2g
>> -Xmx5g -XX:-UseGCOverheadLimit",spark.cores.max="2",spark.sql.autoB
>> roadcastJoinThreshold="-1"))
>>
>> On Mon, Oct 24, 2016 at 7:33 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> OK so what is your full launch code now? I mean equivalent to
>>> spark-submit
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 24 October 2016 at 14:57, Sankar Mittapally <
>>> sankar.mittapa...@creditvidya.com> wrote:
>>>
 Hi Mich,

  I am able to write the files to storage after adding extra parameter.

 FYI..

 This one I used.

 spark.sql.autoBroadcastJoinThreshold="-1"



 On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <
 mich.talebza...@gmail.com> wrote:

> Rather strange as you have plenty free memory there.
>
> Try reducing driver memory to 2GB and executer memory to 2GB and run
> it again
>
> ${SPARK_HOME}/bin/spark-submit \
>--driver-memory 2G \
> --num-executors 2 \
> --executor-cores 1 \
> --executor-memory 2G \
> --master spark://IPAddress:7077 \
>
> HTH
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for
> any loss, damage or destruction of data or any other property which may
> arise from relying on this email's technical content is explicitly
> disclaimed. The author will in no case be liable for any monetary damages
> arising from such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 13:15, Sankar Mittapally <
> sankar.mittapa...@creditvidya.com> wrote:
>
>> Hi Mich,
>>
>>  Yes, I am using standalone mode cluster, We have two executors with
>> 10G memory each.  We have two workers.
>>
>> FYI..
>>
>>
>>
>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Sounds like you are running in standalone mode.
>>>
>>> Have you checked the UI on port 4040 (default) to see where memory
>>> is going. Why do you need executor memory of 10GB?
>>>
>>> How many executors are running and plus how many slaves started?
>>>
>>> In standalone mode executors run on workers (UI 8080)
>>>
>>>
>>> HTH
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>> for any loss, damage or 

Re: JAVA heap space issue

2016-10-24 Thread Mich Talebzadeh
OK so you are disabling broadcasting although it is not obvious how this
helps in this case!

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 15:08, Sankar Mittapally <
sankar.mittapa...@creditvidya.com> wrote:

> sc <- sparkR.session(master = "spark://ip-172-31-6-116:7077"
> ,sparkConfig=list(spark.executor.memory="10g",spark.app.name
> ="Testing",spark.driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
> -Xmx5g -XX:-UseGCOverheadLimit",spark.driver.extraJavaOption="-Xms2g
> -Xmx5g -XX:-UseGCOverheadLimit",spark.cores.max="2",spark.sql.
> autoBroadcastJoinThreshold="-1"))
>
> On Mon, Oct 24, 2016 at 7:33 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> OK so what is your full launch code now? I mean equivalent to spark-submit
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 October 2016 at 14:57, Sankar Mittapally <
>> sankar.mittapa...@creditvidya.com> wrote:
>>
>>> Hi Mich,
>>>
>>>  I am able to write the files to storage after adding extra parameter.
>>>
>>> FYI..
>>>
>>> This one I used.
>>>
>>> spark.sql.autoBroadcastJoinThreshold="-1"
>>>
>>>
>>>
>>> On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 Rather strange as you have plenty free memory there.

 Try reducing driver memory to 2GB and executer memory to 2GB and run it
 again

 ${SPARK_HOME}/bin/spark-submit \
--driver-memory 2G \
 --num-executors 2 \
 --executor-cores 1 \
 --executor-memory 2G \
 --master spark://IPAddress:7077 \

 HTH



 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com


 *Disclaimer:* Use it at your own risk. Any and all responsibility for
 any loss, damage or destruction of data or any other property which may
 arise from relying on this email's technical content is explicitly
 disclaimed. The author will in no case be liable for any monetary damages
 arising from such loss, damage or destruction.



 On 24 October 2016 at 13:15, Sankar Mittapally <
 sankar.mittapa...@creditvidya.com> wrote:

> Hi Mich,
>
>  Yes, I am using standalone mode cluster, We have two executors with
> 10G memory each.  We have two workers.
>
> FYI..
>
>
>
> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Sounds like you are running in standalone mode.
>>
>> Have you checked the UI on port 4040 (default) to see where memory is
>> going. Why do you need executor memory of 10GB?
>>
>> How many executors are running and plus how many slaves started?
>>
>> In standalone mode executors run on workers (UI 8080)
>>
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>> for any loss, damage or destruction of data or any other property which 
>> may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 October 2016 at 12:19, sankarmittapally <
>> 

Re: JAVA heap space issue

2016-10-24 Thread Sankar Mittapally
sc <- sparkR.session(master =
"spark://ip-172-31-6-116:7077",sparkConfig=list(spark.executor.memory="10g",
spark.app.name="Testing",spark.driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
-Xmx5g -XX:-UseGCOverheadLimit",spark.driver.extraJavaOption="-Xms2g -Xmx5g
-XX:-UseGCOverheadLimit",spark.cores.max="2",spark.sql.autoBroadcastJoinThreshold="-1"))

On Mon, Oct 24, 2016 at 7:33 PM, Mich Talebzadeh 
wrote:

> OK so what is your full launch code now? I mean equivalent to spark-submit
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 14:57, Sankar Mittapally  creditvidya.com> wrote:
>
>> Hi Mich,
>>
>>  I am able to write the files to storage after adding extra parameter.
>>
>> FYI..
>>
>> This one I used.
>>
>> spark.sql.autoBroadcastJoinThreshold="-1"
>>
>>
>>
>> On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Rather strange as you have plenty free memory there.
>>>
>>> Try reducing driver memory to 2GB and executer memory to 2GB and run it
>>> again
>>>
>>> ${SPARK_HOME}/bin/spark-submit \
>>>--driver-memory 2G \
>>> --num-executors 2 \
>>> --executor-cores 1 \
>>> --executor-memory 2G \
>>> --master spark://IPAddress:7077 \
>>>
>>> HTH
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 24 October 2016 at 13:15, Sankar Mittapally <
>>> sankar.mittapa...@creditvidya.com> wrote:
>>>
 Hi Mich,

  Yes, I am using standalone mode cluster, We have two executors with
 10G memory each.  We have two workers.

 FYI..



 On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
 mich.talebza...@gmail.com> wrote:

> Sounds like you are running in standalone mode.
>
> Have you checked the UI on port 4040 (default) to see where memory is
> going. Why do you need executor memory of 10GB?
>
> How many executors are running and plus how many slaves started?
>
> In standalone mode executors run on workers (UI 8080)
>
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for
> any loss, damage or destruction of data or any other property which may
> arise from relying on this email's technical content is explicitly
> disclaimed. The author will in no case be liable for any monetary damages
> arising from such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 12:19, sankarmittapally <
> sankar.mittapa...@creditvidya.com> wrote:
>
>> Hi,
>>
>>  I have a three node cluster with 30G of Memory. I am trying to
>> analyzing
>> the data of 200MB and running out of memory every time. This is the
>> command
>> I am using
>>
>> Driver Memory = 10G
>> Executor memory=10G
>>
>> sc <- sparkR.session(master =
>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>> or.memory="10g",spark.app.name="Testing",spark.driver.memory
>> ="14g",spark.executor.extraJavaOption="-Xms2g
>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
>> -Xmx5g
>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>
>>
>> [D 16:43:51.437 NotebookApp] 200 GET
>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>> Exception in thread "broadcast-exchange-0"
>> java.lang.OutOfMemoryError: Java
>> heap 

Re: JAVA heap space issue

2016-10-24 Thread Mich Talebzadeh
OK so what is your full launch code now? I mean equivalent to spark-submit



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 14:57, Sankar Mittapally <
sankar.mittapa...@creditvidya.com> wrote:

> Hi Mich,
>
>  I am able to write the files to storage after adding extra parameter.
>
> FYI..
>
> This one I used.
>
> spark.sql.autoBroadcastJoinThreshold="-1"
>
>
>
> On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Rather strange as you have plenty free memory there.
>>
>> Try reducing driver memory to 2GB and executer memory to 2GB and run it
>> again
>>
>> ${SPARK_HOME}/bin/spark-submit \
>>--driver-memory 2G \
>> --num-executors 2 \
>> --executor-cores 1 \
>> --executor-memory 2G \
>> --master spark://IPAddress:7077 \
>>
>> HTH
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 October 2016 at 13:15, Sankar Mittapally <
>> sankar.mittapa...@creditvidya.com> wrote:
>>
>>> Hi Mich,
>>>
>>>  Yes, I am using standalone mode cluster, We have two executors with 10G
>>> memory each.  We have two workers.
>>>
>>> FYI..
>>>
>>>
>>>
>>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 Sounds like you are running in standalone mode.

 Have you checked the UI on port 4040 (default) to see where memory is
 going. Why do you need executor memory of 10GB?

 How many executors are running and plus how many slaves started?

 In standalone mode executors run on workers (UI 8080)


 HTH

 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com


 *Disclaimer:* Use it at your own risk. Any and all responsibility for
 any loss, damage or destruction of data or any other property which may
 arise from relying on this email's technical content is explicitly
 disclaimed. The author will in no case be liable for any monetary damages
 arising from such loss, damage or destruction.



 On 24 October 2016 at 12:19, sankarmittapally <
 sankar.mittapa...@creditvidya.com> wrote:

> Hi,
>
>  I have a three node cluster with 30G of Memory. I am trying to
> analyzing
> the data of 200MB and running out of memory every time. This is the
> command
> I am using
>
> Driver Memory = 10G
> Executor memory=10G
>
> sc <- sparkR.session(master =
> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
> or.memory="10g",spark.app.name="Testing",spark.driver.memory
> ="14g",spark.executor.extraJavaOption="-Xms2g
> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
> -Xmx5g
> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>
>
> [D 16:43:51.437 NotebookApp] 200 GET
> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
> Java
> heap space
> at
> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
> nd(HashedRelation.scala:539)
> at
> org.apache.spark.sql.execution.joins.LongHashedRelation$.app
> ly(HashedRelation.scala:803)
> at
> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
> ashedRelation.scala:105)
> at
> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
> Mode.transform(HashedRelation.scala:816)
> at
> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
> Mode.transform(HashedRelation.scala:812)

Re: JAVA heap space issue

2016-10-24 Thread Sankar Mittapally
Hi Mich,

 I am able to write the files to storage after adding extra parameter.

FYI..

This one I used.

spark.sql.autoBroadcastJoinThreshold="-1"



On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh 
wrote:

> Rather strange as you have plenty free memory there.
>
> Try reducing driver memory to 2GB and executer memory to 2GB and run it
> again
>
> ${SPARK_HOME}/bin/spark-submit \
>--driver-memory 2G \
> --num-executors 2 \
> --executor-cores 1 \
> --executor-memory 2G \
> --master spark://IPAddress:7077 \
>
> HTH
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 13:15, Sankar Mittapally  creditvidya.com> wrote:
>
>> Hi Mich,
>>
>>  Yes, I am using standalone mode cluster, We have two executors with 10G
>> memory each.  We have two workers.
>>
>> FYI..
>>
>>
>>
>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Sounds like you are running in standalone mode.
>>>
>>> Have you checked the UI on port 4040 (default) to see where memory is
>>> going. Why do you need executor memory of 10GB?
>>>
>>> How many executors are running and plus how many slaves started?
>>>
>>> In standalone mode executors run on workers (UI 8080)
>>>
>>>
>>> HTH
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 24 October 2016 at 12:19, sankarmittapally <
>>> sankar.mittapa...@creditvidya.com> wrote:
>>>
 Hi,

  I have a three node cluster with 30G of Memory. I am trying to
 analyzing
 the data of 200MB and running out of memory every time. This is the
 command
 I am using

 Driver Memory = 10G
 Executor memory=10G

 sc <- sparkR.session(master =
 "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
 or.memory="10g",spark.app.name="Testing",spark.driver.memory
 ="14g",spark.executor.extraJavaOption="-Xms2g
 -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
 -Xmx5g
 -XX:MaxPermSize=1024M",spark.cores.max="2"))


 [D 16:43:51.437 NotebookApp] 200 GET
 /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
 Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
 Java
 heap space
 at
 org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
 nd(HashedRelation.scala:539)
 at
 org.apache.spark.sql.execution.joins.LongHashedRelation$.app
 ly(HashedRelation.scala:803)
 at
 org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
 ashedRelation.scala:105)
 at
 org.apache.spark.sql.execution.joins.HashedRelationBroadcast
 Mode.transform(HashedRelation.scala:816)
 at
 org.apache.spark.sql.execution.joins.HashedRelationBroadcast
 Mode.transform(HashedRelation.scala:812)
 at
 org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
 c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
 ExchangeExec.
 scala:90)
 at
 org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
 c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
 ExchangeExec.
 scala:72)
 at
 org.apache.spark.sql.execution.SQLExecution$.withExecutionId
 (SQLExecution.scala:94)
 at
 org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
 c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
 at
 org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
 c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
 at
 scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
 dTree1$1(Future.scala:24)

Re: JAVA heap space issue

2016-10-24 Thread Mich Talebzadeh
Rather strange as you have plenty free memory there.

Try reducing driver memory to 2GB and executer memory to 2GB and run it
again

${SPARK_HOME}/bin/spark-submit \
   --driver-memory 2G \
--num-executors 2 \
--executor-cores 1 \
--executor-memory 2G \
--master spark://IPAddress:7077 \

HTH



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 13:15, Sankar Mittapally <
sankar.mittapa...@creditvidya.com> wrote:

> Hi Mich,
>
>  Yes, I am using standalone mode cluster, We have two executors with 10G
> memory each.  We have two workers.
>
> FYI..
>
>
>
> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Sounds like you are running in standalone mode.
>>
>> Have you checked the UI on port 4040 (default) to see where memory is
>> going. Why do you need executor memory of 10GB?
>>
>> How many executors are running and plus how many slaves started?
>>
>> In standalone mode executors run on workers (UI 8080)
>>
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 October 2016 at 12:19, sankarmittapally <
>> sankar.mittapa...@creditvidya.com> wrote:
>>
>>> Hi,
>>>
>>>  I have a three node cluster with 30G of Memory. I am trying to analyzing
>>> the data of 200MB and running out of memory every time. This is the
>>> command
>>> I am using
>>>
>>> Driver Memory = 10G
>>> Executor memory=10G
>>>
>>> sc <- sparkR.session(master =
>>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>>> or.memory="10g",spark.app.name="Testing",spark.driver.memory
>>> ="14g",spark.executor.extraJavaOption="-Xms2g
>>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
>>> -Xmx5g
>>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>>
>>>
>>> [D 16:43:51.437 NotebookApp] 200 GET
>>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>>> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
>>> Java
>>> heap space
>>> at
>>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
>>> nd(HashedRelation.scala:539)
>>> at
>>> org.apache.spark.sql.execution.joins.LongHashedRelation$.app
>>> ly(HashedRelation.scala:803)
>>> at
>>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>>> ashedRelation.scala:105)
>>> at
>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>> Mode.transform(HashedRelation.scala:816)
>>> at
>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>> Mode.transform(HashedRelation.scala:812)
>>> at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>> ExchangeExec.
>>> scala:90)
>>> at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>> ExchangeExec.
>>> scala:72)
>>> at
>>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>>> (SQLExecution.scala:94)
>>> at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>> at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>> at
>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>>> dTree1$1(Future.scala:24)
>>> at
>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>>> uture.scala:24)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>>
>>>
>>> --
>>> View this message in 

Re: JAVA heap space issue

2016-10-24 Thread Sankar Mittapally
Hi Mich,

 Yes, I am using standalone mode cluster, We have two executors with 10G
memory each.  We have two workers.

FYI..



On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh 
wrote:

> Sounds like you are running in standalone mode.
>
> Have you checked the UI on port 4040 (default) to see where memory is
> going. Why do you need executor memory of 10GB?
>
> How many executors are running and plus how many slaves started?
>
> In standalone mode executors run on workers (UI 8080)
>
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 12:19, sankarmittapally  creditvidya.com> wrote:
>
>> Hi,
>>
>>  I have a three node cluster with 30G of Memory. I am trying to analyzing
>> the data of 200MB and running out of memory every time. This is the
>> command
>> I am using
>>
>> Driver Memory = 10G
>> Executor memory=10G
>>
>> sc <- sparkR.session(master =
>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>> or.memory="10g",spark.app.name="Testing",spark.driver.
>> memory="14g",spark.executor.extraJavaOption="-Xms2g
>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g -Xmx5g
>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>
>>
>> [D 16:43:51.437 NotebookApp] 200 GET
>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
>> Java
>> heap space
>> at
>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.
>> append(HashedRelation.scala:539)
>> at
>> org.apache.spark.sql.execution.joins.LongHashedRelation$.
>> apply(HashedRelation.scala:803)
>> at
>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>> ashedRelation.scala:105)
>> at
>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>> Mode.transform(HashedRelation.scala:816)
>> at
>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>> Mode.transform(HashedRelation.scala:812)
>> at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
>> scala:90)
>> at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
>> scala:72)
>> at
>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>> (SQLExecution.scala:94)
>> at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>> at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>> at
>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>> dTree1$1(Future.scala:24)
>> at
>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>> uture.scala:24)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: JAVA heap space issue

2016-10-24 Thread Mich Talebzadeh
Sounds like you are running in standalone mode.

Have you checked the UI on port 4040 (default) to see where memory is
going. Why do you need executor memory of 10GB?

How many executors are running and plus how many slaves started?

In standalone mode executors run on workers (UI 8080)


HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 12:19, sankarmittapally <
sankar.mittapa...@creditvidya.com> wrote:

> Hi,
>
>  I have a three node cluster with 30G of Memory. I am trying to analyzing
> the data of 200MB and running out of memory every time. This is the command
> I am using
>
> Driver Memory = 10G
> Executor memory=10G
>
> sc <- sparkR.session(master =
> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.
> executor.memory="10g",spark.app.name="Testing",spark.
> driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g -Xmx5g
> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>
>
> [D 16:43:51.437 NotebookApp] 200 GET
> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError: Java
> heap space
> at
> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.append(
> HashedRelation.scala:539)
> at
> org.apache.spark.sql.execution.joins.LongHashedRelation$.apply(
> HashedRelation.scala:803)
> at
> org.apache.spark.sql.execution.joins.HashedRelation$.apply(
> HashedRelation.scala:105)
> at
> org.apache.spark.sql.execution.joins.HashedRelationBroadcastMode.
> transform(HashedRelation.scala:816)
> at
> org.apache.spark.sql.execution.joins.HashedRelationBroadcastMode.
> transform(HashedRelation.scala:812)
> at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
> scala:90)
> at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
> scala:72)
> at
> org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.
> scala:94)
> at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
> at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.
> liftedTree1$1(Future.scala:24)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(
> Future.scala:24)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>