[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15782500#comment-15782500 ] Rui Li commented on HIVE-8373: -- Thanks [~asears] for the explanations. [~kellyzly], yes please go ahead update the wiki. I think we can put it in the [known issue|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-CommonIssues(Greenareresolved,willberemovedfromthislist)] part. > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15782254#comment-15782254 ] liyunzhang_intel commented on HIVE-8373: [~asears] : I have tested the command provided by you and it works in jdk8 env. [~lirui]: It is better to add this to wiki of hive on spark because users may be want to try spark local mode. > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773646#comment-15773646 ] Andrew Sears commented on HIVE-8373: [~lirui] Would be best to control the upper bounds to limit effects of OOM on the entire system. There are some memory leaks in Spark that may be caught by setting this to a lower setting. > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15771778#comment-15771778 ] Rui Li commented on HIVE-8373: -- Thanks [~asears] for the inputs. Just curious, in which case should MaxMetaspaceSize be set? I think it may be useful to find leaks in classloading. Otherwise, I guess users don't have to set an upper bound for the meta space right? > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769862#comment-15769862 ] Andrew Sears commented on HIVE-8373: [~kellyzly] Java 8 removes PermGen memory and replaces with Metaspace. See above StackOverflow link and some additional ones. http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/ https://blogs.oracle.com/poonam/entry/about_g1_garbage_collector_permanent http://geekswithblogs.net/JoshReuben/archive/2016/04/11/jvm-tuning.aspx So in JDK8, it appears that all native memory will be used unless capped. For Local Spark Master: HADOOP_OPTS="$HADOOP_OPTS -XX:MaxMetaspaceSize=512m" For Spark History server: SPARK_DAEMON_JAVA_OPTS="$SPARK_DAEMON_JAVA_OPTS -XX:MaxMetaspaceSize=512m" Let me know if you need anything added to Wiki around this. > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769353#comment-15769353 ] liyunzhang_intel commented on HIVE-8373: [~lirui]: yes, this problem only happens on jdk7. after upgrading to jdk8, do not need to increase the value of MaxPermSize. so my question is : is maxPermSize increased by default in jdk8? > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769309#comment-15769309 ] Rui Li commented on HIVE-8373: -- Thanks [~kellyzly] for working on this. Have you verified the OOM is related to perm size? If so, yes let's document this in our wiki. And I suppose it's only needed for JDK7 right? > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769284#comment-15769284 ] liyunzhang_intel commented on HIVE-8373: [~xuefuz] and [~lirui]: i have run spark.master=local successfully to avoid "OOM" exception by appending following in conf/hive-env.sh {code} export HADOOP_OPTS="$HADOOP_OPTS -XX:MaxPermSize=128m" {code} Refer the [answer|http://stackoverflow.com/questions/34476195/why-does-creating-hivecontext-fail-with-java-lang-outofmemoryerror-permgen-spa] on stackoverflow to solve the OOM exception. Do we need to increase the value of "MaxPermSize" to make hos run successfully on local mode? if yes, we should add this on wiki. > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15714427#comment-15714427 ] liyunzhang_intel commented on HIVE-8373: [~xuefuz] and [~lirui]: I have met same OOM error when run HOS on spark.master=local in HOS20. Can HOS work in HOS with spark16? > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)