[
https://issues.apache.org/jira/browse/SPARK-23427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368728#comment-16368728
]
Kazuaki Ishizaki edited comment on SPARK-23427 at 2/19/18 12:49 AM:
--------------------------------------------------------------------
Thank you. I ran this program several times with 64GB heap size. I saw the
following OOM in both cases `-1` or default (`10*1024`*1024`). I am running the
program with other heap sizes.
Is this OOM what you are seeing? If not, I would appreciate if you could
upload stack trace when OOM occurred.
{code}
[info] org.apache.spark.sql.MyTest *** ABORTED *** (2 hours, 14 minutes, 36
seconds)
[info] java.lang.OutOfMemoryError:
[info] at
java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:161)
[info] at
java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:155)
[info] at
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:125)
[info] at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
[info] at java.lang.StringBuilder.append(StringBuilder.java:136)
[info] at java.lang.StringBuilder.append(StringBuilder.java:131)
[info] at scala.StringContext.standardInterpolator(StringContext.scala:125)
[info] at scala.StringContext.s(StringContext.scala:95)
[info] at
org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:199)
[info] at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74)
[info] at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252)
[info] at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
[info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
[info] at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:3295)
[info] at
org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3033)
[info] at
org.apache.spark.sql.MyTest$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(MyTest.scala:87)
[info] at
org.apache.spark.sql.catalyst.plans.PlanTestBase$class.withSQLConf(PlanTest.scala:176)
[info] at
org.apache.spark.sql.MyTest.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(MyTest.scala:27)
[info] at
org.apache.spark.sql.test.SQLTestUtilsBase$class.withSQLConf(SQLTestUtils.scala:167)
[info] at org.apache.spark.sql.MyTest.withSQLConf(MyTest.scala:27)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply$mcV$sp(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
...
{code}
was (Author: kiszk):
Thank you. I ran this program several times with 64GB heap size. I saw the
following OOM in both cases `-1` or default (`10*1024`*1024`). I am running the
program with other heap sizes.
Is this OOM what you are seeing? If not, I would appreciate if you could
upload stack trace when OOM occurred.
{code:java}
[info] org.apache.spark.sql.MyTest *** ABORTED *** (2 hours, 14 minutes, 36
seconds)
[info] java.lang.OutOfMemoryError:
[info] at
java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:161)
[info] at
java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:155)
[info] at
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:125)
[info] at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
[info] at java.lang.StringBuilder.append(StringBuilder.java:136)
[info] at java.lang.StringBuilder.append(StringBuilder.java:131)
[info] at scala.StringContext.standardInterpolator(StringContext.scala:125)
[info] at scala.StringContext.s(StringContext.scala:95)
[info] at
org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:199)
[info] at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74)
[info] at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252)
[info] at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
[info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
[info] at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:3295)
[info] at
org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3033)
[info] at
org.apache.spark.sql.MyTest$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(MyTest.scala:87)
[info] at
org.apache.spark.sql.catalyst.plans.PlanTestBase$class.withSQLConf(PlanTest.scala:176)
[info] at
org.apache.spark.sql.MyTest.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(MyTest.scala:27)
[info] at
org.apache.spark.sql.test.SQLTestUtilsBase$class.withSQLConf(SQLTestUtils.scala:167)
[info] at org.apache.spark.sql.MyTest.withSQLConf(MyTest.scala:27)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply$mcV$sp(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
...
{code:java}
> spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver
> -------------------------------------------------------------------------
>
> Key: SPARK-23427
> URL: https://issues.apache.org/jira/browse/SPARK-23427
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0
> Environment: SPARK 2.0 version
> Reporter: Dhiraj
> Priority: Critical
>
> We are facing issue around value of spark.sql.autoBroadcastJoinThreshold.
> With spark.sql.autoBroadcastJoinThreshold -1 ( disable) we seeing driver
> memory used flat.
> With any other values 10MB, 5MB, 2 MB, 1MB, 10K, 1K we see driver memory used
> goes up with rate depending upon the size of the autoBroadcastThreshold and
> getting OOM exception. The problem is memory used by autoBroadcast is not
> being free up in the driver.
> Application imports oracle tables as master dataframes which are persisted.
> Each job applies filter to these tables and then registered them as
> tempViewTable . Then sql query are using to process data further. At the end
> all the intermediate dataFrame are unpersisted.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]