[
https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961829#comment-15961829
]
balaji krishnan commented on SPARK-12837:
-----------------------------------------
Thanks @teobar I did what you suggested, but hitting other problems including
"broken pipe"
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
Remote RPC client disassociated. Likely due to containers exceeding thresholds,
or network issues.
The memory settings were --driver-memory 16g and spark.driver.maxResultSize=0
Thanks
Bala
> Spark driver requires large memory space for serialized results even there
> are no data collected to the driver
> --------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-12837
> URL: https://issues.apache.org/jira/browse/SPARK-12837
> Project: Spark
> Issue Type: Question
> Components: SQL
> Affects Versions: 1.5.2, 1.6.0
> Reporter: Tien-Dung LE
> Assignee: Wenchen Fan
> Priority: Critical
> Fix For: 2.0.0
>
>
> Executing a sql statement with a large number of partitions requires a high
> memory space for the driver even there are no requests to collect data back
> to the driver.
> Here are steps to re-produce the issue.
> 1. Start spark shell with a spark.driver.maxResultSize setting
> {code:java}
> bin/spark-shell --driver-memory=1g --conf spark.driver.maxResultSize=1m
> {code}
> 2. Execute the code
> {code:java}
> case class Toto( a: Int, b: Int)
> val df = sc.parallelize( 1 to 1e6.toInt).map( i => Toto( i, i)).toDF
> sqlContext.setConf( "spark.sql.shuffle.partitions", "200" )
> df.groupBy("a").count().saveAsParquetFile( "toto1" ) // OK
> sqlContext.setConf( "spark.sql.shuffle.partitions", 1e3.toInt.toString )
> df.repartition(1e3.toInt).groupBy("a").count().repartition(1e3.toInt).saveAsParquetFile(
> "toto2" ) // ERROR
> {code}
> The error message is
> {code:java}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure:
> Total size of serialized results of 393 tasks (1025.9 KB) is bigger than
> spark.driver.maxResultSize (1024.0 KB)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]