[GitHub] [spark] JkSelf commented on issue #24978: [SPARK-28177][SQL] Adjust post shuffle partition number in adaptive execution

GitBox Mon, 01 Jul 2019 20:09:12 -0700

JkSelf commented on issue #24978: [SPARK-28177][SQL] Adjust post shuffle 
partition number in adaptive execution
URL: https://github.com/apache/spark/pull/24978#issuecomment-507500129
 
 
   @carsonwang @maryannxue I encountered the following exception with q1 of 
TPC-DS in yarn mode when enable or disable 
`spark.sql.adaptive.reducePostShufflePartitions.enabled` . And this exception 
may be not related with this PR, but with 
[PR#24706](https://github.com/apache/spark/pull/24706)
   `org.apache.spark.SparkException: Adaptive execution failed due to stage 
materialization failures.
        at 
org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.cleanUpAndThrowException(AdaptiveSparkPlanExec.scala:383)
        at 
org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.doExecute(AdaptiveSparkPlanExec.scala:158)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:168)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:192)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:189)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:164)
        at 
org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:284)
        at 
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:333)
        at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:363)
        at 
org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:52)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$4(SQLExecution.scala:100)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:367)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
        at 
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
        at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:196)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:920)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:179)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:202)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:999)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1008)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: org.apache.spark.SparkException: Failed to materialize query 
stage: ShuffleQueryStage 6
   `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] JkSelf commented on issue #24978: [SPARK-28177][SQL] Adjust post shuffle partition number in adaptive execution

Reply via email to