howardli9175 opened a new issue, #1893:
URL: https://github.com/apache/auron/issues/1893

   1. Environment
   6 worker node YARN cluster, x86 architecture, each node with 64 cores and 
500GB memory.
   Hadoop 3.2.2
   Spark 3.5.4
   Blaze 5.0.0
   2. how to reproduce
   Running TPC-DS benchmark, 10TB dataset, Parquet + ZSTD compression.
   ```
   spark.executor.cores=1
   spark.executor.memory=16g
   spark.executor.memoryOverhead=16g
   spark.driver.cores=1
   spark.driver.memory=20g
   spark.blaze.enable true
   spark.sql.extensions org.apache.spark.sql.blaze.BlazeSparkSessionExtension
   spark.shuffle.manager 
org.apache.spark.sql.execution.blaze.shuffle.BlazeShuffleManager
   spark.memory.offHeap.enabled false
   ```
   Queries q24a and q24b failed.
   The error message is as shown in the figure.
   The failure can be reproduced every time. The failed stage has 200 tasks, 
with 164 succeeded and 36 failed.
   3. other scenario where the queries succeed
   On 10TB dataset, without Blaze enabled, the queries succeed. 
   On the 1TB dataset, with Blaze enabled, the queries also succeed .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to