zhouyejoe opened a new pull request #34156:
URL: https://github.com/apache/spark/pull/34156


   We found an issue where user configured both AQE and push based shuffle, but 
the job started to hang after running some  stages. We took the thread dump 
from the Executors, which showed the task is still waiting to fetch shuffle 
blocks.
   Proposed changes in the PR to fix the issue.
   
   ### What changes were proposed in this pull request?
   Disabled Batch fetch when push based shuffle is enabled. 
   
   ### Why are the changes needed?
   Without this patch, enabling AQE and Push based shuffle will have a chance 
to hang the tasks.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Tested the PR within our PR, with Spark shell and the queries are:
   
   sql("""SELECT CASE WHEN rand() < 0.8 THEN 100 ELSE CAST(rand() * 30000000 AS 
INT) END AS s_item_id, CAST(rand() * 100 AS INT) AS s_quantity, 
DATE_ADD(current_date(), - CAST(rand() * 360 AS INT)) AS s_date FROM 
RANGE(1000000000)""").createOrReplaceTempView("sales")
   // Dynamically coalesce partitions
   sql("""SELECT s_date, sum(s_quantity) AS q FROM sales GROUP BY s_date ORDER 
BY q DESC""").collect
   
   Unit tests to be added.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to