[GitHub] [spark] mridulm edited a comment on pull request #35185: [SPARK-37831][CORE] add task partition id in TaskInfo and Task Metrics

GitBox Mon, 17 Jan 2022 16:14:00 -0800


mridulm edited a comment on pull request #35185:
URL: https://github.com/apache/spark/pull/35185#issuecomment-1014966382



   > > Took an initial pass through the PR and added some comments - overall 
looks good. We would need to make sure that skew join and partition coalescing 
in SQL interact well with this change.
   > 
   > Thanks for you reply. I have test partition coalescing in SQL interact, it 
works well with this change.
   
   What I want @cloud-fan, @dongjoon-hyun, etc who are more familiar with SQL 
to look at is - given a single partition gets computed by multiple tasks (for 
skew), or multiple partitions are getting computed by single task (for 
coalascing) what is the expectation (between the 'original' reducer stage, and 
executed reducer stage) Also whether this is compatible with future evolution 
of sql.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] mridulm edited a comment on pull request #35185: [SPARK-37831][CORE] add task partition id in TaskInfo and Task Metrics

Reply via email to