s0nskar commented on PR #2373:
URL: https://github.com/apache/celeborn/pull/2373#issuecomment-2037156713
From my understanding, in this PR we're diverting from vanilla spark
approach based on mapIndex and just dividing the full partition into multiple
sub-partition based on some heuristics. I'm new to Celeborn code, so might be
missing something basic but in this PR we're not addressing below issue. If we
consider a basic scenario where a partial partition read is happening and we
see a FetchFailure.
`ShuffleMapStage --> ResultStage`
- ShuffleMapStage (attempt 0) generated [P0, P1, P2] and P0 is skewed with
partition location [0,1,2,3,4,5].
- AQE asks for three splits and this PR logic will create three partitions
[0, 1], [2, 3], [4, 5]
- Now consider is reducer read [0, 1] and [2, 3] and gets `FetchFailure`
while reading [4, 5]
- This will trigger a complete mapper stage retry a/c to this
[doc](https://docs.google.com/document/d/1dkG6fww3g99VAb1wkphNlUES_MpngVPNg8601chmVp8/edit)
and will clear the map output corresponding the shuffleID
- ShuffleMapStage (attempt 0) will again generate data for P0 at different
partition location [a, b, c, d, e, f] and it will get divided like [a, b], [c,
d], [e, f]
- Now if reader stage is `ShuffleMapStage` then it will read every
sub-partition again but if the reader is `ResultStage` then it will only read
missing partition data which [e, f].
The data generated on location `1` and location `a` would be different
because of other factors like network delay (same thing applies for other
locations). Ex – The data that might be present in 1st location in first
attempt might be present in 2nd location or any location in different attempt
because of the order mapper generated the data and in order server received
that data.
This can cause both Data loss and Data duplication, this might be getting
addressed in some other place in the codebase that i'm not aware of but i
wanted point this problem out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]