Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/20414
  
    Hey I searched the `ExternalAppendOnlyMap` and here are the findings:
    The `ExternalAppendOnlyMap` claims it keeps the sorted content, but it 
actually uses a `HashComparator` that compare the elements by their hashes. 
Luckily, it sort the elements using TimSort which is stable, that means, even 
if there exists hash collisions, the output sequence should still be 
deterministic, as long as the inputs are (which we can achieve by modifying 
`ShuffleBlockFetcherIterator` per previous discussion).
    
    We may need to check for all the other places we may spill/compare objects 
to ensure we generate deterministic output sequence everywhere, though.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to