Github user MLnick commented on the issue:

    https://github.com/apache/spark/pull/19621
  
    It won't be deterministic in the case of different RDDs / partitions / 
shuffle etc. For a given input RDD it _should_ be deterministic? 
    
    But perhaps we could ensure it by first sorting alphabetically and then by 
frequency?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to