Github user viirya commented on a diff in the pull request:
    --- Diff: 
    @@ -118,10 +115,12 @@ case class HashClusteredDistribution(
      * Represents data where tuples have been ordered according to the 
    - * [[Expression Expressions]].  This is a strictly stronger guarantee than
    - * [[ClusteredDistribution]] as an ordering will ensure that tuples that 
share the
    - * same value for the ordering expressions are contiguous and will never 
be split across
    - * partitions.
    + * [[Expression Expressions]]. Its requirement is defined as the following:
    + *   - Given any 2 adjacent partitions, all the rows of the second 
partition must be larger than or
    + *     equal to any row in the first partition, according to the 
`ordering` expressions.
    --- End diff --
    Why here we need this equality? Can we just have all the rows in the second 
partition must be larger than any row in the first partition?


To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to