Dandandan commented on PR #4867: URL: https://github.com/apache/arrow-datafusion/pull/4867#issuecomment-1377179232
Another "problematic" case is when some column/expression part of a hash repartition has low cardinality and data is sorted or semi-sorted (i.e. the value is repeated many times before moving to a next one), in this case the buffer will fill up until a next value is consumed. It is still better than the current situation though for other situations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org