Github user ash211 commented on the issue:
https://github.com/apache/spark/pull/20372
Tagging folks who have touched this code recently: @vgankidi @ericl @davies
This seems to provide a more compact packing in every scenario, which
should improve execution times. One risk is that individual partitions are no
longer always contiguous ranges of files in order, but rather sometimes they
have a gap. In the test this is the `(file1, file6)` partition. If something
depends on this past behavior it could now break, though I don't think anything
should be requiring this partition ordering.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]