Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20372
It sounds like we fixed a "bug" and make the actual partition size more
close to the expected one, but caused another "bug". 2 speculations:
1. The expected partition size can't maximum read performace
2. the open file cost is wrongly estimated--- --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
