srowen commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524312566 OK, so the idea is that similarly-named paths tend to store in the same location (disk, machine) on HDFS? I think that won't be true on object stores. Yeah a question here is how much difference this makes for what deployments. I am not sure why it sorts by size descending; that has been there for a very long time. It's unclear whether that is just historical or whether there's an important reason for it. I understand there's a flag here but I am not sure whether the new code path causes undesired differences in behavior.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
