Hey guys, As a follow up, I raised our target partition size to 600mb (up from 64mb), which split this report's 500gb of tiny S3 files into ~700 partitions, and everything ran much smoother.
In retrospect, this was the same issue we'd ran into before, having too many partitions, and had previously solved by throwing some guesses at coalesce to make it magically go away. But now I feel like we have a much better understanding of why the numbers need to be what they are, which is great. So, thanks for all the input and helping me understand what's going on. It'd be great to see some of the optimizations to BlockManager happen, but I understand in the end why it needs to track what it does. And I was also admittedly using a small cluster anyway. - Stephen
