When have a small query writing smaller data (like aggregate tables for faster aggregates for Dashboards etc). It appears to write a ton of small files. Not sure why, maybe its just how the join worked out etc. I have a "day" that is 1.5M in total size, but 400 files total. This seems excessive.
While I don't have the "small files" issues because I run MapR-FS, having 400 files that make 1.5 mb of total date kills me on the planning phase. How can I get Drill, when doing a CTAS to go through a round of consolidation on the parquet files? Thanks John
