Apply a sort in your CTAS, this will force the data down to a single stream
before writing.

Jason Altekruse
Software Engineer at Dremio
Apache Drill Committer

On Thu, Jun 23, 2016 at 10:23 AM, John Omernik <[email protected]> wrote:

> When have a small query writing smaller data (like aggregate tables for
> faster aggregates for Dashboards etc).  It appears to write a ton of small
> files.  Not sure why, maybe its just how the join worked out etc. I have a
> "day" that is 1.5M in total size, but 400 files total. This seems
> excessive.
>
> While I don't have the "small files" issues because I run MapR-FS, having
> 400 files that make 1.5 mb of total date kills me on the planning phase.
>  How can I get Drill, when doing a CTAS to go through a round of
> consolidation on the parquet files?
>
> Thanks
>
> John
>

Reply via email to