Is there a reason to create a single file? Typically you may want more files to 
improve parallel operation on distributed systems like drill.

That said, if you have a single node drill cluster (or embedded mode) you can 
reduce the threads to a single thread and increase the parquet file size for 
the data set size. Be prepared for things to slow down substantially when doing 
CTAS and querying the file.

alter session set `planner.width.max_per_node` = 1

Then set parquet file size as large as needed. Not sure on limitations, but do 
note that sufficient memory is required to support it as well.

alter session set `store.parquet.block-size` = <size in bytes>

Perhaps someone else knows of a different way to do it. However consider the 
implications of creating a single file.

--Andries

> On Feb 4, 2016, at 6:38 AM, Peder Jakobsen | gmail <[email protected]> 
> wrote:
> 
> Hi, is there a way to force drill to create a single file when performing a
> CTAS command (or some other method).
> 
> Right now, I'm creating CSV files, and then have to perform and extra step
> to stitch 1_0_0.parquet  1_1_0.parquet  1_2_0.parquet etc.  together into a
> single file.
> 
> Thank you.
> 
> Peder

Reply via email to