In Drill, a file is a table, but so is a directory.

So go ahead and do CTAS on each day's data, putting each new file into a
directory, possibly partitioned by month. Then query based on the directory
or super-directory.



On Mon, Oct 5, 2015 at 5:20 PM, John Omernik <[email protected]> wrote:

> Hey all, I have a process that Drill could be awesome and I am trying to
> figure out how to do it, I am thinking that INSERT Support is required.
> Basically I get some data in CSV files on a regular basis. Drill reads
> these well, and I am getting ok performance. For long term, I'd like to
> move these over to Parquet.  When I scan a days worth CSV vs a days worth
> of Parquet loaded, I am getting a huge improvement.  That said, CREATE
> TABLE as select * from mycsvs doesn't really seem to work like how I would
> want it to, i.e. I'd like to create the table once, and then insert into
> the table. (A regular scheduled process)
>
> I see the INSERT JIRA doesn't have much movement, but I'd like to +1 it and
> then also ask the group: if the community doesn't see the INSERT as a
> priority issue, what work around are people doing in similar situations.
>
> Thanks
>
> John
>
>
>
>
> https://issues.apache.org/jira/browse/DRILL-3534
>

Reply via email to