In Drill, a file is a table, but so is a directory. So go ahead and do CTAS on each day's data, putting each new file into a directory, possibly partitioned by month. Then query based on the directory or super-directory.
On Mon, Oct 5, 2015 at 5:20 PM, John Omernik <[email protected]> wrote: > Hey all, I have a process that Drill could be awesome and I am trying to > figure out how to do it, I am thinking that INSERT Support is required. > Basically I get some data in CSV files on a regular basis. Drill reads > these well, and I am getting ok performance. For long term, I'd like to > move these over to Parquet. When I scan a days worth CSV vs a days worth > of Parquet loaded, I am getting a huge improvement. That said, CREATE > TABLE as select * from mycsvs doesn't really seem to work like how I would > want it to, i.e. I'd like to create the table once, and then insert into > the table. (A regular scheduled process) > > I see the INSERT JIRA doesn't have much movement, but I'd like to +1 it and > then also ask the group: if the community doesn't see the INSERT as a > priority issue, what work around are people doing in similar situations. > > Thanks > > John > > > > > https://issues.apache.org/jira/browse/DRILL-3534 >
