Drill respects a file *inclusion* pattern, so you could build a view sort of like:
select * from dfs.workspace.`dirname/*.csv`; Chris Matta [email protected] 215-701-3146 On Tue, Oct 13, 2015 at 5:09 AM, <[email protected]> wrote: > FYI - by real time I mean data files which Flume has finished writing > to...so near real time! > > > -----Original Message----- > From: England, Michael (IT/UK) > Sent: 13 October 2015 10:06 > To: [email protected] > Subject: Stop Drill querying .tmp files > > Hi, > > I am trying to query data ingested by Flume in real time, however, Flume > writes out data to a file ending in .tmp and then renames it once it has > completed its writes. If you run a drill query on a large data set and a > .tmp file is renamed by Flume whilst the query is running, it bombs out. I > was looking for a way to specify a file exclusion pattern with regex or > something similar, however right now this doesn’t seem possible. Right now, > just making Drill exclude any files ending in .tmp or starting with a . or > a _ would be very useful for this reason. > > I have seen the following JIRAs relating to this issue: > > https://issues.apache.org/jira/browse/DRILL-2424 - closed as a duplicate > > https://issues.apache.org/jira/browse/DRILL-1131 - still open but related > to Parquet > > Is there another way to achieve this without having to wait for a change > on the Drill code base? I wrote a custom Hive class to achieve the same > functionality but I am not sure this is possible in Drill. > > Thanks, > Mike > > > This e-mail (including any attachments) is private and confidential, may > contain proprietary or privileged information and is intended for the named > recipient(s) only. Unintended recipients are strictly prohibited from > taking action on the basis of information in this e-mail and must contact > the sender immediately, delete this e-mail (and all attachments) and > destroy any hard copies. Nomura will not accept responsibility or liability > for the accuracy or completeness of, or the presence of any virus or > disabling code in, this e-mail. If verification is sought please request a > hard copy. Any reference to the terms of executed transactions should be > treated as preliminary only and subject to formal written confirmation by > Nomura. Nomura reserves the right to retain, monitor and intercept e-mail > communications through its networks (subject to and in accordance with > applicable laws). No confidentiality or privilege is waived or lost by > Nomura by any mistransmission of this e-mail. Any reference to "Nomura" is > a reference to any entity in the Nomura Holdings, Inc. group. Please read > our Electronic Communications Legal Notice which forms part of this e-mail: > http://www.Nomura.com/email_disclaimer.htm > > > > This e-mail (including any attachments) is private and confidential, may > contain proprietary or privileged information and is intended for the named > recipient(s) only. Unintended recipients are strictly prohibited from > taking action on the basis of information in this e-mail and must contact > the sender immediately, delete this e-mail (and all attachments) and > destroy any hard copies. Nomura will not accept responsibility or liability > for the accuracy or completeness of, or the presence of any virus or > disabling code in, this e-mail. If verification is sought please request a > hard copy. Any reference to the terms of executed transactions should be > treated as preliminary only and subject to formal written confirmation by > Nomura. Nomura reserves the right to retain, monitor and intercept e-mail > communications through its networks (subject to and in accordance with > applicable laws). No confidentiality or privilege is waived or lost by > Nomura by any mistransmission of this e-mail. Any reference to "Nomura" is > a reference to any entity in the Nomura Holdings, Inc. group. Please read > our Electronic Communications Legal Notice which forms part of this e-mail: > http://www.Nomura.com/email_disclaimer.htm > >
