FYI - by real time I mean data files which Flume has finished writing to...so near real time!
-----Original Message----- From: England, Michael (IT/UK) Sent: 13 October 2015 10:06 To: [email protected] Subject: Stop Drill querying .tmp files Hi, I am trying to query data ingested by Flume in real time, however, Flume writes out data to a file ending in .tmp and then renames it once it has completed its writes. If you run a drill query on a large data set and a .tmp file is renamed by Flume whilst the query is running, it bombs out. I was looking for a way to specify a file exclusion pattern with regex or something similar, however right now this doesn’t seem possible. Right now, just making Drill exclude any files ending in .tmp or starting with a . or a _ would be very useful for this reason. I have seen the following JIRAs relating to this issue: https://issues.apache.org/jira/browse/DRILL-2424 - closed as a duplicate https://issues.apache.org/jira/browse/DRILL-1131 - still open but related to Parquet Is there another way to achieve this without having to wait for a change on the Drill code base? I wrote a custom Hive class to achieve the same functionality but I am not sure this is possible in Drill. Thanks, Mike This e-mail (including any attachments) is private and confidential, may contain proprietary or privileged information and is intended for the named recipient(s) only. Unintended recipients are strictly prohibited from taking action on the basis of information in this e-mail and must contact the sender immediately, delete this e-mail (and all attachments) and destroy any hard copies. Nomura will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in, this e-mail. If verification is sought please request a hard copy. Any reference to the terms of executed transactions should be treated as preliminary only and subject to formal written confirmation by Nomura. Nomura reserves the right to retain, monitor and intercept e-mail communications through its networks (subject to and in accordance with applicable laws). No confidentiality or privilege is waived or lost by Nomura by any mistransmission of this e-mail. Any reference to "Nomura" is a reference to any entity in the Nomura Holdings, Inc. group. Please read our Electronic Communications Legal Notice which forms part of this e-mail: http://www.Nomura.com/email_disclaimer.htm This e-mail (including any attachments) is private and confidential, may contain proprietary or privileged information and is intended for the named recipient(s) only. Unintended recipients are strictly prohibited from taking action on the basis of information in this e-mail and must contact the sender immediately, delete this e-mail (and all attachments) and destroy any hard copies. Nomura will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in, this e-mail. If verification is sought please request a hard copy. Any reference to the terms of executed transactions should be treated as preliminary only and subject to formal written confirmation by Nomura. Nomura reserves the right to retain, monitor and intercept e-mail communications through its networks (subject to and in accordance with applicable laws). No confidentiality or privilege is waived or lost by Nomura by any mistransmission of this e-mail. Any reference to "Nomura" is a reference to any entity in the Nomura Holdings, Inc. group. Please read our Electronic Communications Legal Notice which forms part of this e-mail: http://www.Nomura.com/email_disclaimer.htm
