Instead of defining a hard-coded set of prefixes, suffixes, and/or
patterns, can we give users some kind of configuration parameter
somewhere?
Perhaps the file-system plug-in should have a configuration parameter
that is a list of "glob" or regular-expression patterns specifying
names to ignore, with a default (bootstrap) setting that covers
common cases.
Daniel
Mehant Baid wrote:
I addressed the issue mentioned in DRILL-1131, ignoring files starting with an
underscore and dot, this was implicitly added as part of drop table support.
However did not notice the additional file type that needs to be handled (files
ending with .tmp). If this is a common use case then I can create a trivial
patch to do this.
Thanks
Mehant
On 10/13/15 1:48 PM, Steven Phillips wrote:
DRILL-2424 has a comment from Mehant that this should be fixed, but that
there was some sort of merge conflict. Was this ever resolved? Or a new
jira filed?
On Tue, Oct 13, 2015 at 10:13 AM, Rajkumar Singh <[email protected]>
wrote:
There is related jira already filed
https://issues.apache.org/jira/browse/DRILL-2799 <
https://issues.apache.org/jira/browse/DRILL-2799>
On 13-Oct-2015, at 10:36 PM, Christopher Matta <[email protected]> wrote:
I agree, would someone in more of a leadership position care to comment
if
this warrants an enhancement Jira?
On Tuesday, October 13, 2015, <[email protected] <mailto:
[email protected]>> wrote:
Thanks Chris, I have tested this and it works well. I think it would
still
be nice to be able to set an exclusion pattern in the workspace so that
you
don't have to code each query with this in mind.
-----Original Message-----
From: Christopher Matta [mailto:[email protected] <javascript:;>]
Sent: 13 October 2015 14:21
To: [email protected] <javascript:;>
Subject: Re: Stop Drill querying .tmp files
Drill respects a file *inclusion* pattern, so you could build a view
sort
of like:
select * from dfs.workspace.`dirname/*.csv`;
Chris Matta
[email protected] <javascript:;>
215-701-3146
On Tue, Oct 13, 2015 at 5:09 AM, <[email protected]
<javascript:;>> wrote:
FYI - by real time I mean data files which Flume has finished writing
to...so near real time!
-----Original Message-----
From: England, Michael (IT/UK)
Sent: 13 October 2015 10:06
To: [email protected] <javascript:;>
Subject: Stop Drill querying .tmp files
Hi,
I am trying to query data ingested by Flume in real time, however,
Flume writes out data to a file ending in .tmp and then renames it
once it has completed its writes. If you run a drill query on a large
data set and a .tmp file is renamed by Flume whilst the query is
running, it bombs out. I was looking for a way to specify a file
exclusion pattern with regex or something similar, however right now
this doesn’t seem possible. Right now, just making Drill exclude any
files ending in .tmp or starting with a . or a _ would be very useful
for this reason.
I have seen the following JIRAs relating to this issue:
https://issues.apache.org/jira/browse/DRILL-2424 - closed as a
duplicate
https://issues.apache.org/jira/browse/DRILL-1131 - still open but
related to Parquet
Is there another way to achieve this without having to wait for a
change on the Drill code base? I wrote a custom Hive class to achieve
the same functionality but I am not sure this is possible in Drill.
Thanks,
Mike
This e-mail (including any attachments) is private and confidential,
may contain proprietary or privileged information and is intended for
the named
recipient(s) only. Unintended recipients are strictly prohibited from
taking action on the basis of information in this e-mail and must
contact the sender immediately, delete this e-mail (and all
attachments) and destroy any hard copies. Nomura will not accept
responsibility or liability for the accuracy or completeness of, or
the presence of any virus or disabling code in, this e-mail. If
verification is sought please request a hard copy. Any reference to
the terms of executed transactions should be treated as preliminary
only and subject to formal written confirmation by Nomura. Nomura
reserves the right to retain, monitor and intercept e-mail
communications through its networks (subject to and in accordance with
applicable laws). No confidentiality or privilege is waived or lost by
Nomura by any mistransmission of this e-mail. Any reference to
"Nomura" is a reference to any entity in the Nomura Holdings, Inc.
group. Please read our Electronic Communications Legal Notice which
forms
part of this e-mail:
http://www.Nomura.com/email_disclaimer.htm
This e-mail (including any attachments) is private and confidential,
may contain proprietary or privileged information and is intended for
the named
recipient(s) only. Unintended recipients are strictly prohibited from
taking action on the basis of information in this e-mail and must
contact the sender immediately, delete this e-mail (and all
attachments) and destroy any hard copies. Nomura will not accept
responsibility or liability for the accuracy or completeness of, or
the presence of any virus or disabling code in, this e-mail. If
verification is sought please request a hard copy. Any reference to
the terms of executed transactions should be treated as preliminary
only and subject to formal written confirmation by Nomura. Nomura
reserves the right to retain, monitor and intercept e-mail
communications through its networks (subject to and in accordance with
applicable laws). No confidentiality or privilege is waived or lost by
Nomura by any mistransmission of this e-mail. Any reference to
"Nomura" is a reference to any entity in the Nomura Holdings, Inc.
group. Please read our Electronic Communications Legal Notice which
forms
part of this e-mail:
http://www.Nomura.com/email_disclaimer.htm
This e-mail (including any attachments) is private and confidential, may
contain proprietary or privileged information and is intended for the
named
recipient(s) only. Unintended recipients are strictly prohibited from
taking action on the basis of information in this e-mail and must
contact
the sender immediately, delete this e-mail (and all attachments) and
destroy any hard copies. Nomura will not accept responsibility or
liability
for the accuracy or completeness of, or the presence of any virus or
disabling code in, this e-mail. If verification is sought please
request a
hard copy. Any reference to the terms of executed transactions should be
treated as preliminary only and subject to formal written confirmation
by
Nomura. Nomura reserves the right to retain, monitor and intercept
e-mail
communications through its networks (subject to and in accordance with
applicable laws). No confidentiality or privilege is waived or lost by
Nomura by any mistransmission of this e-mail. Any reference to "Nomura"
is
a reference to any entity in the Nomura Holdings, Inc. group. Please
read
our Electronic Communications Legal Notice which forms part of this
e-mail:
http://www.Nomura.com/email_disclaimer.htm
--
Chris Matta
[email protected] <mailto:[email protected]>
215-701-3146
--
Daniel Barclay
MapR Technologies