Re: Excluding HDFS .tmp file from multi-file query?

2016-09-22 Thread Andries Engelbrecht
I noticed if you specifically use * for file matching it will still read hidden files. However if you only point Drill at a directory it will read the directory and sub structure without reading any hidden files. select * from `/dir1/*` - will read hidden files select * from `/dir1` will not

Re: Excluding HDFS .tmp file from multi-file query?

2016-09-21 Thread Andries Engelbrecht
Add a . prefix to the Flume temp files. Drill will ignore the hidden files when you query the directory structure. --Andries > On Sep 21, 2016, at 2:36 PM, Robin Moffatt > wrote: > > Hi, > I have a stream of data from Flume landing in HDFS in files of a set

Excluding HDFS .tmp file from multi-file query?

2016-09-21 Thread Robin Moffatt
Hi, I have a stream of data from Flume landing in HDFS in files of a set size. I can query these files individually just fine, and across multiple ones too - except if the wildcard encompasses the *currently open HDFS file that Flume is writing to*. When this happens, Drill understandably barfs.