On Wed, May 26, 2010 at 2:03 PM, Ashish Thusoo <[email protected]> wrote:
> You could probably use external tables?? CREATE EXTERNAL TABLE allows you > to create tables on hdfs files but I do not think that it takes file > patterns / regex. If all the files are created within a directory then you > could point the external table to the directory location and then querying > on that table would automatically query all the files in that directory. Are > your files in a single directory or are they spread out? > > http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table > > Ashish > > -----Original Message----- > From: Karthik [mailto:[email protected]] > Sent: Wednesday, May 26, 2010 10:45 AM > To: [email protected] > Subject: Query HDFS files without using LOAD (move) > > Is there a way where I can specify a list of files (or file pattern / > regex) from a HDFS location other than the Hive Warehouse as a parameter to > a Hive Query? I have a bunch of files that are used by other applications > as well and I need to perform queries on those as well using Hive and so I > do not want to use LOAD and move those files on to Hive warehouse from the > original location. > > My query is on incremental data (new files) that are added on a daily basis > and need not use the full list of files on a folder and so I need to specify > a list of file / pattern, something like a filter of files to the query. > > Please suggest. > > - KK. > > Also in trunk there is a feature that uses a file with a list of files as the input. I do not know the Jira #
