Are you using the latest Hive trunk? There were some patches such as HIVE-1200 and HIVE-132 to make Hive compatible with this Hadoop feature.
JVS On May 26, 2010, at 5:01 PM, Karthik wrote: > Hi John, > > I tried your suggestion and almost worked :( > > I see that both Map and Reduce tasks complete 100% (and all the files I need > under different sub folders are read by the mappers without any issue), but > after the reducers are done, instead of getting the results printed, I get > this error: > > 2010-05-26 23:45:42,048 ERROR CliDriver (SessionState.java:printError(279)) - > Failed with exception java.io.IOException:java.io.IOException: No input paths > specified in job > java.io.IOException: java.io.IOException: No input paths specified in job > > Any idea? > > Thanks again for helping out so far. > > Regards, > Karthik. > > > > ----- Original Message ---- > From: John Sichi <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Wed, May 26, 2010 11:14:26 AM > Subject: RE: Query HDFS files without using LOAD (move) > > Use a Hadoop version which includes this: > > https://issues.apache.org/jira/browse/MAPREDUCE-1501 > > and > > set mapred.input.dir.recursive=true; > > We are currently using this in production. However, it does not deal with > the pattern case. > > JVS > > ________________________________________ > From: Karthik [[email protected]] > Sent: Wednesday, May 26, 2010 11:08 AM > To: [email protected] > Subject: Re: Query HDFS files without using LOAD (move) > > Thanks a lot for the quick reply Ashish. > > The files are currently across multiple folders as they high in number and so > they are arranged by category (functionally) across multiple folders in HDFS. > Any work around to support multiple folders? > > -KK. > > > > ----- Original Message ---- > From: Ashish Thusoo <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Wed, May 26, 2010 11:03:43 AM > Subject: RE: Query HDFS files without using LOAD (move) > > You could probably use external tables?? CREATE EXTERNAL TABLE allows you to > create tables on hdfs files but I do not think that it takes file patterns / > regex. If all the files are created within a directory then you could point > the external table to the directory location and then querying on that table > would automatically query all the files in that directory. Are your files in > a single directory or are they spread out? > > http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table > > Ashish > > -----Original Message----- > From: Karthik [mailto:[email protected]] > Sent: Wednesday, May 26, 2010 10:45 AM > To: [email protected] > Subject: Query HDFS files without using LOAD (move) > > Is there a way where I can specify a list of files (or file pattern / regex) > from a HDFS location other than the Hive Warehouse as a parameter to a Hive > Query? I have a bunch of files that are used by other applications as well > and I need to perform queries on those as well using Hive and so I do not > want to use LOAD and move those files on to Hive warehouse from the original > location. > > My query is on incremental data (new files) that are added on a daily basis > and need not use the full list of files on a folder and so I need to specify > a list of file / pattern, something like a filter of files to the query. > > Please suggest. > > - KK. >
