Are you using the latest Hive trunk?  There were some patches such as HIVE-1200 
and HIVE-132 to make Hive compatible with this Hadoop feature.

JVS

On May 26, 2010, at 5:01 PM, Karthik wrote:

> Hi John,
> 
> I tried your suggestion and almost worked :(
> 
> I see that both Map and Reduce tasks complete 100%  (and all the files I need 
> under different sub folders are read by the mappers without any issue), but 
> after the reducers are done, instead of getting the results printed, I get 
> this error:
> 
> 2010-05-26 23:45:42,048 ERROR CliDriver (SessionState.java:printError(279)) - 
> Failed with exception java.io.IOException:java.io.IOException: No input paths 
> specified in job
> java.io.IOException: java.io.IOException: No input paths specified in job
> 
> Any idea?
> 
> Thanks again for helping out so far.
> 
> Regards,
> Karthik.
> 
> 
> 
> ----- Original Message ----
> From: John Sichi <[email protected]>
> To: "[email protected]" <[email protected]>
> Sent: Wed, May 26, 2010 11:14:26 AM
> Subject: RE: Query HDFS files without using LOAD (move)
> 
> Use a Hadoop version which includes this:
> 
> https://issues.apache.org/jira/browse/MAPREDUCE-1501
> 
> and
> 
> set mapred.input.dir.recursive=true; 
> 
> We are currently using this in production.  However, it does not deal with 
> the pattern case.
> 
> JVS
> 
> ________________________________________
> From: Karthik [[email protected]]
> Sent: Wednesday, May 26, 2010 11:08 AM
> To: [email protected]
> Subject: Re: Query HDFS files without using LOAD (move)
> 
> Thanks a lot for the quick reply Ashish.
> 
> The files are currently across multiple folders as they high in number and so 
> they are arranged by category (functionally) across multiple folders in HDFS. 
>  Any work around to support multiple folders?
> 
> -KK.
> 
> 
> 
> ----- Original Message ----
> From: Ashish Thusoo <[email protected]>
> To: "[email protected]" <[email protected]>
> Sent: Wed, May 26, 2010 11:03:43 AM
> Subject: RE: Query HDFS files without using LOAD (move)
> 
> You could probably use external tables?? CREATE EXTERNAL TABLE allows you to 
> create tables on hdfs files but I do not think that it takes file patterns / 
> regex. If all the files are created within a directory then you could point 
> the external table to the directory location and then querying on that table 
> would automatically query all the files in that directory. Are your files in 
> a single directory or are they spread out?
> 
> http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table
> 
> Ashish
> 
> -----Original Message-----
> From: Karthik [mailto:[email protected]]
> Sent: Wednesday, May 26, 2010 10:45 AM
> To: [email protected]
> Subject: Query HDFS files without using LOAD (move)
> 
> Is there a way where I can specify a list of files (or file pattern / regex) 
> from a HDFS location other than the Hive Warehouse as a parameter to a Hive 
> Query?  I have a bunch of files that are used by other applications as well 
> and I need to perform queries on those as well using Hive and so I do not 
> want to use LOAD and move those files on to Hive warehouse from the original 
> location.
> 
> My query is on incremental data (new files) that are added on a daily basis 
> and need not use the full list of files on a folder and so I need to specify 
> a list of file / pattern, something like a filter of files to the query.
> 
> Please suggest.
> 
> - KK.
> 

Reply via email to