You might be interested in https://issues.apache.org/jira/browse/HIVE-1662,
using predicate on file-name vc to filter out inputs. For example,

select key,INPUT__FILE__NAME from srcbucket2 where INPUT__FILE__NAME rlike
'.*/srcbucket2[03].txt'

But it's not committed, yet.

Thanks,



2014-03-03 23:14 GMT+09:00 Petter von Dolwitz (Hem) <
[email protected]>:

> Hi,
>
> I have implemented a few custom input formats in Hive. It seems like only
> the getRecordReader() method of these input formats is being called though,
> i.e. there is no way of overriding the listStatus() method and provide a
> custom input filter. The only way I can set a file filter is by using the
> mapred.input.pathFilter.class property which leaves me at using the same
> filter for all input formats. I would like a way to specify a filter per
> input format. Is there a way around this limitation?
>
> I am on Hive 0.10. I think I have seen that when running jobs locally that
> the listStatus() method of my input formats are called but not when handing
> over the job to a hadoop cluster. It seems like the listStatus is called on
> hadoops CombineFileInputFormat instead.
>
> Thanks,
> Petter
>

Reply via email to