Hi, I have implemented a few custom input formats in Hive. It seems like only the getRecordReader() method of these input formats is being called though, i.e. there is no way of overriding the listStatus() method and provide a custom input filter. The only way I can set a file filter is by using the mapred.input.pathFilter.class property which leaves me at using the same filter for all input formats. I would like a way to specify a filter per input format. Is there a way around this limitation?
I am on Hive 0.10. I think I have seen that when running jobs locally that the listStatus() method of my input formats are called but not when handing over the job to a hadoop cluster. It seems like the listStatus is called on hadoops CombineFileInputFormat instead. Thanks, Petter
