Hi Qing, That's a good idea. Can you open a jira? There are lots of details before we can add that feature to Hive. For example, how to specify the largest number of data corruption that can be accepted, by absolute number or percentage, etc. What about half corrupted records in case we only need the non-corrupted part in the query, etc.
Zheng On 2/19/09, Qing Yan <[email protected]> wrote: > Say I have some bad/ill-formatted records in the input, is there a way to > configure the default Hive parser to discard those records directly(e.g. > when a integer column get a string)? > > Besides, is the new skip-bad-records feature in 0.19 accessible in Hive? > It is a quite handy feature in the real world. > > What I see so far is the Hive parser throws exception and cause the whole > job to fail ultimately. > > Thanks for the help! > > Qing > -- Sent from Gmail for mobile | mobile.google.com Yours, Zheng
