[ https://issues.apache.org/jira/browse/HIVE-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain resolved HIVE-2658. ------------------------------ Resolution: Won't Fix Not needed - it will be difficult to come up with a generic exception. > add a option in hive to skip corrupted data entirely > ---------------------------------------------------- > > Key: HIVE-2658 > URL: https://issues.apache.org/jira/browse/HIVE-2658 > Project: Hive > Issue Type: New Feature > Reporter: Namit Jain > Assignee: He Yongqiang > > Add a new parameter: > hive.skip.corrupted.data > This is independent of the type of the underlying data. > The idea is as follows: > We have some corrupted data in our cluster right now. > We will run hive over all the corrupted partitions: > use bucketizedhiveinputformat > set hive.skip.corrupted.data=true > insert overwrite table <T> partition <P> > select * from <T> where <P> > This way, <T>@<P> will be regenerated with all the data that can be read. > If HiveRecordReader gets a exception getting the next row, the mapper will > behave as if no more data is present in the file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira