[ https://issues.apache.org/jira/browse/HIVE-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108198#comment-14108198 ]
Daniel Weeks commented on HIVE-7800: ------------------------------------ The previous implementation had an issue that is only triggered in rare cases where the first split of a task does not contain a row group. This forces the initialization of the value of the input format to be the size of the table (ArrayWritable), but the next row group will produce a value only as wide the columns available in the file. The new patch pads the resolved schema to ensure a matching size and masks the name of the column so there no possibility of conflict with named columns within the file. > Parqet Column Index Access Schema Size Checking > ----------------------------------------------- > > Key: HIVE-7800 > URL: https://issues.apache.org/jira/browse/HIVE-7800 > Project: Hive > Issue Type: Bug > Affects Versions: 0.14.0 > Reporter: Daniel Weeks > Assignee: Daniel Weeks > Attachments: HIVE-7800.1.patch, HIVE-7800.2.patch > > > In the case that a parquet formatted table has partitions where the files > have different size schema, using column index access can result in an index > out of bounds exception. -- This message was sent by Atlassian JIRA (v6.2#6252)