[ 
https://issues.apache.org/jira/browse/HIVE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261578#comment-16261578
 ] 

Andrew Olson commented on HIVE-18122:
-------------------------------------

Until non-native tables officially support being created with "PARTITIONED BY", 
changing the InitializeInput line:

{noformat}
if (table.getPartitionKeys().size() != 0) {
{noformat}

to:

{noformat}
if (!table.isNonNative() && table.getPartitionKeys().size() != 0) {
{noformat}

seems like a reasonable solution. The data selection filter could then 
presumably be supplied to the storage handler and everything would work as 
expected.

> HCatInputFormat cannot read any data when non-native table has partition 
> columns
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-18122
>                 URL: https://issues.apache.org/jira/browse/HIVE-18122
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog
>            Reporter: Andrew Olson
>
> First, some background info: A non-native table can be created with partition 
> columns defined. However, the existence of partition columns for a non-native 
> table is problematic when using {{HCatInputFormat}}. Nothing disallows the 
> table creation, and the documentation [1] does not mention that non-native 
> tables cannot have partition columns. In fact, it suggests that "PARTITIONED 
> BY" can be specified.
> With such a table definition, for any job using {{HCatInputFormat}} no data 
> can ever be read and the cause is not immediately obvious, only revealed via 
> debugging. The bug stems from the 
> {{org.apache.hive.hcatalog.mapreduce.InitializeInput}} class's logic in the 
> {{getInputJobInfo}} method, where it attempts to identify the partitions to 
> read. With partition columns defined, {{table.getPartitionKeys().size()}} is 
> > 0 so it proceeds to the {{listPartitionsByFilter(...)}} code which will 
> never find any partitions, because partitions cannot be added to a non-native 
> table (HIVE-1223). The returned {{InputJobInfo}} then has an empty 
> {{List<PartInfo>}} set rather than taking the "Non partitioned table" path 
> where the table's {{StorageDescriptor}} and parameters are used to build a 
> singleton {{PartInfo}} to use.
> This bug is quite similar to HIVE-18087 although it resides in a different 
> layer of Hive.
> We encountered this using the {{HBaseStorageHandler}}, although I don't 
> believe that's a particularly relevant detail.
> [1] 
> https://cwiki.apache.org/confluence/display/Hive/StorageHandlers#StorageHandlers-DDL



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to