[ 
https://issues.apache.org/jira/browse/HIVE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051947#comment-14051947
 ] 

Sushanth Sowmyan commented on HIVE-7072:
----------------------------------------

[~daijy], could you please review/commit the latest version of this patch?

> HCatLoader only loads first region of hbase table
> -------------------------------------------------
>
>                 Key: HIVE-7072
>                 URL: https://issues.apache.org/jira/browse/HIVE-7072
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Sushanth Sowmyan
>            Assignee: Sushanth Sowmyan
>         Attachments: HIVE-7072.2.patch, HIVE-7072.3.patch
>
>
> Pig needs a config parameter 'pig.noSplitCombination' set to 'true' for it to 
> be able to read HBaseStorageHandler-based tables.
> This is done in the HBaseLoader at getSplits time, but HCatLoader does not do 
> so, which results in only a partial data load.
> Thus, we need one more special case definition in HCat, that sets this 
> parameter in the job properties if we detect that we're loading a 
> HBaseStorageHandler based table. (Note, also, that we should not depend 
> directly on the HBaseStorageHandler class, and instead depend on the name of 
> the class, since we do not want a mvn dependency on hive-hbase-handler to be 
> able to compile HCatalog core, since it's conceivable that at some time, 
> there might be a reverse dependency.) The primary issue is one of where this 
> code should go, since it doesn't belong in pig (pig does not know what loader 
> behaviour should be, and this parameter is its interface to a loader), and 
> doesn't belong in the HBaseStorageHandler either, since that's implementing a 
> HiveStorageHandler and is connecting up the two. Thus, this should belong to 
> HCatLoader. Setting this parameter across the board results in poor 
> performance for HCatLoader, so it must only be set when using with HBase.
> Thus, it belongs in the SpecialCases definition as that was created 
> specifically for these kinds of odd cases, and can be called from within 
> HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to