[ 
https://issues.apache.org/jira/browse/SQOOP-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232297#comment-14232297
 ] 

Ryan Blue commented on SQOOP-1744:
----------------------------------

You're right that HBase doesn't buy us much. In that situation, where we can't 
isolate a subset of the data that might change, I think we have two options: 
either rewrite the entire dataset each time or maintain the dataset in HBase. 
We shouldn't overlook the second option, which would facilitate the fetch 
frequency that you want. Parquet is a great format to use, but if we have to 
constantly rewrite the entire dataset or very substantial portions of it, then 
it might not be worth the storage savings.

> TO-side: Write data to HBase
> ----------------------------
>
>                 Key: SQOOP-1744
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1744
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: connectors
>            Reporter: Qian Xu
>            Assignee: Qian Xu
>             Fix For: 1.99.5
>
>
> Propose to write data into HBase. Note that different to HDFS, HBase is 
> append only. Merge does not work for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to