Lars George created SQOOP-685:
---------------------------------

             Summary: Support HBase bulk loading as another way to load data 
into HBase
                 Key: SQOOP-685
                 URL: https://issues.apache.org/jira/browse/SQOOP-685
             Project: Sqoop
          Issue Type: New Feature
          Components: hbase-integration
    Affects Versions: 2.0.0
            Reporter: Lars George
             Fix For: 2.0.0


HBase has a bulk loading feature that can be used by Sqoop to stage files and 
then bulk load them into HBase. This is preferable for large amounts of data as 
the normal CRUD based API is otherwise quickly overloaded. See the HBase 
suppied ImportTsv.java and its used of the "importtsv.bulk.output" command line 
option. It shows how to easily switch between direct API import and bulk file 
staging. 

It might be necessary to add an additional step into Sqoop that allows to 
sample the data and presplit the table into the right amount of regions before 
the initial loading. This could be done here, or as another issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to