[
https://issues.apache.org/jira/browse/HBASE-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HBASE-1923:
-------------------------------
Attachment: hbase-1923-prelim.txt
Attaching a work-in-progress patch in case anyone wants to start looking at
this (a number of people sounded interested, figured early review would be best)
Quick tutorial:
{code}
# Generate some data
perl -e 'for (1..10000) { print "$_\t$_\n"; }' | hadoop fs -put - mytsv.txt
perl -e 'for (1..10000) { print "$_\t" . ($_*3) . "\n"; }' | hadoop fs -put -
mytsv2.txt
# Create table to hold it
./bin/hbase shell
create 'myfile', 'f1'
# Do a normal MR load
HADOOP_CLASSPATH=$(cat /tmp/hbase-core-test-classpath.txt) hadoop jar
target/hbase-0.21.0-SNAPSHOT.jar importtsv -Dcolumns=f1:blah myfile mytsv.txt
# scan in the shell if you like
# Potentially split table if you like
# Generate incremental from the other file
HADOOP_CLASSPATH=$(cat /tmp/hbase-core-test-classpath.txt) hadoop jar
target/hbase-0.21.0-SNAPSHOT.jar importtsv -Dcolumns=f1:blah -Duse.hfile=true
myfile mytsv2.txt
# Load incremental
HBASE_CLASSPATH=$HADOOP_CONF_DIR ./bin/hbase
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles hfof myfile
# scan in the shell and see that the data has changed
{code}
A fair amount of work remains - have to take care of regions that split, or the
case when there are fewer reducers than regions.
> Bulk incremental load into an existing table
> --------------------------------------------
>
> Key: HBASE-1923
> URL: https://issues.apache.org/jira/browse/HBASE-1923
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client, mapred, regionserver, scripts
> Affects Versions: 0.21.0
> Reporter: anty.rao
> Assignee: Todd Lipcon
> Attachments: hbase-1923-prelim.txt
>
>
> hbase-48 is about bulk load of a new table,maybe it's more practicable to
> bulk load aganist a existing table.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.