[ 
https://issues.apache.org/jira/browse/HBASE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13637711#comment-13637711
 ] 

Yu Li commented on HBASE-5472:
------------------------------

with the attached patch, when the generated hfile includes invalid column 
family, the output of bulkload would be like:
{panel}
{color:red} 13/04/21 20:47:25 ERROR mapreduce.LoadIncrementalHFiles: Unmatched 
family names found, unmatched family names in hfiles to be bulkload: [CF], 
valid family names of table t2 are: [cf]
13/04/21 20:47:25 ERROR mapreduce.LoadIncrementalHFiles: 
-------------------------------------------------
Bulk load aborted with some files not yet loaded:
-------------------------------------------------
  hdfs://9.125.91.85:9000/testBulkload/CF/b7fadfd7d188496cae412862170ea713

Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
        at java.lang.reflect.Method.invoke(Method.java:611)
        at org.apache.hadoop.hbase.mapreduce.Driver.main(Driver.java:51)
...
{color:red} Caused by: java.lang.RuntimeException: Bulkload failed because 
invalid family name found in bulkload target hfiles, please check your codes if 
the hfiles are manually generated.
        at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:223)
        at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:720)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
...
{panel}

My testing steps are:
{noformat}1) Create a table with single column family in name of "cf"
2) HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` 
${HADOOP_HOME}/bin/hadoop jar /opt/ibm/biginsights/hbase/hbase-VERSION.jar 
importtsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:a,CF:b \
-Dimporttsv.bulk.output=hdfs://9.125.91.85:9000/testBulkload 
-Dimporttsv.separator=, t2 /tmp/bulkload
3) HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` 
${HADOOP_HOME}/bin/hadoop jar /opt/ibm/biginsights/hbase/hbase-VERSION.jar 
completebulkload hdfs://9.125.91.85:9000/testBulkload t2{noformat}
                
> LoadIncrementalHFiles loops forever if the target table misses a CF
> -------------------------------------------------------------------
>
>                 Key: HBASE-5472
>                 URL: https://issues.apache.org/jira/browse/HBASE-5472
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>            Reporter: Lars Hofhansl
>            Assignee: Yu Li
>            Priority: Minor
>         Attachments: HBASE-5472-trunk.patch
>
>
> I have some HFiles for two column families 'y','z', but I specified a target 
> table that only has CF 'y'.
> I see the following repeated forever.
> ...
> 12/02/23 22:57:37 WARN mapreduce.LoadIncrementalHFiles: Attempt to bulk load 
> region containing  into table z with files [family:y 
> path:hdfs://bunnypig:9000/bulk/z2/y/bd6f1c3cc8b443fc9e9e5fddcdaa3b09, 
> family:z 
> path:hdfs://bunnypig:9000/bulk/z2/z/38f12fdbb7de40e8bf0e6489ef34365d] failed. 
>  This is recoverable and they will be retried.
> 12/02/23 22:57:37 DEBUG client.MetaScanner: Scanning .META. starting at 
> row=z,,00000000000000 for max=2147483647 rows using 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7b7a4989
> 12/02/23 22:57:37 INFO mapreduce.LoadIncrementalHFiles: Split occured while 
> grouping HFiles, retry attempt 1596 with 2 files remaining to group or split
> 12/02/23 22:57:37 INFO mapreduce.LoadIncrementalHFiles: Trying to load 
> hfile=hdfs://bunnypig:9000/bulk/z2/y/bd6f1c3cc8b443fc9e9e5fddcdaa3b09 first=r 
> last=r
> 12/02/23 22:57:37 INFO mapreduce.LoadIncrementalHFiles: Trying to load 
> hfile=hdfs://bunnypig:9000/bulk/z2/z/38f12fdbb7de40e8bf0e6489ef34365d first=r 
> last=r
> 12/02/23 22:57:37 DEBUG mapreduce.LoadIncrementalHFiles: Going to connect to 
> server region=z,,1330066309814.d5fa76a38c9565f614755e34eacf8316., 
> hostname=localhost, port=60020 for row 
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to