[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

[email protected] (Commented) (JIRA) Wed, 11 Apr 2012 12:03:44 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251851#comment-13251851
 ]


[email protected] commented on HBASE-5741:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4700/
-----------------------------------------------------------

(Updated 2012-04-11 19:01:39.660787)


Review request for hbase.


Changes
-------

Changes done as per Ted.


Summary
-------

There is a bulk output option in the importtsv workload. It outputs the HFiles 
in a user defined directory. The current code assumes that a table with its 
name equal to the given output directory exists, and throws an exception 
otherwise. Here is a patch for creating a table in case it doesn't exist.


This addresses bug HBase-5741.
    https://issues.apache.org/jira/browse/HBase-5741


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java ab22fc4 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java ac30a62 

Diff: https://reviews.apache.org/r/4700/diff


Testing
-------

Added a new test for bulkoutput; All importtsv tests pass.


Thanks,

Himanshu


                
> ImportTsv does not check for table existence 
> ---------------------------------------------
>
>                 Key: HBASE-5741
>                 URL: https://issues.apache.org/jira/browse/HBASE-5741
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Clint Heath
>            Assignee: Himanshu Vashishtha
>         Attachments: HBase-5741.patch
>
>
> The usage statement for the "importtsv" command to hbase claims this:
> "Note: if you do not use this option, then the target table must already 
> exist in HBase" (in reference to the "importtsv.bulk.output" command-line 
> option)
> The truth is, the table must exist no matter what, importtsv cannot and will 
> not create it for you.
> This is the case because the createSubmittableJob method of ImportTsv does 
> not even attempt to check if the table exists already, much less create it:
> (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
> 305 HTable table = new HTable(conf, tableName);
> The HTable method signature in use there assumes the table exists and runs a 
> meta scan on it:
> (From org.apache.hadoop.hbase.client.HTable.java)
> 142 * Creates an object to access a HBase table.
> ...
> 151 public HTable(Configuration conf, final String tableName)
> What we should do inside of createSubmittableJob is something similar to what 
> the "completebulkloads" command would do:
> (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
> 690 boolean tableExists = this.doesTableExist(tableName);
> 691 if (!tableExists) this.createTable(tableName,dirPath);
> Currently the docs are misleading, the table in fact must exist prior to 
> running importtsv. We should check if it exists rather than assume it's 
> already there and throw the below exception:
> 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
> Encountered problems when prefetch META table: 
> org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
> table: myTable2, row=myTable2,,99999999999999
>       at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

Reply via email to