[ 
https://issues.apache.org/jira/browse/HBASE-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510732#comment-14510732
 ] 

Ted Yu commented on HBASE-11409:
--------------------------------

{code}
161                 " exist  in HBase\n  storeFileDepth defaults to (2)\n "
162                 + "example: \" + NAME + \" 
/path/to/hfileoutputformat-output tablename 3\n\n "
{code}
Better state that only 2 and 3 are accepted depths.
{code}
285           for (FileStatus fileStatus : FSUtils.listStatus(fs, hfofDir)) {
286             visitBulkHFiles(fs, fileStatus.getPath(), visitor);
{code}
Add a debug log for each directory in the for loop.
{code}
1005        if (depth != 2 && depth != 3) {
1006          throw new IllegalArgumentException("Depth must be either 2 or 3");
1007        }
{code}
Input validation is done above. Do we still need to check for depth in 
bulkHFile() ?

> Add more flexibility for input directory structure to LoadIncrementalHFiles
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-11409
>                 URL: https://issues.apache.org/jira/browse/HBASE-11409
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: churro morales
>            Assignee: churro morales
>         Attachments: HBASE-11409.patch
>
>
> Use case:
> We were trying to combine two very large tables into a single table.  Thus we 
> ran jobs in one datacenter that populated certain column families and another 
> datacenter which populated other column families.  Took a snapshot and 
> exported them to their respective datacenters.  Wanted to simply take the 
> hdfs restored snapshot and use LoadIncremental to merge the data.  
> It would be nice to add support where we could run LoadIncremental on a 
> directory where the depth of store files is something other than two (current 
> behavior).  
> With snapshots it would be nice if you could pass a restored hdfs snapshot's 
> directory and have the tool run.  
> I am attaching a patch where I parameterize the bulkLoad timeout as well as 
> the default store file depth.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to