[ 
https://issues.apache.org/jira/browse/HBASE-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188371#comment-14188371
 ] 

Ashish Singhi commented on HBASE-12375:
---------------------------------------

Thanks Matteo, for looking into the patch.

bq. I think that by removing the check you'll fail in case of splits (but I 
haven't checked)
No, it did not fail I tested manually the below scenario,
1. Create a table initially with 3 splits
2. Run bulkload
3. split the table into one more region
4. put some data into the table for that region
5. Run completebulkload

bq. the problem is that the LoadIncrementalHFiles will create a "_tmp" 
directory.
Yes, it does create the "_tmp" directory but inside the CF directory so it will 
not create any problem.

We can see that from the logs generated after running the above mentioned 
scenario,
{noformat}
2014-10-29 20:03:40,172 INFO  [LoadIncrementalHFiles-0] 
mapreduce.LoadIncrementalHFiles: Trying to load 
hfile=hdfs://10.18.40.106:9000/s4/_d/_tmp/af37ac06db0f4a8ebe9ccd848d5864b7.top 
first=90 last=90
2014-10-29 20:03:40,172 INFO  [LoadIncrementalHFiles-3] 
mapreduce.LoadIncrementalHFiles: Trying to load 
hfile=hdfs://10.18.40.106:9000/s4/_d/_tmp/af37ac06db0f4a8ebe9ccd848d5864b7.bottom
 first=5 last=67
{noformat}

bq. In theory is enough adding to this patch the rename of "_tmp" to something 
like .tmp
Do you still want me to do this ?

bq. did you tried to run this patch with a set of files that requires splitting?
Yes, as mentioned above

> LoadIncrementalHFiles fails to load data in table when CF name starts with '_'
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-12375
>                 URL: https://issues.apache.org/jira/browse/HBASE-12375
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.5
>            Reporter: Ashish Singhi
>            Assignee: Ashish Singhi
>            Priority: Minor
>         Attachments: HBASE-12375.patch
>
>
> We do not restrict user from creating a table having column family starting 
> with '_'.
> So when user creates a table in such a way then LoadIncrementalHFiles will 
> skip those family data to load into the table.
> {code}
> // Skip _logs, etc
> if (familyDir.getName().startsWith("_")) continue;
> {code}
> I think we should remove that check as I do not see any _logs directory being 
> created by the bulkload tool in the output directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to