[ 
https://issues.apache.org/jira/browse/HBASE-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242256#comment-14242256
 ] 

Jingcheng Du commented on HBASE-11861:
--------------------------------------

Thanks Jon. [~jmhsieh]
bq. I think the bulk load approach avoids the potential race on mob compaction 
and normal compaction.
I think we still have the race condition even if we use the bulkload in mob 
compaction.
# We have a mob cell#1 in the hbase.
# Now the mob compaction happens, and the mob file file#1 which has the cell#1 
is merged into a bigger file file#2. and we write this new cell#1 with value 
file#2(use the file name) into a new store file, but not start to bulkload yet.
# Users delete cell#1, and major compaction happens then, this cell#1 is not 
existent in the HBase.
# The mob compaction bulkloads this new store file.
# The cell#1 is back to the HBase again.

This is why I insist to run the mob compaction in regions. If we do the mob 
compaction out of region or across regions, we have to locks the major 
compactions globally.

> Native MOB Compaction mechanisms.
> ---------------------------------
>
>                 Key: HBASE-11861
>                 URL: https://issues.apache.org/jira/browse/HBASE-11861
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>    Affects Versions: 2.0.0
>            Reporter: Jonathan Hsieh
>         Attachments: 141030-mob-compaction.pdf, mob compaction.pdf
>
>
> Currently, the first cut of mob will have external processes to age off old 
> mob data (the ttl cleaner), and to compact away deleted or over written data 
> (the sweep tool).  
> From an operational point of view, having two external tools, especially one 
> that relies on MapReduce is undesirable.  In this issue we'll tackle 
> integrating these into hbase without requiring external processes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to