[
https://issues.apache.org/jira/browse/HBASE-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242256#comment-14242256
]
Jingcheng Du commented on HBASE-11861:
--------------------------------------
Thanks Jon. [~jmhsieh]
bq. I think the bulk load approach avoids the potential race on mob compaction
and normal compaction.
I think we still have the race condition even if we use the bulkload in mob
compaction.
# We have a mob cell#1 in the hbase.
# Now the mob compaction happens, and the mob file file#1 which has the cell#1
is merged into a bigger file file#2. and we write this new cell#1 with value
file#2(use the file name) into a new store file, but not start to bulkload yet.
# Users delete cell#1, and major compaction happens then, this cell#1 is not
existent in the HBase.
# The mob compaction bulkloads this new store file.
# The cell#1 is back to the HBase again.
This is why I insist to run the mob compaction in regions. If we do the mob
compaction out of region or across regions, we have to locks the major
compactions globally.
> Native MOB Compaction mechanisms.
> ---------------------------------
>
> Key: HBASE-11861
> URL: https://issues.apache.org/jira/browse/HBASE-11861
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver, Scanners
> Affects Versions: 2.0.0
> Reporter: Jonathan Hsieh
> Attachments: 141030-mob-compaction.pdf, mob compaction.pdf
>
>
> Currently, the first cut of mob will have external processes to age off old
> mob data (the ttl cleaner), and to compact away deleted or over written data
> (the sweep tool).
> From an operational point of view, having two external tools, especially one
> that relies on MapReduce is undesirable. In this issue we'll tackle
> integrating these into hbase without requiring external processes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)