[
https://issues.apache.org/jira/browse/HBASE-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248395#comment-14248395
]
Jonathan Hsieh commented on HBASE-11861:
----------------------------------------
bq. This is why I insist to run the mob compaction in regions. If we do the mob
compaction out of region or across regions, we have to locks the major
compactions globally.
nice catch on that race condition -- I buy it. This is essentially the same as
with the MR sweeper approach right?
So we'd need to guarantee that the compacted mob and the bulkload of the new
references block a major compaction on the region that the ref bulk load is
happening on. This means no major compactions before step #2, but allowed
after step #4.
Let's spell out the costs of the different approaches. -- the del mob global
scan for the mob compaction approach and the per region mob compaction.
Meanwhile I noticed you file a new jira for counts and I filed one for the del
mob generator. We can get code started on those, and hash out this higher
level design while doing so.
bq. I think we could leave the expired(live longer than TTL) cells out of the
del files. Let the ExpiredMobFileCleaner to handle those mob files directly.
sounds reasonable. We need to enforce the mob file time ordering though to
make sure the mob compaction is effective.
> Native MOB Compaction mechanisms.
> ---------------------------------
>
> Key: HBASE-11861
> URL: https://issues.apache.org/jira/browse/HBASE-11861
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver, Scanners
> Affects Versions: 2.0.0
> Reporter: Jonathan Hsieh
> Attachments: 141030-mob-compaction.pdf, mob compaction.pdf
>
>
> Currently, the first cut of mob will have external processes to age off old
> mob data (the ttl cleaner), and to compact away deleted or over written data
> (the sweep tool).
> From an operational point of view, having two external tools, especially one
> that relies on MapReduce is undesirable. In this issue we'll tackle
> integrating these into hbase without requiring external processes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)