[
https://issues.apache.org/jira/browse/HBASE-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296596#comment-14296596
]
Jingcheng Du commented on HBASE-11861:
--------------------------------------
Think about the thread pool thing in the mob file compaction.
As you know, we have to divide the mob files to batches if there're two many
candidates in the mob file compaction (native compaction), and it's 100 as
default.
If we have multiple threads to do the compaction in chore, we have to reduce
the batch limitation, for instance 10 ( from 100).
So that is less efficient in the mob compaction after the thread pool is used.
Previously in one thread, we have 100 files merged to 1, After the pool is
used, we have 10 files merged to 1.
Maybe we could do the compaction in parallel in future ( to dispatch the
compaction the HRS). But for now we could have one thread to handle that.
Please advise. Thanks.
> Native MOB Compaction mechanisms.
> ---------------------------------
>
> Key: HBASE-11861
> URL: https://issues.apache.org/jira/browse/HBASE-11861
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver, Scanners
> Affects Versions: 2.0.0
> Reporter: Jonathan Hsieh
> Assignee: Jingcheng Du
> Attachments: 141030-mob-compaction.pdf, HBASE-11861-V1.diff,
> HBASE-11861-V2.diff, HBASE-11861.diff, mob compaction-out-of-region.pdf, mob
> compaction.pdf
>
>
> Currently, the first cut of mob will have external processes to age off old
> mob data (the ttl cleaner), and to compact away deleted or over written data
> (the sweep tool).
> From an operational point of view, having two external tools, especially one
> that relies on MapReduce is undesirable. In this issue we'll tackle
> integrating these into hbase without requiring external processes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)