[jira] [Commented] (HBASE-11861) Native MOB Compaction mechanisms.

Jingcheng Du (JIRA) Mon, 08 Dec 2014 00:31:58 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237612#comment-14237612
 ]


Jingcheng Du commented on HBASE-11861:
--------------------------------------

Thanks Jon [[email protected]] for the comments.
bq. 0) when we do a mob compaction, are we compacting all mobs or just mobs 
relevant to a particular region?
Just relevant to a particular region.

bq. 1) I don't think mob compaction has to happen after major compactions. It 
could have its own schedule and could run less frequently than the normal major 
compactions. Doing them after a major compaction (or after a few) is reasonable 
first cut.
I agree.
But it's better to compact the mob files in each region since we have to 
synchronize the major compaction and mob compaction to avoid the race 
condition. The way we do in the sweep tool is to use zookeeper. If we do the 
mob compaction in region, we could do it in locks.

bq. 3) why should hfile link's be rewritten? I think we can use the same 
critieria to decide on if we do a mob compaction on it.
Agree.

bq. 4) I don't think we want to scan all the mob files to do a compaction on a 
single store. Also, because of splits and merges, there could be other del mob 
files that are relevant that have a start key earlier or later that cover the 
range in a particular store. I think we'll have to do some start key and end 
key tracking in the delmob files and the mob files to reduce the candidate list.
I thought we could only list the mob file names from NN. But we only get the 
md5 of a start key, not the exact start key.
You're right, if the regions are merged, we could not find the related mob 
files at all only by the md5 of the start key.
Currently we have the start key and stop key in the metadata of hfiles. It 
means we could not get them only by the file names, but need to open readers to 
the files.
Do you have ideas on this to track the start and stop key besides reading the 
metadata, to revise the pattern of a mob file name? Please advise. Thanks.

bq. 5) why do a mini del file compaction? why not just use it as is?
Is that possible there are too many delmob files? If not, we could directly 
open scanners to these delmob files.
Jon, do you have comments for the way to map the file names to deleted cells?

bq. 6) deletedCellsSizeInOneMobFile – interesting. I was thinking just a count 
of mobs associated with each mob file.
Count is a good idea. But currently we don't have the accurate count 
information in the mob and del mob file. As you know we have threshold for the 
mob, we could not know how many of them are mob cells, how many are not. That's 
a problem, right?

bq. 7) on merge – shouldn't we try to guarantee time order in a merge so that 
the ttl cleaner is still effective?
Right, we should guarantee time order in a merge, I missed that in the design.

bq. 8) I'm not clear about the splits case here. Also does it manage merges? 
(say we have a single del file with deletes in rows a b c d. that region gets 
split into a b and c d, and then again into separate a, b, c, and d regions. 
finally someone does a merge for b and c to create a bc region. Does the 
grouping on hash idea break then?
Sorry, I missed the merge case. In order to get the start/stop keys 
information, we have to read the mob files instead of file names in each region 
now.
The region split and merge case will be handled in mob compaction by regions.
For split, If the start key of a mob file is between the start and stop keys of 
a region, this mob file is handled by this region. This mob file might cross 
regions by checking the its stop key. If this mob file crosses regions, it will 
create two/or more ref file for each daughter regions. Each of the ref file is 
handled in the mob compaction of daughter regions.
For merges, the files are not across regions, we directly select the mob files 
if they're qualified (small or invalid) owned by the current region.
in the mob compaction of a b,  if a mob file file#1 is selected we need create 
two ref files, one for a b named ref-ab-file#1, the other is for c d named 
ref-cd-file#1 (If a mob file is not selected, we don't need to create them at 
all). The ref file ref-ab-file#1 s handled in the mob compaction of a b to 
generate a new mob file file#1ab, the ref-bc-file#1 is handled in the mob 
compaction of c d to generate the mob file file#1cd.
After the region ab is split, if( and only if) the file file#1ab is selected in 
the mob compaction of region a, the new ref files are created and handled by 
region a and region b.
For merge, it's easier than the split, directly select the small or invalid mob 
files whose start/stop keys are between the key range of the current region.

bq. I think we need to either track both the start and end keys in the del 
files and likely the mobfiles. An alternative is somethign that splits mob 
flies and del files but that potentially causes write amplificaiton we want to 
avoid.
Agree, we should track the start and stop keys. Now we track them in the 
metadata of mob files. Do we need to track them in the file name by directly 
using the hex string of start/stop key instead of md5(startkey)? So we could 
know the start/stop keys directly the file names whereas currently we have to 
read the metadata of the mob files. Please advise. Thanks.

bq. My gut feeling is that we need to deal with all mob files, iterate through 
ranges, and use mob counts. We'd track start/end keys and counts in each mob 
file and each del file. We could then iterate on mob files, and select nonly 
the del files that are relevant based on the start keys and end keys. We might 
want to track a histogram (count or size) of mob files deletions for particular 
mob file in each del file.
Currently we track the start/stop keys in the metadata of mob files. But it's 
hard to track the counts in each mob file since we have threshold for the mob 
cells.
In this design doc, the mob compaction is handled in each region, it means only 
part of mob files (owned by the current region) could be handled each time.
Instead, we could also do the mob compaction globally (in one single place) for 
all the mob files. But how to avoid the race condition between the major 
compaction and mob compaction for this? Still use the zookeeper?
Since the major compaction and mob compaction are not frequent, and deletion is 
rare in the mob cases, could we ignore the race condition directly? Please 
advise. Thanks.

> Native MOB Compaction mechanisms.
> ---------------------------------
>
>                 Key: HBASE-11861
>                 URL: https://issues.apache.org/jira/browse/HBASE-11861
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>    Affects Versions: 2.0.0
>            Reporter: Jonathan Hsieh
>         Attachments: 141030-mob-compaction.pdf, mob compaction.pdf
>
>
> Currently, the first cut of mob will have external processes to age off old 
> mob data (the ttl cleaner), and to compact away deleted or over written data 
> (the sweep tool).  
> From an operational point of view, having two external tools, especially one 
> that relies on MapReduce is undesirable.  In this issue we'll tackle 
> integrating these into hbase without requiring external processes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11861) Native MOB Compaction mechanisms.

Reply via email to