[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Shelukhin resolved HBASE-7667.
-------------------------------------
Resolution: Fixed
Fix Version/s: 0.99.0
0.98.0
All the pertinent patches have been committed for some time (before 98 was
branched).
> Support stripe compaction
> -------------------------
>
> Key: HBASE-7667
> URL: https://issues.apache.org/jira/browse/HBASE-7667
> Project: HBase
> Issue Type: New Feature
> Components: Compaction
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: Stripe compaction perf evaluation.pdf, Stripe compaction
> perf evaluation.pdf, Stripe compaction perf evaluation.pdf, Stripe
> compactions.pdf, Stripe compactions.pdf, Stripe compactions.pdf, Stripe
> compactions.pdf, Using stripe compactions.pdf, Using stripe compactions.pdf,
> Using stripe compactions.pdf, stripe-cdf.pdf
>
>
> So I was thinking about having many regions as the way to make compactions
> more manageable, and writing the level db doc about how level db range
> overlap and data mixing breaks seqNum sorting, and discussing it with Jimmy,
> Matteo and Ted, and thinking about how to avoid Level DB I/O multiplication
> factor.
> And I suggest the following idea, let's call it stripe compactions. It's a
> mix between level db ideas and having many small regions.
> It allows us to have a subset of benefits of many regions (wrt reads and
> compactions) without many of the drawbacks (managing and current
> memstore/etc. limitation).
> It also doesn't break seqNum-based file sorting for any one key.
> It works like this.
> The region key space is separated into configurable number of fixed-boundary
> stripes (determined the first time we stripe the data, see below).
> All the data from memstores is written to normal files with all keys present
> (not striped), similar to L0 in LevelDb, or current files.
> Compaction policy does 3 types of compactions.
> First is L0 compaction, which takes all L0 files and breaks them down by
> stripe. It may be optimized by adding more small files from different
> stripes, but the main logical outcome is that there are no more L0 files and
> all data is striped.
> Second is exactly similar to current compaction, but compacting one single
> stripe. In future, nothing prevents us from applying compaction rules and
> compacting part of the stripe (e.g. similar to current policy with rations
> and stuff, tiers, whatever), but for the first cut I'd argue let it "major
> compact" the entire stripe. Or just have the ratio and no more complexity.
> Finally, the third addresses the concern of the fixed boundaries causing
> stripes to be very unbalanced.
> It's exactly like the 2nd, except it takes 2+ adjacent stripes and writes the
> results out with different boundaries.
> There's a tradeoff here - if we always take 2 adjacent stripes, compactions
> will be smaller but rebalancing will take ridiculous amount of I/O.
> If we take many stripes we are essentially getting into the
> epic-major-compaction problem again. Some heuristics will have to be in place.
> In general, if, before stripes are determined, we initially let L0 grow
> before determining the stripes, we will get better boundaries.
> Also, unless unbalancing is really large we don't need to rebalance really.
> Obviously this scheme (as well as level) is not applicable for all scenarios,
> e.g. if timestamp is your key it completely falls apart.
> The end result:
> - many small compactions that can be spread out in time.
> - reads still read from a small number of files (one stripe + L0).
> - region splits become marvelously simple (if we could move files between
> regions, no references would be needed).
> Main advantage over Level (for HBase) is that default store can still open
> the files and get correct results - there are no range overlap shenanigans.
> It also needs no metadata, although we may record some for convenience.
> It also would appear to not cause as much I/O.
--
This message was sent by Atlassian JIRA
(v6.1#6144)