[
https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086328#comment-15086328
]
Edward Bortnikov commented on HBASE-15016:
------------------------------------------
My 2 cents (well, maybe 5 :)) Apologies in case they repeat part of the past
discussion.
Getting the most of smart (self-compacting/self-compressing/etc) stores depends
on the breathing room they get. The more they do, the larger the potential to
exploit redundancies in data. Therefore the idea is to a-priori let them use
more memory, and rely on them to manage it smart. Early hints to compact are
not the real problem - for example, the store can monitor its size internally,
and compact from time to time regardless of what happens outside. That's why we
think the existing region-store contract needs a little extension. The question
is how to minimize and generalize it; Region does not really need to know much
about the Store internals.
I wonder if we can get around the threshold nomenclature that is awkward
indeed. Let's say every region has two memory areas - underflow and overflow.
Some stores can use the latter. The total of the underflow and the overflow is
budgeted. When this budget is exceeded, we seek to free up the underflow area
first - assuming that it is mostly occupied by default stores that can only
grow monotonically. Every store that fails to meet the individual underflow
budget (including the self-compacting ones) is flushed to disk. We could add a
heuristic that if this fails to free enough room, we also flush all stores that
have an overflow part.
These decisions require a global view - such that a compacting store could
extend far beyond its 16M cap, into the overflow area. Giving each store a
small individual extra just won't cut it.
We are very open to alternative ideas, if they can solve the core problem.
Thanks for bearing with us.
> StoreServices facility in Region
> --------------------------------
>
> Key: HBASE-15016
> URL: https://issues.apache.org/jira/browse/HBASE-15016
> Project: HBase
> Issue Type: Sub-task
> Reporter: Eshcar Hillel
> Assignee: Eshcar Hillel
> Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch,
> HBASE-15016-V03.patch, Regioncounters.pdf
>
>
> The default implementation of a memstore ensures that between two flushes the
> memstore size increases monotonically. Supporting new memstores that store
> data in different formats (specifically, compressed), or that allows to
> eliminate data redundancies in memory (e.g., via compaction), means that the
> size of the data stored in memory can decrease even between two flushes. This
> requires memstores to have access to facilities that manipulate region
> counters and synchronization.
> This subtasks introduces a new region interface -- StoreServices, through
> which store components can access these facilities.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)