[ 
https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086328#comment-15086328
 ] 

Edward Bortnikov commented on HBASE-15016:
------------------------------------------

My 2 cents (well, maybe 5 :)) Apologies in case they repeat part of the past 
discussion. 

Getting the most of smart (self-compacting/self-compressing/etc) stores depends 
on the breathing room they get. The more they do, the larger the potential to 
exploit redundancies in data. Therefore the idea is to a-priori let them use 
more memory, and rely on them to manage it smart. Early hints to compact are 
not the real problem - for example, the store can monitor its size internally, 
and compact from time to time regardless of what happens outside. That's why we 
think the existing region-store contract needs a little extension. The question 
is how to minimize and generalize it; Region does not really need to know much 
about the Store internals. 

I wonder if we can get around the threshold nomenclature that is awkward 
indeed. Let's say every region has two memory areas - underflow and overflow. 
Some stores can use the latter. The total of the underflow and the overflow is 
budgeted. When this budget is exceeded, we seek to free up the underflow area 
first - assuming that it is mostly occupied by default stores that can only 
grow monotonically. Every store that fails to meet the individual underflow 
budget (including the self-compacting ones) is flushed to disk. We could add a 
heuristic that if this fails to free enough room, we also flush all stores that 
have an overflow part. 

These decisions require a global view - such that a compacting store could 
extend far beyond its 16M cap, into the overflow area. Giving each store a 
small individual extra just won't cut it. 

We are very open to alternative ideas, if they can solve the core problem. 
Thanks for bearing with us. 

> StoreServices facility in Region
> --------------------------------
>
>                 Key: HBASE-15016
>                 URL: https://issues.apache.org/jira/browse/HBASE-15016
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>         Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, 
> HBASE-15016-V03.patch, Regioncounters.pdf
>
>
> The default implementation of a memstore ensures that between two flushes the 
> memstore size increases monotonically. Supporting new memstores that store 
> data in different formats (specifically, compressed), or that allows to 
> eliminate data redundancies in memory (e.g., via compaction), means that the 
> size of the data stored in memory can decrease even between two flushes. This 
> requires memstores to have access to facilities that manipulate region 
> counters and synchronization.
> This subtasks introduces a new region interface -- StoreServices, through 
> which store components can access these facilities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to