[jira] [Commented] (HBASE-3969) Outdated data can not be cleaned in time

stack (JIRA) Thu, 16 Jun 2011 10:51:00 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050594#comment-13050594
 ]


stack commented on HBASE-3969:
------------------------------

@ zhoushuaifeng I take it you are doing lots of deletes or you are doing lots 
of aging out of old versions?  Do you think the ycsb represents what your 
actual loading will be like?

You have identified an issue with our scheme where-by we assign a priority on 
initial queuing and while the priority may have been correct at the time, 
circumstances change over time.  Its as though the priority should change with 
as the situation evolves?  If something has been queued a long time, its 
priority should go up?  Perhaps go up only if a major compaction?   This would 
require us adding something to peek at queues on a period.

The solution of (between 1 and blockingStoreFiles - compactionThreshold) seems 
unsatisfactory, don't you agree.  Its hard to tell how it will play out over 
time on a cluster?

Looking at your patch, you might want to do a check for 
hbase.hstore.blockingStoreFiles > hbase.hstore.compactionThreshold

> Outdated data can not be cleaned in time
> ----------------------------------------
>
>                 Key: HBASE-3969
>                 URL: https://issues.apache.org/jira/browse/HBASE-3969
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.90.1, 0.90.2, 0.90.3
>            Reporter: zhoushuaifeng
>             Fix For: 0.90.4
>
>         Attachments: HBASE-3969-solution1-for-branch.patch, 
> HBASE-3969-solution1.patch
>
>
> Compaction checker will send regions to the compact queue to do compact. But 
> the priority of these regions is too low if these regions have only a few 
> storefiles. When there is large through output, and the compact queue will 
> aways have some regions with higher priority. This may causing the major 
> compact be delayed for a long time(even a few days),  and outdated data 
> cleaning will also be delayed.
> In our test case, we found some regions sent to the queue by major compact 
> checker hunging in the queue for more than 2 days! Some scanners on these 
> regions cannot get availably data for a long time and lease expired.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3969) Outdated data can not be cleaned in time

Reply via email to