stack created HBASE-12626:
-----------------------------

             Summary: Archive cleaner cannot keep up; it maxes out at about 
400k deletes/hour
                 Key: HBASE-12626
                 URL: https://issues.apache.org/jira/browse/HBASE-12626
             Project: HBase
          Issue Type: Improvement
          Components: scaling, master
    Affects Versions: 0.94.25
            Reporter: stack
            Assignee: stack
            Priority: Critical


On big clusters, it is possible to overrun the archive cleaning thread.  Make 
it able to do more work per cycle when needed.

We saw this on a user's cluster. The rate at which files were being moved to 
the archive exceeded our delete rate such that the archive had tens of millions 
of files putting a friction on all cluster ops.

The cluster had ~500 nodes.  It that was RAM constrained (other processes on 
box also need RAM). Over a period of days, the loading was thrown off kilter 
because it started taking double writes going from one schema to another 
(Cluster was running hot before the double loading).  The master was deleting 
an archived file every 9ms on average, about 400k deletes an hour.  The 
constrained RAM and their having 4-5 column famiilies had them creating files 
in excess of this rate so we backed up.

For some helpful background/input, see the dev thread 
http://search-hadoop.com/m/DHED4UYSF9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to