Adam Fuchs created ACCUMULO-1696:
------------------------------------

             Summary: deep copy in the compaction scope iterators can throw off 
the stats
                 Key: ACCUMULO-1696
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1696
             Project: Accumulo
          Issue Type: Bug
          Components: tserver
    Affects Versions: 1.5.0
            Reporter: Adam Fuchs
            Priority: Minor


When application-level iterators deep copy the source iterator in a major 
compaction, the stats can be significantly off. We count two things in a major 
compaction:
1. Entries read. This is done using a counting iterator sitting just above the 
system iterators.
2. Entries written. This is done by counting the entries that are written to 
the RFile.
Here's an example of what we see in the Accumulo logs:
{code}
2013-09-06 11:53:31,371 [tabletserver.Compactor] DEBUG: Compaction 
k;row11;row10 20 read | 382,629 written |      3 entries/sec |  5.337 secs
{code}
In this case, we're only counting 20 entries read, presumably because the 
iterators have been deep copied and the counting iterator that is being polled 
does not get a complete view of how many entries were read. Instead of 3 
entries/sec we should have registered close to 72k entries/sec.

To fix this, should we be counting all reads coming from any of the deep copies 
of the source iterators? This could be done by using a CountingIterator that 
keeps one counter for all deep copies. Thread-level counters could be used for 
lock-free counts in case multiple threads are ever used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to