[jira] [Commented] (ACCUMULO-4692) CompactionDriver leaves abandoned metadata scans

Adam Fuchs (JIRA) Fri, 04 Aug 2017 14:25:22 -0700

    [ 
https://issues.apache.org/jira/browse/ACCUMULO-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114986#comment-16114986
 ]


Adam Fuchs commented on ACCUMULO-4692:
--------------------------------------

We saw about 7,000 scans listed on the monitor page for one metadata tablet. 
CompactionDriver pretty much launches new scans constantly, so 7,000 is 
probably around the equilibrium point for when the scan sessions time out. That 
works out to about one query per 10ms. I don't know whether there was memory 
associated with those sessions, but if there were a few hundred kilobytes 
sitting in a result buffer for each of them that could be a problem. We did not 
directly correlate this to any observable degradation in performance, but we 
didn't look very deep at that.

I think the REPO change would actually be the simplest. One-off iterators 
almost never come without bugs, and I think you could use a static soft 
reference cache with a few lines of code to get the desired scan range limit. 
However, we might also consider reducing the frequency of polling and/or share 
a polling mechanism between multiple CompactionDrivers (the latter might 
actually involve a custom iterator similar to the one used in the tablet 
watcher).

> CompactionDriver leaves abandoned metadata scans
> ------------------------------------------------
>
>                 Key: ACCUMULO-4692
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4692
>             Project: Accumulo
>          Issue Type: Bug
>          Components: fate
>            Reporter: Adam Fuchs
>
> We wrote a tool to kick off tablet compactions in the background while 
> minimizing compaction load per-server. The tool uses range compaction on one 
> tablet per call. We're seeing a high number of scans on the metadata table 
> (~7,000 on a ~100 node cluster).
> The metadata query in the isReady() method of CompactionDriver that is used 
> to see if the compaction has completed uses a range that goes to the end of 
> the metadata entries for the given table, but it stops consuming the results 
> of the scanner at the end of the compaction range. isReady gets called in a 
> pretty tight loop, especially with hundreds of compactions running 
> concurrently. Seems like we should limit the scan to the metadata range 
> associated with the compaction so that the scan can get cleaned up quickly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ACCUMULO-4692) CompactionDriver leaves abandoned metadata scans

Reply via email to