Hi Andy Thanks for the clarification, it certainly makes sense.
Glenn. On Tue, Mar 6, 2012 at 9:32 AM, Andy Seaborne <[email protected]> wrote: > On 06/03/12 09:14, Glenn Proctor wrote: >> >> Hi >> >> I have a TDB instance (0.8.10) containing about 207m triples. I've run >> tdbstats and moved stats.opt into the appropriate place. >> >> I've noticed that running the same query multiple times in succession >> results in successively shorter query times, up to a point. For >> example, on an otherwise-idle TDB instance, the query >> >> SELECT ?facet ?val (COUNT(?val) as ?vc) WHERE { ?id a ?val . ?id >> ?facet ?val . } GROUP BY ?facet ?val ORDER BY DESC(?vc) LIMIT 25 >> >> Takes 3707s, then 1424s, then 345s where it seems to stay for subsequent >> runs. >> >> What's the reason for this initial improvement and subsequent tailing >> off - are the indexes being optimised with every query? >> >> Glenn. > > > Glenn, > > Nothing so clever I'm afraid. I think what your seeing is the OS management > of memory mapped files. > > The first run, if a cold system or if queries that have touched different > parts of indexes, will cause the memory mapped pages to become mapped and > this is also caching index data in memory. The latter runs benefit from the > OS caching. If the intermediate results are large for the sort, then it's > spilling to disk, also with possible OS cache effects. > > Andy
