Hi Andy

Thanks for the clarification, it certainly makes sense.

Glenn.


On Tue, Mar 6, 2012 at 9:32 AM, Andy Seaborne <[email protected]> wrote:
> On 06/03/12 09:14, Glenn Proctor wrote:
>>
>> Hi
>>
>> I have a TDB instance (0.8.10) containing about 207m triples. I've run
>> tdbstats and moved stats.opt into the appropriate place.
>>
>> I've noticed that running the same query multiple times in succession
>> results in successively shorter query times, up to a point. For
>> example, on an otherwise-idle TDB instance, the query
>>
>> SELECT ?facet ?val (COUNT(?val) as ?vc) WHERE { ?id a ?val . ?id
>> ?facet ?val . } GROUP BY ?facet ?val ORDER BY DESC(?vc) LIMIT 25
>>
>> Takes 3707s, then 1424s, then 345s where it seems to stay for subsequent
>> runs.
>>
>> What's the reason for this initial improvement and subsequent tailing
>> off - are the indexes being optimised with every query?
>>
>> Glenn.
>
>
> Glenn,
>
> Nothing so clever I'm afraid. I think what your seeing is the OS management
> of memory mapped files.
>
> The first run, if a cold system or if queries that have touched different
> parts of indexes, will cause the memory mapped pages to become mapped and
> this is also caching index data in memory.  The latter runs benefit from the
> OS caching.  If the intermediate results are large for the sort, then it's
> spilling to disk, also with possible OS cache effects.
>
>        Andy

Reply via email to