Re: Heavy queries followed by light queries

nadav hoze Mon, 02 Sep 2013 06:34:43 -0700

Machine size: 12 GB
OS: Windows Server 2008 64 bit
VM: varies from client to client.
data (in triples): 20,000,000 (3.6 GB)
Heap size: 2 GB
Driver program : ? (didn't understand)
No the database is on a network shared drive (different server).


pattern matching (where clause):

*?ontologyConcept schema:code @concept.code^^xsd:string .*
*?ontologyConcept schema:codeSystemId @concept.codeSystemId^^xsd:string*
*OPTIONAL{?ontologyConcept schema:isDeleted ?ontologyConceptDeleted}
FILTER(!bound(?ontologyConceptDeleted) || (bound(?ontologyConceptDeleted)
&& ?ontologyConceptDeleted = false))*
*{*
* ?child relations:subClassOf ?ontologyConcept .*
* OPTIONAL{?child schema:isDeleted ?childDeleted}
FILTER(!bound(?childDeleted) || (bound(?childDeleted) && ?childDeleted =
false))*
* ?concept relations:equalsTo ?child .*
* OPTIONAL{?concept schema:isDeleted ?conceptDeleted}
FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) &&
?conceptDeleted = false))*
* ?concept rdf:type schema:Concept*
*}*
*UNION*
*{*
* ?concept relations:equalsTo ?ontologyConcept .*
* ?concept rdf:type schema:Concept*
* OPTIONAL{?concept schema:isDeleted ?conceptDeleted}
FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) &&
?conceptDeleted = false))*
*}*

basically all this big fuss is to find all child concepts of a specified
parent concept identified by concept.code and concept.codeSystemId.
so the  @concept.code and  @concept.codeSystemId you see are replaced in
runtime to actual values.
all of the optional sections you see are to ignore deleted (logically) or
not bound concepts.

Thanks,

Nadav

On Mon, Sep 2, 2013 at 4:14 PM, Andy Seaborne <[email protected]> wrote:

> On 02/09/13 12:51, nadav hoze wrote:
>
>> hi,
>>
>> We are doing stress tests to our service which it's underlying data layer
>> is jena TDB.
>> one of our tests is tor run heavy queries for long time (about 6 Hrs) and
>> afterwards run light queries. (we have clients which are in that mode).
>> What we witness is a huge performance degradation, light queries which
>> usually took around 0.1-0.2 sec after the heavy queries execution took
>> more
>> than 3 seconds.
>>
>
> Not surprising - the heavy queries will have taken over the OS
> cache.(assuming 64 bit - a similar effect occurs on 32 bit).  The
> light-after-heavy is effectively running cold.
>
>  Also the heavy query execution had a huge performance degradation after
>> only one minute:
>> each heavy query fetched around  35000 triplets and for the first minutes
>> it took between 10-40 seconds (which is OK), afterwards it peaked to
>> 200-8000 seconds.
>> Same thing memory wise, after a minute it peaked from 200mg to 2.2g.
>>
>> What I would like to know is if there could be memory leak in jena, or
>> whether jena objects are cached in some way and maybe we can release them.
>>
>> Here are important details for answering:
>> *jena version: 2.6.4*
>> *tdb version: 0.8.9*
>> *arq: 2.8.7*
>> *we use a single model and no datasets.*
>>
>>
>> Also can an upgrade to jena latest stable version help us here ?
>>
>
> You should upgrade anyway. There are bug fixes.  And a different license.
>
>
>
>> Help is much appreciated :)
>>
>>
> All depends on what the heavy query touches in the database (the pattern
> matching part), the size of the machine, whether anything else is running
> on the machine, ...
>
> There are many, many factors:
>
> What size of the machine?
> What OS?
> Is it a VM?
> How much data (in triples) is there in the DB?
> Heap size?
> The driver program is on What
> the same machine as the database - does this matter?
> ...
>
>         Andy
>
>
>  Thanks,
>>
>> Nadav
>>
>>
>

Re: Heavy queries followed by light queries

Reply via email to