1. Regarding VM when I said varies from client to client I meant that some
uses VM and some don't but the 12GB is always for a single machine.
Also forgot to state that of course other processes works on that machine
beside this service that uses jena, but this service get his shared part
and I don't think it's a lack of resources issue.

2. about the matching pattern here it is again, hopes it's OK now (I also
attached it):

Just a short explanation before you read the matching pattern:
this query should fetch all the triplets with relation subClassOf to a
given ontologyConcept. it's identifiers are @concept.code and
@concept.codeSystemId which are basically placeholders which we replace in
our service.
The OPTIONAL parts you see in the query are for ignoring concepts which
 are marked as deleted or not bound to the schema.


?ontologyConcept schema:code @concept.code^^xsd:string .
?ontologyConcept schema:codeSystemId @concept.codeSystemId^^xsd:string
OPTIONAL{?ontologyConcept schema:isDeleted ?ontologyConceptDeleted}
FILTER(!bound(?ontologyConceptDeleted) || (bound(?ontologyConceptDeleted)
&& ?ontologyConceptDeleted = false))
{
?child relations:subClassOf ?ontologyConcept .
OPTIONAL{?child schema:isDeleted ?childDeleted}
FILTER(!bound(?childDeleted) || (bound(?childDeleted) && ?childDeleted =
false))
?concept relations:equalsTo ?child .
OPTIONAL{?concept schema:isDeleted ?conceptDeleted}
FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) &&
?conceptDeleted = false))
?concept rdf:type schema:Concept
}
UNION
{
?concept relations:equalsTo ?ontologyConcept .
?concept rdf:type schema:Concept
OPTIONAL{?concept schema:isDeleted ?conceptDeleted}
FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) &&
?conceptDeleted = false))
}

3. About the direct mode, we already use it so no effect there, is there a
way to clear the memory cache from the model ?


Thanks,

Nadav


On Mon, Sep 2, 2013 at 6:21 PM, Andy Seaborne <[email protected]> wrote:

> On 02/09/13 14:33, nadav hoze wrote:
>
>> Machine size: 12 GB
>> OS: Windows Server 2008 64 bit
>>
>
> I don't have much experience of Windows 64 bit and mmap files - you may
> find running with 32 bit mode a useful datapoint (this does not use memory
> mapped files which, from reading around the web, and anecdotal evidence on
> users@, do not have the same benefits as on Linux).
>
>
>  VM: varies from client to client.
>>
>
> Does this mean that several VMs for running on the same 12G hardware?
> If so, how much RAM is allocate to each VM?
>
>
>  data (in triples): 20,000,000 (3.6 GB)
>> Heap size: 2 GB
>>
>
> How big does the entire JVM process get?  At that scale, the entire DB
> should be mapped into memory
>
>
>  Driver program : ? (didn't understand)
>>
>
> You say the test program issuing TDB directly so it must be in the same
> JVM.
>
> It may be useful to you to run on native hardware to see what effect VM's
> are having.  It can range from no measurable effect to very significant.
>
>
>  No the database is on a network shared drive (different server).
>>
>> pattern matching (where clause):
>>
>>
> Sorry - this is unreadable and being a partial extract, I can't reformat
> it.
>
>         Andy
>
>  *?ontologyConcept schema:code @concept.code^^xsd:string .*
>> *?ontologyConcept schema:codeSystemId @concept.codeSystemId^^xsd:**
>> string*
>> *OPTIONAL{?ontologyConcept schema:isDeleted ?ontologyConceptDeleted}
>> FILTER(!bound(?**ontologyConceptDeleted) || (bound(?**
>> ontologyConceptDeleted)
>> && ?ontologyConceptDeleted = false))*
>> *{*
>> * ?child relations:subClassOf ?ontologyConcept .*
>> * OPTIONAL{?child schema:isDeleted ?childDeleted}
>>
>> FILTER(!bound(?childDeleted) || (bound(?childDeleted) && ?childDeleted =
>> false))*
>> * ?concept relations:equalsTo ?child .*
>> * OPTIONAL{?concept schema:isDeleted ?conceptDeleted}
>> FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) &&
>> ?conceptDeleted = false))*
>> * ?concept rdf:type schema:Concept*
>> *}*
>> *UNION*
>> *{*
>> * ?concept relations:equalsTo ?ontologyConcept .*
>> * ?concept rdf:type schema:Concept*
>> * OPTIONAL{?concept schema:isDeleted ?conceptDeleted}
>> FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) &&
>> ?conceptDeleted = false))*
>> *}*
>>
>>
>> basically all this big fuss is to find all child concepts of a specified
>> parent concept identified by concept.code and concept.codeSystemId.
>> so the  @concept.code and  @concept.codeSystemId you see are replaced in
>> runtime to actual values.
>> all of the optional sections you see are to ignore deleted (logically) or
>> not bound concepts.
>>
>> Thanks,
>>
>> Nadav
>>
>> On Mon, Sep 2, 2013 at 4:14 PM, Andy Seaborne <[email protected]> wrote:
>>
>>  On 02/09/13 12:51, nadav hoze wrote:
>>>
>>>  hi,
>>>>
>>>> We are doing stress tests to our service which it's underlying data
>>>> layer
>>>> is jena TDB.
>>>> one of our tests is tor run heavy queries for long time (about 6 Hrs)
>>>> and
>>>> afterwards run light queries. (we have clients which are in that mode).
>>>> What we witness is a huge performance degradation, light queries which
>>>> usually took around 0.1-0.2 sec after the heavy queries execution took
>>>> more
>>>> than 3 seconds.
>>>>
>>>>
>>> Not surprising - the heavy queries will have taken over the OS
>>> cache.(assuming 64 bit - a similar effect occurs on 32 bit).  The
>>> light-after-heavy is effectively running cold.
>>>
>>>   Also the heavy query execution had a huge performance degradation after
>>>
>>>> only one minute:
>>>> each heavy query fetched around  35000 triplets and for the first
>>>> minutes
>>>> it took between 10-40 seconds (which is OK), afterwards it peaked to
>>>> 200-8000 seconds.
>>>> Same thing memory wise, after a minute it peaked from 200mg to 2.2g.
>>>>
>>>> What I would like to know is if there could be memory leak in jena, or
>>>> whether jena objects are cached in some way and maybe we can release
>>>> them.
>>>>
>>>> Here are important details for answering:
>>>> *jena version: 2.6.4*
>>>> *tdb version: 0.8.9*
>>>> *arq: 2.8.7*
>>>> *we use a single model and no datasets.*
>>>>
>>>>
>>>> Also can an upgrade to jena latest stable version help us here ?
>>>>
>>>>
>>> You should upgrade anyway. There are bug fixes.  And a different license.
>>>
>>>
>>>
>>>  Help is much appreciated :)
>>>>
>>>>
>>>>  All depends on what the heavy query touches in the database (the
>>> pattern
>>> matching part), the size of the machine, whether anything else is running
>>> on the machine, ...
>>>
>>> There are many, many factors:
>>>
>>> What size of the machine?
>>> What OS?
>>> Is it a VM?
>>> How much data (in triples) is there in the DB?
>>> Heap size?
>>> The driver program is on What
>>> the same machine as the database - does this matter?
>>> ...
>>>
>>>          Andy
>>>
>>>
>>>   Thanks,
>>>
>>>>
>>>> Nadav
>>>>
>>>>
>>>>
>>>
>>
>
?ontologyConcept schema:code @concept.code^^xsd:string .
?ontologyConcept schema:codeSystemId @concept.codeSystemId^^xsd:string
OPTIONAL{?ontologyConcept schema:isDeleted ?ontologyConceptDeleted} 
FILTER(!bound(?ontologyConceptDeleted) || (bound(?ontologyConceptDeleted) && 
?ontologyConceptDeleted = false))
{
        ?child relations:subClassOf ?ontologyConcept .
        OPTIONAL{?child schema:isDeleted ?childDeleted} 
FILTER(!bound(?childDeleted) || (bound(?childDeleted) && ?childDeleted = false))
        ?concept relations:equalsTo ?child .
        OPTIONAL{?concept schema:isDeleted ?conceptDeleted} 
FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) && ?conceptDeleted = 
false))
        ?concept rdf:type schema:Concept
}
UNION
{
        ?concept relations:equalsTo ?ontologyConcept .
        ?concept rdf:type schema:Concept
        OPTIONAL{?concept schema:isDeleted ?conceptDeleted} 
FILTER(!bound(?conceptDeleted) || (bound(?conceptDeleted) && ?conceptDeleted = 
false))
}

Reply via email to