TDB memory usage

2018-04-03 Thread Laurent Rucquoy
Hello,

We have questions about the TDB memory usage.

1. On some Microsoft Windows 64 bit servers, we noted that there is no
mapped file used by the TDB while on other servers mapped file use is
substantial. What is the trigger for mapped file use ?
2. In some cases, RAM is mostly used by the TDB. When some other
applications require more RAM, the space used by the TDB is not released
and we encounter memory lack issues. Despite the fact that the problem is
maybe caused by a wrong memory handling done by the other applications, is
there a way to limit the TDB RAM usage in order to anticipate such a
blocking situation (a bit like defining the 'innodb_buffer_pool_size' in
MariaDB) ?

Environment:
- Windows Server 2008R2 (physical machine)
- Apache Jena 3.1.1 (still with this release because of the long
development cycle of our product)

Thank you in advance for your help.

Laurent


Re: Missing solution in SPARQL select result, however this solution exists in the dataset

2017-10-09 Thread Laurent Rucquoy
Hello,

Just to recap, I was able to reload the TDB to be migrated but my migration
process became unusually slow on the reloaded TDB.
I have found a way to get better performance again by querying the union
graph named model instead of querying the dataset.
My problem seems to be solved for now.

Thank you again for your help.
Laurent


PS
Here is the source code modified in order to querying union graph model:

String sparql = "SELECT ?annotationDimension WHERE {
<
http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-bb71-2d416f729db8-msr>
<http://www.telemis.com/annotationDimension> ?annotationDimension .
}";

Dataset dataset = TDBFactory.createDataset("C:/tdb");
dataset.begin(ReadWrite.READ);
Model model = dataset.getNamedModel("urn:x-arq:UnionGraph")

try {

try (QueryExecution queryExecution =
QueryExecutionFactory.create(sparql, model)) {

ResultSet resultSet = queryExecution.execSelect();

while (resultSet.hasNext()) {

QuerySolution querySolution = resultSet.nextSolution();
RDFNode rdfNode = querySolution.get("annotationDimension");
if(rdfNode != null) {
if(rdfNode.isLiteral()) {
 // ...
} else if(rdfNode.isResource()) {
// ...
}
}
}
}

} finally {
dataset.end();
}



On 29 September 2017 at 18:10, Andy Seaborne <a...@apache.org> wrote:

>
>
> On 29/09/17 16:34, Andy Seaborne wrote:
>
>> On 29 September 2017 at 09:27, Laurent Rucquoy <
>> laurent.rucq...@telemis.com <mailto:laurent.rucq...@telemis.com>> wrote:
>>
>> Hello,
>>
>> After having reloaded the data, we ran our TDB migration process which
>> starts to read the source model inculding the SPARQL queries in
>> question in
>> this discussion.
>> This process part took very long to finish and this seems abnormal
>> for us.
>>
>> 1 - Could it be due to missing indexes ?
>>
>>
>> No
>>
>> Its an issue with the data in the node table or in an index, not missing
>> something.
>>
>> 2 - Is there a tool in Jena to rebuild indexes of a reloaded TDB ?
>>
>>
>> No - TDB indexes are built as data is loaded.
>>
>> 3 - If the original TDB (before having be reloaded) contains corrupted
>> data, is there tool in Jena to identify the concerned corrupted data ?
>>
>>
>> There may not be the the info from but it *may* be possible to find out
>>
>
> There may not be the info but it *may* be possible to find out
>
>
> more (it depends what is broken and how).
>>
>> You can try this to attempt find out what is wrong - it may not work. I
>> have not tried this but ...
>>
>>
>> (1) With no other processes using the database, take a copy of the
>> database by copying the directory.
>>
>> (2) from the command line:
>>SELECT (count(*) AS ?C) { ?s ?p ?o }
>> (if no named graphs)
>>
>> because that does not access the node table to do the count.
>>
>> If that completes, and the count is right, then try from the command line:
>>
>> SELECT (count(*) AS ?C) { ?s ?p ?o . FILTER(?s != 1234) }
>>
>> because that forces the subject to be retrieved and nothing else.
>>
>> Same for ?p and ?o.
>>
>> This is checking the SPO index which is what ?s ?p ?o will choose.  It is
>> also the index used by backup. At least one of them should show the same
>> error.  Depending on which, it might be possible to fool the system to look
>> elsewhere as well.
>>
>>  Andy
>>
>>
>> Thank you in advance for your help.
>>
>> Regards,
>> Laurent
>>
>>
>>
>> On 28 September 2017 at 16:26, Laurent Rucquoy
>> <laurent.rucq...@telemis.com <mailto:laurent.rucq...@telemis.com>>
>> wrote:
>>
>>  > Hello,
>>  >
>>  > I tested the tdbdump with Jena 3 instead of Jena 2.
>>  > It seemed to finish successfully (and I was able to load this
>> dump into a
>>  > TDB)
>>  >
>>  > Thank you for your help.
>>  >
>>  > Laurent
>>  >
>>  >
>>  > On 28 September 2017 at 14:04, Andy Seaborne <a...@apache.org
>> <mailto:a...@apache.org>> wrote:
>>  >
>>  >>
>>  >>
>>  >> On 28/09/17 09:33, Laurent Rucquoy wrote:
>>  >> ...
>>

Re: Missing solution in SPARQL select result, however this solution exists in the dataset

2017-09-29 Thread Laurent Rucquoy
Hello,

After having reloaded the data, we ran our TDB migration process which
starts to read the source model inculding the SPARQL queries in question in
this discussion.
This process part took very long to finish and this seems abnormal for us.

1 - Could it be due to missing indexes ?
2 - Is there a tool in Jena to rebuild indexes of a reloaded TDB ?

3 - If the original TDB (before having be reloaded) contains corrupted
data, is there tool in Jena to identify the concerned corrupted data ?

Thank you in advance for your help.

Regards,
Laurent



On 28 September 2017 at 16:26, Laurent Rucquoy <laurent.rucq...@telemis.com>
wrote:

> Hello,
>
> I tested the tdbdump with Jena 3 instead of Jena 2.
> It seemed to finish successfully (and I was able to load this dump into a
> TDB)
>
> Thank you for your help.
>
> Laurent
>
>
> On 28 September 2017 at 14:04, Andy Seaborne <a...@apache.org> wrote:
>
>>
>>
>> On 28/09/17 09:33, Laurent Rucquoy wrote:
>> ...
>>
>>>
>>> Note that the wrong behavior discussed here is strange because the given
>>> SPARQL query does not return any data and when I remove the triple
>>> pattern
>>> concerning the "annotationDimension" linked resource object (i.e. not a
>>> literal object) the query returns the expected data (as if the linked
>>> resource object did not exist... but this object exists)
>>>
>>>
>>> I've run the tdbdump on the concerned dataset but the process ended
>>> earlier
>>> than expected with the following stacktrace:
>>>
>>> com.hp.hpl.jena.tdb.TDBException: Unrecognized node id type: 10
>>>  at com.hp.hpl.jena.tdb.store.NodeId.extract(NodeId.java:346)
>>>  at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNode
>>> Id(
>>> NodeTableInline.java:64)
>>>
>>
>> It looks like the database files are damaged in some way - there isn't a
>> "type: 10" NodeId.  It's been a long time but I don't remember any mention
>> of this before for any version of TDB. (It's not the same as the "Invalid
>> NodeId" errors.)
>>
>> All I can think is that at some time in the past, maybe a very long time
>> ago, there was a non-transaction update that didn't get flushed.
>>
>> Or, maybe, have you run a Jena3 TDB on the database before trying to back
>> it up?  I don't see why it would cause that particular message but it is a
>> possibility to consider.
>>
>> Andy
>>
>>
>>  at com.hp.hpl.jena.tdb.lib.TupleLib.triple(TupleLib.java:126)
>>>  at com.hp.hpl.jena.tdb.lib.TupleLib.triple(TupleLib.java:114)
>>>  at com.hp.hpl.jena.tdb.lib.TupleLib.access$000(TupleLib.java:45
>>> )
>>>  at com.hp.hpl.jena.tdb.lib.TupleLib$3.convert(TupleLib.java:76)
>>>  at com.hp.hpl.jena.tdb.lib.TupleLib$3.convert(TupleLib.java:72)
>>>  at org.apache.jena.atlas.iterator.Iter$4.next(Iter.java:299)
>>>  at org.apache.jena.atlas.iterator.Iter$4.next(Iter.java:299)
>>>  at org.apache.jena.atlas.iterator.Iter.next(Iter.java:909)
>>>  at org.apache.jena.atlas.iterator.IteratorCons.next(
>>> IteratorCons.java:92)
>>>  at org.apache.jena.riot.system.StreamRDFLib.quadsToStream(
>>> StreamRDFLib.java:69)
>>>  at org.apache.jena.riot.writer.NQuadsWriter.write(
>>> NQuadsWriter.java:40)
>>>  at org.apache.jena.riot.writer.NQuadsWriter.write(
>>> NQuadsWriter.java:67)
>>>  at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1133)
>>>  at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1007)
>>>  at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:997)
>>>  at tdb.tdbdump.exec(tdbdump.java:50)
>>>  at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
>>>  at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>>>  at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>>>  at tdb.tdbdump.main(tdbdump.java:32)
>>>
>>>
>>> Thank you again for your help.
>>> Sincerely,
>>> Laurent
>>>
>>>
>>>
>>> On 27 September 2017 at 13:27, Lorenz Buehmann <buehm...@informatik.uni-
>>> leipzig.de> wrote:
>>>
>>> Query works for me on the sample data.
>>>>
>>>> Btw, there is an error in the first URI in the OPTIONAL clause. I'd
>>>> suggest to use SPARQL 1.1 VALUES to avoid redundant declaration of the
>>&g

Re: Missing solution in SPARQL select result, however this solution exists in the dataset

2017-09-28 Thread Laurent Rucquoy
Hello,

I tested the tdbdump with Jena 3 instead of Jena 2.
It seemed to finish successfully (and I was able to load this dump into a
TDB)

Thank you for your help.

Laurent


On 28 September 2017 at 14:04, Andy Seaborne <a...@apache.org> wrote:

>
>
> On 28/09/17 09:33, Laurent Rucquoy wrote:
> ...
>
>>
>> Note that the wrong behavior discussed here is strange because the given
>> SPARQL query does not return any data and when I remove the triple pattern
>> concerning the "annotationDimension" linked resource object (i.e. not a
>> literal object) the query returns the expected data (as if the linked
>> resource object did not exist... but this object exists)
>>
>>
>> I've run the tdbdump on the concerned dataset but the process ended
>> earlier
>> than expected with the following stacktrace:
>>
>> com.hp.hpl.jena.tdb.TDBException: Unrecognized node id type: 10
>>  at com.hp.hpl.jena.tdb.store.NodeId.extract(NodeId.java:346)
>>  at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNode
>> Id(
>> NodeTableInline.java:64)
>>
>
> It looks like the database files are damaged in some way - there isn't a
> "type: 10" NodeId.  It's been a long time but I don't remember any mention
> of this before for any version of TDB. (It's not the same as the "Invalid
> NodeId" errors.)
>
> All I can think is that at some time in the past, maybe a very long time
> ago, there was a non-transaction update that didn't get flushed.
>
> Or, maybe, have you run a Jena3 TDB on the database before trying to back
> it up?  I don't see why it would cause that particular message but it is a
> possibility to consider.
>
> Andy
>
>
>  at com.hp.hpl.jena.tdb.lib.TupleLib.triple(TupleLib.java:126)
>>  at com.hp.hpl.jena.tdb.lib.TupleLib.triple(TupleLib.java:114)
>>  at com.hp.hpl.jena.tdb.lib.TupleLib.access$000(TupleLib.java:45)
>>  at com.hp.hpl.jena.tdb.lib.TupleLib$3.convert(TupleLib.java:76)
>>  at com.hp.hpl.jena.tdb.lib.TupleLib$3.convert(TupleLib.java:72)
>>  at org.apache.jena.atlas.iterator.Iter$4.next(Iter.java:299)
>>  at org.apache.jena.atlas.iterator.Iter$4.next(Iter.java:299)
>>  at org.apache.jena.atlas.iterator.Iter.next(Iter.java:909)
>>  at org.apache.jena.atlas.iterator.IteratorCons.next(
>> IteratorCons.java:92)
>>  at org.apache.jena.riot.system.StreamRDFLib.quadsToStream(
>> StreamRDFLib.java:69)
>>  at org.apache.jena.riot.writer.NQuadsWriter.write(
>> NQuadsWriter.java:40)
>>  at org.apache.jena.riot.writer.NQuadsWriter.write(
>> NQuadsWriter.java:67)
>>  at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1133)
>>  at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1007)
>>  at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:997)
>>  at tdb.tdbdump.exec(tdbdump.java:50)
>>  at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
>>  at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>>  at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>>  at tdb.tdbdump.main(tdbdump.java:32)
>>
>>
>> Thank you again for your help.
>> Sincerely,
>> Laurent
>>
>>
>>
>> On 27 September 2017 at 13:27, Lorenz Buehmann <buehm...@informatik.uni-
>> leipzig.de> wrote:
>>
>> Query works for me on the sample data.
>>>
>>> Btw, there is an error in the first URI in the OPTIONAL clause. I'd
>>> suggest to use SPARQL 1.1 VALUES to avoid redundant declaration of the
>>> same URI.
>>>
>>>
>>> On 27.09.2017 11:35, Andy Seaborne wrote:
>>>
>>>> That's a lot of data and it's broken by email.  A small extract to
>>>> illustrate the problem is all that is needed together with a stripped
>>>> down query that shows the effect in question.  Something runnable.
>>>>
>>>> The query is different to the original a well - some of it is matching
>>>> strings so you wil need to reload the data.
>>>>
>>>>  Andy
>>>>
>>>>
>>>> On 26/09/17 20:32, Laurent Rucquoy wrote:
>>>>
>>>>> - I will test to reload the data
>>>>> - The last source code is not what I sent before because I removed some
>>>>> specific parts when I transcribed because I thought these parts not
>>>>> relevant for this case but I can be mistaken...
>>>>>
>>>>> - Here is a

Re: Missing solution in SPARQL select result, however this solution exists in the dataset

2017-09-28 Thread Laurent Rucquoy
Hello,

Here is a data sample subset:

<
http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-bb71-2d416f729db8-msr>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://www.telemis.com/ImageAnnotation> .
<
http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-bb71-2d416f729db8-msr>
<http://www.telemis.com/codeMeaning> "ROI Circle measure" .
<
http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-bb71-2d416f729db8-msr>
<http://www.telemis.com/codeValue> "MSR-ROI002" .
<
http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-bb71-2d416f729db8-msr>
<http://www.telemis.com/annotationDimension> <
http://www.telemis.com/AnnotationDimension/dim-4ViewAsymR3> .
<http://www.telemis.com/AnnotationDimension/dim-4ViewAsymR3> <
http://www.telemis.com/numberOfDimension> "3"^^<
http://www.w3.org/2001/XMLSchema#integer> .
<http://www.telemis.com/AnnotationDimension/dim-4ViewAsymR3> <
http://www.telemis.com/mprLayout> "4ViewAsymR" .
<http://www.telemis.com/AnnotationDimension/dim-4ViewAsymR3> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://www.telemis.com/AnnotationDimension> .

Note that the wrong behavior discussed here is strange because the given
SPARQL query does not return any data and when I remove the triple pattern
concerning the "annotationDimension" linked resource object (i.e. not a
literal object) the query returns the expected data (as if the linked
resource object did not exist... but this object exists)


I've run the tdbdump on the concerned dataset but the process ended earlier
than expected with the following stacktrace:

com.hp.hpl.jena.tdb.TDBException: Unrecognized node id type: 10
at com.hp.hpl.jena.tdb.store.NodeId.extract(NodeId.java:346)
at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNodeId(
NodeTableInline.java:64)
at com.hp.hpl.jena.tdb.lib.TupleLib.triple(TupleLib.java:126)
at com.hp.hpl.jena.tdb.lib.TupleLib.triple(TupleLib.java:114)
at com.hp.hpl.jena.tdb.lib.TupleLib.access$000(TupleLib.java:45)
at com.hp.hpl.jena.tdb.lib.TupleLib$3.convert(TupleLib.java:76)
at com.hp.hpl.jena.tdb.lib.TupleLib$3.convert(TupleLib.java:72)
at org.apache.jena.atlas.iterator.Iter$4.next(Iter.java:299)
at org.apache.jena.atlas.iterator.Iter$4.next(Iter.java:299)
at org.apache.jena.atlas.iterator.Iter.next(Iter.java:909)
at org.apache.jena.atlas.iterator.IteratorCons.next(
IteratorCons.java:92)
at org.apache.jena.riot.system.StreamRDFLib.quadsToStream(
StreamRDFLib.java:69)
at org.apache.jena.riot.writer.NQuadsWriter.write(
NQuadsWriter.java:40)
at org.apache.jena.riot.writer.NQuadsWriter.write(
NQuadsWriter.java:67)
at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1133)
at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1007)
at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:997)
at tdb.tdbdump.exec(tdbdump.java:50)
at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
at tdb.tdbdump.main(tdbdump.java:32)


Thank you again for your help.
Sincerely,
Laurent



On 27 September 2017 at 13:27, Lorenz Buehmann <buehm...@informatik.uni-
leipzig.de> wrote:

> Query works for me on the sample data.
>
> Btw, there is an error in the first URI in the OPTIONAL clause. I'd
> suggest to use SPARQL 1.1 VALUES to avoid redundant declaration of the
> same URI.
>
>
> On 27.09.2017 11:35, Andy Seaborne wrote:
> > That's a lot of data and it's broken by email.  A small extract to
> > illustrate the problem is all that is needed together with a stripped
> > down query that shows the effect in question.  Something runnable.
> >
> > The query is different to the original a well - some of it is matching
> > strings so you wil need to reload the data.
> >
> > Andy
> >
> >
> > On 26/09/17 20:32, Laurent Rucquoy wrote:
> >> - I will test to reload the data
> >> - The last source code is not what I sent before because I removed some
> >> specific parts when I transcribed because I thought these parts not
> >> relevant for this case but I can be mistaken...
> >>
> >> - Here is a data sample:
> >>
> >> <http://www.telemis.com/CalculationCollection> <
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> >> http://www.w3.org/2000/01/rdf-schema#Class> .
> >> <http://www.telemis.com/CalculationCollection> <
> >> http://thewebsemantic.com/javaclass>
> >> "com.telemis.core.aim.base.CalculationCollec

Re: Missing solution in SPARQL select result, however this solution exists in the dataset

2017-09-26 Thread Laurent Rucquoy
- Did you reload the data ?
No, I did not know that. How can I reload the data ?

- We are using the Apache Jena 3.1.1 instead of 3.4.0 because this is
integrated in a product with a big release cycle...

- Is there a transaction ?
Yes all the queries are made in transactions


- Here is a minimal example

String sparql = "SELECT ?annotationDimension WHERE {
<
http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-bb71-2d416f729db8-msr>
<http://www.telemis.com/annotationDimension> ?annotationDimension .
}";

Dataset dataset = TDBFactory.createDataset("C:/tdb");
dataset.begin(ReadWrite.READ);

try {

try (QueryExecution queryExecution =
QueryExecutionFactory.create(sparql, dataset)) {

ResultSet resultSet = queryExecution.execSelect();

while (resultSet.hasNext()) {  // <=== No result

QuerySolution querySolution = resultSet.nextSolution();
RDFNode rdfNode = querySolution.get("annotationDimension");
if(rdfNode != null) {
if(rdfNode.isLiteral()) {
 // ...
} else if(rdfNode.isResource()) {
// ...
}
}
}
}

} finally {
dataset.end();
}


- Do you need a larger data sample ?


Thank you for your help.

Laurent




On 26 September 2017 at 16:40, Andy Seaborne <a...@apache.org> wrote:

> Did you reload the data?
>
> https://jena.apache.org/documentation/migrate_jena2_jena3.html
>
>
>
> On 26/09/17 15:08, Laurent Rucquoy wrote:
>
>> Hello,
>>
>> We are currently migrating an Apache Jena 2.10.1 (jena-tdb 0.10.1) dataset
>> to a new model with Apache Jena 3.1.1
>>
>
> 3.1.1 is old - 3.4.0 is current.
>
>
>> Some SPARQL select queries on the source dataset don't return any solution
>> when the where pattern includes a triple having a resource as object
>> (others patterns have a literal as object)... however we are sure that the
>> missing solution exists.
>>
>> Here is an example of query where ?annotationDimension is a resource:
>>
>> SELECT ?annotationDimension
>> WHERE {
>>  <
>> http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-
>> bb71-2d416f729db8-msr>
>> <http://www.telemis.com/annotationDimension> ?annotationDimension .
>> }
>>
>> These queries are executed through
>>
>>
> Is there a transaction?
>
> QueryExecution queryExecution = QueryExecutionFactory.create(sparql,
>> dataset, initialJenaQuerySolutionMap)
>>
>^^^
>
> ResultSet resultSet = queryExecution.execSelect();
>>
>>
>> What can I do to retrieve the missing solution ?
>>
>> Thank you in advance for your help.
>>
>
> Please provide a complete, minimal example.  As described, I can't
> recreate a test case because it is about the data.
>
> Andy
>
>
>> Regards,
>> Laurent
>>
>>


Missing solution in SPARQL select result, however this solution exists in the dataset

2017-09-26 Thread Laurent Rucquoy
Hello,

We are currently migrating an Apache Jena 2.10.1 (jena-tdb 0.10.1) dataset
to a new model with Apache Jena 3.1.1

Some SPARQL select queries on the source dataset don't return any solution
when the where pattern includes a triple having a resource as object
(others patterns have a literal as object)... however we are sure that the
missing solution exists.

Here is an example of query where ?annotationDimension is a resource:

SELECT ?annotationDimension
WHERE {
<
http://www.telemis.com/ImageAnnotation/000b3231-a9c3-42b1-bb71-2d416f729db8-msr>
 ?annotationDimension .
}

These queries are executed through

QueryExecution queryExecution = QueryExecutionFactory.create(sparql,
dataset, initialJenaQuerySolutionMap)
ResultSet resultSet = queryExecution.execSelect();


What can I do to retrieve the missing solution ?

Thank you in advance for your help.

Regards,
Laurent


Re: SPARQL to check if a specific URI exists

2017-05-16 Thread Laurent Rucquoy
Thank you for these clarifications.

Laurent


On 16 May 2017 at 07:58, Lorenz B. 
wrote:

> It depends on the definition of "existence" in an RDF graph. What if the
> URI resp. resource only occurs in the object position of a triple? Then
> you'd need something like this:
>
> ASK WHERE {
> { ?p ?o . }
> UNION
> {?s ?p  .  }
> }
> > Hello,
> >
> > I want to write a SPARQL query to attest the existence of a specific URI
> in
> > a TDB.
> >
> > Is the following query the right way to do this (considering notably the
> > execution performances with big volumes) ?
> >
> > ASK WHERE {
> >  ?p ?o .
> > }
> >
> > Thank you in advance for your help,
> >
> > Laurent
> >
> --
> Lorenz Bühmann
> AKSW group, University of Leipzig
> Group: http://aksw.org - semantic web research center
>
>


SPARQL to check if a specific URI exists

2017-05-15 Thread Laurent Rucquoy
Hello,

I want to write a SPARQL query to attest the existence of a specific URI in
a TDB.

Is the following query the right way to do this (considering notably the
execution performances with big volumes) ?

ASK WHERE {
 ?p ?o .
}

Thank you in advance for your help,

Laurent


Re: QueryParseException: java.lang.StackOverflowError

2017-03-01 Thread Laurent Rucquoy
Thank you for your help.
I'll try the ARQ parser.
Sincerely,
Laurent


On 22 February 2017 at 17:12, Andy Seaborne <a...@apache.org> wrote:

> Now that standards are settled, we could consider making SPARQL 1.1 have
> this change.  It is not a language change, it is only a grammar change
> (different grammar accepting and rejecting the same language).
>
> The spec grammar is simple LL(1) - the rewrite uses a javacc feature and
> so its not pure LL(1).
>
> (if Java had tail recursion optimization this would not be necessary ...)
>
> Andy
>
>
> On 22/02/17 14:44, Rob Vesse wrote:
>
>> The standard SPARQL 1.1 parser strictly follows the grammar. The grammar
>> is recursive which can lead to stack overflow errors when there are too
>> many individual updates within the overall update request. If you use the
>> ARQ Version of the parser it uses a slightly modified modified version of
>> the grammar which avoids this issue.
>>
>>  Rob
>>
>> On 22/02/2017 14:21, "Laurent Rucquoy" <laurent.rucq...@telemis.com>
>> wrote:
>>
>> Hello,
>>
>> We are currently working on TDB data migrations.
>> These migrations imply massive writing operations using SPARQL update
>> queries passed through UpdateAction.parseExecute(String
>> updateString, Model
>> model)
>>
>> To improve migrations runtime duration, we decided to group several
>> "unit"
>> update queries per transaction (please see the "unit" update query
>> sample
>> here below.)
>> We noted that when the "unit" update queries group size passes a
>> threshold
>> (about 250 "unit" updates), we get the stack overflow errors (please
>> see
>> the corresponding stacktrace here below.)
>>
>> Do I do something wrong?
>> Are there recommendations about some limits?
>>
>> Thank you in advance for your help.
>> Best regards,
>> Laurent
>>
>>
>>
>> "Unit" update query sample:
>>
>>
>> DELETE {
>> <http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-
>> c2e715babac2>
>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?annotationType ;
>> <http://company.com/model/updated> ?annotationUpdatedTime .
>> }
>> INSERT {
>> <http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-
>> c2e715babac2>
>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://company.com/model/Annotation> ;
>> <http://company.com/model/updated>
>> "2017-02-17T15:49:52.705Z"^^<
>> http://www.w3.org/2001/XMLSchema#dateTime> .
>> }
>> WHERE {
>> OPTIONAL {
>> <
>> http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-
>> c2e715babac2> <
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?annotationType .
>> }
>> OPTIONAL {
>> <
>> http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-
>> c2e715babac2> <
>> http://company.com/model/updated> ?annotationUpdatedTime .
>> }
>> } ;
>> DELETE {
>> <http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-
>> c2e715babac2>
>> <http://company.com/model/modifiedBy> ?annotationUser .
>> }
>> INSERT {
>> <http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-
>> c2e715babac2>
>> <http://company.com/model/modifiedBy> <
>> http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177
>> -eb5892375032> .
>> }
>> WHERE {
>> OPTIONAL {
>> <
>> http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-
>> c2e715babac2> <
>> http://company.com/model/modifiedBy> ?annotationUser .
>> }
>> } ;
>> DELETE {
>> <
>> http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177
>> -eb5892375032> <
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?userType ;
>> <http://company.com/model/loginName> ?userLoginName ;
>> <http://company.com/model/uid> ?userUid .
>> }
>> INSERT {
>> <
>> http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177
>> -eb5892375032> <
>

QueryParseException: java.lang.StackOverflowError

2017-02-22 Thread Laurent Rucquoy
Hello,

We are currently working on TDB data migrations.
These migrations imply massive writing operations using SPARQL update
queries passed through UpdateAction.parseExecute(String updateString, Model
model)

To improve migrations runtime duration, we decided to group several "unit"
update queries per transaction (please see the "unit" update query sample
here below.)
We noted that when the "unit" update queries group size passes a threshold
(about 250 "unit" updates), we get the stack overflow errors (please see
the corresponding stacktrace here below.)

Do I do something wrong?
Are there recommendations about some limits?

Thank you in advance for your help.
Best regards,
Laurent



"Unit" update query sample:


DELETE {

 ?annotationType ;
 ?annotationUpdatedTime .
}
INSERT {

 <
http://company.com/model/Annotation> ;
 "2017-02-17T15:49:52.705Z"^^<
http://www.w3.org/2001/XMLSchema#dateTime> .
}
WHERE {
OPTIONAL {
<
http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-c2e715babac2> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?annotationType .
}
OPTIONAL {
<
http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-c2e715babac2> <
http://company.com/model/updated> ?annotationUpdatedTime .
}
} ;
DELETE {

 ?annotationUser .
}
INSERT {

 <
http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177-eb5892375032> .
}
WHERE {
OPTIONAL {
<
http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-c2e715babac2> <
http://company.com/model/modifiedBy> ?annotationUser .
}
} ;
DELETE {
<
http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177-eb5892375032> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?userType ;
 ?userLoginName ;
 ?userUid .
}
INSERT {
<
http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177-eb5892375032> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://company.com/model/User> ;
 "***" ;

"0684d3b0-dba8-43c8-9177-eb5892375032" .
}
WHERE {
OPTIONAL {
<
http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177-eb5892375032> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?userType .
}
OPTIONAL {
<
http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177-eb5892375032> <
http://company.com/model/loginName> ?userLoginName .
}
OPTIONAL {
<
http://company.com/data/User/Default/0684d3b0-dba8-43c8-9177-eb5892375032> <
http://company.com/model/uid> ?userUid .
}
} ;
DELETE {

 ?target .
}
INSERT {

 <
http://company.com/data/Image/1.2.840.114619.2.327.3.2474926736.930.1452584764.259.503>
.
}
WHERE {
OPTIONAL {
<
http://company.com/data/Annotation/ffa19568-9300-477f-4d5c-c2e715babac2> <
http://company.com/model/hasTarget> ?target .
}
} ;
DELETE {
<
http://company.com/data/Image/1.2.840.114619.2.327.3.2474926736.930.1452584764.259.503>
 ?imageType ;
 ?imageSopInstanceUid .
}
INSERT {
<
http://company.com/data/Image/1.2.840.114619.2.327.3.2474926736.930.1452584764.259.503>
 <
http://company.com/model/Image> ;

"1.2.840.114619.2.327.3.2474926736.930.1452584764.259.503" .
}
WHERE {
OPTIONAL {
<
http://company.com/data/Image/1.2.840.114619.2.327.3.2474926736.930.1452584764.259.503>
 ?imageType .
}
OPTIONAL {
<
http://company.com/data/Image/1.2.840.114619.2.327.3.2474926736.930.1452584764.259.503>
 ?imageSopInstanceUid .
}
} ;
DELETE {
<
http://company.com/data/Image/1.2.840.114619.2.327.3.2474926736.930.1452584764.259.503>
 ?imageSeries .
}
INSERT {
<

TDB support/training/courses

2017-01-19 Thread Laurent Rucquoy
Hello,

We are searching support/training/courses about TDB (modelling, database
optimization, ...)
Do you know a company providing such services ?

Thank you in advance,
Laurent


Re: Backup of TDB-backed dataset

2016-10-04 Thread Laurent Rucquoy
Thank you for your help.
Sincerely,
Laurent

On 3 October 2016 at 17:17, Andy Seaborne <a...@apache.org> wrote:

> Laurent,
>
> The best way to take a backup is to be a read-transaction and write
> n-quads (that's what Fuseki does).
>
> In order to go in at the file level, you need to not have a running JVM.
> In 3.1.1, there will be a "lock down" mode to make the on-disk database
> consistent, if the app really must.
>
> The best for snapshoting disk files in 3.1.0 is start a write transaction,
> do no writes, and take disk copy. This is not a feature guaranteed in the
> future.
>
> Taking an RDF-level backup is better.
>
> Andy
>
>
> On 03/10/16 15:36, Laurent Rucquoy wrote:
>
>> Hello,
>>
>> We have a TDB-backed dataset on a Windows server (Apache Jena 3.1.0)
>> We want to make a .zip backup of the TDB folder from the JVM running the
>> TDB.
>> Is it a safe way to do this in order to avoid database corruption ?
>>
>> Thank you in advance for your help.
>>
>> Regards,
>> Laurent
>>
>>


-- 


*Laurent Rucquoy*
R Engineer

laurent.rucq...@telemis.com
Tel: +32 (0) 10 48 00 27
Fax: +32 (0) 10 48 00 20

Telemis
Avenue Athéna 2
1348 Louvain-la-Neuve
Belgium
www.telemis.com
*Extending Human Life*


Backup of TDB-backed dataset

2016-10-03 Thread Laurent Rucquoy
Hello,

We have a TDB-backed dataset on a Windows server (Apache Jena 3.1.0)
We want to make a .zip backup of the TDB folder from the JVM running the
TDB.
Is it a safe way to do this in order to avoid database corruption ?

Thank you in advance for your help.

Regards,
Laurent


Re: Question about RDF collections

2016-09-22 Thread Laurent Rucquoy
Thank you all for your help !

On 22 September 2016 at 14:20, Nikolaos Beredimas <bere...@gmail.com> wrote:

> Just remember that there is no ordering in this.
>
> [Document_A] --contains--> [Paragraph_1]
> [Document_A] --contains--> [Paragraph_2]
> [Document_B] --contains--> [Paragraph_2]
>
> is equivalent to
>
> [Document_A] --contains--> [Paragraph_2]
> [Document_B] --contains--> [Paragraph_2]
> [Document_A] --contains--> [Paragraph_1]
>
>
> On Thu, Sep 22, 2016 at 3:14 PM, Laurent Rucquoy <
> laurent.rucq...@telemis.com> wrote:
>
> > I have a "Document" resource which could contain many "Paragraph"
> > resources.
> > A same "Paragraph" resource could also be contained by different
> "Document"
> > resources.
> >
> > What is the most relevant model to translate such a case in RDF ?
> > I have two solutions:
> >
> > Solution 1 (using RDF collections)
> > [Document_A] --contains--> ( [Paragraph_1], [Paragraph_2] )
> > [Document_B] --contains--> ( [Paragraph_2] )
> >
> > or
> >
> > Solution 2 (defining the same predicate several times on the same
> subject)
> > [Document_A] --contains--> [Paragraph_1]
> > [Document_A] --contains--> [Paragraph_2]
> > [Document_B] --contains--> [Paragraph_2]
> >
> >
> > I think that the Solution 1 requires complex and resource-consuming
> SPARQL
> > update.
> > So to keep it simple, I would choose the Solution 2. But, I don't know if
> > it's safe and if it's a good practice to define the same predicate
> several
> > times on the same subject ?
> >
> > Thank you in advance for your help.
> >
> > Regards,
> > Laurent
> >
>


Question about RDF collections

2016-09-22 Thread Laurent Rucquoy
I have a "Document" resource which could contain many "Paragraph" resources.
A same "Paragraph" resource could also be contained by different "Document"
resources.

What is the most relevant model to translate such a case in RDF ?
I have two solutions:

Solution 1 (using RDF collections)
[Document_A] --contains--> ( [Paragraph_1], [Paragraph_2] )
[Document_B] --contains--> ( [Paragraph_2] )

or

Solution 2 (defining the same predicate several times on the same subject)
[Document_A] --contains--> [Paragraph_1]
[Document_A] --contains--> [Paragraph_2]
[Document_B] --contains--> [Paragraph_2]


I think that the Solution 1 requires complex and resource-consuming SPARQL
update.
So to keep it simple, I would choose the Solution 2. But, I don't know if
it's safe and if it's a good practice to define the same predicate several
times on the same subject ?

Thank you in advance for your help.

Regards,
Laurent


Re: ConcurrentModificationException

2016-09-15 Thread Laurent Rucquoy
Hi Andy,

I've moved the model.close() call inside the transaction (sorry for
this careless mistake) and it works now !

Thank you very much for your help.
Laurent



On 14 September 2016 at 18:28, Andy Seaborne <a...@apache.org> wrote:

> Hi Laurent,
>
> On 14/09/16 15:16, Laurent Rucquoy wrote:
>
>> Hi Andy,
>>
>> Thank you for your help.
>>
>> I have tested the following code according to your suggestions (get the
>> model inside the transaction and remove the use of locks):
>>
>> dataset.begin(ReadWrite.WRITE);
>>
>>> Model model = dataset.getNamedModel("http://my-model-name;);
>>> try {
>>> UpdateAction.parseExecute(sparql, model);
>>>
>>
> What is the update?
>
> if(writeMode) {
>>> dataset.commit();
>>> }
>>> } finally {
>>>
>>> model.close();
>>>
>>
> You are using the model after the commit
>
> dataset.end();
>>> }
>>>
>>
>>
> Andy
>
>
>> When multiple updates are made on our TDB-backed dataset, the java.util.
>> ConcurrentModificationException is still thrown, leaving the dataset in
>> an
>>
>> inaccessible state.
>> Here is the stacktrace part:
>>
>> Caused by: java.util.ConcurrentModificationException: Iterator: started
>> at
>>
>>> 5, now 8
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW.policyError(Datas
>>> etControlMRSW.java:157)
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW.access$000(Datase
>>> tControlMRSW.java:32)
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotC
>>> oncurrent.checkCourrentModification(DatasetControlMRSW.java:110)
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotC
>>> oncurrent.hasNext(DatasetControlMRSW.java:118)
>>> at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
>>> at org.apache.jena.atlas.iterator.Iter.hasNext(Iter.java:870)
>>> at org.apache.jena.atlas.iterator.Iter$1.hasNext(Iter.java:192)
>>> at org.apache.jena.atlas.iterator.Iter.hasNext(Iter.java:870)
>>> at
>>> org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext(
>>> RepeatApplyIterator.java:58)
>>> at
>>> org.apache.jena.tdb.solver.SolverLib$IterAbortable.hasNext(
>>> SolverLib.java:195)
>>> at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIterPlainWrapper
>>> .hasNextBinding(QueryIterPlainWrapper.java:53)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has
>>> Next(QueryIteratorBase.java:111)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.
>>> makeNextStage(QueryIterRepeatApply.java:101)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.
>>> hasNextBinding(QueryIterRepeatApply.java:65)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has
>>> Next(QueryIteratorBase.java:111)
>>> at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
>>> at
>>> org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext(
>>> RepeatApplyIterator.java:45)
>>> at
>>> org.apache.jena.tdb.solver.SolverLib$IterAbortable.hasNext(
>>> SolverLib.java:195)
>>> at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIterPlainWrapper
>>> .hasNextBinding(QueryIterPlainWrapper.java:53)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has
>>> Next(QueryIteratorBase.java:111)
>>> at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
>>> at
>>> org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext(
>>> RepeatApplyIterator.java:45)
>>> at
>>> org.apache.jena.tdb.solver.SolverLib$IterAbortable.hasNext(
>>> SolverLib.java:195)
>>> at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIterPlainWrapper
>>> .hasNextBinding(QueryIterPlainWrapper.java:53)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has
>>> Next(QueryIteratorBase.java:111)
>>> at
>>> org.apache.jena.sparql.engine.iterator.QueryIterConcat.hasNe
>>> xtBinding(Qu

Re: ConcurrentModificationException

2016-09-14 Thread Laurent Rucquoy
engine.ResultSetCheckCondition.hasNext(ResultSetCheckCondition.java:59)



Regards,
Laurent



On 14 September 2016 at 11:36, Andy Seaborne <a...@apache.org> wrote:

> Hi Laurant,
>
> Try getting the model inside transaction, not passing it across
> transaction boundaries.
>
> dataset.begin(ReadWrite.WRITE);
> try {
>   Model model = dataset.get...
>   ...
>   dataset.commit() ;
> } ...
>
> In 3.1.0, the model is (sort of) connected to the transaction in which it
> is created.  This is fixed in the next release but style-wise, because
> model are just views of the database, they are related to transactions.
>
> No need to close the model but it's harmless to do so.
>
> You don't need the locking as well as transactions.
>
> Andy
>
>
>
> On 14/09/16 08:59, Laurent Rucquoy wrote:
>
>> Hello,
>>
>> We use a Jena TDB-backed dataset (release 3.1.0) accessed through a single
>> JVM multi-threaded application running generally on a Microsoft Windows
>> server.
>>
>> The read and write accesses are made using transaction and read/write
>> locks.
>> Here is the SPARQL update code we use:
>>
>> dataset.begin(ReadWrite.WRITE);
>>
>>> model.enterCriticalSection(Lock.WRITE);
>>> try {
>>> UpdateAction.parseExecute(sparql, model);
>>> if(writeMode) {
>>> dataset.commit();
>>> }
>>> } finally {
>>> model.leaveCriticalSection();
>>> model.close();
>>> dataset.end();
>>> }
>>>
>>
>>
>> Note that our SPARQL query code implements also read transactions and
>> locks.
>>
>> When multiple updates are made on our TDB-backed dataset, a
>> java.util.ConcurrentModificationException is sometimes thrown, leaving
>> the
>> dataset in an inaccessible state.
>> Here is the stacktrace part:
>>
>>
>> ...
>>> Caused by: java.util.ConcurrentModificationException: Reader = 1,
>>> Writer =
>>> 1
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW.policyError(Datas
>>> etControlMRSW.java:157)
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW.policyError(Datas
>>> etControlMRSW.java:152)
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW.checkConcurrency(
>>> DatasetControlMRSW.java:79)
>>> at
>>> org.apache.jena.tdb.sys.DatasetControlMRSW.startUpdate(Datas
>>> etControlMRSW.java:60)
>>> at
>>> org.apache.jena.tdb.store.nodetupletable.NodeTupleTableConcr
>>> ete.startWrite(NodeTupleTableConcrete.java:65)
>>> at
>>> org.apache.jena.tdb.store.nodetupletable.NodeTupleTableConcr
>>> ete.sync(NodeTupleTableConcrete.java:249)
>>>
>>
>>
>> How can we avoid such a situation ?
>> Can we safely do without read/write locks when we use transactions in a
>> single JVM multi-threaded application ?
>>
>> Thank you in advance for your help.
>>
>> Sincerely,
>> Laurent
>>
>>


ConcurrentModificationException

2016-09-14 Thread Laurent Rucquoy
Hello,

We use a Jena TDB-backed dataset (release 3.1.0) accessed through a single
JVM multi-threaded application running generally on a Microsoft Windows
server.

The read and write accesses are made using transaction and read/write locks.
Here is the SPARQL update code we use:

dataset.begin(ReadWrite.WRITE);
> model.enterCriticalSection(Lock.WRITE);
> try {
> UpdateAction.parseExecute(sparql, model);
> if(writeMode) {
> dataset.commit();
> }
> } finally {
> model.leaveCriticalSection();
> model.close();
> dataset.end();
> }


Note that our SPARQL query code implements also read transactions and locks.

When multiple updates are made on our TDB-backed dataset, a
java.util.ConcurrentModificationException is sometimes thrown, leaving the
dataset in an inaccessible state.
Here is the stacktrace part:


> ...
> Caused by: java.util.ConcurrentModificationException: Reader = 1, Writer =
> 1
> at
> org.apache.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
> at
> org.apache.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:152)
> at
> org.apache.jena.tdb.sys.DatasetControlMRSW.checkConcurrency(DatasetControlMRSW.java:79)
> at
> org.apache.jena.tdb.sys.DatasetControlMRSW.startUpdate(DatasetControlMRSW.java:60)
> at
> org.apache.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.startWrite(NodeTupleTableConcrete.java:65)
> at
> org.apache.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.sync(NodeTupleTableConcrete.java:249)


How can we avoid such a situation ?
Can we safely do without read/write locks when we use transactions in a
single JVM multi-threaded application ?

Thank you in advance for your help.

Sincerely,
Laurent


Re: TDB store parameters

2016-08-26 Thread Laurent Rucquoy
We use Microsoft Windows servers.



On 26 August 2016 at 13:01, Andy Seaborne <a...@apache.org> wrote:

> On 26/08/16 08:59, Laurent Rucquoy wrote:
>
>> Hello Andy,
>>
>> Thank you for your help.
>>
>> The params I'm mainly interested in changing are those of the profile
>> returned by StoreParams.getSmallStoreParams() to be able to reduce the
>> dataset size.
>>
>
> That is best done when creating the dataset in the first place.
>
> It reduces the in-memory cache foot print; it uses direct mode which uses
> in-JVM file cache but it does not swamp the machine with memory mapped
> files.
>
> For small datasets, it makes the file size seem less. The memory mapped
> files on Linux are spare files - space allocated but not used. The empty
> dataset on disk is 150K for Linux even though many file sizes are 8M. Some
> other OSs may allocate the whole space or they may misreport sparse files)
>
> Except the test of changing the fileMode from mapped to direct, I've not
>> made finer tuning on the other parameters, this is why the
>> StoreParams.getSmallStoreParams()
>> seems to be convenient for our needs.
>>
>> I've another question about this case:
>>
>> What will be the size result of changing from default store params to
>> small
>> store params on an existing TDB dataset ?
>>
>
> Not much.  The files reporting 8M will report 8k but the actual size is
> the same because all databases are compatible unless you change the block
> size or indexing.
>
> I think this will have an effect on future writing (i.e. the existing size
>> on disk will not be compacted -> is there a direct way or an existing tool
>> able to compact the size of an existing dataset ?)
>>
>
> Correct.
>
>
>> Regards,
>> Laurent
>>
>
> What OS are you using?
>
> Andy
>
>
>
>>
>> On 26 August 2016 at 00:22, Andy Seaborne <a...@apache.org> wrote:
>>
>> On 25/08/16 16:16, Laurent Rucquoy wrote:
>>>
>>> Hello,
>>>>
>>>> I'm implementing a TDB-backed dataset (Jena 3.1) and I whish to provide
>>>> a
>>>> method to change the StoreParams of this dataset.
>>>>
>>>> Because changing the StoreParams implies to release the corresponding
>>>>>
>>>>
>>> dataset location, I'd like to identify the current StoreParams in use to
>>>> be
>>>> able to avoid to release the location if the StoreParams we want to
>>>> apply
>>>> now are the same as those currently used.
>>>>
>>>>
>>> Release is not so bad unless you are doing it frequently.
>>>
>>>
>>> What is the right way to do this (if possible) ?
>>>>
>>>>
>>> This may work:
>>>
>>> DatasetGraphTDB x = TDBInternal.getBaseDatasetGraphTDB(myDatasetGraph)
>>> StoreParams sp = x.getConfig().params ;
>>> System.out.println(sp);
>>>
>>> (the "may" is because I only think it works on a live dataset no tested
>>> it)
>>>
>>> Obviously the name "TDBInternal" is a warning!
>>>
>>> Which params are you interested in changing?
>>>
>>> Andy
>>>
>>> Defaults:
>>>
>>> fileMode   dft:mapped
>>> blockSize  dft:8192
>>> readCacheSize  dft:1
>>> writeCacheSize dft:2000
>>> Node2NodeIdCacheSize   dft:10
>>> NodeId2NodeCacheSize   dft:50
>>> NodeMissCacheSize  dft:100
>>> indexNode2Id   dft:node2id
>>> indexId2Node   dft:nodes
>>> primaryIndexTriplesdft:SPO
>>> tripleIndexes  dft:[SPO, POS, OSP]
>>> primaryIndexQuads  dft:GSPO
>>> quadIndexesdft:[GSPO, GPOS, GOSP, POSG, OSPG, SPOG]
>>> primaryIndexPrefix dft:GPU
>>> prefixIndexes  dft:[GPU]
>>> indexPrefixdft:prefixIdx
>>> prefixNode2Id  dft:prefix2id
>>> prefixId2Node  dft:prefixes
>>>
>>>
>>>
>>> Thank you in advance for your help.
>>>>
>>>> Sincerely,
>>>> Laurent
>>>>
>>>>
>>>>
>>>
>>
>


Re: Imported ontology handling

2016-04-25 Thread Laurent Rucquoy
Hello,

Thank you very much for your help, it was very clear and very useful for me.



On 24 April 2016 at 13:05, Dave Reynolds <dave.e.reyno...@gmail.com> wrote:

> Hi,
>
> On 22/04/16 12:45, Laurent Rucquoy wrote:
>
>> Hello,
>>
>> I want to manage a TDB notably to store observations which use terms
>> defined in an external ontology.
>>
>> This ontology is defined in OWL files available on the following web page:
>> https://bioportal.bioontology.org/ontologies/RADLEX?p=summary
>>
>> Example of OWL file used:
>> - 3.13.1 version :
>>
>> http://data.bioontology.org/ontologies/RADLEX/submissions/36/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb
>> - 3.12 version :
>>
>> http://data.bioontology.org/ontologies/RADLEX/submissions/31/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb
>>
>>
>> What is the best practice to handle the ontology use ?
>>
>> My idea is to import the OWL file as a named model in my TDB whereas my
>> instances are stored in the default model. These instances will be linked
>> to the ontology through <ontology_base_uri#RID> resources (where
>> RID is the local id of terms defined in this ontology)
>>
>> When I will have to reason with the ontology, I will use a 'work' model
>> resulting from the union of the ontology named model and the default
>> model.
>>
>>
>> My questions:
>>
>> 1) Is this the right way to reason with imported ontologies (i.e. the
>> default model to store the instances, named models used to import
>> different
>> versions of an ontology and a 'work' model resulting from the union of
>> default model and named model) ?
>>
>
> There's no "right" answer here. It'll depend on your work flow and the
> sorts of queries you want to make.
>
> That said I would suggest putting your instance data in a named graph as
> well, not in the default model. That leaves you free to set "union default"
> so that you can query the union of the instance and ontology data.
>
> Note that the built-in Jena reasoners are in-memory reasoners only and
> reasoning over a TDB model will be slow and not improve scaling.
>
> 2) How can I handle the different versions of OWL files ?
>>
>> e.g. in one version of this ontology, the RID31872 term is identified by
>> the
>> <http://www.owl-ontologies.com/Ontology1447432460.owl#RID31872> uri
>> while the same term is identified by the
>> <http://www.owl-ontologies.com/Ontology1415135201.owl#RID31872> uri
>>
>
> Ugh, that's completely horrible. I don't see a reasonable way you can
> handle that.
>
> As far as I can see there is no relationship between the different
> versions of the term. Just because they happen to have the same localname
> is irrelevant, they are different resources. Looking at those files I see
> no provision for versioning - there's no unversioned resources, no
> versioning links, no mapping terms, nothing. Hopefully that's somewhere and
> I'm just missing it.
>
> Unless you have some separate mapping information that isn't included in
> those links then I'm afraid this is a case of "don't start from here".
>
> Which information will be the more useful to store in my default model to
>> be able to link to the corresponding term in the different versions of the
>> ontology since the base uri could change from one version to the other
>> (while the local part is still the same) ?
>>
>
> As I say, there's just no easy way to handle that. You are dealing with
> "ontologies" that have made no provision for versioning. Indeed I would
> suggest you are dealing with data that started out not as an ontology and
> has just been mapped to OWL syntax.
>
> To fix that would require deep understanding of what the nature of the
> changes are between those different versions.
>
> Assuming the concepts actually have closely related meanings between the
> different versions (a big assumption) then my best advice would be to
> create a new URI set with unversioned URI corresponding to each concept in
> the union of the ontology versions you are looking at. Use those
> unversioned URIs in your instance data. Then create a set of mappings to
> map your unversioned resources to the versioned ones. Precisely what
> mapping terms to use depends on the detailed semantics involved.
>
> Dave
>
>


Imported ontology handling

2016-04-22 Thread Laurent Rucquoy
Hello,

I want to manage a TDB notably to store observations which use terms
defined in an external ontology.

This ontology is defined in OWL files available on the following web page:
https://bioportal.bioontology.org/ontologies/RADLEX?p=summary

Example of OWL file used:
- 3.13.1 version :
http://data.bioontology.org/ontologies/RADLEX/submissions/36/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb
- 3.12 version :
http://data.bioontology.org/ontologies/RADLEX/submissions/31/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb


What is the best practice to handle the ontology use ?

My idea is to import the OWL file as a named model in my TDB whereas my
instances are stored in the default model. These instances will be linked
to the ontology through  resources (where
RID is the local id of terms defined in this ontology)

When I will have to reason with the ontology, I will use a 'work' model
resulting from the union of the ontology named model and the default model.


My questions:

1) Is this the right way to reason with imported ontologies (i.e. the
default model to store the instances, named models used to import different
versions of an ontology and a 'work' model resulting from the union of
default model and named model) ?


2) How can I handle the different versions of OWL files ?

e.g. in one version of this ontology, the RID31872 term is identified by the
 uri
while the same term is identified by the
 uri

Which information will be the more useful to store in my default model to
be able to link to the corresponding term in the different versions of the
ontology since the base uri could change from one version to the other
(while the local part is still the same) ?


Thank you in advance for your help.

Sincerely,
Laurent


Re: TDB suddenly seems to use more CPU

2016-02-04 Thread Laurent Rucquoy
Thank you very much for your explanations.

Regards,
Laurent

On 21 January 2016 at 11:53, Andy Seaborne <a...@apache.org> wrote:

> On 20/01/16 10:38, Laurent Rucquoy wrote:
>
>> Hi,
>>
>> 1 - About the direct mode:
>> Yes, the TDB is running in direct mode, but I have no explanation about
>> why
>> it has been explicitly set in our application source code.
>> 1.1 - What will change in our application if I remove the
>> TDB.getContext().set(SystemTDB.symFileMode, FileMode.direct); line ?
>>
>
> Firstly - I use TDB in Linux, not Windows, so I'm looking to hear of
> people's experiences.  This is based on what I have heard ...
>
> On windows, the difference in performance between direct and mapped modes
> seems to be much less (near zero?) than Linux.
>
> And you can't delete databases while the JVM using DB is alive.  This is a
> very long standing java issue (see the Java bug tracker - it is in there
> several times in different reports).
>
> The TDB test cases suffer from this - they use a lot of temporary space as
> a new DB is made for each test rather than delete-reuse the directory.
>
> 1.2 - Is there a default mode which will suit for classical cases ?
>>
>
> The only real way to know the right setting for you is to try it.  Your
> data, the usage patterns and the size may all be factors.
>
> That said, I don't remember any reports to suggest that other than the
> "delete database" issue, it makes much difference until data sizes go up.
>
> In the version you are running (which is quite old), it is hard to tune
> the cache sizes.  In mapped mode there are no index caches to manages - it
> flexes automatically (the OS does it - not that TDB has some built-in
> smarts).
>
> 1.3 - Is it possible that this 'forced' direct mode could be the cause of
>> our CPU high-usage issue ?
>>
>
> There is one possibility which is that the GC is under pressure; if you
> are close to max heap, it may be working hard to keep memory available.
>  There is no specific indication of this one way or the other in your
> report; it is just a possibility.  Solution - increase heap by 25%-50% and
> see what happens.
>
>
>>
>> 2 - About the environment:
>> OS: Windows Server 2008 R2 64bit (Virtual Machine)
>> Java: 1.7.0_67-b01 (64bit)
>>
>
> VMs can be affected by what else the real hardware is hosting.  Suppose
> the hardware is busy - your VM only gets it's allocated %-age of the CPU
> time, whereas when not busy your VM may be getting a lot more than the
> "contract" amount.  Result - requests take a bit longer and that has a
> knock-on effect of more results being active at any one time causing more
> CPU for your VM to be needed.  But again, only a possibility.
>
>
>>
>> 3 - About the data and the query
>> The changes on the data occur through Jenabean save calls (the underlying
>> object model has not changed.)
>> The query at the point logged in the dump messages is:
>>
>> PREFIX base: <http://www.telemis.com/>
>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> PREFIX xmls: <http://www.w3.org/2001/XMLSchema#>
>> SELECT ?x
>> {
>> ?image base:sopInstanceUID
>> "x.x.xx.x..x.x.x.x.x"^^xmls:string .
>> ?image a base:Image .
>> ?seq ?p ?image .
>> ?x base:images ?seq .
>> ?x a base:ImageAnnotation ;
>> base:deleted false .
>> }
>>
>
> (I don't know jenabean).
>
> There is nothing strange looking about that query.
>
> If you added a lot more data, rather than steady incremental growth, it
> might have
>
> Increase RAM and increase the block caches:
> System properties:
>
> BlockReadCacheSize : default: 1 so try 25
> BlockWriteCacheSize : default: 2000 so try 5000
> NodeId2NodeCacheSize : default 50 so try 100 (1 million)
>
> these are all in-heap so increase the heap size.
>
> (changes are logged at level info so you can check they have an effect - I
> am not on my dev machine at the moment so I can't easily check details here
> I'm afraid)
>
> 4 - About the BLOCKED state:
>> Indeed it means that the thread was blocked (not using CPU) at the time of
>> the dump.
>> But looking at the threads list and corresponding CPU usage, these threads
>> were using each about 5% of the CPU, so there is only a 1 in 20 chance
>> that
>> a thread dump will catch them running.
>> Anyway, my colleague managed to get a thread dump while some of the
>> incriminated threads where running.
>> This was 

Re: TDB suddenly seems to use more CPU

2016-01-20 Thread Laurent Rucquoy
e)
at sun.rmi.transport.Transport$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)



Thank you for your help.
Regards,
Laurent.



On 19 January 2016 at 20:14, Andy Seaborne <a...@apache.org> wrote:

> Hi there,
>
> It's not clear to be what's going on here - looking at what's changed in
> the data and what the query that is being issued would be a good starting
> place.
>
> TDB seems to be running in direct mode, which is usually only used for 32
> bit JVMs (unless you changed the global environment specially).
>
> Is that right?
> What is the environment? OS? Java version?
> Is the environment a VM?
>
> The stack of BlockMgrJournal.release is odd - I don't think that
> BlockMgrJournal's get stacked like that but this version of TDB was a while
> ago or this might be an artifact of the sampling process.
>
> One possible oddity : it says
> java.lang.Thread.State: BLOCKED (on object monitor)
>
> so could this thread be waiting and not consuming CPU?
>
> Andy
>
>
> On 19/01/16 17:06, Laurent Rucquoy wrote:
>
>> Yes, it is probable that the queried data has been modified because it's a
>> production server and the TDB is always used.
>> The way the data are updated in the TDB is using the Jenabean library
>> (1.0.6) which is used to persist the corresponding Java object model.
>>
>> On 19 January 2016 at 17:58, A. Soroka <aj...@virginia.edu> wrote:
>>
>> I don’t have the knowledge to unwrap that trace (the real experts here can
>>> do that) but I’d like to ask: if you haven’t changed any part of the
>>> executing code, did you change the data over which you’re running the
>>> queries at the time the problem appeared, and if so, in what way?
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>> On Jan 19, 2016, at 11:54 AM, Laurent Rucquoy <
>>>>
>>> laurent.rucq...@telemis.com> wrote:
>>>
>>>>
>>>> Hello,
>>>>
>>>> We have a production server encountering significant slowness resulting
>>>> from high CPU usage since about two weeks now.
>>>>
>>>> When we ckeck high CPU usage threads dumps we note that calls to Jena
>>>> are
>>>> always implicated.
>>>>
>>>> The calling code in our application executes a SPARQL query and iterates
>>>>
>>> on
>>>
>>>> query solution.
>>>> The Jena version used is 2.10.1 (with jena-tdb 0.10.1)
>>>> There was recently no change made in our application source code that
>>>>
>>> could
>>>
>>>> explain this issue.
>>>>
>>>> Have you any idea about possible causes ?
>>>>
>>>> Thank you in advance for your support.
>>>>
>>>> Sincerely,
>>>> Laurent.
>>>>
>>>>
>>>> Here is a thread dump as an example:
>>>>
>>>> "RMI TCP Connection(17383)-10.249.203.163" daemon prio=6
>>>> tid=0x15607800 nid=0x1dd0 waiting for monitor entry
>>>> [0x1518d000]
>>>>java.lang.Thread.State: BLOCKED (on object monitor)
>>>> at
>>>>
>>> com.hp.hpl.jena.tdb.base.block.BlockMgrSync.release(BlockMgrSync.java:76)
>>>
>>>> - waiting to lock <0x00072ab58058> (a
>>>> com.hp.hpl.jena.tdb.base.block.BlockMgrCache)
>>>> at
>>>>
>>>
>>> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
>>>
>>>> at
>>>>
>>>
>>> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
>>>
>>>> at
>>>>
>>>
>>> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
>>>
>>>> at
>>>>
>>>
>>> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
>>>
>>>> at
>>>>
>>>
>>> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
>

TDB suddenly seems to use more CPU

2016-01-19 Thread Laurent Rucquoy
Hello,

We have a production server encountering significant slowness resulting
from high CPU usage since about two weeks now.

When we ckeck high CPU usage threads dumps we note that calls to Jena are
always implicated.

The calling code in our application executes a SPARQL query and iterates on
query solution.
The Jena version used is 2.10.1 (with jena-tdb 0.10.1)
There was recently no change made in our application source code that could
explain this issue.

Have you any idea about possible causes ?

Thank you in advance for your support.

Sincerely,
Laurent.


Here is a thread dump as an example:

"RMI TCP Connection(17383)-10.249.203.163" daemon prio=6
tid=0x15607800 nid=0x1dd0 waiting for monitor entry
[0x1518d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at com.hp.hpl.jena.tdb.base.block.BlockMgrSync.release(BlockMgrSync.java:76)
- waiting to lock <0x00072ab58058> (a
com.hp.hpl.jena.tdb.base.block.BlockMgrCache)
at 
com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
at 
com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
at 
com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
at 
com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
at 
com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
at 
com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
at 
com.hp.hpl.jena.tdb.base.block.BlockMgrWrapper.release(BlockMgrWrapper.java:77)
at com.hp.hpl.jena.tdb.base.page.PageBlockMgr.release(PageBlockMgr.java:92)
at 
com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.close(RecordRangeIterator.java:151)
at 
com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:134)
at org.apache.jena.atlas.iterator.Iter$4.hasNext(Iter.java:293)
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:119)
at org.apache.jena.atlas.iterator.Iter$4.hasNext(Iter.java:293)
at org.apache.jena.atlas.iterator.Iter$3.hasNext(Iter.java:179)
at org.apache.jena.atlas.iterator.Iter.hasNext(Iter.java:906)
at 
org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:58)
at 
com.hp.hpl.jena.tdb.solver.SolverLib$IterAbortable.hasNext(SolverLib.java:193)
at 
org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
at 
com.hp.hpl.jena.tdb.solver.SolverLib$IterAbortable.hasNext(SolverLib.java:193)
at org.apache.jena.atlas.iterator.Iter$4.hasNext(Iter.java:293)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:75)
at 
com.telemis.core.measure.server.rdf.MeasureRetrieveRdf.getMeasuresFromSeriesIds(MeasureRetrieveRdf.java:138)
at 
com.telemis.core.measure.server.rdf.MeasureRetrieveRdf.getMeasuresFromSeries(MeasureRetrieveRdf.java:183)
at 
telemis.measure.tms.service.MeasureService.getMeasuresFromSeries(MeasureService.java:458)
at 
telemis.measure.tms.service.MeasureService.getMeasuresFromExam(MeasureService.java:436)
at 
telemis.measure.tms.messagehandlers.GetMeasuresFromExamMessageHandler.perform(GetMeasuresFromExamMessageHandler.java:46)
at 
telemis.measure.tms.messagehandlers.GetMeasuresFromExamMessageHandler.perform(GetMeasuresFromExamMessageHandler.java:26)
at telemis.service.MessageHandlerManager.execute(MessageHandlerManager.java:50)
at telemis.service.MomoRMIImpl.executeInternal(MomoRMIImpl.java:522)
at telemis.service.MomoRMIImpl.execute(MomoRMIImpl.java:367)
at telemis.service.MomoRMIImpl_Skel.dispatch(Unknown Source)
at sun.rmi.server.UnicastServerRef.oldDispatch(Unknown Source)
at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
at sun.rmi.transport.Transport$1.run(Unknown Source)
at sun.rmi.transport.Transport$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
at 

Re: TDB suddenly seems to use more CPU

2016-01-19 Thread Laurent Rucquoy
Yes, it is probable that the queried data has been modified because it's a
production server and the TDB is always used.
The way the data are updated in the TDB is using the Jenabean library
(1.0.6) which is used to persist the corresponding Java object model.

On 19 January 2016 at 17:58, A. Soroka <aj...@virginia.edu> wrote:

> I don’t have the knowledge to unwrap that trace (the real experts here can
> do that) but I’d like to ask: if you haven’t changed any part of the
> executing code, did you change the data over which you’re running the
> queries at the time the problem appeared, and if so, in what way?
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Jan 19, 2016, at 11:54 AM, Laurent Rucquoy <
> laurent.rucq...@telemis.com> wrote:
> >
> > Hello,
> >
> > We have a production server encountering significant slowness resulting
> > from high CPU usage since about two weeks now.
> >
> > When we ckeck high CPU usage threads dumps we note that calls to Jena are
> > always implicated.
> >
> > The calling code in our application executes a SPARQL query and iterates
> on
> > query solution.
> > The Jena version used is 2.10.1 (with jena-tdb 0.10.1)
> > There was recently no change made in our application source code that
> could
> > explain this issue.
> >
> > Have you any idea about possible causes ?
> >
> > Thank you in advance for your support.
> >
> > Sincerely,
> > Laurent.
> >
> >
> > Here is a thread dump as an example:
> >
> > "RMI TCP Connection(17383)-10.249.203.163" daemon prio=6
> > tid=0x15607800 nid=0x1dd0 waiting for monitor entry
> > [0x1518d000]
> >   java.lang.Thread.State: BLOCKED (on object monitor)
> > at
> com.hp.hpl.jena.tdb.base.block.BlockMgrSync.release(BlockMgrSync.java:76)
> > - waiting to lock <0x00072ab58058> (a
> > com.hp.hpl.jena.tdb.base.block.BlockMgrCache)
> > at
> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
> > at
> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
> > at
> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
> > at
> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
> > at
> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
> > at
> com.hp.hpl.jena.tdb.transaction.BlockMgrJournal.release(BlockMgrJournal.java:207)
> > at
> com.hp.hpl.jena.tdb.base.block.BlockMgrWrapper.release(BlockMgrWrapper.java:77)
> > at
> com.hp.hpl.jena.tdb.base.page.PageBlockMgr.release(PageBlockMgr.java:92)
> > at
> com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.close(RecordRangeIterator.java:151)
> > at
> com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:134)
> > at org.apache.jena.atlas.iterator.Iter$4.hasNext(Iter.java:293)
> > at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:119)
> > at org.apache.jena.atlas.iterator.Iter$4.hasNext(Iter.java:293)
> > at org.apache.jena.atlas.iterator.Iter$3.hasNext(Iter.java:179)
> > at org.apache.jena.atlas.iterator.Iter.hasNext(Iter.java:906)
> > at
> org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:58)
> > at
> com.hp.hpl.jena.tdb.solver.SolverLib$IterAbortable.hasNext(SolverLib.java:193)
> > at
> org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
> > at
> com.hp.hpl.jena.tdb.solver.SolverLib$IterAbortable.hasNext(SolverLib.java:193)
> > at org.apache.jena.atlas.iterator.Iter$4.hasNext(Iter.java:293)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at

Re: Check the Jena TDB files exist in a directory

2015-11-30 Thread Laurent Rucquoy
Thank you for your help.
Regards
Laurent

On 28 November 2015 at 11:45, Andy Seaborne <a...@apache.org> wrote:

> On 28/11/15 06:05, Saikat Maitra wrote:
>
>> Hi Laurent
>>
>> You can use dir.listFiles() and it will return list of files in that dir.
>> You can iterate over the list to check for any TDB file present and then
>> load Dataset from it.
>>
>> Regards
>> Saikat
>>
>
> There will be (next release) "TDBFactory.inUseLocation" which has some
> checks, the main one being for the file "tdb.cfg".
>
> (Older versions of TDB may not have created this file.)
>
> Andy
>
>
>
>> On Fri, Nov 27, 2015 at 2:03 PM, Laurent Rucquoy <
>> laurent.rucq...@telemis.com> wrote:
>>
>> Hello,
>>>
>>> I want to load a Dataset from a directory but not to create a new one if
>>> there is no TDB files in this directory. Is there a way to check that TDB
>>> files effectively exist in a directory or to load a Dataset while
>>> preventing to create files on disk if they don't exist ?
>>>
>>> Thank you.
>>>
>>> Sincerely,
>>> Laurent
>>>
>>>
>>
>


Check the Jena TDB files exist in a directory

2015-11-27 Thread Laurent Rucquoy
Hello,

I want to load a Dataset from a directory but not to create a new one if
there is no TDB files in this directory. Is there a way to check that TDB
files effectively exist in a directory or to load a Dataset while
preventing to create files on disk if they don't exist ?

Thank you.

Sincerely,
Laurent


Re: Trace back RDF containers in SPARQL

2015-05-12 Thread Laurent Rucquoy
I will upgrade Jena and investigate further following your advice.
Thank you for your help Andy.

Laurent.

On 11 May 2015 at 22:57, Andy Seaborne a...@apache.org wrote:

 On 11/05/15 14:55, Laurent Rucquoy wrote:

 Yes, the bad query is the good query with the last 3 triple patterns
 added.


 The optimizer in 2.10 would probably do a bad job on your query.  Adding
 the patterns makes it worse as it puts an unconstrained cross product (due
 to the ??? a :SomeClass2 parts).

 2.13 is better, using fixed.opt.

 It probably makes no difference as to whether you have a stats.opt file;
 if you have one, and with 2.13 it's worth trying both ways round.

 It could well explain what you are seeing and until that possibility is
 removed, it's hard to see any further.

 Andy


  When I run the good query (without the bad query last 3 triple patterns),
 I
 get about ten calculationResult nodes.
 When I run the bad query to try to retrieve the containing
 calculationResultCollection, the system freezes.

 What I want to do is to find the CalculationResultCollection nodes
 containing CalculationResult nodes referring to CalculationDataCollection
 nodes containing in their turn CalculationData nodes having
 0^^xsd:string
 value.

 Here is what could look like an instances diagram:

 CalculationResultCollection ---listCalculationResult--- blank_node_CR
 ---rdf:_1--- CalculationResult_1 ---calculationDataCollection---
 CalculationDataCollection ---listCalculationData--- blank_node_CD
 ---rdf:_1--- CalculationData_1_1
 CalculationResultCollection ---listCalculationResult--- blank_node_CR
 ---rdf:_1--- CalculationResult_1 ---calculationDataCollection---
 CalculationDataCollection ---listCalculationData--- blank_node_CD
 ---rdf:_2--- CalculationData_1_2
 CalculationResultCollection ---listCalculationResult--- blank_node_CR
 ---rdf:_1--- CalculationResult_1 ---calculationDataCollection---
 CalculationDataCollection ---listCalculationData--- blank_node_CD
 ---rdf:_3--- CalculationData_1_3
 CalculationResultCollection ---listCalculationResult--- blank_node_CR
 ---rdf:_2--- CalculationResult_2 ---calculationDataCollection---
 CalculationDataCollection ---listCalculationData--- blank_node_CD
 ---rdf:_1--- CalculationData_2_1
 CalculationResultCollection ---listCalculationResult--- blank_node_CR
 ---rdf:_2--- CalculationResult_2 ---calculationDataCollection---
 CalculationDataCollection ---listCalculationData--- blank_node_CD
 ---rdf:_2--- CalculationData_2_2
 CalculationResultCollection ---listCalculationResult--- blank_node_CR
 ---rdf:_2--- CalculationResult_2 ---calculationDataCollection---
 CalculationDataCollection ---listCalculationData--- blank_node_CD
 ---rdf:_3--- CalculationData_2_3
 ...


 Thank you for your help.

 Laurent.


 On 8 May 2015 at 12:42, Andy Seaborne a...@apache.org wrote:

  On 08/05/15 09:43, Laurent Rucquoy wrote:

  Hi Andy,

 Thank you for your response.

 1) Which version of Jena are you running?
 The used version of Jena is 2.10.1 (I will upgrade soon...)


 Try with 2.13.0 because the area of BGP optimizations has been improved.



 2) How are you storing the data and how big is it?
 TDBFactory.createDataset(directory)
 COUNT(*) - 1 224 103
 350MB on disk
 Do you need other details ?


 3) You say the query returns good results - what sort of query causes
 the
 system to freeze?
 This is the query returning good results appended with 3 more statements
 in
 the WHERE clause:


 So the bad query is the good query with the last 3 triple patterns added?

 It's hard to read but

 ?seqCalculationResultCollection
?seqCalculationResultCollectionIndex ?calculationResult .
 ?calculationResultCollection
 :listCalculationResult  ?seqCalculationResultCollection .
 ?calculationResultCollection rdf:type :CalculationResultCollection

 is connected to the good part by ?calculationResult; all the other
 variables are just fanning out from that point without anything like the
 :value 0^^xsd:string in the good part.  From what I understand of your
 data, that can be a huge number of results.

 Do you get no results, or that some results appear but then the query
 does
 not finish?

  Andy




  PREFIX : http://www.telemis.com/
 PREFIX xsd: http://www.w3.org/2001/XMLSchema#
 PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
 PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

 SELECT ?calculationResultCollection
 WHERE {
 ?calculationData a :CalculationData ;
 :value 0^^xsd:string .
 ?seqCalculationDataCollection ?seqCalculationDataCollectionIndex
 ?calculationData .
 ?calculationDataCollection a :CalculationDataCollection ;
 :listCalculationData ?seqCalculationDataCollection .
 ?calculationResult a :CalculationResult ;
 :calculationDataCollection ?calculationDataCollection .
 ?seqCalculationResultCollection ?seqCalculationResultCollectionIndex
 ?calculationResult .
 ?calculationResultCollection :listCalculationResult
 ?seqCalculationResultCollection ;
 a :CalculationResultCollection

Re: Trace back RDF containers in SPARQL

2015-05-11 Thread Laurent Rucquoy
Yes, the bad query is the good query with the last 3 triple patterns added.

When I run the good query (without the bad query last 3 triple patterns), I
get about ten calculationResult nodes.
When I run the bad query to try to retrieve the containing
calculationResultCollection, the system freezes.

What I want to do is to find the CalculationResultCollection nodes
containing CalculationResult nodes referring to CalculationDataCollection
nodes containing in their turn CalculationData nodes having 0^^xsd:string
value.

Here is what could look like an instances diagram:

CalculationResultCollection ---listCalculationResult--- blank_node_CR
---rdf:_1--- CalculationResult_1 ---calculationDataCollection---
CalculationDataCollection ---listCalculationData--- blank_node_CD
---rdf:_1--- CalculationData_1_1
CalculationResultCollection ---listCalculationResult--- blank_node_CR
---rdf:_1--- CalculationResult_1 ---calculationDataCollection---
CalculationDataCollection ---listCalculationData--- blank_node_CD
---rdf:_2--- CalculationData_1_2
CalculationResultCollection ---listCalculationResult--- blank_node_CR
---rdf:_1--- CalculationResult_1 ---calculationDataCollection---
CalculationDataCollection ---listCalculationData--- blank_node_CD
---rdf:_3--- CalculationData_1_3
CalculationResultCollection ---listCalculationResult--- blank_node_CR
---rdf:_2--- CalculationResult_2 ---calculationDataCollection---
CalculationDataCollection ---listCalculationData--- blank_node_CD
---rdf:_1--- CalculationData_2_1
CalculationResultCollection ---listCalculationResult--- blank_node_CR
---rdf:_2--- CalculationResult_2 ---calculationDataCollection---
CalculationDataCollection ---listCalculationData--- blank_node_CD
---rdf:_2--- CalculationData_2_2
CalculationResultCollection ---listCalculationResult--- blank_node_CR
---rdf:_2--- CalculationResult_2 ---calculationDataCollection---
CalculationDataCollection ---listCalculationData--- blank_node_CD
---rdf:_3--- CalculationData_2_3
...


Thank you for your help.

Laurent.


On 8 May 2015 at 12:42, Andy Seaborne a...@apache.org wrote:

 On 08/05/15 09:43, Laurent Rucquoy wrote:

 Hi Andy,

 Thank you for your response.

 1) Which version of Jena are you running?
 The used version of Jena is 2.10.1 (I will upgrade soon...)


 Try with 2.13.0 because the area of BGP optimizations has been improved.



 2) How are you storing the data and how big is it?
 TDBFactory.createDataset(directory)
 COUNT(*) - 1 224 103
 350MB on disk
 Do you need other details ?


 3) You say the query returns good results - what sort of query causes the
 system to freeze?
 This is the query returning good results appended with 3 more statements
 in
 the WHERE clause:


 So the bad query is the good query with the last 3 triple patterns added?

 It's hard to read but

 ?seqCalculationResultCollection
   ?seqCalculationResultCollectionIndex ?calculationResult .
 ?calculationResultCollection
:listCalculationResult  ?seqCalculationResultCollection .
 ?calculationResultCollection rdf:type :CalculationResultCollection

 is connected to the good part by ?calculationResult; all the other
 variables are just fanning out from that point without anything like the
 :value 0^^xsd:string in the good part.  From what I understand of your
 data, that can be a huge number of results.

 Do you get no results, or that some results appear but then the query does
 not finish?

 Andy




 PREFIX : http://www.telemis.com/
 PREFIX xsd: http://www.w3.org/2001/XMLSchema#
 PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
 PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

 SELECT ?calculationResultCollection
 WHERE {
 ?calculationData a :CalculationData ;
 :value 0^^xsd:string .
 ?seqCalculationDataCollection ?seqCalculationDataCollectionIndex
 ?calculationData .
 ?calculationDataCollection a :CalculationDataCollection ;
 :listCalculationData ?seqCalculationDataCollection .
 ?calculationResult a :CalculationResult ;
 :calculationDataCollection ?calculationDataCollection .
 ?seqCalculationResultCollection ?seqCalculationResultCollectionIndex
 ?calculationResult .
 ?calculationResultCollection :listCalculationResult
 ?seqCalculationResultCollection ;
 a :CalculationResultCollection .
 }





 On 7 May 2015 at 15:59, Andy Seaborne a...@apache.org wrote:

  Hi Laurent,

 Which version of Jena are you running?  How are you storing the data and
 how big is it?

 You say the query returns good results - what sort of query causes the
 system to freeze?

  Andy


 On 06/05/15 15:17, Laurent Rucquoy wrote:

  Hello,

 I have container resources which I want to retrieve from one of their
 elements.
 Here is an RDF/XML part of my data:


 rdf:Description rdf:about=


 http://www.telemis.com/CalculationResult/9e892a88-7257-4af3-881d-2eb304437546
 
 rdf:type rdf:resource=http://www.telemis.com/CalculationResult/
 TM:uid rdf:datatype=http://www.w3.org/2001/XMLSchema#string
 9e892a88-7257-4af3-881d-2eb304437546/TM:uid

Re: Trace back RDF containers in SPARQL

2015-05-08 Thread Laurent Rucquoy
Hi Andy,

Thank you for your response.

1) Which version of Jena are you running?
The used version of Jena is 2.10.1 (I will upgrade soon...)


2) How are you storing the data and how big is it?
TDBFactory.createDataset(directory)
COUNT(*) - 1 224 103
350MB on disk
Do you need other details ?


3) You say the query returns good results - what sort of query causes the
system to freeze?
This is the query returning good results appended with 3 more statements in
the WHERE clause:

PREFIX : http://www.telemis.com/
PREFIX xsd: http://www.w3.org/2001/XMLSchema#
PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

SELECT ?calculationResultCollection
WHERE {
?calculationData a :CalculationData ;
:value 0^^xsd:string .
?seqCalculationDataCollection ?seqCalculationDataCollectionIndex
?calculationData .
?calculationDataCollection a :CalculationDataCollection ;
:listCalculationData ?seqCalculationDataCollection .
?calculationResult a :CalculationResult ;
:calculationDataCollection ?calculationDataCollection .
?seqCalculationResultCollection ?seqCalculationResultCollectionIndex
?calculationResult .
?calculationResultCollection :listCalculationResult
?seqCalculationResultCollection ;
a :CalculationResultCollection .
}





On 7 May 2015 at 15:59, Andy Seaborne a...@apache.org wrote:

 Hi Laurent,

 Which version of Jena are you running?  How are you storing the data and
 how big is it?

 You say the query returns good results - what sort of query causes the
 system to freeze?

 Andy


 On 06/05/15 15:17, Laurent Rucquoy wrote:

 Hello,

 I have container resources which I want to retrieve from one of their
 elements.
 Here is an RDF/XML part of my data:


 rdf:Description rdf:about=

 http://www.telemis.com/CalculationResult/9e892a88-7257-4af3-881d-2eb304437546
 
 rdf:type rdf:resource=http://www.telemis.com/CalculationResult/
 TM:uid rdf:datatype=http://www.w3.org/2001/XMLSchema#string
 9e892a88-7257-4af3-881d-2eb304437546/TM:uid
 ...
 /rdf:Description

 rdf:Description rdf:nodeID=A525
 rdf:type rdf:resource=http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq/
 rdf:_1 rdf:resource=

 http://www.telemis.com/CalculationResult/9e892a88-7257-4af3-881d-2eb304437546
 /
 rdf:_2 rdf:resource=

 http://www.telemis.com/CalculationResult/9e892a88-7257-4af3-881d-45208e34af13
 /
 /rdf:Description

 rdf:Description rdf:about=

 http://www.telemis.com/CalculationResultCollection/a7d808d7-7689-4dfb-82c4-b0763f0dcd34
 
 rdf:type rdf:resource=
 http://www.telemis.com/CalculationResultCollection
 /
 TM:uid rdf:datatype=http://www.w3.org/2001/XMLSchema#string
 a7d808d7-7689-4dfb-82c4-b0763f0dcd34/TM:uid
 TM:listCalculationResult rdf:nodeID=A525/
 /rdf:Description


 The representation of this context could be:

 (CalculationResultCollection)
 --listCalculationResult--
 (seq-blank-node)
 +--(rdf:_1)--(CalculationResult1)
 +--(rdf:_2)--(CalculationResult2)
 +--...

 I've a SPARQL query which gives me a set of CalculationResult nodes.
 My question is: by which SPARQL query can I trace back to the
 CalculationResultCollection containing resource(s) ?

 Here is the SPARQL query which gives me a set of CalculationResult nodes:

 PREFIX : http://www.telemis.com/
 PREFIX xsd: http://www.w3.org/2001/XMLSchema#
 PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
 PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

 SELECT ?calculationResult
 WHERE {
 ?calculationData a :CalculationData ;
 :value 0^^xsd:string .
 ?seqCalculationDataCollection ?seqCalculationDataCollectionIndex
 ?calculationData .
 ?calculationDataCollection a :CalculationDataCollection ;
 :listCalculationData ?seqCalculationDataCollection .
 ?calculationResult a :CalculationResult ;
 :calculationDataCollection ?calculationDataCollection .
 }

 I tried to complete my WHERE clause with statements like I did to trace
 back to ?calculationDataCollection from ?calculationData but I cannot get
 the wanted result because the system freezes. Note that the SPARQL query
 here above returns good results.

 Is there a better practice to do that with acceptable performances ?

 Thank you in advance for your help.







-- 
*Laurent Rucquoy*
RD Engineer

laurent.rucq...@telemis.com
Tel: +32 (0) 10 48 00 27
Fax: +32 (0) 10 48 00 20

Telemis
Avenue Athéna 2
1348 Louvain-la-Neuve
Belgium
www.telemis.com
*Extending Human Life*

Au delà du Pacs...Découvrez le Macs
https://drive.google.com/a/telemis.com/file/d/0B8xEcRPFLfNdRmFRMlRSZ0k5djA/view?usp=sharing
 sur le stand Telemis #118 du 4. - 6 Juin - Centre des congrès de Bâle !


Trace back RDF containers in SPARQL

2015-05-06 Thread Laurent Rucquoy
Hello,

I have container resources which I want to retrieve from one of their
elements.
Here is an RDF/XML part of my data:


rdf:Description rdf:about=
http://www.telemis.com/CalculationResult/9e892a88-7257-4af3-881d-2eb304437546

rdf:type rdf:resource=http://www.telemis.com/CalculationResult/
TM:uid rdf:datatype=http://www.w3.org/2001/XMLSchema#string
9e892a88-7257-4af3-881d-2eb304437546/TM:uid
...
/rdf:Description

rdf:Description rdf:nodeID=A525
rdf:type rdf:resource=http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq/
rdf:_1 rdf:resource=
http://www.telemis.com/CalculationResult/9e892a88-7257-4af3-881d-2eb304437546
/
rdf:_2 rdf:resource=
http://www.telemis.com/CalculationResult/9e892a88-7257-4af3-881d-45208e34af13
/
/rdf:Description

rdf:Description rdf:about=
http://www.telemis.com/CalculationResultCollection/a7d808d7-7689-4dfb-82c4-b0763f0dcd34

rdf:type rdf:resource=http://www.telemis.com/CalculationResultCollection
/
TM:uid rdf:datatype=http://www.w3.org/2001/XMLSchema#string
a7d808d7-7689-4dfb-82c4-b0763f0dcd34/TM:uid
TM:listCalculationResult rdf:nodeID=A525/
/rdf:Description


The representation of this context could be:

(CalculationResultCollection)
--listCalculationResult--
(seq-blank-node)
+--(rdf:_1)--(CalculationResult1)
+--(rdf:_2)--(CalculationResult2)
+--...

I've a SPARQL query which gives me a set of CalculationResult nodes.
My question is: by which SPARQL query can I trace back to the
CalculationResultCollection containing resource(s) ?

Here is the SPARQL query which gives me a set of CalculationResult nodes:

PREFIX : http://www.telemis.com/
PREFIX xsd: http://www.w3.org/2001/XMLSchema#
PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

SELECT ?calculationResult
WHERE {
?calculationData a :CalculationData ;
:value 0^^xsd:string .
?seqCalculationDataCollection ?seqCalculationDataCollectionIndex
?calculationData .
?calculationDataCollection a :CalculationDataCollection ;
:listCalculationData ?seqCalculationDataCollection .
?calculationResult a :CalculationResult ;
:calculationDataCollection ?calculationDataCollection .
}

I tried to complete my WHERE clause with statements like I did to trace
back to ?calculationDataCollection from ?calculationData but I cannot get
the wanted result because the system freezes. Note that the SPARQL query
here above returns good results.

Is there a better practice to do that with acceptable performances ?

Thank you in advance for your help.



-- 
*Laurent Rucquoy*
RD Engineer

laurent.rucq...@telemis.com
Tel: +32 (0) 10 48 00 27
Fax: +32 (0) 10 48 00 20

Telemis
Avenue Athéna 2
1348 Louvain-la-Neuve
Belgium
www.telemis.com
*Extending Human Life*

Au delà du Pacs...Découvrez le Macs
https://drive.google.com/a/telemis.com/file/d/0B8xEcRPFLfNdRmFRMlRSZ0k5djA/view?usp=sharing
 sur le stand Telemis #118 du 4. - 6 Juin - Centre des congrès de Bâle !


Re: Very slow SPARQL query on TDB

2015-01-29 Thread Laurent Rucquoy
Hi Andy,

I've good response time now with the SPARQL request fixed by Milorad.
Nevertheless, I will focus on the TDB Optimizer soon and also upgrade Jena
to the last release (my current release is 2.10.1)
Thank you for your help.

Sincerely,
Laurent.

On Wed, Jan 28, 2015 at 7:44 PM, Andy Seaborne a...@apache.org wrote:

 On 28/01/15 18:34, Milorad Tosic wrote:

 Hi Laurent,
 I would give a try to a different sequencing in the query. For example:

 PREFIX base:http://www.telemis.com/PREFIX rdf: 
 http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
 http://www.w3.org/2001/XMLSchema#
 SELECT ?x{ ?image base:sopInstanceUID 1.2.840.113564.10656621.
 201302121438403281.1003000225002^^XMLS:string . ?image a base:Image .
?seq ?p ?image . ?x base:images ?seq .
   ?x a base:ImageAnnotation ;  base:deleted false .
 }
 Though, it may or may not help.
 Regards,Milorad


From: Laurent Rucquoy laurent.rucq...@telemis.com
   To: users@jena.apache.org
   Sent: Wednesday, January 28, 2015 6:13 PM
   Subject: Very slow SPARQL query on TDB

 Hello,
 I have a Java application which implements an object model persisted
 through JenaBean in my Jena TDB (see the attached image of the classes
 diagram).
 The request to retrieve an ImageAnnotation resource from the ID of a
 linked Image is very slow.Here is a typical SPARQL query used (more than
 40s to get the result):


 PREFIX base:http://www.telemis.com/PREFIX rdf: 
 http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
 http://www.w3.org/2001/XMLSchema#
 SELECT ?x{ ?x a base:ImageAnnotation ;  base:deleted false ; base:images
 ?seq . ?seq ?p ?image . ?image a base:Image . ?image base:sopInstanceUID
 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string
 .}


 Can you help me to find what I'm doing wrong ?
 Thank you in advance.
 Sincerely,Laurent


 Which version of TDB? 2.11.2 had possibly related fixes.

 https://issues.apache.org/jira/browse/JENA-685

 If you do take Milorad suggestion, also put in a none.opt file to stop
 TDB reordering your improved order into a worse one.

 http://jena.apache.org/documentation/tdb/optimizer.
 html#choosing-the-optimizer-strategy

 Andy

 PS Attachments don't come through this list.




-- 
*Laurent Rucquoy*
RD Engineer

Telemis http://www.telemis.com
*Extending Human Life*

*** NEW ADDRESS ***
Avenue Athéna 2
1348 Louvain-la-Neuve
Belgium
laurent.rucq...@telemis.com
Tel: +32 (0) 10 47 14 39
Fax: +32 (0) 10 48 00 20


Re: Very slow SPARQL query on TDB

2015-01-29 Thread Laurent Rucquoy
Hi Milorad,

Your suggestion solved my performance problem: the fixed SPARQL query took
less than 1s to get the result !
Thank you very much for your efficient support.

Sincerely,
Laurent.


On Wed, Jan 28, 2015 at 7:34 PM, Milorad Tosic mbto...@yahoo.com.invalid
wrote:

 Hi Laurent,
 I would give a try to a different sequencing in the query. For example:

 PREFIX base:http://www.telemis.com/PREFIX rdf: 
 http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
 http://www.w3.org/2001/XMLSchema#
 SELECT ?x{ ?image base:sopInstanceUID
 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .
 ?image a base:Image . ?seq ?p ?image . ?x base:images ?seq .
  ?x a base:ImageAnnotation ;  base:deleted false .
 }
 Though, it may or may not help.
 Regards,Milorad


   From: Laurent Rucquoy laurent.rucq...@telemis.com
  To: users@jena.apache.org
  Sent: Wednesday, January 28, 2015 6:13 PM
  Subject: Very slow SPARQL query on TDB

 Hello,
 I have a Java application which implements an object model persisted
 through JenaBean in my Jena TDB (see the attached image of the classes
 diagram).
 The request to retrieve an ImageAnnotation resource from the ID of a
 linked Image is very slow.Here is a typical SPARQL query used (more than
 40s to get the result):


 PREFIX base:http://www.telemis.com/PREFIX rdf: 
 http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
 http://www.w3.org/2001/XMLSchema#
 SELECT ?x{ ?x a base:ImageAnnotation ;  base:deleted false ; base:images
 ?seq . ?seq ?p ?image . ?image a base:Image . ?image base:sopInstanceUID
 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .}


 Can you help me to find what I'm doing wrong ?
 Thank you in advance.
 Sincerely,Laurent






-- 
*Laurent Rucquoy*
RD Engineer

Telemis http://www.telemis.com
*Extending Human Life*

*** NEW ADDRESS ***
Avenue Athéna 2
1348 Louvain-la-Neuve
Belgium
laurent.rucq...@telemis.com
Tel: +32 (0) 10 47 14 39
Fax: +32 (0) 10 48 00 20


Very slow SPARQL query on TDB

2015-01-28 Thread Laurent Rucquoy
Hello,

I have a Java application which implements an object model persisted
through JenaBean in my Jena TDB* (see the attached image of the classes
diagram)*.

The request to retrieve an ImageAnnotation resource from the ID of a linked
Image is very slow.
Here is a typical SPARQL query used (more than 40s to get the result):



PREFIX base:http://www.telemis.com/
PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
PREFIX XMLS: http://www.w3.org/2001/XMLSchema#

SELECT ?x
{
?x a base:ImageAnnotation ;
 base:deleted false ;
base:images ?seq .
?seq ?p ?image .
?image a base:Image .
?image base:sopInstanceUID
1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .
}



Can you help me to find what I'm doing wrong ?

Thank you in advance.

Sincerely,
Laurent