Re: Turtle parser fails on CONSTRUCT query result

2015-01-28 Thread Andy Seaborne

On 28/01/15 10:31, Lorenz Bühmann wrote:

Hello Andy,

first of all, thanks for the answer. I added answers to your comments
inline below.



Comments inline and at the end ...


...


This is a warning - the parser emits the data and continues ...

(I'm somewhat tempted to turn the NF tests off - while strictly
correct, few people worry or understand NF - feedback welcome).


Form my point of view the warnings are quite confusing, although I
usually tend to ignore such kind of warnings.


Very true.

In Unicode you can write the same thing in different ways especially 
with accented characters.  You can have a code point for the letter with 
the accent, or the code point for the letter with accent followed by 
out this accent(modifier) on the character before.





and now we have a real error.

What's line 513? (You can get the response by using curl or wget).

Well, from what I can see line 513 contains

ns56:Лауреати_премії_«Еммі» ,

so I guess the char « is unknown to some reason.


Yes.
« is not legal in a prefix name.


11:48:30,584 ErrorHandlerFactory$ErrorLogger - [line: 513, col: 24]
Unknown char: «(171;0x00AB)


The actual error is from looking for a new turtle token and does nto
find a start-of-token marker like  or  or a digit.  So it assumes a
prefix name (which does not start with an identifing character)

It might be badly written data (some unescaped significant character
earlier in the triple).  It's structural problem with the data sent back.

Ok, so the Dbpedia endpoint aka Virtuoso seems to return some illegal
structural data. Probably I'll have to file an issue or at least ask on
their mailing list.


Yes.

This is not a data problem.  The other end (DBpedia) should not send 
illegal Turtle ever.





(Hmm - the stack trace does not seem to quite agree with the current
codebase.  What version are you running?)

I used JENA ARQ 2.11.2, but now updated to

JENA ARQ 2.12.1
JENA Core 2.12.1
JENA IRI 1.1.1

The stacktrace seems to be the same as before:



Thanks.  2.11.2 should be OK - I didn't know the code had moved about 
that much so I suspected a much older version.


Andy



Re: Forward RETE and redundant deduction

2015-01-28 Thread Christophe FAGOT [intactile DESIGN]
Hi Andy,

thanks for your answer, and I’m ok for the graph being a set of triples, it is 
the (very good) reason explaining why only one triple is produced. But the 
reasoner is not in forward model. It is a forward-RETE model, which means that 
the forward rules have to work incrementally, allowing to add and remove 
triples and maintaining the consistency of the model.

So in the case described by Sébastien, the forward-RETE model should not remove 
the inferred triple since another rule has its body terms still validated. At 
least, this last rule should have been fired in order to indicate it that the 
triple which was not created previously (because it was still in the graph) is 
going to be removed, so this last rule should produce it again.

Chris.

Christophe FAGOT, PhD
RESPONSABLE RD INFORMATIQUE

intactile DESIGN
Création d’interfaces + subtiles
+33 (0)4 67 52 88 61
+33 (0)9 50 12 05 66
20 rue du carré du roi
34000 MONTPELLIER
France
www.intactile.com http://intactile.com/

Hugh MacLeod : It's not what the software does, it's what the user does

Les informations contenues dans cet email et ses documents attachés sont 
confidentielles. Elles sont exclusivement adressées aux destinataires 
explicitement désignés ci-dessus et ne peuvent être divulguées sans 
consentement de son auteur. Si vous n'êtes pas le destinataire de cet email 
vous ne devez pas employer, révéler, distribuer, copier, imprimer ou 
transmettre le contenu de cet email et devez le détruire immédiatement.

 Le 28 janv. 2015 à 12:17, Andy Seaborne a...@apache.org a écrit :
 
 (Dave is not around at the moment so I'll try to answer some parts of your 
 question ...)
 
 On 28/01/15 10:28, Sébastien Boulet [intactile DESIGN] wrote:
 Hello,
 
 I have two rules which could produce the same triple:
 
 String rules = [r1: (?a eg:p ?b) - (?a, eg:q, ?b)] +
[r2: (?a eg:r ?b) - (?a, eg:q, ?b)];
 
 i have configured a GenericRuleReasoner in FORWARD_RETE mode.
 
 GenericRuleReasoner reasoner = new 
 GenericRuleReasoner(Rule.parseRules(rules));
 reasoner.setMode(GenericRuleReasoner.FORWARD_RETE);
 InfModel model = ModelFactory.createInfModel(reasoner, 
 ModelFactory.createDefaultModel());
 
 When a triple satisfy the first rule and another triple satisfy the second 
 rule:
 
   Resource subject = model.createResource();
 Property predicateP = model.getProperty(urn:x-hp:eg/p);
 Literal literalA = model.createTypedLiteral(A);
 Property predicateR = model.getProperty(urn:x-hp:eg/r);
 
 model.add(subject, predicateP, literalA);
  model.add(subject, predicateR, literalA);
 
 only one triple is deduced:
 
 An RDF graph is a set of triples.
 
 A set only has one of each thing in it.
 
 If you
 add(triple)
 add(triple)
 
 you will see only one triple in the output.  This is not to do with 
 inference, it is to do with an RDF graph being a set.
 
 
 rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:j.0=urn:x-hp:eg/ 
   rdf:Description rdf:nodeID=A0
 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r
 j.0:p rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:p
 j.0:q rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:q
   /rdf:Description
 /rdf:RDF
 
 When i remove the fist triple:
 
 model.remove(subject, predicateP, literalA);
 
  the sole deduced triple is removed even if the second rule is still 
 satisfied:
 
 You ran the reasoner in forward model - it included all deductions at the 
 start and then does not run again until you ask it to.
 
 To trigger it again:
 
 InfModel.rebind()
 Cause the inference model to reconsult the underlying data to take into 
 account changes.
 
 or run in backward mode.
 
   Andy
 
 
 rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:j.0=urn:x-hp:eg/ 
   rdf:Description rdf:nodeID=A0
 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r
   /rdf:Description
 /rdf:RDF
 
 is it the expected behavior ?
 is there a workaround to deduce twice the same triple or at least to don’t 
 remove the sole deduction ?
 
 Thanks
 
 
 Sébastien BOULET
 LEAD DÉVELOPPEUR
 
 intactile DESIGN
 Création d’interfaces + subtiles
 04 67 52 88 61
 09 50 12 05 66
 20 rue du carré du roi
 34000 MONTPELLIER
 France
 www.intactile.com http://intactile.com/
 
 Les informations contenues dans cet email et ses documents attachés sont 
 confidentielles. Elles sont exclusivement adressées aux destinataires 
 explicitement désignés ci-dessus et ne peuvent être divulguées sans 
 consentement de son auteur. Si vous n'êtes pas le destinataire de cet email 
 vous ne devez pas employer, révéler, distribuer, copier, imprimer ou 
 transmettre le contenu de cet email et devez le détruire immédiatement.



Re: Time series (energy data) in jena

2015-01-28 Thread Rurik Thomas Greenall
Hi Ashley,

I worked for a large Norwegian oil company on sensor readings in relation
to time-series data from data historians.

For numerous reasons (including the sheer number of triples), we planned to
move away from the idea of storing the data directly as RDF, but rather
mapped data from the historians to RDF on-the-fly, providing a simple REST
interface to serve the RDF. The PoC for this included data about the
sensors and the measurements taken as well as links to previous/subsequent
measurements.

I played with the idea of requesting period series via the interface as
well as single instants.

The PoC worked well enough to be used in a real-time 3D visualisation of
the subsea template, but I'm not sure how this ended up as I ended my
contract before the project was completed.

Regards,

Rurik

On Wed, Jan 28, 2015 at 11:02 AM, Ashley Davison-White adw...@gmail.com
wrote:

 Hi all

 I'm currently looking at the feasibility of storing time series data in
 jena (TDB); specifically energy data. In fact, right now I'm looking for a
 reason not to!-  so far, my research has shown me it is possible but there
 seems to be a lack of experimentation in doing so.

 I'm wondering if anyone is aware of previous research or projects? And if
 there are any potential advantages/disadvantages?

 From my limited experience, in-place updates are not possible, so storing a
 rollup of data (i.e. an entity of readings per day, rather than per minute)
 is not possible; so each reading would need to be it's own tuple. With a
 large data set, I can see this being a problem - especially with the
 increased verboseness of triple data vs a traditional time series database.
 However, for small scale, I don't see this as a problem.

 I'm interested to hear opinions on the topic.

 Regards,
 - Ashley



Fwd: Fwd: How can I replace TDB dataset with virtuoso dataset

2015-01-28 Thread Nauman Ramzan
Thanks for reply.

http://stackoverflow.com/questions/27958212/is-it-
possible-to-add-virtuoso-as-a-storage-provider-in-jena-
fuseki/27966848#27966848

This link i saw already but I can not understand clearly.




Doesn't Virtuoso have it's own SPARQL HTTP server built-in?
virtuoso is providing SPARQL HTTP server BUT there I can not use jane
text:query() function...


It has it's own text indexing.

I am using solr index because I read that solr is BEST text indexer ever...

What is this code trying to do?
I am trying to create virtuoso dataset graph instead of TDB dataset graph..

Where it is running?
I wrote this code in apache-fuseki project org.apache.jena.fuseki.server
package fusekiConfig.java file

My Requirements

The reason using virtuoso is for storage triples as partial storage/
Partition storage ( as I know in TDB we can not do that )
The reason of using fuseki is because I want to use jena text:query()
function
So that i can use solrIndex and virtuoso (instead of TDB ) in same
project...

Nauman


Re: results vary for the same query on same dataset for different engine

2015-01-28 Thread Andy Seaborne

On 28/01/15 12:28, Qiaser Mehmood wrote:

The query SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } returns 21 in 
both cases.  listOfPropertiesInDataset is just name of actual query  which I 
execute and store in a model. mdl= qry.execConstruct
  However, if I run the following code and get triples count for that query:
qry=QueryExecutionFactory.sparqlService(endpoint, query);int count= 0;Iterator 
Triple triples = qry.execConstructTriples();while(triples.hasNext()){
 triples.next();
 count++;
}System.out.print(Triples count value is + count);
The count value is different for Fuseki (42) and (740444) for Sesame. Although 
the data is same in both stores. What could be a reason for this  difference?.



execConstructTriples is an important detail.

Try execConstruct and model.size().  Should be 42 in each case.

execConstructTriples is passing back the low level stream of triples 
received.  It may include duplicates if the server sends duplicates.


There are 21 unique predicates in your data.

CONSTRUCT
{ datasetUri void:propertyPartition ?pUri .
  ?pUri void:property ?p . }

is 42 triples in a set of triples. 2 for each ?p : ?pUri is calculated 
from ?p.  Note *set*.   Sets do not show duplicates.


The default thing you are querying in Jena sends back a stream of 
triples from a set - duplicates have been suppressed.


Sesame does not: it send back the results of each template instantiation 
for every match of the pattern.  No duplicates have been suppression.


Look at the start of the Sesame triple stream of results.  I would 
expect to see repeated triples.


There are 740444/2 = 370222 triples in your data.

370222 matches of ?s ?p ?o

so 370222 matches of WHERE {}

but your construct template does not depend on ?s or ?o. The same ?p and 
?pUri from different matching of ?s ?p ?o happen over and over again.


Hence Sesame returns 740444 triples with many duplicates.

Compare

SELECT  ?p ?pUri WHERE {}

and

SELECT DISTINCT ?p ?pUri WHERE {}

Projecting out just ?p gives 370222 of 21 distinct values.

This data creates an RDF graph of one triple :

@prefix : http://example/ .

:s :p abc .
:s :p abc .
:s :p abc .



Andy

PS I think Sesame may have changed this behaviour in recent versions, at 
least I recall some discussion.





  On Wednesday, January 28, 2015 11:13 AM, Andy Seaborne a...@apache.org 
wrote:


  On 28/01/15 10:49, Qiaser Mehmood wrote:

Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki 
and Sesame, moreover I dumped the same data in both store.
So you mean that result difference over same data is due to the particular 
engine which return either duplicate (i.e. sesame) and set with no duplicate 
(i.e. Fuseki).
Thanks,Qaiser.


So what does

SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o }

return in each case?


And how are you counting results (listOfPropertiesInDataset is not Jena
code).

 Andy



   On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org 
wrote:


   On 27/01/15 17:32, Qiaser Mehmood wrote:

What could be the reason of results (listOfPropertiesInDataset) difference for 
the same query which runs on two different engine e.g. fuseki and sesame. I 
dumped the Kegg data into fuseki and sesame and when I run the following query 
the results vary.
PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri 
void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . 
BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)}

In fuseki it returns 42 and in sesame it returns back 740444
Best,Qaiser.



I guess there are 42 different predicates in the data.

SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }

Jena returns a model, a set of triples.  Set means no duplicates.

It looks liek you are using the form of execution in Sesame that returns
an iterator of stream of triples.  No suppression of duplicates.

In your query:

PREFIX  void: http://rdfs.org/ns/void#

CONSTRUCT
 { http://example/base/datasetUri void:propertyPartition ?pUri .
   ?pUri void:property ?p .}
WHERE
 { ?s ?p ?o
   BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p
AS ?pUri)
 }

Your query has massive duplicates - it projects out ?s and ?o..

Many ?s ?p ?o, few distinct ?p

Try this:

WHERE
 { SELECT DISTINCT ?p ?pUri {
   ?s ?p ?o
   BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p
AS ?pUri)
   }
 }


   Andy















Re: Time series (energy data) in jena

2015-01-28 Thread Claude Warren
Ashley,

I worked for the National Renewable Energy Labratory in Golden, CO, USA
several years ago.  They were doing lots of work with linked open data and
energy -- you might find some good information there (http://www.nrel.gov/)

As for time series data, I think that there was a group at the Digital
Enterprise Research Institute that was using data cube schema to do time
series, but I could be misremembering that.

Finally, as to changing the object of a triple that is possible if you use
an rdf:Seq (Seq.set() ) or rdf:RDFList (RDFList.replace()).  However, that
may throw a wrench into your schema plans.  The most common way is to
simply delete and insert (or insert and delete).

I can't think of a reason not to put the data into Jena.


Fwd: Time series (energy data) in jena

2015-01-28 Thread Ashley Davison-White
Hi all

I'm currently looking at the feasibility of storing time series data in
jena (TDB); specifically energy data. In fact, right now I'm looking for a
reason not to!-  so far, my research has shown me it is possible but there
seems to be a lack of experimentation in doing so.

I'm wondering if anyone is aware of previous research or projects? And if
there are any potential advantages/disadvantages?

From my limited experience, in-place updates are not possible, so storing a
rollup of data (i.e. an entity of readings per day, rather than per minute)
is not possible; so each reading would need to be it's own tuple. With a
large data set, I can see this being a problem - especially with the
increased verboseness of triple data vs a traditional time series database.
However, for small scale, I don't see this as a problem.

I'm interested to hear opinions on the topic.

Regards,
- Ashley


Re: Fwd: How can I replace TDB dataset with virtuoso dataset

2015-01-28 Thread Andy Seaborne

http://stackoverflow.com/questions/27958212/is-it-possible-to-add-virtuoso-as-a-storage-provider-in-jena-fuseki/27966848#27966848

On 28/01/15 07:27, Nauman Ramzan wrote:

Hey all !
I am working on fuseki and I want to use virtuoso graph instead of TDB

Here are my requirements

1 :- Save all record in virtuoso
2 :- Use SolrIndexer for index So that i can also use text:query in
Fuseki...


Doesn't Virtuoso have it's own SPARQL HTTP server built-in?

It has it's own text indexing.



I am using virt_jena2.jar and virtjdbc4_1.jar in my fuseki project

Here is My code


What is this code trying to do?
Where it is running?



Dataset ds = (Dataset) Assembler.general.open(datasetDesc);
 /*   Virtuoso Code  */

 datasetDesc = ((Resource)getOne(datasetDesc, text:gdatabase));
 if
(datasetDesc.getPropertyResourceValue(RDF.type).equals(VirtuosoDatasetVocab.tDataset))
{
 String jdbcurl = getOne(datasetDesc,
fuvirtext:jdbcURL).toString();
 String user = getOne(datasetDesc, fuvirtext:user).toString();
 String password = getOne(datasetDesc,
fuvirtext:password).toString();
 String graphName = ;
 Boolean readAllGraphs = false;
 if (datasetDesc.hasProperty(VirtuosoDatasetVocab.pgraphName)) {
 graphName = getOne(datasetDesc,
fuvirtext:graphName).toString();
 }
 if
(datasetDesc.hasProperty(VirtuosoDatasetVocab.preadAllGraphs)) {
 readAllGraphs = getOne(datasetDesc,
fuvirtext:readAllGraphs).asLiteral().getBoolean();
 }

 VirtuosoStore vstore;

 if (!graphName.isEmpty()) { vstore = new VirtuosoStore(jdbcurl,
user, password, graphName, readAllGraphs);}
 else { vstore = new VirtuosoStore(jdbcurl, user, password,
readAllGraphs); }


 DatasetGraph vg = vstore.getDatasetGraph();
 DatasetGraph dg = ds.asDatasetGraph();

 return null;
 /*sDesc.dataset = vstore.getDatasetGraph();
 sDesc.dataset = ds.asDatasetGraph();*/
 }else{
 sDesc.dataset = ds.asDatasetGraph();
 }





Re: results vary for the same query on same dataset for different engine

2015-01-28 Thread Andy Seaborne

On 28/01/15 10:49, Qiaser Mehmood wrote:

Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki 
and Sesame, moreover I dumped the same data in both store.
So you mean that result difference over same data is due to the particular 
engine which return either duplicate (i.e. sesame) and set with no duplicate 
(i.e. Fuseki).
Thanks,Qaiser.


So what does

SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o }

return in each case?


And how are you counting results (listOfPropertiesInDataset is not Jena 
code).


Andy



  On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org 
wrote:


  On 27/01/15 17:32, Qiaser Mehmood wrote:

What could be the reason of results (listOfPropertiesInDataset) difference for 
the same query which runs on two different engine e.g. fuseki and sesame. I 
dumped the Kegg data into fuseki and sesame and when I run the following query 
the results vary.
PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri 
void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . 
BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)}

In fuseki it returns 42 and in sesame it returns back 740444
Best,Qaiser.



I guess there are 42 different predicates in the data.

SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }

Jena returns a model, a set of triples.  Set means no duplicates.

It looks liek you are using the form of execution in Sesame that returns
an iterator of stream of triples.  No suppression of duplicates.

In your query:

PREFIX  void: http://rdfs.org/ns/void#

CONSTRUCT
   { http://example/base/datasetUri void:propertyPartition ?pUri .
 ?pUri void:property ?p .}
WHERE
   { ?s ?p ?o
 BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p
AS ?pUri)
   }

Your query has massive duplicates - it projects out ?s and ?o..

Many ?s ?p ?o, few distinct ?p

Try this:

WHERE
   { SELECT DISTINCT ?p ?pUri {
 ?s ?p ?o
 BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p
AS ?pUri)
 }
   }


 Andy









Re: results vary for the same query on same dataset for different engine

2015-01-28 Thread Qiaser Mehmood
The query SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } returns 21 in 
both cases.  listOfPropertiesInDataset is just name of actual query  which I 
execute and store in a model. mdl= qry.execConstruct
 However, if I run the following code and get triples count for that query:
qry=QueryExecutionFactory.sparqlService(endpoint, query);int count= 0;Iterator 
Triple triples = qry.execConstructTriples();while(triples.hasNext()){
    triples.next();
    count++;
}System.out.print(Triples count value is + count);
The count value is different for Fuseki (42) and (740444) for Sesame. Although 
the data is same in both stores. What could be a reason for this  difference?.

 

 On Wednesday, January 28, 2015 11:13 AM, Andy Seaborne a...@apache.org 
wrote:
   

 On 28/01/15 10:49, Qiaser Mehmood wrote:
 Thanks Andy, I forgot to mention that I am using Jena to query both the 
 Fuseki and Sesame, moreover I dumped the same data in both store.
 So you mean that result difference over same data is due to the particular 
 engine which return either duplicate (i.e. sesame) and set with no duplicate 
 (i.e. Fuseki).
 Thanks,Qaiser.

So what does

SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o }

return in each case?


And how are you counting results (listOfPropertiesInDataset is not Jena 
code).

    Andy


      On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org 
wrote:


  On 27/01/15 17:32, Qiaser Mehmood wrote:
 What could be the reason of results (listOfPropertiesInDataset) difference 
 for the same query which runs on two different engine e.g. fuseki and 
 sesame. I dumped the Kegg data into fuseki and sesame and when I run the 
 following query the results vary.
 PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri 
 void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . 
 BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)}

 In fuseki it returns 42 and in sesame it returns back 740444
 Best,Qaiser.


 I guess there are 42 different predicates in the data.

 SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }

 Jena returns a model, a set of triples.  Set means no duplicates.

 It looks liek you are using the form of execution in Sesame that returns
 an iterator of stream of triples.  No suppression of duplicates.

 In your query:

 PREFIX  void: http://rdfs.org/ns/void#

 CONSTRUCT
    { http://example/base/datasetUri void:propertyPartition ?pUri .
      ?pUri void:property ?p .}
 WHERE
    { ?s ?p ?o
      BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p
 AS ?pUri)
    }

 Your query has massive duplicates - it projects out ?s and ?o..

 Many ?s ?p ?o, few distinct ?p

 Try this:

 WHERE
    { SELECT DISTINCT ?p ?pUri {
      ?s ?p ?o
      BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p
 AS ?pUri)
      }
    }


      Andy








   

Re: Turtle parser fails on CONSTRUCT query result

2015-01-28 Thread Lorenz Bühmann

Hello Andy,

first of all, thanks for the answer. I added answers to your comments 
inline below.




Comments inline and at the end ...

On 27/01/15 10:57, Lorenz Bühmann wrote:

Hello,

when I run the SPARQL query on the DBpedia endpoint
http://dbpedia.org/sparql

CONSTRUCT {
http://dbpedia.org/resource/Leipzig ?p0 ?o0.
}
WHERE {
http://dbpedia.org/resource/Leipzig ?p0 ?o0.
}


by using the code


String query = CONSTRUCT {\n +
http://dbpedia.org/resource/Trey_Parker ?p0 ?o0.\n +
 ?o0 ?p1 ?o1.\n +
 }\n +
 WHERE {\n +
http://dbpedia.org/resource/Trey_Parker ?p0 ?o0.\n +
 OPTIONAL{\n +
 ?o0 ?p1 ?o1.\n +
 }};
com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP qe = new
com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP(http://dbpedia.org/sparql;, 


query);
qe.setDefaultGraphURIs(Collections.singletonList(http://dbpedia.org;));
Model model = qe.execConstruct();
qe.close();


I get an exception thrown by the Turtle parser:

11:48:30,550 ErrorHandlerFactory$ErrorLogger - [line: 263, col: 45] Bad
IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน
Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC.


This is a warning - the parser emits the data and continues ...

(I'm somewhat tempted to turn the NF tests off - while strictly 
correct, few people worry or understand NF - feedback welcome).


Form my point of view the warnings are quite confusing, although I 
usually tend to ignore such kind of warnings.





11:48:30,553 ErrorHandlerFactory$ErrorLogger - [line: 263, col: 45] Bad
IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน
Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO
11:48:30,557 ErrorHandlerFactory$ErrorLogger - [line: 288, col: 45] Bad
IRI:
http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián
Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC.
11:48:30,557 ErrorHandlerFactory$ErrorLogger - [line: 288, col: 45] Bad
IRI:
http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián
Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO
11:48:30,574 ErrorHandlerFactory$ErrorLogger - [line: 440, col: 13] Bad
IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์
ชาวอเมริกัน Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal
Form KC.
11:48:30,575 ErrorHandlerFactory$ErrorLogger - [line: 440, col: 13] Bad
IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์
ชาวอเมริกัน Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO


and now we have a real error.

What's line 513? (You can get the response by using curl or wget).

Well, from what I can see line 513 contains

ns56:Лауреати_премії_«Еммі» ,

so I guess the char « is unknown to some reason.



11:48:30,584 ErrorHandlerFactory$ErrorLogger - [line: 513, col: 24]
Unknown char: «(171;0x00AB)


The actual error is from looking for a new turtle token and does nto 
find a start-of-token marker like  or  or a digit.  So it assumes a 
prefix name (which does not start with an identifing character)


It might be badly written data (some unescaped significant character 
earlier in the triple).  It's structural problem with the data sent back.
Ok, so the Dbpedia endpoint aka Virtuoso seems to return some illegal 
structural data. Probably I'll have to file an issue or at least ask on 
their mailing list.


(Hmm - the stack trace does not seem to quite agree with the current 
codebase.  What version are you running?)

I used JENA ARQ 2.11.2, but now updated to

JENA ARQ 2.12.1
JENA Core 2.12.1
JENA IRI 1.1.1

The stacktrace seems to be the same as before:

WARN - [line: 263, col: 45] Bad IRI: http://th.dbpedia.org/resource 
/หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน Code: 47/NOT_NFKC in PATH: The IRI is 
not in Unicode Normal Form KC.
WARN - [line: 263, col: 45] Bad IRI: http://th.dbpedia.org/resource 
/หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน Code: 56/COMPATIBILITY_CHARACTER in 
PATH: TODO
WARN - [line: 288, col: 45] Bad IRI: 
http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián 
Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC.
WARN - [line: 288, col: 45] Bad IRI: 
http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián 
Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO
WARN - [line: 440, col: 13] Bad IRI: http://th.dbpedia.org/resource 
/หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์ชาวอเมริกัน Code: 47/NOT_NFKC in PATH: 
The IRI is not in Unicode Normal Form KC.
WARN - [line: 440, col: 13] Bad IRI: http://th.dbpedia.org/resource 
/หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์ชาวอเมริกัน Code: 
56/COMPATIBILITY_CHARACTER in PATH: TODO

ERROR - [line: 513, col: 24] Unknown char: «(171;0x00AB)
Exception in thread main org.apache.jena.riot.RiotException: [line: 
513, col: 24] Unknown char: «(171;0x00AB)
at 

Re: results vary for the same query on same dataset for different engine

2015-01-28 Thread Qiaser Mehmood
Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki 
and Sesame, moreover I dumped the same data in both store.
So you mean that result difference over same data is due to the particular 
engine which return either duplicate (i.e. sesame) and set with no duplicate 
(i.e. Fuseki).
Thanks,Qaiser.   

 On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org 
wrote:
   

 On 27/01/15 17:32, Qiaser Mehmood wrote:
 What could be the reason of results (listOfPropertiesInDataset) difference 
 for the same query which runs on two different engine e.g. fuseki and sesame. 
 I dumped the Kegg data into fuseki and sesame and when I run the following 
 query the results vary.
 PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri 
 void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . 
 BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)}

 In fuseki it returns 42 and in sesame it returns back 740444
 Best,Qaiser.


I guess there are 42 different predicates in the data.

SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }

Jena returns a model, a set of triples.  Set means no duplicates.

It looks liek you are using the form of execution in Sesame that returns 
an iterator of stream of triples.  No suppression of duplicates.

In your query:

PREFIX  void: http://rdfs.org/ns/void#

CONSTRUCT
  { http://example/base/datasetUri void:propertyPartition ?pUri .
    ?pUri void:property ?p .}
WHERE
  { ?s ?p ?o
    BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p 
AS ?pUri)
  }

Your query has massive duplicates - it projects out ?s and ?o..

Many ?s ?p ?o, few distinct ?p

Try this:

WHERE
  { SELECT DISTINCT ?p ?pUri {
    ?s ?p ?o
    BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p 
AS ?pUri)
    }
  }


    Andy



   

Forward RETE and redundant deduction

2015-01-28 Thread Sébastien Boulet [intactile DESIGN]
Hello,

I have two rules which could produce the same triple:

String rules = [r1: (?a eg:p ?b) - (?a, eg:q, ?b)] +
   [r2: (?a eg:r ?b) - (?a, eg:q, ?b)];

i have configured a GenericRuleReasoner in FORWARD_RETE mode.

GenericRuleReasoner reasoner = new 
GenericRuleReasoner(Rule.parseRules(rules));
reasoner.setMode(GenericRuleReasoner.FORWARD_RETE);
InfModel model = ModelFactory.createInfModel(reasoner, 
ModelFactory.createDefaultModel());

When a triple satisfy the first rule and another triple satisfy the second rule:

 Resource subject = model.createResource();
Property predicateP = model.getProperty(urn:x-hp:eg/p);
Literal literalA = model.createTypedLiteral(A);
Property predicateR = model.getProperty(urn:x-hp:eg/r);

model.add(subject, predicateP, literalA);
model.add(subject, predicateR, literalA);

only one triple is deduced:

rdf:RDF
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
xmlns:j.0=urn:x-hp:eg/  
  rdf:Description rdf:nodeID=A0
j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r
j.0:p rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:p
j.0:q rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:q
  /rdf:Description
/rdf:RDF

When i remove the fist triple:

model.remove(subject, predicateP, literalA);

 the sole deduced triple is removed even if the second rule is still satisfied:

rdf:RDF
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
xmlns:j.0=urn:x-hp:eg/  
  rdf:Description rdf:nodeID=A0
j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r
  /rdf:Description
/rdf:RDF

is it the expected behavior ?
is there a workaround to deduce twice the same triple or at least to don’t 
remove the sole deduction ?

Thanks


Sébastien BOULET
LEAD DÉVELOPPEUR

intactile DESIGN
Création d’interfaces + subtiles
04 67 52 88 61
09 50 12 05 66
20 rue du carré du roi
34000 MONTPELLIER
France
www.intactile.com http://intactile.com/

Les informations contenues dans cet email et ses documents attachés sont 
confidentielles. Elles sont exclusivement adressées aux destinataires 
explicitement désignés ci-dessus et ne peuvent être divulguées sans 
consentement de son auteur. Si vous n'êtes pas le destinataire de cet email 
vous ne devez pas employer, révéler, distribuer, copier, imprimer ou 
transmettre le contenu de cet email et devez le détruire immédiatement.



Very slow SPARQL query on TDB

2015-01-28 Thread Laurent Rucquoy
Hello,

I have a Java application which implements an object model persisted
through JenaBean in my Jena TDB* (see the attached image of the classes
diagram)*.

The request to retrieve an ImageAnnotation resource from the ID of a linked
Image is very slow.
Here is a typical SPARQL query used (more than 40s to get the result):



PREFIX base:http://www.telemis.com/
PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
PREFIX XMLS: http://www.w3.org/2001/XMLSchema#

SELECT ?x
{
?x a base:ImageAnnotation ;
 base:deleted false ;
base:images ?seq .
?seq ?p ?image .
?image a base:Image .
?image base:sopInstanceUID
1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .
}



Can you help me to find what I'm doing wrong ?

Thank you in advance.

Sincerely,
Laurent


Re: Best way to synchonize across graphs

2015-01-28 Thread Trevor Donaldson
Trying to use DatasetAccessor. I am getting the following error. Where
could I start to troubleshoot this error. Is this a problem with my
config.ttl file? I am trying to run the following command :

DatasetAccessor datasetAccessor =
DatasetAccessorFactory.createHTTP(http://localhost:3030/ds;);
Model model = datasetAccessor.getModel(http://example.org/#serviceA;);


Exception in thread main org.apache.jena.atlas.web.HttpException: 403 -
Forbidden: SPARQL Graph Store Protocol : Read operation : GET
at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1118)
at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:385)
at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:354)
at
org.apache.jena.web.DatasetGraphAccessorHTTP.doGet(DatasetGraphAccessorHTTP.java:134)
at
org.apache.jena.web.DatasetGraphAccessorHTTP.httpGet(DatasetGraphAccessorHTTP.java:128)
at org.apache.jena.web.DatasetAdapter.getModel(DatasetAdapter.java:47)
at com.security.examples.FusekiExample.main(FusekiExample.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)

On Wed, Jan 28, 2015 at 3:07 PM, Trevor Donaldson tmdona...@gmail.com
wrote:

 I would prefer to use the model api. There is reification, deletion,
 inserting, etc I will take a look at the DatasetAccessor. I am only
 applying changes to a named graph.

 On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org wrote:

 Via Fuseki

 Attempting to do anything that bypasses Fuseki and accesses the TDB data
 directory directly is ill-advised, shouldn't work (there is process level
 locking on TDB data directories) and is highly likely to corrupt your
 data.

 One option would be to use SPARQL updates to supply your changes.

 However if your changes involve complex graph manipulations best done with
 the Model API AND they only apply to a specific named graph then you could
 use the graph store protocol to replace an existing named model - see
 DatasetAccessor which is the Jena API to the protocol and its methods such
 as getModel() and putModel()

 Rob

 On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote:

 Hi all,
 
 What would be the best way to update a TDB store behind Fuseki. I have a
 standalone app that needs to update (delete and insert) statements as
 well
 as insert new statements. I was thinking that I could use the Jena api
 with
 an in memory model and then somehow send the deletes first to fuseki,
 then
 send the inserts to fuseki. Not exactly sure how to accomplish this. Is
 this possible?
 
 Thanks,
 Trevor








Re: Best way to synchonize across graphs

2015-01-28 Thread Martynas Jusevičius
HTTP error code 403 means the client does not have access to the
requested resource:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4

On Thu, Jan 29, 2015 at 2:57 AM, Trevor Donaldson tmdona...@gmail.com wrote:
 Trying to use DatasetAccessor. I am getting the following error. Where
 could I start to troubleshoot this error. Is this a problem with my
 config.ttl file? I am trying to run the following command :

 DatasetAccessor datasetAccessor =
 DatasetAccessorFactory.createHTTP(http://localhost:3030/ds;);
 Model model = datasetAccessor.getModel(http://example.org/#serviceA;);


 Exception in thread main org.apache.jena.atlas.web.HttpException: 403 -
 Forbidden: SPARQL Graph Store Protocol : Read operation : GET
 at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1118)
 at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:385)
 at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:354)
 at
 org.apache.jena.web.DatasetGraphAccessorHTTP.doGet(DatasetGraphAccessorHTTP.java:134)
 at
 org.apache.jena.web.DatasetGraphAccessorHTTP.httpGet(DatasetGraphAccessorHTTP.java:128)
 at org.apache.jena.web.DatasetAdapter.getModel(DatasetAdapter.java:47)
 at com.security.examples.FusekiExample.main(FusekiExample.java:13)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)

 On Wed, Jan 28, 2015 at 3:07 PM, Trevor Donaldson tmdona...@gmail.com
 wrote:

 I would prefer to use the model api. There is reification, deletion,
 inserting, etc I will take a look at the DatasetAccessor. I am only
 applying changes to a named graph.

 On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org wrote:

 Via Fuseki

 Attempting to do anything that bypasses Fuseki and accesses the TDB data
 directory directly is ill-advised, shouldn't work (there is process level
 locking on TDB data directories) and is highly likely to corrupt your
 data.

 One option would be to use SPARQL updates to supply your changes.

 However if your changes involve complex graph manipulations best done with
 the Model API AND they only apply to a specific named graph then you could
 use the graph store protocol to replace an existing named model - see
 DatasetAccessor which is the Jena API to the protocol and its methods such
 as getModel() and putModel()

 Rob

 On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote:

 Hi all,
 
 What would be the best way to update a TDB store behind Fuseki. I have a
 standalone app that needs to update (delete and insert) statements as
 well
 as insert new statements. I was thinking that I could use the Jena api
 with
 an in memory model and then somehow send the deletes first to fuseki,
 then
 send the inserts to fuseki. Not exactly sure how to accomplish this. Is
 this possible?
 
 Thanks,
 Trevor








Re: Best way to synchonize across graphs

2015-01-28 Thread Trevor Donaldson
Right I know that but what in fuseki is making it return a 403. I am using
an in memory graph. --update --mem
On Jan 28, 2015 9:05 PM, Martynas Jusevičius marty...@graphity.org
wrote:

 HTTP error code 403 means the client does not have access to the
 requested resource:
 http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4

 On Thu, Jan 29, 2015 at 2:57 AM, Trevor Donaldson tmdona...@gmail.com
 wrote:
  Trying to use DatasetAccessor. I am getting the following error. Where
  could I start to troubleshoot this error. Is this a problem with my
  config.ttl file? I am trying to run the following command :
 
  DatasetAccessor datasetAccessor =
  DatasetAccessorFactory.createHTTP(http://localhost:3030/ds;);
  Model model = datasetAccessor.getModel(http://example.org/#serviceA;);
 
 
  Exception in thread main org.apache.jena.atlas.web.HttpException: 403 -
  Forbidden: SPARQL Graph Store Protocol : Read operation : GET
  at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1118)
  at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:385)
  at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:354)
  at
 
 org.apache.jena.web.DatasetGraphAccessorHTTP.doGet(DatasetGraphAccessorHTTP.java:134)
  at
 
 org.apache.jena.web.DatasetGraphAccessorHTTP.httpGet(DatasetGraphAccessorHTTP.java:128)
  at org.apache.jena.web.DatasetAdapter.getModel(DatasetAdapter.java:47)
  at com.security.examples.FusekiExample.main(FusekiExample.java:13)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:483)
  at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
 
  On Wed, Jan 28, 2015 at 3:07 PM, Trevor Donaldson tmdona...@gmail.com
  wrote:
 
  I would prefer to use the model api. There is reification, deletion,
  inserting, etc I will take a look at the DatasetAccessor. I am only
  applying changes to a named graph.
 
  On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org
 wrote:
 
  Via Fuseki
 
  Attempting to do anything that bypasses Fuseki and accesses the TDB
 data
  directory directly is ill-advised, shouldn't work (there is process
 level
  locking on TDB data directories) and is highly likely to corrupt your
  data.
 
  One option would be to use SPARQL updates to supply your changes.
 
  However if your changes involve complex graph manipulations best done
 with
  the Model API AND they only apply to a specific named graph then you
 could
  use the graph store protocol to replace an existing named model - see
  DatasetAccessor which is the Jena API to the protocol and its methods
 such
  as getModel() and putModel()
 
  Rob
 
  On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote:
 
  Hi all,
  
  What would be the best way to update a TDB store behind Fuseki. I
 have a
  standalone app that needs to update (delete and insert) statements as
  well
  as insert new statements. I was thinking that I could use the Jena api
  with
  an in memory model and then somehow send the deletes first to fuseki,
  then
  send the inserts to fuseki. Not exactly sure how to accomplish this.
 Is
  this possible?
  
  Thanks,
  Trevor
 
 
 
 
 
 



Re: Forward RETE and redundant deduction

2015-01-28 Thread Dave Reynolds

On 28/01/15 14:06, Christophe FAGOT [intactile DESIGN] wrote:

Hi Andy,

thanks for your answer, and I’m ok for the graph being a set of triples, it is 
the (very good) reason explaining why only one triple is produced. But the 
reasoner is not in forward model. It is a forward-RETE model, which means that 
the forward rules have to work incrementally, allowing to add and remove 
triples and maintaining the consistency of the model.

So in the case described by Sébastien, the forward-RETE model should not remove 
the inferred triple since another rule has its body terms still validated. At 
least, this last rule should have been fired in order to indicate it that the 
triple which was not created previously (because it was still in the graph) is 
going to be removed, so this last rule should produce it again.


The RETE engine stops once a triple has been deduced by one route. If 
you attempt to track each possible route by which a triple could be 
deduced and reference count them all then you will get a combinatoric 
explosion in numbers of possible deduction paths and performance 
plummets (which is why naive truth maintenance never worked out).


The Jena engine works around this by not attempting to handle removals 
incrementally at all. A remove is supposed to mark the model as needing 
a new prepare stage and the entire deduction process is run from 
scratch again the next time you query the model.


That certainly used to work and I can't see why Sébastien's case would 
fail, though I don't see the code by which the results are getting 
accessed. I'm not in a position to test it from here.


Dave


Chris.

Christophe FAGOT, PhD
RESPONSABLE RD INFORMATIQUE

intactile DESIGN
Création d’interfaces + subtiles
+33 (0)4 67 52 88 61
+33 (0)9 50 12 05 66
20 rue du carré du roi
34000 MONTPELLIER
France
www.intactile.com http://intactile.com/

Hugh MacLeod : It's not what the software does, it's what the user does

Les informations contenues dans cet email et ses documents attachés sont 
confidentielles. Elles sont exclusivement adressées aux destinataires 
explicitement désignés ci-dessus et ne peuvent être divulguées sans 
consentement de son auteur. Si vous n'êtes pas le destinataire de cet email 
vous ne devez pas employer, révéler, distribuer, copier, imprimer ou 
transmettre le contenu de cet email et devez le détruire immédiatement.


Le 28 janv. 2015 à 12:17, Andy Seaborne a...@apache.org a écrit :

(Dave is not around at the moment so I'll try to answer some parts of your 
question ...)

On 28/01/15 10:28, Sébastien Boulet [intactile DESIGN] wrote:

Hello,

I have two rules which could produce the same triple:

 String rules = [r1: (?a eg:p ?b) - (?a, eg:q, ?b)] +
[r2: (?a eg:r ?b) - (?a, eg:q, ?b)];

i have configured a GenericRuleReasoner in FORWARD_RETE mode.

 GenericRuleReasoner reasoner = new 
GenericRuleReasoner(Rule.parseRules(rules));
 reasoner.setMode(GenericRuleReasoner.FORWARD_RETE);
 InfModel model = ModelFactory.createInfModel(reasoner, 
ModelFactory.createDefaultModel());

When a triple satisfy the first rule and another triple satisfy the second rule:

 Resource subject = model.createResource();
 Property predicateP = model.getProperty(urn:x-hp:eg/p);
 Literal literalA = model.createTypedLiteral(A);
 Property predicateR = model.getProperty(urn:x-hp:eg/r);

 model.add(subject, predicateP, literalA);
model.add(subject, predicateR, literalA);

only one triple is deduced:


An RDF graph is a set of triples.

A set only has one of each thing in it.

If you
add(triple)
add(triple)

you will see only one triple in the output.  This is not to do with inference, 
it is to do with an RDF graph being a set.



rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:j.0=urn:x-hp:eg/ 
   rdf:Description rdf:nodeID=A0
 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r
 j.0:p rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:p
 j.0:q rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:q
   /rdf:Description
/rdf:RDF

When i remove the fist triple:

 model.remove(subject, predicateP, literalA);

  the sole deduced triple is removed even if the second rule is still satisfied:


You ran the reasoner in forward model - it included all deductions at the start 
and then does not run again until you ask it to.

To trigger it again:

InfModel.rebind()
Cause the inference model to reconsult the underlying data to take into account 
changes.

or run in backward mode.

Andy



rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:j.0=urn:x-hp:eg/ 
   rdf:Description rdf:nodeID=A0
 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r
   /rdf:Description
/rdf:RDF

is it the expected behavior ?
is there a workaround to deduce twice the same triple or at least to don’t 
remove 

Re: Very slow SPARQL query on TDB

2015-01-28 Thread Andy Seaborne

On 28/01/15 18:34, Milorad Tosic wrote:

Hi Laurent,
I would give a try to a different sequencing in the query. For example:

PREFIX base:http://www.telemis.com/PREFIX rdf: 
http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
http://www.w3.org/2001/XMLSchema#
SELECT ?x{ ?image base:sopInstanceUID 
1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string . 
?image a base:Image . ?seq ?p ?image . ?x base:images ?seq .
  ?x a base:ImageAnnotation ;  base:deleted false .
}
Though, it may or may not help.
Regards,Milorad


   From: Laurent Rucquoy laurent.rucq...@telemis.com
  To: users@jena.apache.org
  Sent: Wednesday, January 28, 2015 6:13 PM
  Subject: Very slow SPARQL query on TDB

Hello,
I have a Java application which implements an object model persisted through 
JenaBean in my Jena TDB (see the attached image of the classes diagram).
The request to retrieve an ImageAnnotation resource from the ID of a linked 
Image is very slow.Here is a typical SPARQL query used (more than 40s to get 
the result):


PREFIX base:http://www.telemis.com/PREFIX rdf: 
http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
http://www.w3.org/2001/XMLSchema#
SELECT ?x{ ?x a base:ImageAnnotation ;  base:deleted false ; base:images ?seq . ?seq ?p 
?image . ?image a base:Image . ?image base:sopInstanceUID 
1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .}


Can you help me to find what I'm doing wrong ?
Thank you in advance.
Sincerely,Laurent


Which version of TDB? 2.11.2 had possibly related fixes.

https://issues.apache.org/jira/browse/JENA-685

If you do take Milorad suggestion, also put in a none.opt file to stop 
TDB reordering your improved order into a worse one.


http://jena.apache.org/documentation/tdb/optimizer.html#choosing-the-optimizer-strategy

Andy

PS Attachments don't come through this list.


Re: Very slow SPARQL query on TDB

2015-01-28 Thread Milorad Tosic
Hi Laurent,
I would give a try to a different sequencing in the query. For example:

PREFIX base:http://www.telemis.com/PREFIX rdf: 
http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
http://www.w3.org/2001/XMLSchema#
SELECT ?x{ ?image base:sopInstanceUID 
1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string . 
?image a base:Image . ?seq ?p ?image . ?x base:images ?seq . 
 ?x a base:ImageAnnotation ;  base:deleted false .
}
Though, it may or may not help.
Regards,Milorad

 
  From: Laurent Rucquoy laurent.rucq...@telemis.com
 To: users@jena.apache.org 
 Sent: Wednesday, January 28, 2015 6:13 PM
 Subject: Very slow SPARQL query on TDB
   
Hello,
I have a Java application which implements an object model persisted through 
JenaBean in my Jena TDB (see the attached image of the classes diagram).
The request to retrieve an ImageAnnotation resource from the ID of a linked 
Image is very slow.Here is a typical SPARQL query used (more than 40s to get 
the result):


PREFIX base:http://www.telemis.com/PREFIX rdf: 
http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: 
http://www.w3.org/2001/XMLSchema#
SELECT ?x{ ?x a base:ImageAnnotation ;  base:deleted false ; base:images ?seq . 
?seq ?p ?image . ?image a base:Image . ?image base:sopInstanceUID 
1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .}


Can you help me to find what I'm doing wrong ?
Thank you in advance.
Sincerely,Laurent


   


Re: Time series (energy data) in jena

2015-01-28 Thread Ashley Davison-White
Hi Rurik

Thanks for your reply. The number of triples is what is a concern to me
also. Do you remember if this was a matter of entity size (verboseness of
RDF), or query efficiency?

I too am leaning towards using a timeseries data to mix in results using an
API at query-level, but lack a decent platform or database to experiment
with.

- Ashley

On 28 January 2015 at 15:31, Rurik Thomas Greenall rurik.green...@gmail.com
 wrote:

 Hi Ashley,

 I worked for a large Norwegian oil company on sensor readings in relation
 to time-series data from data historians.

 For numerous reasons (including the sheer number of triples), we planned to
 move away from the idea of storing the data directly as RDF, but rather
 mapped data from the historians to RDF on-the-fly, providing a simple REST
 interface to serve the RDF. The PoC for this included data about the
 sensors and the measurements taken as well as links to previous/subsequent
 measurements.

 I played with the idea of requesting period series via the interface as
 well as single instants.

 The PoC worked well enough to be used in a real-time 3D visualisation of
 the subsea template, but I'm not sure how this ended up as I ended my
 contract before the project was completed.

 Regards,

 Rurik

 On Wed, Jan 28, 2015 at 11:02 AM, Ashley Davison-White adw...@gmail.com
 wrote:

  Hi all
 
  I'm currently looking at the feasibility of storing time series data in
  jena (TDB); specifically energy data. In fact, right now I'm looking for
 a
  reason not to!-  so far, my research has shown me it is possible but
 there
  seems to be a lack of experimentation in doing so.
 
  I'm wondering if anyone is aware of previous research or projects? And if
  there are any potential advantages/disadvantages?
 
  From my limited experience, in-place updates are not possible, so
 storing a
  rollup of data (i.e. an entity of readings per day, rather than per
 minute)
  is not possible; so each reading would need to be it's own tuple. With a
  large data set, I can see this being a problem - especially with the
  increased verboseness of triple data vs a traditional time series
 database.
  However, for small scale, I don't see this as a problem.
 
  I'm interested to hear opinions on the topic.
 
  Regards,
  - Ashley
 



Best way to synchonize across graphs

2015-01-28 Thread Trevor Donaldson
Hi all,

What would be the best way to update a TDB store behind Fuseki. I have a
standalone app that needs to update (delete and insert) statements as well
as insert new statements. I was thinking that I could use the Jena api with
an in memory model and then somehow send the deletes first to fuseki, then
send the inserts to fuseki. Not exactly sure how to accomplish this. Is
this possible?

Thanks,
Trevor


Re: Best way to synchonize across graphs

2015-01-28 Thread Rob Vesse
Via Fuseki

Attempting to do anything that bypasses Fuseki and accesses the TDB data
directory directly is ill-advised, shouldn't work (there is process level
locking on TDB data directories) and is highly likely to corrupt your data.

One option would be to use SPARQL updates to supply your changes.

However if your changes involve complex graph manipulations best done with
the Model API AND they only apply to a specific named graph then you could
use the graph store protocol to replace an existing named model - see
DatasetAccessor which is the Jena API to the protocol and its methods such
as getModel() and putModel()

Rob

On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote:

Hi all,

What would be the best way to update a TDB store behind Fuseki. I have a
standalone app that needs to update (delete and insert) statements as well
as insert new statements. I was thinking that I could use the Jena api
with
an in memory model and then somehow send the deletes first to fuseki, then
send the inserts to fuseki. Not exactly sure how to accomplish this. Is
this possible?

Thanks,
Trevor






Re: Best way to synchonize across graphs

2015-01-28 Thread Trevor Donaldson
I would prefer to use the model api. There is reification, deletion,
inserting, etc I will take a look at the DatasetAccessor. I am only
applying changes to a named graph.

On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org wrote:

 Via Fuseki

 Attempting to do anything that bypasses Fuseki and accesses the TDB data
 directory directly is ill-advised, shouldn't work (there is process level
 locking on TDB data directories) and is highly likely to corrupt your data.

 One option would be to use SPARQL updates to supply your changes.

 However if your changes involve complex graph manipulations best done with
 the Model API AND they only apply to a specific named graph then you could
 use the graph store protocol to replace an existing named model - see
 DatasetAccessor which is the Jena API to the protocol and its methods such
 as getModel() and putModel()

 Rob

 On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote:

 Hi all,
 
 What would be the best way to update a TDB store behind Fuseki. I have a
 standalone app that needs to update (delete and insert) statements as well
 as insert new statements. I was thinking that I could use the Jena api
 with
 an in memory model and then somehow send the deletes first to fuseki, then
 send the inserts to fuseki. Not exactly sure how to accomplish this. Is
 this possible?
 
 Thanks,
 Trevor