Re: Turtle parser fails on CONSTRUCT query result
On 28/01/15 10:31, Lorenz Bühmann wrote: Hello Andy, first of all, thanks for the answer. I added answers to your comments inline below. Comments inline and at the end ... ... This is a warning - the parser emits the data and continues ... (I'm somewhat tempted to turn the NF tests off - while strictly correct, few people worry or understand NF - feedback welcome). Form my point of view the warnings are quite confusing, although I usually tend to ignore such kind of warnings. Very true. In Unicode you can write the same thing in different ways especially with accented characters. You can have a code point for the letter with the accent, or the code point for the letter with accent followed by out this accent(modifier) on the character before. and now we have a real error. What's line 513? (You can get the response by using curl or wget). Well, from what I can see line 513 contains ns56:Лауреати_премії_«Еммі» , so I guess the char « is unknown to some reason. Yes. « is not legal in a prefix name. 11:48:30,584 ErrorHandlerFactory$ErrorLogger - [line: 513, col: 24] Unknown char: «(171;0x00AB) The actual error is from looking for a new turtle token and does nto find a start-of-token marker like or or a digit. So it assumes a prefix name (which does not start with an identifing character) It might be badly written data (some unescaped significant character earlier in the triple). It's structural problem with the data sent back. Ok, so the Dbpedia endpoint aka Virtuoso seems to return some illegal structural data. Probably I'll have to file an issue or at least ask on their mailing list. Yes. This is not a data problem. The other end (DBpedia) should not send illegal Turtle ever. (Hmm - the stack trace does not seem to quite agree with the current codebase. What version are you running?) I used JENA ARQ 2.11.2, but now updated to JENA ARQ 2.12.1 JENA Core 2.12.1 JENA IRI 1.1.1 The stacktrace seems to be the same as before: Thanks. 2.11.2 should be OK - I didn't know the code had moved about that much so I suspected a much older version. Andy
Re: Forward RETE and redundant deduction
Hi Andy, thanks for your answer, and I’m ok for the graph being a set of triples, it is the (very good) reason explaining why only one triple is produced. But the reasoner is not in forward model. It is a forward-RETE model, which means that the forward rules have to work incrementally, allowing to add and remove triples and maintaining the consistency of the model. So in the case described by Sébastien, the forward-RETE model should not remove the inferred triple since another rule has its body terms still validated. At least, this last rule should have been fired in order to indicate it that the triple which was not created previously (because it was still in the graph) is going to be removed, so this last rule should produce it again. Chris. Christophe FAGOT, PhD RESPONSABLE RD INFORMATIQUE intactile DESIGN Création d’interfaces + subtiles +33 (0)4 67 52 88 61 +33 (0)9 50 12 05 66 20 rue du carré du roi 34000 MONTPELLIER France www.intactile.com http://intactile.com/ Hugh MacLeod : It's not what the software does, it's what the user does Les informations contenues dans cet email et ses documents attachés sont confidentielles. Elles sont exclusivement adressées aux destinataires explicitement désignés ci-dessus et ne peuvent être divulguées sans consentement de son auteur. Si vous n'êtes pas le destinataire de cet email vous ne devez pas employer, révéler, distribuer, copier, imprimer ou transmettre le contenu de cet email et devez le détruire immédiatement. Le 28 janv. 2015 à 12:17, Andy Seaborne a...@apache.org a écrit : (Dave is not around at the moment so I'll try to answer some parts of your question ...) On 28/01/15 10:28, Sébastien Boulet [intactile DESIGN] wrote: Hello, I have two rules which could produce the same triple: String rules = [r1: (?a eg:p ?b) - (?a, eg:q, ?b)] + [r2: (?a eg:r ?b) - (?a, eg:q, ?b)]; i have configured a GenericRuleReasoner in FORWARD_RETE mode. GenericRuleReasoner reasoner = new GenericRuleReasoner(Rule.parseRules(rules)); reasoner.setMode(GenericRuleReasoner.FORWARD_RETE); InfModel model = ModelFactory.createInfModel(reasoner, ModelFactory.createDefaultModel()); When a triple satisfy the first rule and another triple satisfy the second rule: Resource subject = model.createResource(); Property predicateP = model.getProperty(urn:x-hp:eg/p); Literal literalA = model.createTypedLiteral(A); Property predicateR = model.getProperty(urn:x-hp:eg/r); model.add(subject, predicateP, literalA); model.add(subject, predicateR, literalA); only one triple is deduced: An RDF graph is a set of triples. A set only has one of each thing in it. If you add(triple) add(triple) you will see only one triple in the output. This is not to do with inference, it is to do with an RDF graph being a set. rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:j.0=urn:x-hp:eg/ rdf:Description rdf:nodeID=A0 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r j.0:p rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:p j.0:q rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:q /rdf:Description /rdf:RDF When i remove the fist triple: model.remove(subject, predicateP, literalA); the sole deduced triple is removed even if the second rule is still satisfied: You ran the reasoner in forward model - it included all deductions at the start and then does not run again until you ask it to. To trigger it again: InfModel.rebind() Cause the inference model to reconsult the underlying data to take into account changes. or run in backward mode. Andy rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:j.0=urn:x-hp:eg/ rdf:Description rdf:nodeID=A0 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r /rdf:Description /rdf:RDF is it the expected behavior ? is there a workaround to deduce twice the same triple or at least to don’t remove the sole deduction ? Thanks Sébastien BOULET LEAD DÉVELOPPEUR intactile DESIGN Création d’interfaces + subtiles 04 67 52 88 61 09 50 12 05 66 20 rue du carré du roi 34000 MONTPELLIER France www.intactile.com http://intactile.com/ Les informations contenues dans cet email et ses documents attachés sont confidentielles. Elles sont exclusivement adressées aux destinataires explicitement désignés ci-dessus et ne peuvent être divulguées sans consentement de son auteur. Si vous n'êtes pas le destinataire de cet email vous ne devez pas employer, révéler, distribuer, copier, imprimer ou transmettre le contenu de cet email et devez le détruire immédiatement.
Re: Time series (energy data) in jena
Hi Ashley, I worked for a large Norwegian oil company on sensor readings in relation to time-series data from data historians. For numerous reasons (including the sheer number of triples), we planned to move away from the idea of storing the data directly as RDF, but rather mapped data from the historians to RDF on-the-fly, providing a simple REST interface to serve the RDF. The PoC for this included data about the sensors and the measurements taken as well as links to previous/subsequent measurements. I played with the idea of requesting period series via the interface as well as single instants. The PoC worked well enough to be used in a real-time 3D visualisation of the subsea template, but I'm not sure how this ended up as I ended my contract before the project was completed. Regards, Rurik On Wed, Jan 28, 2015 at 11:02 AM, Ashley Davison-White adw...@gmail.com wrote: Hi all I'm currently looking at the feasibility of storing time series data in jena (TDB); specifically energy data. In fact, right now I'm looking for a reason not to!- so far, my research has shown me it is possible but there seems to be a lack of experimentation in doing so. I'm wondering if anyone is aware of previous research or projects? And if there are any potential advantages/disadvantages? From my limited experience, in-place updates are not possible, so storing a rollup of data (i.e. an entity of readings per day, rather than per minute) is not possible; so each reading would need to be it's own tuple. With a large data set, I can see this being a problem - especially with the increased verboseness of triple data vs a traditional time series database. However, for small scale, I don't see this as a problem. I'm interested to hear opinions on the topic. Regards, - Ashley
Fwd: Fwd: How can I replace TDB dataset with virtuoso dataset
Thanks for reply. http://stackoverflow.com/questions/27958212/is-it- possible-to-add-virtuoso-as-a-storage-provider-in-jena- fuseki/27966848#27966848 This link i saw already but I can not understand clearly. Doesn't Virtuoso have it's own SPARQL HTTP server built-in? virtuoso is providing SPARQL HTTP server BUT there I can not use jane text:query() function... It has it's own text indexing. I am using solr index because I read that solr is BEST text indexer ever... What is this code trying to do? I am trying to create virtuoso dataset graph instead of TDB dataset graph.. Where it is running? I wrote this code in apache-fuseki project org.apache.jena.fuseki.server package fusekiConfig.java file My Requirements The reason using virtuoso is for storage triples as partial storage/ Partition storage ( as I know in TDB we can not do that ) The reason of using fuseki is because I want to use jena text:query() function So that i can use solrIndex and virtuoso (instead of TDB ) in same project... Nauman
Re: results vary for the same query on same dataset for different engine
On 28/01/15 12:28, Qiaser Mehmood wrote: The query SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } returns 21 in both cases. listOfPropertiesInDataset is just name of actual query which I execute and store in a model. mdl= qry.execConstruct However, if I run the following code and get triples count for that query: qry=QueryExecutionFactory.sparqlService(endpoint, query);int count= 0;Iterator Triple triples = qry.execConstructTriples();while(triples.hasNext()){ triples.next(); count++; }System.out.print(Triples count value is + count); The count value is different for Fuseki (42) and (740444) for Sesame. Although the data is same in both stores. What could be a reason for this difference?. execConstructTriples is an important detail. Try execConstruct and model.size(). Should be 42 in each case. execConstructTriples is passing back the low level stream of triples received. It may include duplicates if the server sends duplicates. There are 21 unique predicates in your data. CONSTRUCT { datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p . } is 42 triples in a set of triples. 2 for each ?p : ?pUri is calculated from ?p. Note *set*. Sets do not show duplicates. The default thing you are querying in Jena sends back a stream of triples from a set - duplicates have been suppressed. Sesame does not: it send back the results of each template instantiation for every match of the pattern. No duplicates have been suppression. Look at the start of the Sesame triple stream of results. I would expect to see repeated triples. There are 740444/2 = 370222 triples in your data. 370222 matches of ?s ?p ?o so 370222 matches of WHERE {} but your construct template does not depend on ?s or ?o. The same ?p and ?pUri from different matching of ?s ?p ?o happen over and over again. Hence Sesame returns 740444 triples with many duplicates. Compare SELECT ?p ?pUri WHERE {} and SELECT DISTINCT ?p ?pUri WHERE {} Projecting out just ?p gives 370222 of 21 distinct values. This data creates an RDF graph of one triple : @prefix : http://example/ . :s :p abc . :s :p abc . :s :p abc . Andy PS I think Sesame may have changed this behaviour in recent versions, at least I recall some discussion. On Wednesday, January 28, 2015 11:13 AM, Andy Seaborne a...@apache.org wrote: On 28/01/15 10:49, Qiaser Mehmood wrote: Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki and Sesame, moreover I dumped the same data in both store. So you mean that result difference over same data is due to the particular engine which return either duplicate (i.e. sesame) and set with no duplicate (i.e. Fuseki). Thanks,Qaiser. So what does SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } return in each case? And how are you counting results (listOfPropertiesInDataset is not Jena code). Andy On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org wrote: On 27/01/15 17:32, Qiaser Mehmood wrote: What could be the reason of results (listOfPropertiesInDataset) difference for the same query which runs on two different engine e.g. fuseki and sesame. I dumped the Kegg data into fuseki and sesame and when I run the following query the results vary. PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)} In fuseki it returns 42 and in sesame it returns back 740444 Best,Qaiser. I guess there are 42 different predicates in the data. SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o } Jena returns a model, a set of triples. Set means no duplicates. It looks liek you are using the form of execution in Sesame that returns an iterator of stream of triples. No suppression of duplicates. In your query: PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { http://example/base/datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p .} WHERE { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } Your query has massive duplicates - it projects out ?s and ?o.. Many ?s ?p ?o, few distinct ?p Try this: WHERE { SELECT DISTINCT ?p ?pUri { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } } Andy
Re: Time series (energy data) in jena
Ashley, I worked for the National Renewable Energy Labratory in Golden, CO, USA several years ago. They were doing lots of work with linked open data and energy -- you might find some good information there (http://www.nrel.gov/) As for time series data, I think that there was a group at the Digital Enterprise Research Institute that was using data cube schema to do time series, but I could be misremembering that. Finally, as to changing the object of a triple that is possible if you use an rdf:Seq (Seq.set() ) or rdf:RDFList (RDFList.replace()). However, that may throw a wrench into your schema plans. The most common way is to simply delete and insert (or insert and delete). I can't think of a reason not to put the data into Jena.
Fwd: Time series (energy data) in jena
Hi all I'm currently looking at the feasibility of storing time series data in jena (TDB); specifically energy data. In fact, right now I'm looking for a reason not to!- so far, my research has shown me it is possible but there seems to be a lack of experimentation in doing so. I'm wondering if anyone is aware of previous research or projects? And if there are any potential advantages/disadvantages? From my limited experience, in-place updates are not possible, so storing a rollup of data (i.e. an entity of readings per day, rather than per minute) is not possible; so each reading would need to be it's own tuple. With a large data set, I can see this being a problem - especially with the increased verboseness of triple data vs a traditional time series database. However, for small scale, I don't see this as a problem. I'm interested to hear opinions on the topic. Regards, - Ashley
Re: Fwd: How can I replace TDB dataset with virtuoso dataset
http://stackoverflow.com/questions/27958212/is-it-possible-to-add-virtuoso-as-a-storage-provider-in-jena-fuseki/27966848#27966848 On 28/01/15 07:27, Nauman Ramzan wrote: Hey all ! I am working on fuseki and I want to use virtuoso graph instead of TDB Here are my requirements 1 :- Save all record in virtuoso 2 :- Use SolrIndexer for index So that i can also use text:query in Fuseki... Doesn't Virtuoso have it's own SPARQL HTTP server built-in? It has it's own text indexing. I am using virt_jena2.jar and virtjdbc4_1.jar in my fuseki project Here is My code What is this code trying to do? Where it is running? Dataset ds = (Dataset) Assembler.general.open(datasetDesc); /* Virtuoso Code */ datasetDesc = ((Resource)getOne(datasetDesc, text:gdatabase)); if (datasetDesc.getPropertyResourceValue(RDF.type).equals(VirtuosoDatasetVocab.tDataset)) { String jdbcurl = getOne(datasetDesc, fuvirtext:jdbcURL).toString(); String user = getOne(datasetDesc, fuvirtext:user).toString(); String password = getOne(datasetDesc, fuvirtext:password).toString(); String graphName = ; Boolean readAllGraphs = false; if (datasetDesc.hasProperty(VirtuosoDatasetVocab.pgraphName)) { graphName = getOne(datasetDesc, fuvirtext:graphName).toString(); } if (datasetDesc.hasProperty(VirtuosoDatasetVocab.preadAllGraphs)) { readAllGraphs = getOne(datasetDesc, fuvirtext:readAllGraphs).asLiteral().getBoolean(); } VirtuosoStore vstore; if (!graphName.isEmpty()) { vstore = new VirtuosoStore(jdbcurl, user, password, graphName, readAllGraphs);} else { vstore = new VirtuosoStore(jdbcurl, user, password, readAllGraphs); } DatasetGraph vg = vstore.getDatasetGraph(); DatasetGraph dg = ds.asDatasetGraph(); return null; /*sDesc.dataset = vstore.getDatasetGraph(); sDesc.dataset = ds.asDatasetGraph();*/ }else{ sDesc.dataset = ds.asDatasetGraph(); }
Re: results vary for the same query on same dataset for different engine
On 28/01/15 10:49, Qiaser Mehmood wrote: Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki and Sesame, moreover I dumped the same data in both store. So you mean that result difference over same data is due to the particular engine which return either duplicate (i.e. sesame) and set with no duplicate (i.e. Fuseki). Thanks,Qaiser. So what does SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } return in each case? And how are you counting results (listOfPropertiesInDataset is not Jena code). Andy On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org wrote: On 27/01/15 17:32, Qiaser Mehmood wrote: What could be the reason of results (listOfPropertiesInDataset) difference for the same query which runs on two different engine e.g. fuseki and sesame. I dumped the Kegg data into fuseki and sesame and when I run the following query the results vary. PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)} In fuseki it returns 42 and in sesame it returns back 740444 Best,Qaiser. I guess there are 42 different predicates in the data. SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o } Jena returns a model, a set of triples. Set means no duplicates. It looks liek you are using the form of execution in Sesame that returns an iterator of stream of triples. No suppression of duplicates. In your query: PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { http://example/base/datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p .} WHERE { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } Your query has massive duplicates - it projects out ?s and ?o.. Many ?s ?p ?o, few distinct ?p Try this: WHERE { SELECT DISTINCT ?p ?pUri { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } } Andy
Re: results vary for the same query on same dataset for different engine
The query SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } returns 21 in both cases. listOfPropertiesInDataset is just name of actual query which I execute and store in a model. mdl= qry.execConstruct However, if I run the following code and get triples count for that query: qry=QueryExecutionFactory.sparqlService(endpoint, query);int count= 0;Iterator Triple triples = qry.execConstructTriples();while(triples.hasNext()){ triples.next(); count++; }System.out.print(Triples count value is + count); The count value is different for Fuseki (42) and (740444) for Sesame. Although the data is same in both stores. What could be a reason for this difference?. On Wednesday, January 28, 2015 11:13 AM, Andy Seaborne a...@apache.org wrote: On 28/01/15 10:49, Qiaser Mehmood wrote: Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki and Sesame, moreover I dumped the same data in both store. So you mean that result difference over same data is due to the particular engine which return either duplicate (i.e. sesame) and set with no duplicate (i.e. Fuseki). Thanks,Qaiser. So what does SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } return in each case? And how are you counting results (listOfPropertiesInDataset is not Jena code). Andy On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org wrote: On 27/01/15 17:32, Qiaser Mehmood wrote: What could be the reason of results (listOfPropertiesInDataset) difference for the same query which runs on two different engine e.g. fuseki and sesame. I dumped the Kegg data into fuseki and sesame and when I run the following query the results vary. PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)} In fuseki it returns 42 and in sesame it returns back 740444 Best,Qaiser. I guess there are 42 different predicates in the data. SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o } Jena returns a model, a set of triples. Set means no duplicates. It looks liek you are using the form of execution in Sesame that returns an iterator of stream of triples. No suppression of duplicates. In your query: PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { http://example/base/datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p .} WHERE { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } Your query has massive duplicates - it projects out ?s and ?o.. Many ?s ?p ?o, few distinct ?p Try this: WHERE { SELECT DISTINCT ?p ?pUri { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } } Andy
Re: Turtle parser fails on CONSTRUCT query result
Hello Andy, first of all, thanks for the answer. I added answers to your comments inline below. Comments inline and at the end ... On 27/01/15 10:57, Lorenz Bühmann wrote: Hello, when I run the SPARQL query on the DBpedia endpoint http://dbpedia.org/sparql CONSTRUCT { http://dbpedia.org/resource/Leipzig ?p0 ?o0. } WHERE { http://dbpedia.org/resource/Leipzig ?p0 ?o0. } by using the code String query = CONSTRUCT {\n + http://dbpedia.org/resource/Trey_Parker ?p0 ?o0.\n + ?o0 ?p1 ?o1.\n + }\n + WHERE {\n + http://dbpedia.org/resource/Trey_Parker ?p0 ?o0.\n + OPTIONAL{\n + ?o0 ?p1 ?o1.\n + }}; com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP qe = new com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP(http://dbpedia.org/sparql;, query); qe.setDefaultGraphURIs(Collections.singletonList(http://dbpedia.org;)); Model model = qe.execConstruct(); qe.close(); I get an exception thrown by the Turtle parser: 11:48:30,550 ErrorHandlerFactory$ErrorLogger - [line: 263, col: 45] Bad IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC. This is a warning - the parser emits the data and continues ... (I'm somewhat tempted to turn the NF tests off - while strictly correct, few people worry or understand NF - feedback welcome). Form my point of view the warnings are quite confusing, although I usually tend to ignore such kind of warnings. 11:48:30,553 ErrorHandlerFactory$ErrorLogger - [line: 263, col: 45] Bad IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO 11:48:30,557 ErrorHandlerFactory$ErrorLogger - [line: 288, col: 45] Bad IRI: http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC. 11:48:30,557 ErrorHandlerFactory$ErrorLogger - [line: 288, col: 45] Bad IRI: http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO 11:48:30,574 ErrorHandlerFactory$ErrorLogger - [line: 440, col: 13] Bad IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์ ชาวอเมริกัน Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC. 11:48:30,575 ErrorHandlerFactory$ErrorLogger - [line: 440, col: 13] Bad IRI: http://th.dbpedia.org/resource/หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์ ชาวอเมริกัน Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO and now we have a real error. What's line 513? (You can get the response by using curl or wget). Well, from what I can see line 513 contains ns56:Лауреати_премії_«Еммі» , so I guess the char « is unknown to some reason. 11:48:30,584 ErrorHandlerFactory$ErrorLogger - [line: 513, col: 24] Unknown char: «(171;0x00AB) The actual error is from looking for a new turtle token and does nto find a start-of-token marker like or or a digit. So it assumes a prefix name (which does not start with an identifing character) It might be badly written data (some unescaped significant character earlier in the triple). It's structural problem with the data sent back. Ok, so the Dbpedia endpoint aka Virtuoso seems to return some illegal structural data. Probably I'll have to file an issue or at least ask on their mailing list. (Hmm - the stack trace does not seem to quite agree with the current codebase. What version are you running?) I used JENA ARQ 2.11.2, but now updated to JENA ARQ 2.12.1 JENA Core 2.12.1 JENA IRI 1.1.1 The stacktrace seems to be the same as before: WARN - [line: 263, col: 45] Bad IRI: http://th.dbpedia.org/resource /หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC. WARN - [line: 263, col: 45] Bad IRI: http://th.dbpedia.org/resource /หมวดหมู่:ผู้กำกับภาพยนตร์ชาว อเมริกัน Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO WARN - [line: 288, col: 45] Bad IRI: http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC. WARN - [line: 288, col: 45] Bad IRI: http://zh_min_nan.dbpedia.org/resource/Category:Bí-kok_tiān-iáⁿ_tō-ián Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO WARN - [line: 440, col: 13] Bad IRI: http://th.dbpedia.org/resource /หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์ชาวอเมริกัน Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC. WARN - [line: 440, col: 13] Bad IRI: http://th.dbpedia.org/resource /หมวดหมู่:ผู้อำนวยการสร้างรายการ โทรทัศน์ชาวอเมริกัน Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO ERROR - [line: 513, col: 24] Unknown char: «(171;0x00AB) Exception in thread main org.apache.jena.riot.RiotException: [line: 513, col: 24] Unknown char: «(171;0x00AB) at
Re: results vary for the same query on same dataset for different engine
Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki and Sesame, moreover I dumped the same data in both store. So you mean that result difference over same data is due to the particular engine which return either duplicate (i.e. sesame) and set with no duplicate (i.e. Fuseki). Thanks,Qaiser. On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne a...@apache.org wrote: On 27/01/15 17:32, Qiaser Mehmood wrote: What could be the reason of results (listOfPropertiesInDataset) difference for the same query which runs on two different engine e.g. fuseki and sesame. I dumped the Kegg data into fuseki and sesame and when I run the following query the results vary. PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . BIND(IRI(CONCAT(STR(baseUri),MD5(STR(?p AS ?pUri)} In fuseki it returns 42 and in sesame it returns back 740444 Best,Qaiser. I guess there are 42 different predicates in the data. SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o } Jena returns a model, a set of triples. Set means no duplicates. It looks liek you are using the form of execution in Sesame that returns an iterator of stream of triples. No suppression of duplicates. In your query: PREFIX void: http://rdfs.org/ns/void# CONSTRUCT { http://example/base/datasetUri void:propertyPartition ?pUri . ?pUri void:property ?p .} WHERE { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } Your query has massive duplicates - it projects out ?s and ?o.. Many ?s ?p ?o, few distinct ?p Try this: WHERE { SELECT DISTINCT ?p ?pUri { ?s ?p ?o BIND(iri(concat(str(http://example/base/baseUri), MD5(str(?p AS ?pUri) } } Andy
Forward RETE and redundant deduction
Hello, I have two rules which could produce the same triple: String rules = [r1: (?a eg:p ?b) - (?a, eg:q, ?b)] + [r2: (?a eg:r ?b) - (?a, eg:q, ?b)]; i have configured a GenericRuleReasoner in FORWARD_RETE mode. GenericRuleReasoner reasoner = new GenericRuleReasoner(Rule.parseRules(rules)); reasoner.setMode(GenericRuleReasoner.FORWARD_RETE); InfModel model = ModelFactory.createInfModel(reasoner, ModelFactory.createDefaultModel()); When a triple satisfy the first rule and another triple satisfy the second rule: Resource subject = model.createResource(); Property predicateP = model.getProperty(urn:x-hp:eg/p); Literal literalA = model.createTypedLiteral(A); Property predicateR = model.getProperty(urn:x-hp:eg/r); model.add(subject, predicateP, literalA); model.add(subject, predicateR, literalA); only one triple is deduced: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:j.0=urn:x-hp:eg/ rdf:Description rdf:nodeID=A0 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r j.0:p rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:p j.0:q rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:q /rdf:Description /rdf:RDF When i remove the fist triple: model.remove(subject, predicateP, literalA); the sole deduced triple is removed even if the second rule is still satisfied: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:j.0=urn:x-hp:eg/ rdf:Description rdf:nodeID=A0 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r /rdf:Description /rdf:RDF is it the expected behavior ? is there a workaround to deduce twice the same triple or at least to don’t remove the sole deduction ? Thanks Sébastien BOULET LEAD DÉVELOPPEUR intactile DESIGN Création d’interfaces + subtiles 04 67 52 88 61 09 50 12 05 66 20 rue du carré du roi 34000 MONTPELLIER France www.intactile.com http://intactile.com/ Les informations contenues dans cet email et ses documents attachés sont confidentielles. Elles sont exclusivement adressées aux destinataires explicitement désignés ci-dessus et ne peuvent être divulguées sans consentement de son auteur. Si vous n'êtes pas le destinataire de cet email vous ne devez pas employer, révéler, distribuer, copier, imprimer ou transmettre le contenu de cet email et devez le détruire immédiatement.
Very slow SPARQL query on TDB
Hello, I have a Java application which implements an object model persisted through JenaBean in my Jena TDB* (see the attached image of the classes diagram)*. The request to retrieve an ImageAnnotation resource from the ID of a linked Image is very slow. Here is a typical SPARQL query used (more than 40s to get the result): PREFIX base:http://www.telemis.com/ PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX XMLS: http://www.w3.org/2001/XMLSchema# SELECT ?x { ?x a base:ImageAnnotation ; base:deleted false ; base:images ?seq . ?seq ?p ?image . ?image a base:Image . ?image base:sopInstanceUID 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string . } Can you help me to find what I'm doing wrong ? Thank you in advance. Sincerely, Laurent
Re: Best way to synchonize across graphs
Trying to use DatasetAccessor. I am getting the following error. Where could I start to troubleshoot this error. Is this a problem with my config.ttl file? I am trying to run the following command : DatasetAccessor datasetAccessor = DatasetAccessorFactory.createHTTP(http://localhost:3030/ds;); Model model = datasetAccessor.getModel(http://example.org/#serviceA;); Exception in thread main org.apache.jena.atlas.web.HttpException: 403 - Forbidden: SPARQL Graph Store Protocol : Read operation : GET at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1118) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:385) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:354) at org.apache.jena.web.DatasetGraphAccessorHTTP.doGet(DatasetGraphAccessorHTTP.java:134) at org.apache.jena.web.DatasetGraphAccessorHTTP.httpGet(DatasetGraphAccessorHTTP.java:128) at org.apache.jena.web.DatasetAdapter.getModel(DatasetAdapter.java:47) at com.security.examples.FusekiExample.main(FusekiExample.java:13) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) On Wed, Jan 28, 2015 at 3:07 PM, Trevor Donaldson tmdona...@gmail.com wrote: I would prefer to use the model api. There is reification, deletion, inserting, etc I will take a look at the DatasetAccessor. I am only applying changes to a named graph. On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org wrote: Via Fuseki Attempting to do anything that bypasses Fuseki and accesses the TDB data directory directly is ill-advised, shouldn't work (there is process level locking on TDB data directories) and is highly likely to corrupt your data. One option would be to use SPARQL updates to supply your changes. However if your changes involve complex graph manipulations best done with the Model API AND they only apply to a specific named graph then you could use the graph store protocol to replace an existing named model - see DatasetAccessor which is the Jena API to the protocol and its methods such as getModel() and putModel() Rob On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote: Hi all, What would be the best way to update a TDB store behind Fuseki. I have a standalone app that needs to update (delete and insert) statements as well as insert new statements. I was thinking that I could use the Jena api with an in memory model and then somehow send the deletes first to fuseki, then send the inserts to fuseki. Not exactly sure how to accomplish this. Is this possible? Thanks, Trevor
Re: Best way to synchonize across graphs
HTTP error code 403 means the client does not have access to the requested resource: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4 On Thu, Jan 29, 2015 at 2:57 AM, Trevor Donaldson tmdona...@gmail.com wrote: Trying to use DatasetAccessor. I am getting the following error. Where could I start to troubleshoot this error. Is this a problem with my config.ttl file? I am trying to run the following command : DatasetAccessor datasetAccessor = DatasetAccessorFactory.createHTTP(http://localhost:3030/ds;); Model model = datasetAccessor.getModel(http://example.org/#serviceA;); Exception in thread main org.apache.jena.atlas.web.HttpException: 403 - Forbidden: SPARQL Graph Store Protocol : Read operation : GET at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1118) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:385) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:354) at org.apache.jena.web.DatasetGraphAccessorHTTP.doGet(DatasetGraphAccessorHTTP.java:134) at org.apache.jena.web.DatasetGraphAccessorHTTP.httpGet(DatasetGraphAccessorHTTP.java:128) at org.apache.jena.web.DatasetAdapter.getModel(DatasetAdapter.java:47) at com.security.examples.FusekiExample.main(FusekiExample.java:13) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) On Wed, Jan 28, 2015 at 3:07 PM, Trevor Donaldson tmdona...@gmail.com wrote: I would prefer to use the model api. There is reification, deletion, inserting, etc I will take a look at the DatasetAccessor. I am only applying changes to a named graph. On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org wrote: Via Fuseki Attempting to do anything that bypasses Fuseki and accesses the TDB data directory directly is ill-advised, shouldn't work (there is process level locking on TDB data directories) and is highly likely to corrupt your data. One option would be to use SPARQL updates to supply your changes. However if your changes involve complex graph manipulations best done with the Model API AND they only apply to a specific named graph then you could use the graph store protocol to replace an existing named model - see DatasetAccessor which is the Jena API to the protocol and its methods such as getModel() and putModel() Rob On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote: Hi all, What would be the best way to update a TDB store behind Fuseki. I have a standalone app that needs to update (delete and insert) statements as well as insert new statements. I was thinking that I could use the Jena api with an in memory model and then somehow send the deletes first to fuseki, then send the inserts to fuseki. Not exactly sure how to accomplish this. Is this possible? Thanks, Trevor
Re: Best way to synchonize across graphs
Right I know that but what in fuseki is making it return a 403. I am using an in memory graph. --update --mem On Jan 28, 2015 9:05 PM, Martynas Jusevičius marty...@graphity.org wrote: HTTP error code 403 means the client does not have access to the requested resource: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4 On Thu, Jan 29, 2015 at 2:57 AM, Trevor Donaldson tmdona...@gmail.com wrote: Trying to use DatasetAccessor. I am getting the following error. Where could I start to troubleshoot this error. Is this a problem with my config.ttl file? I am trying to run the following command : DatasetAccessor datasetAccessor = DatasetAccessorFactory.createHTTP(http://localhost:3030/ds;); Model model = datasetAccessor.getModel(http://example.org/#serviceA;); Exception in thread main org.apache.jena.atlas.web.HttpException: 403 - Forbidden: SPARQL Graph Store Protocol : Read operation : GET at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1118) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:385) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:354) at org.apache.jena.web.DatasetGraphAccessorHTTP.doGet(DatasetGraphAccessorHTTP.java:134) at org.apache.jena.web.DatasetGraphAccessorHTTP.httpGet(DatasetGraphAccessorHTTP.java:128) at org.apache.jena.web.DatasetAdapter.getModel(DatasetAdapter.java:47) at com.security.examples.FusekiExample.main(FusekiExample.java:13) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) On Wed, Jan 28, 2015 at 3:07 PM, Trevor Donaldson tmdona...@gmail.com wrote: I would prefer to use the model api. There is reification, deletion, inserting, etc I will take a look at the DatasetAccessor. I am only applying changes to a named graph. On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org wrote: Via Fuseki Attempting to do anything that bypasses Fuseki and accesses the TDB data directory directly is ill-advised, shouldn't work (there is process level locking on TDB data directories) and is highly likely to corrupt your data. One option would be to use SPARQL updates to supply your changes. However if your changes involve complex graph manipulations best done with the Model API AND they only apply to a specific named graph then you could use the graph store protocol to replace an existing named model - see DatasetAccessor which is the Jena API to the protocol and its methods such as getModel() and putModel() Rob On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote: Hi all, What would be the best way to update a TDB store behind Fuseki. I have a standalone app that needs to update (delete and insert) statements as well as insert new statements. I was thinking that I could use the Jena api with an in memory model and then somehow send the deletes first to fuseki, then send the inserts to fuseki. Not exactly sure how to accomplish this. Is this possible? Thanks, Trevor
Re: Forward RETE and redundant deduction
On 28/01/15 14:06, Christophe FAGOT [intactile DESIGN] wrote: Hi Andy, thanks for your answer, and I’m ok for the graph being a set of triples, it is the (very good) reason explaining why only one triple is produced. But the reasoner is not in forward model. It is a forward-RETE model, which means that the forward rules have to work incrementally, allowing to add and remove triples and maintaining the consistency of the model. So in the case described by Sébastien, the forward-RETE model should not remove the inferred triple since another rule has its body terms still validated. At least, this last rule should have been fired in order to indicate it that the triple which was not created previously (because it was still in the graph) is going to be removed, so this last rule should produce it again. The RETE engine stops once a triple has been deduced by one route. If you attempt to track each possible route by which a triple could be deduced and reference count them all then you will get a combinatoric explosion in numbers of possible deduction paths and performance plummets (which is why naive truth maintenance never worked out). The Jena engine works around this by not attempting to handle removals incrementally at all. A remove is supposed to mark the model as needing a new prepare stage and the entire deduction process is run from scratch again the next time you query the model. That certainly used to work and I can't see why Sébastien's case would fail, though I don't see the code by which the results are getting accessed. I'm not in a position to test it from here. Dave Chris. Christophe FAGOT, PhD RESPONSABLE RD INFORMATIQUE intactile DESIGN Création d’interfaces + subtiles +33 (0)4 67 52 88 61 +33 (0)9 50 12 05 66 20 rue du carré du roi 34000 MONTPELLIER France www.intactile.com http://intactile.com/ Hugh MacLeod : It's not what the software does, it's what the user does Les informations contenues dans cet email et ses documents attachés sont confidentielles. Elles sont exclusivement adressées aux destinataires explicitement désignés ci-dessus et ne peuvent être divulguées sans consentement de son auteur. Si vous n'êtes pas le destinataire de cet email vous ne devez pas employer, révéler, distribuer, copier, imprimer ou transmettre le contenu de cet email et devez le détruire immédiatement. Le 28 janv. 2015 à 12:17, Andy Seaborne a...@apache.org a écrit : (Dave is not around at the moment so I'll try to answer some parts of your question ...) On 28/01/15 10:28, Sébastien Boulet [intactile DESIGN] wrote: Hello, I have two rules which could produce the same triple: String rules = [r1: (?a eg:p ?b) - (?a, eg:q, ?b)] + [r2: (?a eg:r ?b) - (?a, eg:q, ?b)]; i have configured a GenericRuleReasoner in FORWARD_RETE mode. GenericRuleReasoner reasoner = new GenericRuleReasoner(Rule.parseRules(rules)); reasoner.setMode(GenericRuleReasoner.FORWARD_RETE); InfModel model = ModelFactory.createInfModel(reasoner, ModelFactory.createDefaultModel()); When a triple satisfy the first rule and another triple satisfy the second rule: Resource subject = model.createResource(); Property predicateP = model.getProperty(urn:x-hp:eg/p); Literal literalA = model.createTypedLiteral(A); Property predicateR = model.getProperty(urn:x-hp:eg/r); model.add(subject, predicateP, literalA); model.add(subject, predicateR, literalA); only one triple is deduced: An RDF graph is a set of triples. A set only has one of each thing in it. If you add(triple) add(triple) you will see only one triple in the output. This is not to do with inference, it is to do with an RDF graph being a set. rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:j.0=urn:x-hp:eg/ rdf:Description rdf:nodeID=A0 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r j.0:p rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:p j.0:q rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:q /rdf:Description /rdf:RDF When i remove the fist triple: model.remove(subject, predicateP, literalA); the sole deduced triple is removed even if the second rule is still satisfied: You ran the reasoner in forward model - it included all deductions at the start and then does not run again until you ask it to. To trigger it again: InfModel.rebind() Cause the inference model to reconsult the underlying data to take into account changes. or run in backward mode. Andy rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:j.0=urn:x-hp:eg/ rdf:Description rdf:nodeID=A0 j.0:r rdf:datatype=http://www.w3.org/2001/XMLSchema#string;A/j.0:r /rdf:Description /rdf:RDF is it the expected behavior ? is there a workaround to deduce twice the same triple or at least to don’t remove
Re: Very slow SPARQL query on TDB
On 28/01/15 18:34, Milorad Tosic wrote: Hi Laurent, I would give a try to a different sequencing in the query. For example: PREFIX base:http://www.telemis.com/PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: http://www.w3.org/2001/XMLSchema# SELECT ?x{ ?image base:sopInstanceUID 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string . ?image a base:Image . ?seq ?p ?image . ?x base:images ?seq . ?x a base:ImageAnnotation ; base:deleted false . } Though, it may or may not help. Regards,Milorad From: Laurent Rucquoy laurent.rucq...@telemis.com To: users@jena.apache.org Sent: Wednesday, January 28, 2015 6:13 PM Subject: Very slow SPARQL query on TDB Hello, I have a Java application which implements an object model persisted through JenaBean in my Jena TDB (see the attached image of the classes diagram). The request to retrieve an ImageAnnotation resource from the ID of a linked Image is very slow.Here is a typical SPARQL query used (more than 40s to get the result): PREFIX base:http://www.telemis.com/PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: http://www.w3.org/2001/XMLSchema# SELECT ?x{ ?x a base:ImageAnnotation ; base:deleted false ; base:images ?seq . ?seq ?p ?image . ?image a base:Image . ?image base:sopInstanceUID 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .} Can you help me to find what I'm doing wrong ? Thank you in advance. Sincerely,Laurent Which version of TDB? 2.11.2 had possibly related fixes. https://issues.apache.org/jira/browse/JENA-685 If you do take Milorad suggestion, also put in a none.opt file to stop TDB reordering your improved order into a worse one. http://jena.apache.org/documentation/tdb/optimizer.html#choosing-the-optimizer-strategy Andy PS Attachments don't come through this list.
Re: Very slow SPARQL query on TDB
Hi Laurent, I would give a try to a different sequencing in the query. For example: PREFIX base:http://www.telemis.com/PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: http://www.w3.org/2001/XMLSchema# SELECT ?x{ ?image base:sopInstanceUID 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string . ?image a base:Image . ?seq ?p ?image . ?x base:images ?seq . ?x a base:ImageAnnotation ; base:deleted false . } Though, it may or may not help. Regards,Milorad From: Laurent Rucquoy laurent.rucq...@telemis.com To: users@jena.apache.org Sent: Wednesday, January 28, 2015 6:13 PM Subject: Very slow SPARQL query on TDB Hello, I have a Java application which implements an object model persisted through JenaBean in my Jena TDB (see the attached image of the classes diagram). The request to retrieve an ImageAnnotation resource from the ID of a linked Image is very slow.Here is a typical SPARQL query used (more than 40s to get the result): PREFIX base:http://www.telemis.com/PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#PREFIX XMLS: http://www.w3.org/2001/XMLSchema# SELECT ?x{ ?x a base:ImageAnnotation ; base:deleted false ; base:images ?seq . ?seq ?p ?image . ?image a base:Image . ?image base:sopInstanceUID 1.2.840.113564.10656621.201302121438403281.1003000225002^^XMLS:string .} Can you help me to find what I'm doing wrong ? Thank you in advance. Sincerely,Laurent
Re: Time series (energy data) in jena
Hi Rurik Thanks for your reply. The number of triples is what is a concern to me also. Do you remember if this was a matter of entity size (verboseness of RDF), or query efficiency? I too am leaning towards using a timeseries data to mix in results using an API at query-level, but lack a decent platform or database to experiment with. - Ashley On 28 January 2015 at 15:31, Rurik Thomas Greenall rurik.green...@gmail.com wrote: Hi Ashley, I worked for a large Norwegian oil company on sensor readings in relation to time-series data from data historians. For numerous reasons (including the sheer number of triples), we planned to move away from the idea of storing the data directly as RDF, but rather mapped data from the historians to RDF on-the-fly, providing a simple REST interface to serve the RDF. The PoC for this included data about the sensors and the measurements taken as well as links to previous/subsequent measurements. I played with the idea of requesting period series via the interface as well as single instants. The PoC worked well enough to be used in a real-time 3D visualisation of the subsea template, but I'm not sure how this ended up as I ended my contract before the project was completed. Regards, Rurik On Wed, Jan 28, 2015 at 11:02 AM, Ashley Davison-White adw...@gmail.com wrote: Hi all I'm currently looking at the feasibility of storing time series data in jena (TDB); specifically energy data. In fact, right now I'm looking for a reason not to!- so far, my research has shown me it is possible but there seems to be a lack of experimentation in doing so. I'm wondering if anyone is aware of previous research or projects? And if there are any potential advantages/disadvantages? From my limited experience, in-place updates are not possible, so storing a rollup of data (i.e. an entity of readings per day, rather than per minute) is not possible; so each reading would need to be it's own tuple. With a large data set, I can see this being a problem - especially with the increased verboseness of triple data vs a traditional time series database. However, for small scale, I don't see this as a problem. I'm interested to hear opinions on the topic. Regards, - Ashley
Best way to synchonize across graphs
Hi all, What would be the best way to update a TDB store behind Fuseki. I have a standalone app that needs to update (delete and insert) statements as well as insert new statements. I was thinking that I could use the Jena api with an in memory model and then somehow send the deletes first to fuseki, then send the inserts to fuseki. Not exactly sure how to accomplish this. Is this possible? Thanks, Trevor
Re: Best way to synchonize across graphs
Via Fuseki Attempting to do anything that bypasses Fuseki and accesses the TDB data directory directly is ill-advised, shouldn't work (there is process level locking on TDB data directories) and is highly likely to corrupt your data. One option would be to use SPARQL updates to supply your changes. However if your changes involve complex graph manipulations best done with the Model API AND they only apply to a specific named graph then you could use the graph store protocol to replace an existing named model - see DatasetAccessor which is the Jena API to the protocol and its methods such as getModel() and putModel() Rob On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote: Hi all, What would be the best way to update a TDB store behind Fuseki. I have a standalone app that needs to update (delete and insert) statements as well as insert new statements. I was thinking that I could use the Jena api with an in memory model and then somehow send the deletes first to fuseki, then send the inserts to fuseki. Not exactly sure how to accomplish this. Is this possible? Thanks, Trevor
Re: Best way to synchonize across graphs
I would prefer to use the model api. There is reification, deletion, inserting, etc I will take a look at the DatasetAccessor. I am only applying changes to a named graph. On Wed, Jan 28, 2015 at 2:44 PM, Rob Vesse rve...@dotnetrdf.org wrote: Via Fuseki Attempting to do anything that bypasses Fuseki and accesses the TDB data directory directly is ill-advised, shouldn't work (there is process level locking on TDB data directories) and is highly likely to corrupt your data. One option would be to use SPARQL updates to supply your changes. However if your changes involve complex graph manipulations best done with the Model API AND they only apply to a specific named graph then you could use the graph store protocol to replace an existing named model - see DatasetAccessor which is the Jena API to the protocol and its methods such as getModel() and putModel() Rob On 28/01/2015 10:58, Trevor Donaldson tmdona...@gmail.com wrote: Hi all, What would be the best way to update a TDB store behind Fuseki. I have a standalone app that needs to update (delete and insert) statements as well as insert new statements. I was thinking that I could use the Jena api with an in memory model and then somehow send the deletes first to fuseki, then send the inserts to fuseki. Not exactly sure how to accomplish this. Is this possible? Thanks, Trevor