Re: [PR] Fix rdf-thrift URL [jena-site]
kinow merged PR #176: URL: https://github.com/apache/jena-site/pull/176 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Fix rdf-thrift URL [jena-site]
joshmoore opened a new pull request, #176: URL: https://github.com/apache/jena-site/pull/176 The previous rdf-thrift URL (https://afs.github.io/rdf-thrift) does not render in Chrome/OSX, assumedly due to a missing redirect. ``` $ curl -IL https://afs.github.io/rdf-thrift HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 permissions-policy: interest-cohort=() last-modified: Sat, 03 Dec 2022 10:51:05 GMT access-control-allow-origin: * etag: "638b2a19-3cc" expires: Thu, 18 Jan 2024 08:17:23 GMT cache-control: max-age=600 x-proxy-cache: MISS x-github-request-id: F4C2:2FFA55:585BEB3:59B9FAD:65A8DC3B accept-ranges: bytes date: Thu, 18 Jan 2024 08:07:38 GMT via: 1.1 varnish age: 15 x-served-by: cache-fra-eddf8230085-FRA x-cache: HIT x-cache-hits: 1 x-timer: S1705565258.486688,VS0,VE2 vary: Accept-Encoding x-fastly-request-id: 44210949b1d61366b2f3687dfebb857648e79ece content-length: 972 $ curl -IL https://afs.github.io/rdf-thrift/ HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 permissions-policy: interest-cohort=() last-modified: Sat, 03 Dec 2022 10:51:05 GMT access-control-allow-origin: * etag: "638b2a19-ca9" expires: Thu, 18 Jan 2024 08:18:34 GMT cache-control: max-age=600 x-proxy-cache: MISS x-github-request-id: 0CF2:2C7005:302E610:30EB27B:65A8DC82 accept-ranges: bytes date: Thu, 18 Jan 2024 08:08:34 GMT via: 1.1 varnish age: 0 x-served-by: cache-fra-eddf8230034-FRA x-cache: MISS x-cache-hits: 0 x-timer: S1705565314.436379,VS0,VE102 vary: Accept-Encoding x-fastly-request-id: 26f87c34d577b083f023642c07a2318ae27df25b content-length: 3241 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Fix typo in RDF API tutorial [jena-site]
rvesse merged PR #168: URL: https://github.com/apache/jena-site/pull/168 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Fix typo in RDF API tutorial [jena-site]
dfsp-spirit opened a new pull request, #168: URL: https://github.com/apache/jena-site/pull/168 Very minor, just fixes a typo. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] afs closed pull request #152: Documentation update for RDF/XML in Jena 4.8.0
afs closed pull request #152: Documentation update for RDF/XML in Jena 4.8.0 URL: https://github.com/apache/jena-site/pull/152 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] afs commented on pull request #152: Documentation update for RDF/XML in Jena 4.8.0
afs commented on PR #152: URL: https://github.com/apache/jena-site/pull/152#issuecomment-1520720653 This got merged at 4.8.0 during preparing the website but the commits were rehashed so it isn't shown here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Preview release - RDF ABAC data access
This is a preview open source release from Telicent of a system for data-level access control for Apache Jena Fuseki. https://github.com/Telicent-io/public-rdf-abac The license is Apache License 2.0. ABAC - Attribute Based Access Control - allows data owners to define and manage access controls. Different parts of an RDF dataset can be given different access requirements. These requirements control the visibility of the data for read access (SPARQL query or Graph Store Protocol). The access-controlled dataset is a view of the underlying RDF dataset. The access requirements are expressed as labels on the data. Every triple has a set of labels associated with it. These labels can be specified at the triple level, or on all triples with a specific property, or on triples with the same subject. A request has a set of attributes for the user (or software system) making the request. Triples are visible to the read request only if the attributes of the request satisfy the requirements specified by the data labels. The access controls are self-contained and can be transported with the data. A local user attribute store for stand-alone operation is provided in this preview release. Request: "status=employee". Visible Data: :s :p :o -- label "status=employee || status=contractor". Hierarchies are provided whereby some attribute values imply other attribute values. public < restricted < company confidential < company private A request at level "company confidential" has visibility of data labelled with "company confidential", "restricted" or "public". Request: "level=confidential" Visible Data: :s :p :o -- label "level=restricted" This is a snapshot of on-going work within Telicent and the system is in active use and active development. Telicent primarily uses per-triple labelling. Documentation: https://github.com/Telicent-io/public-rdf-abac/blob/main/docs/abac.md This preview release is subject to design change. This is a source-only preview. There are no public maven artifacts. User authentication is not part of this system. This preview release has restrictions: * Data labelling only applies to the default graph. * Per graph access is not yet provided (c.f. https://jena.apache.org/documentation/fuseki2/fuseki-data-access-control) Andy https://www.telicent.io/
[GitHub] [jena-site] afs opened a new pull request, #152: Documentation update for RDF/XML in Jena 4.8.0
afs opened a new pull request, #152: URL: https://github.com/apache/jena-site/pull/152 Mainly reorganisation with am entry page that warns that much of this is legacy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: Resolving against bad URI - parsing CIM RDF/XML reference data for CGMES with Jena 4.8.SNAPSHOT
Hello Andy, thank you very much for your quick and detailed help! "urn:uid:abc" seems to work for my benchmarks. This allows me to work with typed literals from the real world for my contribution. Graphs containing typed literals may have a different distribution of hash values and different equality implementations will be used. The information and references given will probably be useful at work when we update jena. The "base" might play a bigger role there. Regards Arne Am Sa., 4. März 2023 um 17:37 Uhr schrieb Andy Seaborne : > Hi Arne, > > Thanks for testing 4.8.0-SNAPSHOT. > > Part of #1773 is to change to the same IRI handling used elsewhere in > Jena. While still based in jena-iri, the IRIx layer has a specific set > of scheme specific rules. Pure jena-iri is not up-to-date with all the RFCs > > The RDF/XMLfile itself is fine. The issue is the base URI in the parser > setup. > > The URN scheme urn:uuid: defines the rests of the URI to match the > syntax of a UUID: 671940cc-e6b5-47ad-9992-2d9185f53464 > > RFC 8141 defines URNs as urn:NID:NSS -- it tightened up on URN syntax to > require at least two characters in the middle part (NID) and one in the > final part (NSS). It also permitted fragments, which were in the first > URN RFC. > > > So -- > > * is legal by URI syntax, > * not correct the details a URN (must have 2 colons) > * not correct by the detail of the urn:uuid namespace. RFC 4122. > > If you use a legal base, the file parses OK. > Is that possible for you? > > urn:uid:abc > http://example.org/ > > (UID isn't registered -- and also Jena only has schema specific rules > for certain URI and URN registrations. > > Andy > > https://www.rfc-editor.org/rfc/rfc8141.html > https://www.rfc-editor.org/rfc/rfc4122.html > > PS There will be a transition legacy route to get to the 4.7.0 parser > but that is temporary. > > On 03/03/2023 21:47, Arne Bernhardt wrote: > > Hello, > > the following code, which works fine under Jena 4.6, no longer works > under > > Jena 4.8.SNAPSHOT: > > > > RDFParser.create() > > .source(graphUri) > > .base("urn:uuid") > > .lang(Lang.RDFXML) > > .parse(streamSink); > > > > The graph looks like this: > > > > http://iec.ch/TC57/CIM100#; xmlns:md=" > > http://iec.ch/TC57/61970-552/ModelDescription/1#; xmlns:rdf=" > > http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:eu=" > > http://iec.ch/TC57/CIM100-European#;> > > > > 1555284823 LoadArea > > > > 5b5b515b-91bb-41c6-ba63-71a711139a86 > > > > > > > > 1055343234 SubLoadArea > > > > > "#_5b5b515b-91bb-41c6-ba63-71a711139a86" /> > > 27f108dd-e578-4921-8d3a-753e67bd718e > > > > > > > > > > The error is: "org.apache.jena.riot.RiotException: [line: 3, col: 64] > > {E214} Resolving against bad URI : > > <#_5b5b515b-91bb-41c6-ba63-71a711139a86>" > > > > The example is an extract from the CGMES Conformity Assessment Scheme v3 > - > > Test Configurations ( > > https://www.entsoe.eu/data/cim/cim-conformity-and-interoperability/ -> > > > https://www.entsoe.eu/Documents/CIM_documents/Grid_Model_CIM/ENTSO-E_Test_Configurations_v3.0.2.zip > > ). > > > > Could my problem be related to the changes in > > https://github.com/apache/jena/issues/1773? > > Are my options or my base URI wrong? > > Or if the format is wrong, what specification does it violate? (I haven't > > figured out this URI/IRI thing yet, maybe I haven't found the right > sources > > for it). > > How do I get Jena to accept the file, preferably as is? > > > > Greetings > > Arne > > >
Re: Resolving against bad URI - parsing CIM RDF/XML reference data for CGMES with Jena 4.8.SNAPSHOT
Hi Arne, Thanks for testing 4.8.0-SNAPSHOT. Part of #1773 is to change to the same IRI handling used elsewhere in Jena. While still based in jena-iri, the IRIx layer has a specific set of scheme specific rules. Pure jena-iri is not up-to-date with all the RFCs The RDF/XMLfile itself is fine. The issue is the base URI in the parser setup. The URN scheme urn:uuid: defines the rests of the URI to match the syntax of a UUID: 671940cc-e6b5-47ad-9992-2d9185f53464 RFC 8141 defines URNs as urn:NID:NSS -- it tightened up on URN syntax to require at least two characters in the middle part (NID) and one in the final part (NSS). It also permitted fragments, which were in the first URN RFC. So -- * is legal by URI syntax, * not correct the details a URN (must have 2 colons) * not correct by the detail of the urn:uuid namespace. RFC 4122. If you use a legal base, the file parses OK. Is that possible for you? urn:uid:abc http://example.org/ (UID isn't registered -- and also Jena only has schema specific rules for certain URI and URN registrations. Andy https://www.rfc-editor.org/rfc/rfc8141.html https://www.rfc-editor.org/rfc/rfc4122.html PS There will be a transition legacy route to get to the 4.7.0 parser but that is temporary. On 03/03/2023 21:47, Arne Bernhardt wrote: Hello, the following code, which works fine under Jena 4.6, no longer works under Jena 4.8.SNAPSHOT: RDFParser.create() .source(graphUri) .base("urn:uuid") .lang(Lang.RDFXML) .parse(streamSink); The graph looks like this: http://iec.ch/TC57/CIM100#; xmlns:md=" http://iec.ch/TC57/61970-552/ModelDescription/1#; xmlns:rdf=" http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:eu=" http://iec.ch/TC57/CIM100-European#;> 1555284823 LoadArea 5b5b515b-91bb-41c6-ba63-71a711139a86 1055343234 SubLoadArea 27f108dd-e578-4921-8d3a-753e67bd718e The error is: "org.apache.jena.riot.RiotException: [line: 3, col: 64] {E214} Resolving against bad URI : <#_5b5b515b-91bb-41c6-ba63-71a711139a86>" The example is an extract from the CGMES Conformity Assessment Scheme v3 - Test Configurations ( https://www.entsoe.eu/data/cim/cim-conformity-and-interoperability/ -> https://www.entsoe.eu/Documents/CIM_documents/Grid_Model_CIM/ENTSO-E_Test_Configurations_v3.0.2.zip ). Could my problem be related to the changes in https://github.com/apache/jena/issues/1773? Are my options or my base URI wrong? Or if the format is wrong, what specification does it violate? (I haven't figured out this URI/IRI thing yet, maybe I haven't found the right sources for it). How do I get Jena to accept the file, preferably as is? Greetings Arne
Resolving against bad URI - parsing CIM RDF/XML reference data for CGMES with Jena 4.8.SNAPSHOT
Hello, the following code, which works fine under Jena 4.6, no longer works under Jena 4.8.SNAPSHOT: RDFParser.create() .source(graphUri) .base("urn:uuid") .lang(Lang.RDFXML) .parse(streamSink); The graph looks like this: http://iec.ch/TC57/CIM100#; xmlns:md=" http://iec.ch/TC57/61970-552/ModelDescription/1#; xmlns:rdf=" http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:eu=" http://iec.ch/TC57/CIM100-European#;> 1555284823 LoadArea 5b5b515b-91bb-41c6-ba63-71a711139a86 1055343234 SubLoadArea 27f108dd-e578-4921-8d3a-753e67bd718e The error is: "org.apache.jena.riot.RiotException: [line: 3, col: 64] {E214} Resolving against bad URI : <#_5b5b515b-91bb-41c6-ba63-71a711139a86>" The example is an extract from the CGMES Conformity Assessment Scheme v3 - Test Configurations ( https://www.entsoe.eu/data/cim/cim-conformity-and-interoperability/ -> https://www.entsoe.eu/Documents/CIM_documents/Grid_Model_CIM/ENTSO-E_Test_Configurations_v3.0.2.zip ). Could my problem be related to the changes in https://github.com/apache/jena/issues/1773? Are my options or my base URI wrong? Or if the format is wrong, what specification does it violate? (I haven't figured out this URI/IRI thing yet, maybe I haven't found the right sources for it). How do I get Jena to accept the file, preferably as is? Greetings Arne
Re: Evolving RDF/XML support and ARP.
On 26/02/2023 17:08, Andy Seaborne wrote: On 24/02/2023 14:24, Andy Seaborne wrote: Issue for updating ARP to use IRIx, as described below. https://github.com/apache/jena/issues/1773 Draft PR: https://github.com/apache/jena/pull/1774 This has xmlinput0 (the state of ARP 4.7.0, using jena-iri directly) with ARP0 and RDFXMLReader0 as classes. The package xmlinput is the updated RDF/XML parsing code. The class ARP in xmlinput is deprecated as a warning that running ARP without the rest of Jena is not going to continue except while xmlinput0 exists. Andy Merged. It would be good if users that have "file:" ontology imports try out the development build now this is merged. The treatment of file URIs with a relative file path is different between RDF/XML and other formats that resolve URIs: Turtle/TriG and SPARQL. Base = file:///home/somePath/ Turtle:: PREFIX ex: ex:p . gives for the object: . whereas RDF/XML:: gives . RDF/XML is treating a file: URL as an opaque URL (terminology from RFC 2396 which disappeared in RFC 3986). PR:1774 makes RDF/XML resolve relative "file:" paths at parsing time. This impacts tracking owl:imports where there are "file:" URLs with a filesystem-relative path. It matters when reading a local RDF/XML file from a different directory to where the file resides. There is a change to track imports by resolved URI rather than unresolved URI. (Actually, it'll depend whether the import is referenced from Turtle or from RDF/XML.) The better to handle relative URI references is not to include the scheme. Generally to preserve relative URIs, the safe way is have a custom URI scheme "relative:" and remove that after processing. Andy
Re: Datatypes in the rdf: namespace.
Yes, that sounds like a sensible approach Value handling that involves normalizing and canonicalizing what are effectively document formats just seems like a major DoS vector in the same way we’ve seen with things like XML DTDs in the past Rob From: Andy Seaborne Date: Sunday, 26 February 2023 at 17:07 To: dev@jena.apache.org Subject: Datatypes in the rdf: namespace. (Moral: Never pull on the end of a loose bit of string in a codebase...) There are 3 datatypes in the RDF namespace which are there for convenience but not mentioned in the RDF Abstract data model. So they are not required even if they were normatively defined. rdf:XMLLiteral, rdf:HTML, rdf:JSON Jena's XMLLiteralType is compliant with RDF 1.0 but RDF 1.1 changed the rdf:XMLLiteral (no canonicalization, the value space is DOM4 based). In RDF 1.0, rdf:XMLLiteral is the one and only required datatype. It's weird because the lexical space has canonicalization and normalization requirement (the lexical space is the same as value space - puts all the work on the user!). In RDF 1.1, rdf:XMLLiteral is not required (even if normative, which it isn't for other reasons) and it has become just a datatype definition. In RDF 1.1, there is rdf:HTML. The Jena RDF vocabulary has a constant. There is no value handling. rdf:JSON exists in http://www.w3.org/1999/02/22-rdf-syntax-ns, it was defined by JSON-LD. The Jena RDF vocabulary has a constant. There is no value handling. rdf:JSON is likely to make it into RDF 1.2 Concepts. Its value space is a canonicalized form of JSON. All three have complex requirements for the value space (making them a bit of a DOS vector!). It might be simpler to do the same for all 3 datatypes - constants but no value support. Andy
Re: Evolving RDF/XML support and ARP.
On 24/02/2023 14:24, Andy Seaborne wrote: Issue for updating ARP to use IRIx, as described below. https://github.com/apache/jena/issues/1773 Draft PR: https://github.com/apache/jena/pull/1774 This has xmlinput0 (the state of ARP 4.7.0, using jena-iri directly) with ARP0 and RDFXMLReader0 as classes. The package xmlinput is the updated RDF/XML parsing code. The class ARP in xmlinput is deprecated as a warning that running ARP without the rest of Jena is not going to continue except while xmlinput0 exists. Andy On 24/02/2023 14:16, Andy Seaborne wrote: Jena's RDF/XML parser, ARP, was original a separate subsystem that could be configured for different possible directions of the RDF 1.0 working group and different treatment of IRIs that were possible at the time (this is before RFC3986/3987). It is the "xmlinput" package in jena-core. It has a close coupling to jena-iri with features such as customization of errors, and an idiosyncratic approach to relative IRIs (if called directly). These are outside normal use of RDF/XML. When used from model.read or a RIOT API, these features aren't accessible. Both jena-iri and ARP are hard to maintain. xmlinput is the last part of Jena that uses jena-iri directly. Jena has a IRI abstraction - IRIx that allows switching IRI providers. The Jena releases use jena-iri as the provider through the IRIx abstraction - errors message are the same as before. There is a test suite for compatibility - on a pass/warning/error basis, not error message text, that gives the expected behaviour of an IRIx implementation. RFCs and W3C documents that define the URIs, IRIs, and the specific URI schemes evolve so maintenance is necessary. RDF 1.1 removed the special "RDF URI reference" in favour of RFC 3987. W3C has a REC about DIDs (a new "did:" URI scheme). RFC 6874 changes the core URI grammar of RFC 3986, adding support for IPv6 zones. RFC 8089 define "file:" as it is actually used. RFC 8141 replaces the definition of URNs with a new RFC. My long-term aspiration is to have an RDF/XML parser and IRI handling that is: 1/ Maintainable. 2/ For use as a parser in Jena and only for that. That means making RDF/XML handling much simpler, with functionality for reading conformant RDF/XML and not variations that are not used by Jena users. The test suite has good coverage. For IRIs, switch from jena-iri to a new IRI library that has up-to-date support for IRIs. jena-iri also has scheme-specific rules for a large number of legacy schemes (gopher:, telnet:, fax:, ...). This extensibility causes a very high cost to maintain. It has not been remade from the original configuration files for many years (that step is not in the build). New IRI library: https://github.com/afs/x4ld/tree/main/iri4ld jena-iri is also slower than iri4ld and this is visible in parsing (the impact is 5-10% of parsing speed on N-triples.) Error message do change, hopefully to ones that are easier to understand. jena-iri error messages are quite technical. This all applies to xmloutput as well but that's already converted to IRIx. I have a new PR in-progress that converts RDF/XML parsing to use IRIx. It does change the behaviour for directly using RDFXMLReader when relative URIs are given as the base. A fully legacy setup exists that passes all the tests for normal parsing use but does not pass some detailed local behaviour tests in the RDF/XML writer. Roadmap: Eventually have multiple packages, until we decide that migration has happened and they are getting in the way. Packages used by RIOT/modle.read are essential maintenance only. * xmlinput0 - this is ARP xmlinput as it is in Jena 4.7.0. * xmlinput1 - this is ARP switched to use IRIx. * xmlinput2 - an RDF/XML parser (starting with ARP and cutting out the unused parts) that covers Jena needs and not trying to do everything ARP does. xmlinput2 does not yet exist. The new PR gets the codebase to xmlinput1(as "xmlinput"). If all goes well, we can have 4.8.0 default to use xmlinput1, switchable back to xmlinput0. When called from model.read or RIOT, it should not make a difference. It would be great to have users test but any affected users are using legacy features and they are less likely to upgrade regularly. Reports about direct use of ARP have been very infrequent. Andy
Datatypes in the rdf: namespace.
(Moral: Never pull on the end of a loose bit of string in a codebase...) There are 3 datatypes in the RDF namespace which are there for convenience but not mentioned in the RDF Abstract data model. So they are not required even if they were normatively defined. rdf:XMLLiteral, rdf:HTML, rdf:JSON Jena's XMLLiteralType is compliant with RDF 1.0 but RDF 1.1 changed the rdf:XMLLiteral (no canonicalization, the value space is DOM4 based). In RDF 1.0, rdf:XMLLiteral is the one and only required datatype. It's weird because the lexical space has canonicalization and normalization requirement (the lexical space is the same as value space - puts all the work on the user!). In RDF 1.1, rdf:XMLLiteral is not required (even if normative, which it isn't for other reasons) and it has become just a datatype definition. In RDF 1.1, there is rdf:HTML. The Jena RDF vocabulary has a constant. There is no value handling. rdf:JSON exists in http://www.w3.org/1999/02/22-rdf-syntax-ns, it was defined by JSON-LD. The Jena RDF vocabulary has a constant. There is no value handling. rdf:JSON is likely to make it into RDF 1.2 Concepts. Its value space is a canonicalized form of JSON. All three have complex requirements for the value space (making them a bit of a DOS vector!). It might be simpler to do the same for all 3 datatypes - constants but no value support. Andy
Re: Evolving RDF/XML support and ARP.
Issue for updating ARP to use IRIx, as described below. https://github.com/apache/jena/issues/1773 Draft PR: https://github.com/apache/jena/pull/1774 Andy On 24/02/2023 14:16, Andy Seaborne wrote: Jena's RDF/XML parser, ARP, was original a separate subsystem that could be configured for different possible directions of the RDF 1.0 working group and different treatment of IRIs that were possible at the time (this is before RFC3986/3987). It is the "xmlinput" package in jena-core. It has a close coupling to jena-iri with features such as customization of errors, and an idiosyncratic approach to relative IRIs (if called directly). These are outside normal use of RDF/XML. When used from model.read or a RIOT API, these features aren't accessible. Both jena-iri and ARP are hard to maintain. xmlinput is the last part of Jena that uses jena-iri directly. Jena has a IRI abstraction - IRIx that allows switching IRI providers. The Jena releases use jena-iri as the provider through the IRIx abstraction - errors message are the same as before. There is a test suite for compatibility - on a pass/warning/error basis, not error message text, that gives the expected behaviour of an IRIx implementation. RFCs and W3C documents that define the URIs, IRIs, and the specific URI schemes evolve so maintenance is necessary. RDF 1.1 removed the special "RDF URI reference" in favour of RFC 3987. W3C has a REC about DIDs (a new "did:" URI scheme). RFC 6874 changes the core URI grammar of RFC 3986, adding support for IPv6 zones. RFC 8089 define "file:" as it is actually used. RFC 8141 replaces the definition of URNs with a new RFC. My long-term aspiration is to have an RDF/XML parser and IRI handling that is: 1/ Maintainable. 2/ For use as a parser in Jena and only for that. That means making RDF/XML handling much simpler, with functionality for reading conformant RDF/XML and not variations that are not used by Jena users. The test suite has good coverage. For IRIs, switch from jena-iri to a new IRI library that has up-to-date support for IRIs. jena-iri also has scheme-specific rules for a large number of legacy schemes (gopher:, telnet:, fax:, ...). This extensibility causes a very high cost to maintain. It has not been remade from the original configuration files for many years (that step is not in the build). New IRI library: https://github.com/afs/x4ld/tree/main/iri4ld jena-iri is also slower than iri4ld and this is visible in parsing (the impact is 5-10% of parsing speed on N-triples.) Error message do change, hopefully to ones that are easier to understand. jena-iri error messages are quite technical. This all applies to xmloutput as well but that's already converted to IRIx. I have a new PR in-progress that converts RDF/XML parsing to use IRIx. It does change the behaviour for directly using RDFXMLReader when relative URIs are given as the base. A fully legacy setup exists that passes all the tests for normal parsing use but does not pass some detailed local behaviour tests in the RDF/XML writer. Roadmap: Eventually have multiple packages, until we decide that migration has happened and they are getting in the way. Packages used by RIOT/modle.read are essential maintenance only. * xmlinput0 - this is ARP xmlinput as it is in Jena 4.7.0. * xmlinput1 - this is ARP switched to use IRIx. * xmlinput2 - an RDF/XML parser (starting with ARP and cutting out the unused parts) that covers Jena needs and not trying to do everything ARP does. xmlinput2 does not yet exist. The new PR gets the codebase to xmlinput1(as "xmlinput"). If all goes well, we can have 4.8.0 default to use xmlinput1, switchable back to xmlinput0. When called from model.read or RIOT, it should not make a difference. It would be great to have users test but any affected users are using legacy features and they are less likely to upgrade regularly. Reports about direct use of ARP have been very infrequent. Andy
Evolving RDF/XL support and ARP.
Jena's RDF/XML parser, ARP, was original a separate subsystem that could be configured for different possible directions of the RDF 1.0 working group and different treatment of IRIs that were possible at the time (this is before RFC3986/3987). It is the "xmlinput" package in jena-core. It has a close coupling to jena-iri with features such as customization of errors, and an idiosyncratic approach to relative IRIs (if called directly). These are outside normal use of RDF/XML. When used from model.read or a RIOT API, these features aren't accessible. Both jena-iri and ARP are hard to maintain. xmlinput is the last part of Jena that uses jena-iri directly. Jena has a IRI abstraction - IRIx that allows switching IRI providers. The Jena releases use jena-iri as the provider through the IRIx abstraction - errors message are the same as before. There is a test suite for compatibility - on a pass/warning/error basis, not error message text, that gives the expected behaviour of an IRIx implementation. RFCs and W3C documents that define the URIs, IRIs, and the specific URI schemes evolve so maintenance is necessary. RDF 1.1 removed the special "RDF URI reference" in favour of RFC 3987. W3C has a REC about DIDs (a new "did:" URI scheme). RFC 6874 changes the core URI grammar of RFC 3986, adding support for IPv6 zones. RFC 8089 define "file:" as it is actually used. RFC 8141 replaces the definition of URNs with a new RFC. My long-term aspiration is to have an RDF/XML parser and IRI handling that is: 1/ Maintainable. 2/ For use as a parser in Jena and only for that. That means making RDF/XML handling much simpler, with functionality for reading conformant RDF/XML and not variations that are not used by Jena users. The test suite has good coverage. For IRIs, switch from jena-iri to a new IRI library that has up-to-date support for IRIs. jena-iri also has scheme-specific rules for a large number of legacy schemes (gopher:, telnet:, fax:, ...). This extensibility causes a very high cost to maintain. It has not been remade from the original configuration files for many years (that step is not in the build). New IRI library: https://github.com/afs/x4ld/tree/main/iri4ld jena-iri is also slower than iri4ld and this is visible in parsing (the impact is 5-10% of parsing speed on N-triples.) Error message do change, hopefully to ones that are easier to understand. jena-iri error messages are quite technical. This all applies to xmloutput as well but that's already converted to IRIx. I have a new PR in-progress that converts RDF/XML parsing to use IRIx. It does change the behaviour for directly using RDFXMLReader when relative URIs are given as the base. A fully legacy setup exists that passes all the tests for normal parsing use but does not pass some detailed local behaviour tests in the RDF/XML writer. Roadmap: Eventually have multiple packages, until we decide that migration has happened and they are getting in the way. Packages used by RIOT/modle.read are essential maintenance only. * xmlinput0 - this is ARP xmlinput as it is in Jena 4.7.0. * xmlinput1 - this is ARP switched to use IRIx. * xmlinput2 - an RDF/XML parser (starting with ARP and cutting out the unused parts) that covers Jena needs and not trying to do everything ARP does. xmlinput2 does not yet exist. The new PR gets the codebase to xmlinput1(as "xmlinput"). If all goes well, we can have 4.8.0 default to use xmlinput1, switchable back to xmlinput0. When called from model.read or RIOT, it should not make a difference. It would be great to have users test but any affected users are using legacy features and they are less likely to upgrade regularly. Reports about direct use of ARP have been very infrequent. Andy
[GitHub] [jena-site] afs commented on pull request #131: RDF Patch documentation
afs commented on PR #131: URL: https://github.com/apache/jena-site/pull/131#issuecomment-1369593432 Done via website update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] afs closed pull request #131: RDF Patch documentation
afs closed pull request #131: RDF Patch documentation URL: https://github.com/apache/jena-site/pull/131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] kinow commented on pull request #131: RDF Patch documentation
kinow commented on PR #131: URL: https://github.com/apache/jena-site/pull/131#issuecomment-1318746965 > @kinow Thank you very much for the review. Should all be done now. Nice @afs! RDF Patch, LATERAL JOIN. Interesting features coming to Jena! Thank you!!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] afs commented on a diff in pull request #131: RDF Patch documentation
afs commented on code in PR #131: URL: https://github.com/apache/jena-site/pull/131#discussion_r1025159994 ## source/documentation/rdf-patch/__index.md: ## @@ -0,0 +1,191 @@ +--- +title: RDF Patch +slug: index +--- + +This page describes RDF Patch. An RDF Patch is a set of changes to an +[RDF dataset](https://www.w3.org/TR/rdf11-concepts/#section-dataset). +The change are for triples, quads and prefixes. + +Changes to triples involving blank nodes are handled by using their system +identifier which uniquely identifies a blank node. Unlike RDF syntaxes, blank +nodes are not generated afresh each time the document is parsed. + +## Example + +This example ensures certain prefixes are in the dataset and adds some +basic triples for a new subclass of `<http://example/SUPER_CLASS>`. + +``` +TX . +PA "rdf" "http://www.w3.org/1999/02/22-rdf-syntax-ns#; . +PA "owl" "http://www.w3.org/2002/07/owl#; . +PA "rdfs" "http://www.w3.org/2000/01/rdf-schema#; . +A <http://example/SubClass> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://example/SUPER_CLASS> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#label> "SubClass" . +TC . +``` + +## Structure + +The text format for an RDF Patch is N-Triples-like: it is a series of +rows, each row ends with a `.` (DOT). The tokens on a row are keywords, +URIs, blank nodes, writen with their label (see below) or RDF Literals, +in N-triples syntax. A keyword follows the same rules as +Turtle prefix declarations without a trailing `:`. + +A line has an operation code, then some number of items depending on +the operation. + +| Operation | | +| - | - | +| `H` | Header | +| `TX``TC``TA` | Change block: transactions| +| `PA``PD` | Change: Prefix add and delete | +| `A``D`| Change: Add and delete triples and quads | + +The general structure of an RDF patch is a header (possible empty), then a +number of change blocks. + +Each change block is a transaction. Transactions can be explicit recorded ('TX' +start, `TC` commit) to include multiple transaction in one patch. They are not +required. If not present, the patch should be applied atomically to the data. + +``` +header +TX +Quad, triple or prefix changes +TC or TA +``` + +Multiple transaction blocks are allowed for multiple sets of changes in one +patch. + +A binary version based on [RDF Thrift](../io/rdf-binary/) is provided. +Parsing binary compared to text for N-triples achieves a x3-x4 increase in +throughput. + +### Header + +The header provides for basic information about patch. It is a series of +(key, value) pairs. + +It is better to put complex metadata in a separate file and link to it +from the header, but certain information is best kept with the patch. An example +used by Delta is to keep the identifer of the global version id of the dataset Review Comment: Revised that section. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] afs commented on a diff in pull request #131: RDF Patch documentation
afs commented on code in PR #131: URL: https://github.com/apache/jena-site/pull/131#discussion_r1025158734 ## source/documentation/rdf-patch/__index.md: ## @@ -0,0 +1,191 @@ +--- +title: RDF Patch +slug: index +--- + +This page describes RDF Patch. An RDF Patch is a set of changes to an +[RDF dataset](https://www.w3.org/TR/rdf11-concepts/#section-dataset). +The change are for triples, quads and prefixes. + +Changes to triples involving blank nodes are handled by using their system +identifier which uniquely identifies a blank node. Unlike RDF syntaxes, blank +nodes are not generated afresh each time the document is parsed. + +## Example + +This example ensures certain prefixes are in the dataset and adds some +basic triples for a new subclass of `<http://example/SUPER_CLASS>`. + +``` +TX . +PA "rdf" "http://www.w3.org/1999/02/22-rdf-syntax-ns#; . +PA "owl" "http://www.w3.org/2002/07/owl#; . +PA "rdfs" "http://www.w3.org/2000/01/rdf-schema#; . +A <http://example/SubClass> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://example/SUPER_CLASS> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#label> "SubClass" . +TC . +``` + +## Structure + +The text format for an RDF Patch is N-Triples-like: it is a series of +rows, each row ends with a `.` (DOT). The tokens on a row are keywords, +URIs, blank nodes, writen with their label (see below) or RDF Literals, +in N-triples syntax. A keyword follows the same rules as +Turtle prefix declarations without a trailing `:`. + +A line has an operation code, then some number of items depending on +the operation. + +| Operation | | +| - | - | +| `H` | Header | +| `TX``TC``TA` | Change block: transactions| +| `PA``PD` | Change: Prefix add and delete | +| `A``D`| Change: Add and delete triples and quads | + +The general structure of an RDF patch is a header (possible empty), then a +number of change blocks. + +Each change block is a transaction. Transactions can be explicit recorded ('TX' +start, `TC` commit) to include multiple transaction in one patch. They are not +required. If not present, the patch should be applied atomically to the data. + +``` +header +TX +Quad, triple or prefix changes +TC or TA +``` + +Multiple transaction blocks are allowed for multiple sets of changes in one +patch. + +A binary version based on [RDF Thrift](../io/rdf-binary/) is provided. +Parsing binary compared to text for N-triples achieves a x3-x4 increase in +throughput. + +### Header + +The header provides for basic information about patch. It is a series of +(key, value) pairs. + +It is better to put complex metadata in a separate file and link to it +from the header, but certain information is best kept with the patch. An example +used by Delta is to keep the identifer of the global version id of the dataset Review Comment: Delta is [RDF Delta](https://afs.github.io/rdf-delta) - the whole change propagation, replication and high-availability system. There should not be any mention of it in this patch documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] kinow commented on a diff in pull request #131: RDF Patch documentation
kinow commented on code in PR #131: URL: https://github.com/apache/jena-site/pull/131#discussion_r1024499875 ## source/documentation/rdf-patch/__index.md: ## @@ -0,0 +1,191 @@ +--- +title: RDF Patch +slug: index +--- + +This page describes RDF Patch. An RDF Patch is a set of changes to an +[RDF dataset](https://www.w3.org/TR/rdf11-concepts/#section-dataset). +The change are for triples, quads and prefixes. + +Changes to triples involving blank nodes are handled by using their system +identifier which uniquely identifies a blank node. Unlike RDF syntaxes, blank +nodes are not generated afresh each time the document is parsed. + +## Example + +This example ensures certain prefixes are in the dataset and adds some +basic triples for a new subclass of `<http://example/SUPER_CLASS>`. + +``` +TX . +PA "rdf" "http://www.w3.org/1999/02/22-rdf-syntax-ns#; . +PA "owl" "http://www.w3.org/2002/07/owl#; . +PA "rdfs" "http://www.w3.org/2000/01/rdf-schema#; . +A <http://example/SubClass> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://example/SUPER_CLASS> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#label> "SubClass" . +TC . +``` + +## Structure + +The text format for an RDF Patch is N-Triples-like: it is a series of +rows, each row ends with a `.` (DOT). The tokens on a row are keywords, +URIs, blank nodes, writen with their label (see below) or RDF Literals, +in N-triples syntax. A keyword follows the same rules as +Turtle prefix declarations without a trailing `:`. + +A line has an operation code, then some number of items depending on +the operation. + +| Operation | | +| - | - | +| `H` | Header | +| `TX``TC``TA` | Change block: transactions| +| `PA``PD` | Change: Prefix add and delete | +| `A``D`| Change: Add and delete triples and quads | + +The general structure of an RDF patch is a header (possible empty), then a +number of change blocks. + +Each change block is a transaction. Transactions can be explicit recorded ('TX' +start, `TC` commit) to include multiple transaction in one patch. They are not +required. If not present, the patch should be applied atomically to the data. + +``` +header +TX +Quad, triple or prefix changes +TC or TA +``` + +Multiple transaction blocks are allowed for multiple sets of changes in one +patch. + +A binary version based on [RDF Thrift](../io/rdf-binary/) is provided. +Parsing binary compared to text for N-triples achieves a x3-x4 increase in +throughput. + +### Header + +The header provides for basic information about patch. It is a series of +(key, value) pairs. + +It is better to put complex metadata in a separate file and link to it +from the header, but certain information is best kept with the patch. An example +used by Delta is to keep the identifer of the global version id of the dataset +so that patches are applied in the right order. + +A header section can be used to provide additional information. In this example +a patch has an identifier and refers to a previous patch. This might be used to +create a log of patches, a log being a sequnce of chnages to apply in-order. + +``` +H id . +H prev . +TX . +PA "rdf" "http://www.w3.org/1999/02/22-rdf-syntax-ns#; . +PA "owl" "http://www.w3.org/2002/07/owl#; . +PA "rdfs" "http://www.w3.org/2000/01/rdf-schema#; . +A <http://example/SubClass> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://example/SUPER_CLASS> . +A <http://example/SubClass> <http://www.w3.org/2000/01/rdf-schema#label> "SubClass" . +TC . +``` + +Header format: +``` +H word RDFTerm . +``` +where `word` is a string in quotes, or an unquoted string (no spaces, starts with a letter, +same as a prefix without the colon). + +The header is ended by the first non `H` line or the end of the patch. + +### Transactions + +``` +TX . +TC . +``` + +These delimit a block of quad, triple and prefix changes. + +Abort, `TA` is provided so that changes can be streamed, not obliging the +application to buffer change and wait to confirm the action is +committed. + +Transactions should be applied atomically when a patch is applied. + +### Changes + +A change is an add or delete of a quad or a prefix. + + Prefixes + +Prefixes do not apply to the data of the patch. They are +changes to the data the patch is applied to. + +The prefix name is without the trailing colon. It can be given as a +quoted string or unquoted string (keyword) with the same limitations as +Turtle on the pre
Contribution: RDF Patch
This is a larger-than-usual PR so I'll point it out on dev@. This is the RDF Patch code from RDF Delta. The code is available with an Apache License in RDF Delta [1]. It predates RDF Delta, and started with joint work with Rob Vesse. Issue: https://github.com/apache/jena/pull/1618 PR: https://github.com/apache/jena/pull/1619 Documentation: https://github.com/apache/jena-site/pull/131 Andy [1] https://afs.github.io/rdf-delta/
[GitHub] [jena-site] afs opened a new pull request, #131: RDF Patch documentation
afs opened a new pull request, #131: URL: https://github.com/apache/jena-site/pull/131 Companion to https://github.com/apache/jena/pull/1619. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] kinow merged pull request #111: Update rdf-input.md
kinow merged PR #111: URL: https://github.com/apache/jena-site/pull/111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jena-site] berezovskyi opened a new pull request, #111: Update rdf-input.md
berezovskyi opened a new pull request, #111: URL: https://github.com/apache/jena-site/pull/111 Add the terminating dot to the prefix declaration. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: Fuseki UI with RDF Star
Hey Andy, I see. Yeah, YASR is using N3 to parse at the Graph Result, used version is "n3": "^1.3.5" there is already a newer version for N3 which I think is capable of parsing RDF Star [1] (modulo bugs [2] ) So for Graph results it would take to bump the N3 version and adapt YASR code. For SELECT queries they're using the own stuff, e.g. for SPARQL+XML it happens somewhere here: https://github.com/TriplyDB/Yasgui/blob/71dfb7300436c03371fb1e0a356f034f829ecfa4/packages/yasr/src/parsers/xml.ts#L18 Apparently, CONSTRUCT queries with Turtle as result format at least return in such a state that you can watch and save the response, for XML it fails with "server error" [1] https://github.com/rdfjs/N3.js/blob/main/src/N3Parser.js#L31 [2] https://github.com/rdfjs/N3.js/issues/272 On 04.02.22 11:24, Andy Seaborne wrote: Hi Lorenz, Looks like yasgui doesn't support RDF-star. I wonder how difficult it is to add the feature. jena-fuseki-ui:package.json @triply/yasr => ^4.21.1 which is latest. On 04/02/2022 08:49, LB wrote: Hi all, just using latest Fuseki UI and it looks like 1) RDF Star isn't supported by the query parser - it complains with some red marker but at least we can run queries nevertheless We have the latest yasgui - 4.2.20. https://github.com/TriplyDB/Yasgui/issues/189 2) more problematic, it can't visualize RDF Star neither from SELECT queries in a table nor for CONSTRUCT queries Same issue yasr does not support RDF-star https://github.com/TriplyDB/Yasgui/issues/190 Especially 2) I find rather misleading as the only feedback you get is something like "Unexpected "<<" on line 66." which comes from the result parser I guess. Am I missing something, maybe using just the wrong version or a too old part of YASGUI? Or is it a known limitation you're aware of? If so, maybe we should mention it somewhere? Cheers, Lorenz
Re: Fuseki UI with RDF Star
Hi Lorenz, Looks like yasgui doesn't support RDF-star. I wonder how difficult it is to add the feature. jena-fuseki-ui:package.json @triply/yasr => ^4.21.1 which is latest. On 04/02/2022 08:49, LB wrote: Hi all, just using latest Fuseki UI and it looks like 1) RDF Star isn't supported by the query parser - it complains with some red marker but at least we can run queries nevertheless We have the latest yasgui - 4.2.20. https://github.com/TriplyDB/Yasgui/issues/189 2) more problematic, it can't visualize RDF Star neither from SELECT queries in a table nor for CONSTRUCT queries Same issue yasr does not support RDF-star https://github.com/TriplyDB/Yasgui/issues/190 Especially 2) I find rather misleading as the only feedback you get is something like "Unexpected "<<" on line 66." which comes from the result parser I guess. Am I missing something, maybe using just the wrong version or a too old part of YASGUI? Or is it a known limitation you're aware of? If so, maybe we should mention it somewhere? Cheers, Lorenz
Fuseki UI with RDF Star
Hi all, just using latest Fuseki UI and it looks like 1) RDF Star isn't supported by the query parser - it complains with some red marker but at least we can run queries nevertheless 2) more problematic, it can't visualize RDF Star neither from SELECT queries in a table nor for CONSTRUCT queries Especially 2) I find rather misleading as the only feedback you get is something like "Unexpected "<<" on line 66." which comes from the result parser I guess. Am I missing something, maybe using just the wrong version or a too old part of YASGUI? Or is it a known limitation you're aware of? If so, maybe we should mention it somewhere? Cheers, Lorenz
[GitHub] [jena-site] afs merged pull request #90: Update RDF-star terminology
afs merged pull request #90: URL: https://github.com/apache/jena-site/pull/90 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422953#comment-17422953 ] ASF subversion and git services commented on JENA-2167: --- Commit 0c0a509e3b0f5f636306275ef2b7ab273e99e1fc in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=0c0a509 ] Merge pull request #1079 from afs/fixes JENA-2172, JENA-2167 and small improvements. > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.3.0 > > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422951#comment-17422951 ] ASF subversion and git services commented on JENA-2167: --- Commit 2071508862313ab58296ef405f6ecc55ab0ba0a1 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=2071508 ] JENA-2167: Flush output > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.3.0 > > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419790#comment-17419790 ] ASF subversion and git services commented on JENA-2167: --- Commit 0ffb1a795d46a642ec6263838d565158f98564cd in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=0ffb1a7 ] Merge pull request #1076 from afs/rdf-protobuf JENA-2167: Protobuf based RDF binary format > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-2167. - Fix Version/s: Jena 4.3.0 Resolution: Done > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.3.0 > > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419788#comment-17419788 ] ASF subversion and git services commented on JENA-2167: --- Commit 1b5617dade712b6f16a4ea7b76469f67c9aff31f in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=1b5617d ] JENA-2167: Protobuf-based RDF binary format > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419789#comment-17419789 ] ASF subversion and git services commented on JENA-2167: --- Commit e1580129eb0aa823f2ea0eefc363e7d23ad5806d in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=e158012 ] JENA-2167: Add required text for Protobuf redistribution > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418746#comment-17418746 ] Andy Seaborne edited comment on JENA-2167 at 9/22/21, 6:46 PM: --- The Google Protocol Buffers license is 3-clause BSD-style with an explicit comment that the output of the protobuf compiler is owned by the user, and not under the protobuf license. https://github.com/protocolbuffers/protobuf/blob/master/LICENSE 3-clause BSD-style is cat-A. The requirement to acknowledge redistribution of source (not relevant for this ticket) and binaries (we will be in convenience binaries) zip files: apache-jena, apache-jena-fuseki Combined jars: jena-fuseki-server, jena-fuseki-fulljar, jena-fuseki-war, jena-fuseki-geosparql Also: jena-fuseki-docker zip was (Author: andy.seaborne): The Google Protocol Buffers license is 3-clause BSD-style with an explicit comment that the output of the protobuf compiler is owned by the user, and not under the protobuf license. https://github.com/protocolbuffers/protobuf/blob/master/LICENSE 3-clause BSD-style is cat-A. The requirement to acknowledge redistribution of source (not relevant for this ticket) and binaries (we will be in convenience binaries) zip files: apache-jena, apache-jena-fuseki Combined jars: jena-fuseki-server, jena-fuseki-fulljar, jena-fuseki-war. Also: jena-fuseki-docker zip > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418746#comment-17418746 ] Andy Seaborne edited comment on JENA-2167 at 9/22/21, 6:45 PM: --- The Google Protocol Buffers license is 3-clause BSD-style with an explicit comment that the output of the protobuf compiler is owned by the user, and not under the protobuf license. https://github.com/protocolbuffers/protobuf/blob/master/LICENSE 3-clause BSD-style is cat-A. The requirement to acknowledge redistribution of source (not relevant for this ticket) and binaries (we will be in convenience binaries) zip files: apache-jena, apache-jena-fuseki Combined jars: jena-fuseki-server, jena-fuseki-fulljar, jena-fuseki-war. Also: jena-fuseki-docker zip was (Author: andy.seaborne): The Google Protocol Buffers license is 3-clause BSD-style with an explicit comment that the output of the protobuf compiler is owned by the user, and not under the protobuf license. https://github.com/protocolbuffers/protobuf/blob/master/LICENSE 3-clause BSD-style is cat-A. The requirement to acknowledge redistribution of source (not relevant for this ticket) and binaries (we will be in convenience binaries) zip files: apache-jena, apache-jena-fuseki Combined jars: jena-fuseki-server, jena-fuseki-fulljar, jena-fuseki-war.. > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418752#comment-17418752 ] Andy Seaborne commented on JENA-2167: - Some initial figures. Parsing BSBM 25 million (which is large enough to get stable timing figures after warm up): Thrift: 1 million triples per second. Protobuf: 918kTPS N-Triples: 245kTPS The thrift rate is faster than last time I ran it. Same hardware, same code, newer Java (this is Java 17-ea) Suspicion: The protobuf is slightly slower because protobuf does not provide length delimited objects, where as Thrift encoding is self contained. The encoding of a graph is writing triples streaming fashion, each triple a Protobuf message. The protobuf way is to add a block length into the stream, and the extra decoding of this is slightly inefficient (it create two java objects per triple, rather than reuse existing objects). > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2167) Provide an RDF Binary format using Protobuf
[ https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418746#comment-17418746 ] Andy Seaborne commented on JENA-2167: - The Google Protocol Buffers license is 3-clause BSD-style with an explicit comment that the output of the protobuf compiler is owned by the user, and not under the protobuf license. https://github.com/protocolbuffers/protobuf/blob/master/LICENSE 3-clause BSD-style is cat-A. The requirement to acknowledge redistribution of source (not relevant for this ticket) and binaries (we will be in convenience binaries) zip files: apache-jena, apache-jena-fuseki Combined jars: jena-fuseki-server, jena-fuseki-fulljar, jena-fuseki-war.. > Provide an RDF Binary format using Protobuf > --- > > Key: JENA-2167 > URL: https://issues.apache.org/jira/browse/JENA-2167 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 4.2.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > To go along side the RDF Thrift encoding. > Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (JENA-2167) Provide an RDF Binary format using Protobuf
Andy Seaborne created JENA-2167: --- Summary: Provide an RDF Binary format using Protobuf Key: JENA-2167 URL: https://issues.apache.org/jira/browse/JENA-2167 Project: Apache Jena Issue Type: New Feature Affects Versions: Jena 4.2.0 Reporter: Andy Seaborne Assignee: Andy Seaborne To go along side the RDF Thrift encoding. Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (JENA-1903) Encode/decode RDF* using reification
[ https://issues.apache.org/jira/browse/JENA-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-1903. - Resolution: Done > Encode/decode RDF* using reification > > > Key: JENA-1903 > URL: https://issues.apache.org/jira/browse/JENA-1903 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 3.15.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Minor > Fix For: Jena 4.2.0 > > > (split off from JENA-1899) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-1903) Encode/decode RDF* using reification
[ https://issues.apache.org/jira/browse/JENA-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412797#comment-17412797 ] ASF subversion and git services commented on JENA-1903: --- Commit 8794a934bc6162ace6f742713a283cc29665414c in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=8794a93 ] Merge pull request #1065 from afs/rdf-star-translate JENA-1903: Encode RDF-star into RDF Reification > Encode/decode RDF* using reification > > > Key: JENA-1903 > URL: https://issues.apache.org/jira/browse/JENA-1903 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 3.15.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Minor > Fix For: Jena 4.2.0 > > > (split off from JENA-1899) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-1903) Encode/decode RDF* using reification
[ https://issues.apache.org/jira/browse/JENA-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412796#comment-17412796 ] ASF subversion and git services commented on JENA-1903: --- Commit f6623458e47559c965299512b4053ed0f8c47e5f in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=f662345 ] JENA-1903: Encode RDF-star into RDF Reification > Encode/decode RDF* using reification > > > Key: JENA-1903 > URL: https://issues.apache.org/jira/browse/JENA-1903 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 3.15.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Minor > Fix For: Jena 4.2.0 > > > (split off from JENA-1899) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (JENA-1903) Encode/decode RDF* using reification
[ https://issues.apache.org/jira/browse/JENA-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne updated JENA-1903: Fix Version/s: Jena 4.2.0 > Encode/decode RDF* using reification > > > Key: JENA-1903 > URL: https://issues.apache.org/jira/browse/JENA-1903 > Project: Apache Jena > Issue Type: Task >Affects Versions: Jena 3.15.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Minor > Fix For: Jena 4.2.0 > > > (split off from JENA-1899) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (JENA-1903) Encode/decode RDF* using reification
[ https://issues.apache.org/jira/browse/JENA-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne updated JENA-1903: Issue Type: New Feature (was: Task) > Encode/decode RDF* using reification > > > Key: JENA-1903 > URL: https://issues.apache.org/jira/browse/JENA-1903 > Project: Apache Jena > Issue Type: New Feature >Affects Versions: Jena 3.15.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Minor > Fix For: Jena 4.2.0 > > > (split off from JENA-1899) -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: RDF Writer JSON LD FRAME performance
Hi Ronald, Jena used github/jsonld-java for its JSON-LD handling. Providing the costs aren't going in the translation across the Jena/jsonld-java boundary, then it is the cost in jsonld-java. I haven't had the opportunity to check but it has been said that the JSON-LD algorithms are not cheap. Are you able to attach visialvm (or similar) and see where the time is going? BTW In the next release, Jena will have support for reading JSON-LD 1.1 using the Titanium project. Andy On 20/08/2021 08:36, Roland Bailly wrote: Hello ! I have a question related to the performance of the RDFWriter to JSON LD FRAME format. Currently I have to process 40k objects inside in a RDF file. When I process them into JSON following the code: JsonLDWriteContext ctx = new JsonLDWriteContext(); JsonLdOptions opts = new JsonLdOptions(); opts.setOmitGraph(true); opts.setEmbed(Embed.ALWAYS); opts.setProcessingMode(JSON_LD_1_1); ctx.setOptions(opts); RDFWriter.create().format(RDFFormat.JSONLD_FRAME_FLAT).source(graph).context(ctx).build().asString(); I have in average 1 to 10 seconds to process it. It is a bit too slow. Does someone know how to increase the speed of the process? Yours faithfully, Roland Bailly
RDF Writer JSON LD FRAME performance
Hello ! I have a question related to the performance of the RDFWriter to JSON LD FRAME format. Currently I have to process 40k objects inside in a RDF file. When I process them into JSON following the code: JsonLDWriteContext ctx = new JsonLDWriteContext(); JsonLdOptions opts = new JsonLdOptions(); opts.setOmitGraph(true); opts.setEmbed(Embed.ALWAYS); opts.setProcessingMode(JSON_LD_1_1); ctx.setOptions(opts); RDFWriter.create().format(RDFFormat.JSONLD_FRAME_FLAT).source(graph).context(ctx).build().asString(); I have in average 1 to 10 seconds to process it. It is a bit too slow. Does someone know how to increase the speed of the process? Yours faithfully, Roland Bailly
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349712#comment-17349712 ] Andy Seaborne commented on JENA-2107: - Thanks - all the RDF-star processing should be fixed now. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-2107. - Resolution: Fixed > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349711#comment-17349711 ] ASF subversion and git services commented on JENA-2107: --- Commit 653f9f5f23d8fed66e11eee812efc1d852424bff in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=653f9f5 ] Merge pull request #1005 from afs/solver JENA-2107: Substitute in RDF-star triple pattern > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349710#comment-17349710 ] ASF subversion and git services commented on JENA-2107: --- Commit 8a1877b8c0a96b8595d1d95eb0b7fb6e88132731 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=8a1877b ] JENA-2107: Substitute in RDF-star triple pattern > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349662#comment-17349662 ] ASF subversion and git services commented on JENA-2107: --- Commit 01c07c332c8d17e293b191268456dbd1541d4be8 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=01c07c3 ] Merge pull request #1004 from SANSA-Stack/JENA-2107 JENA-2107: Substitute in RDF-star triple pattern > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349660#comment-17349660 ] ASF subversion and git services commented on JENA-2107: --- Commit d5a37f524a031cb86b8c5a0a076a915afd4e6851 in jena's branch refs/heads/main from Lorenz Buehmann [ https://gitbox.apache.org/repos/asf?p=jena.git;h=d5a37f5 ] Merge branch 'JENA-2107' of github.com:SANSA-Stack/jena into JENA-2107 > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349661#comment-17349661 ] ASF subversion and git services commented on JENA-2107: --- Commit 01c07c332c8d17e293b191268456dbd1541d4be8 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=01c07c3 ] Merge pull request #1004 from SANSA-Stack/JENA-2107 JENA-2107: Substitute in RDF-star triple pattern > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349659#comment-17349659 ] ASF subversion and git services commented on JENA-2107: --- Commit d5a37f524a031cb86b8c5a0a076a915afd4e6851 in jena's branch refs/heads/main from Lorenz Buehmann [ https://gitbox.apache.org/repos/asf?p=jena.git;h=d5a37f5 ] Merge branch 'JENA-2107' of github.com:SANSA-Stack/jena into JENA-2107 > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349658#comment-17349658 ] ASF subversion and git services commented on JENA-2107: --- Commit 80760e815b7abc6b9dfa743bbe4ab893b6f8e05e in jena's branch refs/heads/main from Lorenz Buehmann [ https://gitbox.apache.org/repos/asf?p=jena.git;h=80760e8 ] JENA-2107: Substitute in RDF-star triple pattern > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349657#comment-17349657 ] ASF subversion and git services commented on JENA-2107: --- Commit 80804af22cf107b6074ed9ee58e1fb055428bc20 in jena's branch refs/heads/main from Lorenz Buehmann [ https://gitbox.apache.org/repos/asf?p=jena.git;h=80804af ] adresses JENA-2107 > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne reassigned JENA-2107: --- Assignee: Andy Seaborne > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346897#comment-17346897 ] Andy Seaborne commented on JENA-2107: - All the solvers. PR #1005 includes the SolverRX3 fix from PR #1004. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346874#comment-17346874 ] Andy Seaborne edited comment on JENA-2107 at 5/18/21, 1:16 PM: --- Indexing: certainly it can be added. I've kept away from changes that change-of-disk-layout. Disk changes are a more permanent commitment. Adding is a data reload; withdrawing/changing the feature is a reload and disruption. The first {{RDF*}} Jena implementation was a bit PG mode and a bit SA mode (PG mode - the triple is always also an asserted triple like annotation syntax). It exploited the existing indexes to look up {{<<...>>}} patterns up. The current {{RDF-star}} compliant is code, no disk-changes. With the fix for this JIRA, use of annotation syntax should be reasonable (the asserted triple will come first) >From the current state (4.1.0 onwards), functionally correct and complete, we >can see what the user-uptake is. One thing to avoid is making a change, then needing to make another change, ... and another ... for a feature that not everyone is going to use. The index setup is currently fixed - we can change it to look for additional indexes and have a tool to adding indexes. The bulk loaders need adjusting - TDB2 bulk loaders do work incrementally and make a difference adding a lot of data (comparable to the size of the existing data - if smaller, little point using them). was (Author: andy.seaborne): Indexing: certainly it can be added. I've kept away from changes that change-of-disk-layout. Disk changes are a more permanent commitment. Adding is a data reload; withdrawing/changing the feature is a reload and disruption. The first {{RDF*}} Jena implementation was a bit PG mode and a bit SA mode (PG mode - the triple is always also an asserted triple like annotation syntax). It exploited the existing indexes to look up {{<<...>>}} patterns up. The current \{{RDF-star} compliant is code, no disk-changes. With the fix for this JIRA, use of annotation syntax should be reasonable (the asserted triple will come first) >From the current state (4.1.0 onwards), functionally correct and complete, we >can see what the user-uptake is. One thing to avoid is making a change, then needing to make another change, ... and another ... for a feature that not everyone is going to use. The index setup is currently fixed - we can change it to look for additional indexes and have a tool to adding indexes. The bulk loaders need adjusting - TDB2 bulk loaders do work incrementally and make a difference adding a lot of data (comparable to the size of the existing data - if smaller, little point using them). > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346874#comment-17346874 ] Andy Seaborne commented on JENA-2107: - Indexing: certainly it can be added. I've kept away from changes that change-of-disk-layout. Disk changes are a more permanent commitment. Adding is a data reload; withdrawing/changing the feature is a reload and disruption. The first {{RDF*}} Jena implementation was a bit PG mode and a bit SA mode (PG mode - the triple is always also an asserted triple like annotation syntax). It exploited the existing indexes to look up {{<<...>>}} patterns up. The current \{{RDF-star} compliant is code, no disk-changes. With the fix for this JIRA, use of annotation syntax should be reasonable (the asserted triple will come first) >From the current state (4.1.0 onwards), functionally correct and complete, we >can see what the user-uptake is. One thing to avoid is making a change, then needing to make another change, ... and another ... for a feature that not everyone is going to use. The index setup is currently fixed - we can change it to look for additional indexes and have a tool to adding indexes. The bulk loaders need adjusting - TDB2 bulk loaders do work incrementally and make a difference adding a lot of data (comparable to the size of the existing data - if smaller, little point using them). > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346864#comment-17346864 ] Andy Seaborne commented on JENA-2107: - {quote}fixed the commit message. {quote} The commit message - not the PR message (I can change the PR message, I can't change a commit message). > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346772#comment-17346772 ] Lorenz Bühmann commented on JENA-2107: -- Some numbers: #Triples = {{50,367}} the shape of the data is: * 100 nodes with a directed connection to each other, i.e. 9900 triples of * for each connection we have 4 triples making statements about the connection * plus some other data about the nodes themselves a simplified query on the data executed is {code:sql} SELECT (count(*) as ?cnt) { ?src ?target . < ?target>> ?val1 ; ?val2 }{code} h4. Runtimes: {code:java} sparql --time --repeat 2,5 --data ... --query ...{code} h5. Jena 4.0.0 Time: 81.749 sec Total time: 403.872 sec for repeat count of 5 : average: 80.774 h5. Jena 4.1.0-SNAPSHOT with fix Time: 0.039 sec Total time: 0.422 sec for repeat count of 5 : average: 0.084 > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346614#comment-17346614 ] Lorenz Bühmann commented on JENA-2107: -- [~andy] fixed the commit message. And sure, I can test it with our dataset and post the numbers here to get an idea of the performance gain. Indeed, some benchmark generator as well as some test queries would be better, but I'm sure somebody will do this for RDF-star anyways in the near future. (by the way, I also tried a quick fix for TDB1/TDB2 locally, it's basically the same I guess, except for calling {{convToBinding}} method to get the binding from the ID) We were also wondering if you're thinking of any index structures for the embedded triples? At least for the top level it would be possible - though, to be fair,, already here we would have to do it for subject and object position and then each permutation ... sounds overkill, I think? Especially as there isn't currently that much demand, as usual a tradeoff [~Aklakan] what would be the purpose of tracking the sizes? (ok, we can discuss internally in the office later) > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346475#comment-17346475 ] Claus Stadler edited comment on JENA-2107 at 5/17/21, 11:25 PM: For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such a dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this part of the engine is for testing these internals. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. was (Author: aklakan): For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such a dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome if we had to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. I know that TDB2 has a DatasetGraph abstraction - and I am assuming TDB1 has one too - so above sketch might already be sufficient to test all RDF star implementations. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346475#comment-17346475 ] Claus Stadler edited comment on JENA-2107 at 5/17/21, 11:22 PM: For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such a dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome if we had to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. I know that TDB2 has a DatasetGraph abstraction - and I am assuming TDB1 has one too - so above sketch might already be sufficient to test all RDF star implementations. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. was (Author: aklakan): For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such a dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346475#comment-17346475 ] Claus Stadler edited comment on JENA-2107 at 5/17/21, 11:17 PM: For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such a dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. was (Author: aklakan): For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such an dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346475#comment-17346475 ] Claus Stadler edited comment on JENA-2107 at 5/17/21, 11:17 PM: For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such an dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. was (Author: aklakan): For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such an dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); try { Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346475#comment-17346475 ] Claus Stadler edited comment on JENA-2107 at 5/17/21, 11:16 PM: For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such an dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); try { Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper) ; and I am not sure about how accessible this engine is for testing these internals. Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. was (Author: aklakan): For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such an dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); try { Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper). Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346475#comment-17346475 ] Claus Stadler commented on JENA-2107: - For the Dataset-based implementation we could subclass the find methods of DatasetGraphWrapper to keep track of the internal iterator sizes. After running a query on such an dataset instance one could then check whether only a specific number of tuples have been touched Alternatively, one could track the arguments passed to find and check whether those match an expected sequence (or set) of reference arguments - which would be more traceable than mere counts. Sketch: {code:java} class TrackingDatasetGraph extends DatasetGraphWrapper { protected long numSeenTuples = 0; protected Collection seenArgs = new LinkedHashSet<>(); // or ArrayList @Override public Iterator find(Node g, Node s, Node p, Node o) { seenArgs.add(Arrays.asList(g, s, p, o)); try { Iterator it = getR().find() List materialized = Iter.toList(it); numSeenTuples += materialized.size(); return materialized.iterator(); } } {code} It's just somewhat cumbersome having to repeat the same pattern for NodeTupleTable(Wrapper). Having at least a single test case would already be beneficial for detecting regressions in this regard while work on RDF star progresses. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346371#comment-17346371 ] Andy Seaborne edited comment on JENA-2107 at 5/17/21, 10:32 PM: All the solvers need this fix. I've done (locally) RX4 and the TDB SolverRX's following the pattern from the PR. [~LorenzB] if you adjust the commit message on the PR, I'll do the other three. Would you be able to test the TDB (either one, ideally both)? I can't think of a test for the situation other than timing because the core of the RDF-star triple solver code is, by design, more general than assuming substitution has been done, which is why it didn't show except from a performance effect. was (Author: andy.seaborne): All the solvers need this fix. I've don RX4 and the TDB SolverRX's following the pattern from the PR. [~LorenzB] if you adjust he commit message on the PR, I'll do the other three. Would you be able to test the TDB (either one, ideally both)? I can't think of a test for the situation other than timing because the core of the RDF-star triple solver code is, by design, more general than assuming substitution has been done, which is why it didn't show except from a performance effect. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346371#comment-17346371 ] Andy Seaborne commented on JENA-2107: - All the solvers need this fix. I've don RX4 and the TDB SolverRX's following the pattern from the PR. [~LorenzB] if you adjust he commit message on the PR, I'll do the other three. Would you be able to test the TDB (either one, ideally both)? I can't think of a test for the situation other than timing because the core of the RDF-star triple solver code is, by design, more general than assuming substitution has been done, which is why it didn't show except from a performance effect. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [jena-site] afs merged pull request #51: JENA-2084: Align documentation of RDF Thrift
afs merged pull request #51: URL: https://github.com/apache/jena-site/pull/51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346211#comment-17346211 ] Andy Seaborne edited comment on JENA-2107 at 5/17/21, 3:04 PM: --- {{tdbquery}} processing will be in SolverRX (one each for TDB1, TDB2). was (Author: andy.seaborne): {{tdbquery}} with be in SolverRX (one each for TDB1, TDB2). > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346211#comment-17346211 ] Andy Seaborne commented on JENA-2107: - {{tdbquery}} with be in SolverRX (one each for TDB1, TDB2). > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorenz Bühmann updated JENA-2107: - Description: the following graph pattern is not evaluated efficiently (results in full-scan per binding) because the second triple pattern doesn't take advantage of the bindings generated by evaluation of the first one: {code:java} ?s ?o . << ?s ?o >> ?v . {code} A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] by changing the beginning to {code:java} private static Iterator rdfStarTripleSub(Binding input, Triple xPattern, ExecutionContext execCxt) { Triple tPattern = Substitute.substitute(xPattern, input); {code} We went from 75s for a very small dataset (50k triples) to near instant response times. If this fix is correct and doesn't break anything, it might be the same way to fix for its quads counterpart in {{SolverRX4}} class. Note, for tdbquery, this seems to be evaluated at a different place? At least, we couldn't find any performance improvement, it's still horribly slow. was: the following graph pattern is not evaluated efficiently (results in full-scan per binding) because the second triple pattern doesn't take advantage of the bindings generated by evaluation of the first one: {code:java} ?s ?o . << ?s ?o >> ?v . {code} A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] by changing the beginning to {code:java} private static Iterator rdfStarTripleSub(Binding input, Triple xPattern, ExecutionContext execCxt) { Triple tPattern = Substitute.substitute(xPattern, input); {code} We went from 75s for a very small dataset (50k triples) to near instant response times. If this fix is correct and doesn't break anything, it might be the same way to fix for its quads counterpart in {{SolverRX4}} class > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class. > > Note, for tdbquery, this seems to be evaluated at a different place? At > least, we couldn't find any performance improvement, it's still horribly slow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorenz Bühmann updated JENA-2107: - Description: the following graph pattern is not evaluated efficiently (results in full-scan per binding) because the second triple pattern doesn't take advantage of the bindings generated by evaluation of the first one: {code:java} ?s ?o . << ?s ?o >> ?v . {code} A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] by changing the beginning to {code:java} private static Iterator rdfStarTripleSub(Binding input, Triple xPattern, ExecutionContext execCxt) { Triple tPattern = Substitute.substitute(xPattern, input); {code} We went from 75s for a very small dataset (50k triples) to near instant response times. If this fix is correct and doesn't break anything, it might be the same way to fix for its quads counterpart in {{SolverRX4}} class was: the following graph pattern is not evaluated efficiently (results in full-scan per binding) because the second triple pattern doesn't take advantage of the bindings generated by evaluation of the first one: {code} ?s ?o . << ?s ?o >> ?v . {code} A possible fix would be to adapt the method `rdfStarTripleSub` in class [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] by changing the beginning to {code:java} private static Iterator rdfStarTripleSub(Binding input, Triple xPattern, ExecutionContext execCxt) { Triple tPattern = Substitute.substitute(xPattern, input); {code} We went from 75s for a very small dataset (50k triples) to near instant response times. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code:java} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class > > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. > > If this fix is correct and doesn't break anything, it might be the same way > to fix for its quads counterpart in {{SolverRX4}} class -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (JENA-2107) RDF Star performance issue with non-concrete node triples
[ https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorenz Bühmann updated JENA-2107: - Description: the following graph pattern is not evaluated efficiently (results in full-scan per binding) because the second triple pattern doesn't take advantage of the bindings generated by evaluation of the first one: {code} ?s ?o . << ?s ?o >> ?v . {code} A possible fix would be to adapt the method `rdfStarTripleSub` in class [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] by changing the beginning to {code:java} private static Iterator rdfStarTripleSub(Binding input, Triple xPattern, ExecutionContext execCxt) { Triple tPattern = Substitute.substitute(xPattern, input); {code} We went from 75s for a very small dataset (50k triples) to near instant response times. was: the following graph pattern is not evaluated efficiently (results in full-scan per binding) because the second triple pattern doesn't take advantage of the bindings generated by evaluation of the first one: {code:java} ?s ?o . << ?s ?o >> ?v . {code} A possible fix would be to adapt the method `rdfStarTripleSub` in class [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] by changing the beginning to {code:java} private static Iterator rdfStarTripleSub(Binding input, Triple xPattern, ExecutionContext execCxt) { Triple tPattern = Substitute.substitute(xPattern, input); {code} We went from 75s for a very small dataset (50k triples) to near instant response times. > RDF Star performance issue with non-concrete node triples > - > > Key: JENA-2107 > URL: https://issues.apache.org/jira/browse/JENA-2107 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ >Affects Versions: Jena 3.17.0, Jena 4.0.0 >Reporter: Lorenz Bühmann >Priority: Critical > Fix For: Jena 4.1.0 > > > the following graph pattern is not evaluated efficiently (results in > full-scan per binding) because the second triple pattern doesn't take > advantage of the bindings generated by evaluation of the first one: > {code} > ?s ?o . > << ?s ?o >> ?v . > {code} > A possible fix would be to adapt the method `rdfStarTripleSub` in class > [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] > by changing the beginning to > {code:java} > private static Iterator rdfStarTripleSub(Binding input, Triple > xPattern, ExecutionContext execCxt) { > Triple tPattern = Substitute.substitute(xPattern, input); > {code} > We went from 75s for a very small dataset (50k triples) to near instant > response times. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (JENA-2107) RDF Star performance issue with non-concrete node triples
Lorenz Bühmann created JENA-2107: Summary: RDF Star performance issue with non-concrete node triples Key: JENA-2107 URL: https://issues.apache.org/jira/browse/JENA-2107 Project: Apache Jena Issue Type: Improvement Components: ARQ Affects Versions: Jena 4.0.0, Jena 3.17.0 Reporter: Lorenz Bühmann Fix For: Jena 4.1.0 the following graph pattern is not evaluated efficiently (results in full-scan per binding) because the second triple pattern doesn't take advantage of the bindings generated by evaluation of the first one: {code:java} ?s ?o . << ?s ?o >> ?v . {code} A possible fix would be to adapt the method `rdfStarTripleSub` in class [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71] by changing the beginning to {code:java} private static Iterator rdfStarTripleSub(Binding input, Triple xPattern, ExecutionContext execCxt) { Triple tPattern = Substitute.substitute(xPattern, input); {code} We went from 75s for a very small dataset (50k triples) to near instant response times. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (JENA-2084) Error in documentation of RDF Thrift format
[ https://issues.apache.org/jira/browse/JENA-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-2084. - Resolution: Fixed > Error in documentation of RDF Thrift format > --- > > Key: JENA-2084 > URL: https://issues.apache.org/jira/browse/JENA-2084 > Project: Apache Jena > Issue Type: Bug > Components: Documentation >Affects Versions: Jena 4.0.0 >Reporter: Jon >Assignee: Andy Seaborne >Priority: Minor > Fix For: Jena 4.1.0 > > > Both [https://jena.apache.org/documentation/io/rdf-binary.html] > and the page that it links to > [https://afs.github.io/rdf-thrift/rdf-binary-thrift.html] > define RDF literals as: > {code:java} > struct RDF_Literal { > 1: required string lex > 2: optional string datatype > 3: optional string langtag > } {code} > however the definition in the codebase appears to be: > > [https://github.com/apache/jena/blob/9519c65f8f5c377d3d9a3983eaaceeb26a99554c/jena-arq/Grammar/RDF-Thrift/BinaryRDF.thrift#L48-L53] > {code:java} > struct RDF_Literal { > 1: required string lex ; > 2: optional string langtag ; > 3: optional string datatype ; // Either 3 or 4 but UNION is heavy. > 4: optional RDF_PrefixName dtPrefix ; // datatype as prefix name > } {code} > with langtag and datatype swapped, etc. > Perhaps these documentation pages should also link to the latest > {{BinaryRDF.thrift}} file in source control? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [jena-site] afs merged pull request #51: JENA-2084: Align documentation of RDF Thrift
afs merged pull request #51: URL: https://github.com/apache/jena-site/pull/51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (JENA-2084) Error in documentation of RDF Thrift format
[ https://issues.apache.org/jira/browse/JENA-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne updated JENA-2084: Fix Version/s: Jena 4.1.0 > Error in documentation of RDF Thrift format > --- > > Key: JENA-2084 > URL: https://issues.apache.org/jira/browse/JENA-2084 > Project: Apache Jena > Issue Type: Bug > Components: Documentation >Affects Versions: Jena 4.0.0 >Reporter: Jon >Assignee: Andy Seaborne >Priority: Minor > Fix For: Jena 4.1.0 > > > Both [https://jena.apache.org/documentation/io/rdf-binary.html] > and the page that it links to > [https://afs.github.io/rdf-thrift/rdf-binary-thrift.html] > define RDF literals as: > {code:java} > struct RDF_Literal { > 1: required string lex > 2: optional string datatype > 3: optional string langtag > } {code} > however the definition in the codebase appears to be: > > [https://github.com/apache/jena/blob/9519c65f8f5c377d3d9a3983eaaceeb26a99554c/jena-arq/Grammar/RDF-Thrift/BinaryRDF.thrift#L48-L53] > {code:java} > struct RDF_Literal { > 1: required string lex ; > 2: optional string langtag ; > 3: optional string datatype ; // Either 3 or 4 but UNION is heavy. > 4: optional RDF_PrefixName dtPrefix ; // datatype as prefix name > } {code} > with langtag and datatype swapped, etc. > Perhaps these documentation pages should also link to the latest > {{BinaryRDF.thrift}} file in source control? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [jena-site] afs opened a new pull request #51: JENA-2084: Align documentation of RDF Thrift
afs opened a new pull request #51: URL: https://github.com/apache/jena-site/pull/51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (JENA-2084) Error in documentation of RDF Thrift format
[ https://issues.apache.org/jira/browse/JENA-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne reassigned JENA-2084: --- Assignee: Andy Seaborne > Error in documentation of RDF Thrift format > --- > > Key: JENA-2084 > URL: https://issues.apache.org/jira/browse/JENA-2084 > Project: Apache Jena > Issue Type: Bug > Components: Documentation >Affects Versions: Jena 4.0.0 >Reporter: Jon >Assignee: Andy Seaborne >Priority: Minor > > Both [https://jena.apache.org/documentation/io/rdf-binary.html] > and the page that it links to > [https://afs.github.io/rdf-thrift/rdf-binary-thrift.html] > define RDF literals as: > {code:java} > struct RDF_Literal { > 1: required string lex > 2: optional string datatype > 3: optional string langtag > } {code} > however the definition in the codebase appears to be: > > [https://github.com/apache/jena/blob/9519c65f8f5c377d3d9a3983eaaceeb26a99554c/jena-arq/Grammar/RDF-Thrift/BinaryRDF.thrift#L48-L53] > {code:java} > struct RDF_Literal { > 1: required string lex ; > 2: optional string langtag ; > 3: optional string datatype ; // Either 3 or 4 but UNION is heavy. > 4: optional RDF_PrefixName dtPrefix ; // datatype as prefix name > } {code} > with langtag and datatype swapped, etc. > Perhaps these documentation pages should also link to the latest > {{BinaryRDF.thrift}} file in source control? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2103) Update RDF-star. Align to community work; switch to community test suite
[ https://issues.apache.org/jira/browse/JENA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341503#comment-17341503 ] ASF subversion and git services commented on JENA-2103: --- Commit 3c49e27bdb649919ea97e2dc9dca837924a47ca9 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=3c49e27 ] JENA-2103: Fix decoding RDF-star embedded triples in TDB1 > Update RDF-star. Align to community work; switch to community test suite > > > Key: JENA-2103 > URL: https://issues.apache.org/jira/browse/JENA-2103 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ, RIOT >Reporter: Andy Seaborne >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2103) Update RDF-star. Align to community work; switch to community test suite
[ https://issues.apache.org/jira/browse/JENA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341505#comment-17341505 ] ASF subversion and git services commented on JENA-2103: --- Commit 79e1343153e596a2b33fa0847b9a87ae14f76a3b in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=79e1343 ] Merge pull request #999 from afs/rdf-star JENA-2103: RDF-star updates > Update RDF-star. Align to community work; switch to community test suite > > > Key: JENA-2103 > URL: https://issues.apache.org/jira/browse/JENA-2103 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ, RIOT >Reporter: Andy Seaborne >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2103) Update RDF-star. Align to community work; switch to community test suite
[ https://issues.apache.org/jira/browse/JENA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341502#comment-17341502 ] ASF subversion and git services commented on JENA-2103: --- Commit c14f37a6d9fc57de83bddf0c224d3798bb9747d9 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=c14f37a ] JENA-2103: RDF-star test suite > Update RDF-star. Align to community work; switch to community test suite > > > Key: JENA-2103 > URL: https://issues.apache.org/jira/browse/JENA-2103 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ, RIOT >Reporter: Andy Seaborne >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (JENA-2103) Update RDF-star. Align to community work; switch to community test suite
[ https://issues.apache.org/jira/browse/JENA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-2103. - Fix Version/s: Jena 4.1.0 Assignee: Andy Seaborne Resolution: Done > Update RDF-star. Align to community work; switch to community test suite > > > Key: JENA-2103 > URL: https://issues.apache.org/jira/browse/JENA-2103 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ, RIOT >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2103) Update RDF-star. Align to community work; switch to community test suite
[ https://issues.apache.org/jira/browse/JENA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341501#comment-17341501 ] ASF subversion and git services commented on JENA-2103: --- Commit ffd6377ef7901488ec06a8dae71067210a5cc5f5 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=ffd6377 ] JENA-2103: Allow recusive <<>> > Update RDF-star. Align to community work; switch to community test suite > > > Key: JENA-2103 > URL: https://issues.apache.org/jira/browse/JENA-2103 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ, RIOT >Reporter: Andy Seaborne >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2103) Update RDF-star. Align to community work; switch to community test suite
[ https://issues.apache.org/jira/browse/JENA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341504#comment-17341504 ] ASF subversion and git services commented on JENA-2103: --- Commit e727494b4a8ac4593facdf9d9d64893b2bc481a4 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=e727494 ] JENA-2103: Additional tests for ecoding RDF-star embedded triples > Update RDF-star. Align to community work; switch to community test suite > > > Key: JENA-2103 > URL: https://issues.apache.org/jira/browse/JENA-2103 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ, RIOT >Reporter: Andy Seaborne >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (JENA-2103) Update RDF-star. Align to community work; switch to community test suite
Andy Seaborne created JENA-2103: --- Summary: Update RDF-star. Align to community work; switch to community test suite Key: JENA-2103 URL: https://issues.apache.org/jira/browse/JENA-2103 Project: Apache Jena Issue Type: Improvement Components: ARQ, RIOT Reporter: Andy Seaborne -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (JENA-2101) RDF-star: Adjust grammar to disallow blank nodes in embedded triple expressions.
[ https://issues.apache.org/jira/browse/JENA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-2101. - Resolution: Fixed > RDF-star: Adjust grammar to disallow blank nodes in embedded triple > expressions. > > > Key: JENA-2101 > URL: https://issues.apache.org/jira/browse/JENA-2101 > Project: Apache Jena > Issue Type: Bug >Affects Versions: Jena 4.0.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2101) RDF-star: Adjust grammar to disallow blank nodes in embedded triple expressions.
[ https://issues.apache.org/jira/browse/JENA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339496#comment-17339496 ] ASF subversion and git services commented on JENA-2101: --- Commit c52f52f1cdb40271473b2146b1d049d553d3c2cb in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=c52f52f ] Merge pull request #997 from afs/rdf-star-expr JENA-2101: Expression use of <<>> > RDF-star: Adjust grammar to disallow blank nodes in embedded triple > expressions. > > > Key: JENA-2101 > URL: https://issues.apache.org/jira/browse/JENA-2101 > Project: Apache Jena > Issue Type: Bug >Affects Versions: Jena 4.0.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2101) RDF-star: Adjust grammar to disallow blank nodes in embedded triple expressions.
[ https://issues.apache.org/jira/browse/JENA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339495#comment-17339495 ] ASF subversion and git services commented on JENA-2101: --- Commit c495b2bf52189e550a12a464953f555107026ef5 in jena's branch refs/heads/main from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=c495b2b ] JENA-2101: Expression use of <<>> > RDF-star: Adjust grammar to disallow blank nodes in embedded triple > expressions. > > > Key: JENA-2101 > URL: https://issues.apache.org/jira/browse/JENA-2101 > Project: Apache Jena > Issue Type: Bug >Affects Versions: Jena 4.0.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (JENA-2101) RDF-star: Adjust grammar to disallow blank nodes in embedded triple expressions.
[ https://issues.apache.org/jira/browse/JENA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne updated JENA-2101: Summary: RDF-star: Adjust grammar to disallow blank nodes in embedded triple expressions. (was: RDF-star: Adjust grammar to disaloow blank nodes in embedded triple expressions.) > RDF-star: Adjust grammar to disallow blank nodes in embedded triple > expressions. > > > Key: JENA-2101 > URL: https://issues.apache.org/jira/browse/JENA-2101 > Project: Apache Jena > Issue Type: Bug >Affects Versions: Jena 4.0.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 4.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (JENA-2099) Converting rdfstar to plain rdf formats do not work
[ https://issues.apache.org/jira/browse/JENA-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne closed JENA-2099. --- > Converting rdfstar to plain rdf formats do not work > --- > > Key: JENA-2099 > URL: https://issues.apache.org/jira/browse/JENA-2099 > Project: Apache Jena > Issue Type: Question > Components: RIOT >Affects Versions: Jena 3.17.0 >Reporter: Chris Mungall >Priority: Major > > Given a file ex1.ttl > > {code:java} > @prefix : <http://www.example.org/> . > :employee38 :familyName "Smith" . > << :employee38 :jobTitle "Assistant Designer" >> :accordingTo :employee22 . > {code} > > my expectation is that I should be able to use riot to serialize as plain > rdf, ttl, ntriples, rdf-xml, and json-ld, with reification used (I cannot > find a normative statement in the spec to support this but I hold it's a > reasonably user expectation given what has been written about rdfstar by the > authors of the spec). > This doesn't work however: > > {code:java} > $ riot --out=jsonld ex1.ttl > org.apache.jena.riot.RiotException: Subject node is not a URI or a blank node > at > org.apache.jena.riot.writer.JenaRDF2JSONLD.parse(JenaRDF2JSONLD.java:64) > at > org.apache.jena.riot.writer.JsonLDWriter.toJsonLDJavaAPI(JsonLDWriter.java:200) > at > org.apache.jena.riot.writer.JsonLDWriter.serialize(JsonLDWriter.java:174) > at > org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:135) > at > org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:141) > at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:166) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:113) > at > org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:204) > at > riotcmd.CmdLangParse.lambda$createAccumulateSink$0(CmdLangParse.java:348) > at riotcmd.CmdLangParse.exec$(CmdLangParse.java:172) > at riotcmd.CmdLangParse.exec(CmdLangParse.java:130) > at jena.cmd.CmdMain.mainMethod(CmdMain.java:92) > at jena.cmd.CmdMain.mainRun(CmdMain.java:58) > at jena.cmd.CmdMain.mainRun(CmdMain.java:45) > at riotcmd.riot.main(riot.java:29) > {code} > > similarly for export to rdfxml > When I convert to turtle, the syntax remains turtlestar: > > {code:java} > $ riot --out=ttl ex1.ttl > @prefix : <http://www.example.org/> . > :employee38 :familyName "Smith" . > << :employee38 :jobTitle "Assistant Designer" >> > :accordingTo :employee22 .{code} > > I'm not totally sure if this is what should happen. I'm not totally sure of > what the different file format options are, and if there is a distinct > "turtlestar" and "ntriplesstar" (as an aside, it would be useful to have more > command line help on permissible values for input and output formats in riot). > My expectation is that parsing should be lenient as should accept *star > syntaxes, but allow fine-grained control in output as to whether plain rdf > syntax or *star syntaxes are used (with expansion to reification happening in > the latter), but I may be using riot incorrectly. > FWIW, explicitly using reification doesn't syntactically convert to rdfstar, > e.g. with ex1r.ttl: > > {code:java} > @prefix : <http://www.example.org/> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > :employee38 :jobTitle "Assistant Designer" . > [a rdf:Statement ; > rdf:subject :employee38 ; > rdf:predicate :jobTitle ; > rdf:object "Assistant Designer" ; > :accordingTo :employee22 > ] . > {code} > > I can't seem to convert this into rdfstar syntax using riot > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (JENA-2099) Converting rdfstar to plain rdf formats do not work
[ https://issues.apache.org/jira/browse/JENA-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-2099. - Resolution: Duplicate > Converting rdfstar to plain rdf formats do not work > --- > > Key: JENA-2099 > URL: https://issues.apache.org/jira/browse/JENA-2099 > Project: Apache Jena > Issue Type: Question > Components: RIOT >Affects Versions: Jena 3.17.0 >Reporter: Chris Mungall >Priority: Major > > Given a file ex1.ttl > > {code:java} > @prefix : <http://www.example.org/> . > :employee38 :familyName "Smith" . > << :employee38 :jobTitle "Assistant Designer" >> :accordingTo :employee22 . > {code} > > my expectation is that I should be able to use riot to serialize as plain > rdf, ttl, ntriples, rdf-xml, and json-ld, with reification used (I cannot > find a normative statement in the spec to support this but I hold it's a > reasonably user expectation given what has been written about rdfstar by the > authors of the spec). > This doesn't work however: > > {code:java} > $ riot --out=jsonld ex1.ttl > org.apache.jena.riot.RiotException: Subject node is not a URI or a blank node > at > org.apache.jena.riot.writer.JenaRDF2JSONLD.parse(JenaRDF2JSONLD.java:64) > at > org.apache.jena.riot.writer.JsonLDWriter.toJsonLDJavaAPI(JsonLDWriter.java:200) > at > org.apache.jena.riot.writer.JsonLDWriter.serialize(JsonLDWriter.java:174) > at > org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:135) > at > org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:141) > at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:166) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:113) > at > org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:204) > at > riotcmd.CmdLangParse.lambda$createAccumulateSink$0(CmdLangParse.java:348) > at riotcmd.CmdLangParse.exec$(CmdLangParse.java:172) > at riotcmd.CmdLangParse.exec(CmdLangParse.java:130) > at jena.cmd.CmdMain.mainMethod(CmdMain.java:92) > at jena.cmd.CmdMain.mainRun(CmdMain.java:58) > at jena.cmd.CmdMain.mainRun(CmdMain.java:45) > at riotcmd.riot.main(riot.java:29) > {code} > > similarly for export to rdfxml > When I convert to turtle, the syntax remains turtlestar: > > {code:java} > $ riot --out=ttl ex1.ttl > @prefix : <http://www.example.org/> . > :employee38 :familyName "Smith" . > << :employee38 :jobTitle "Assistant Designer" >> > :accordingTo :employee22 .{code} > > I'm not totally sure if this is what should happen. I'm not totally sure of > what the different file format options are, and if there is a distinct > "turtlestar" and "ntriplesstar" (as an aside, it would be useful to have more > command line help on permissible values for input and output formats in riot). > My expectation is that parsing should be lenient as should accept *star > syntaxes, but allow fine-grained control in output as to whether plain rdf > syntax or *star syntaxes are used (with expansion to reification happening in > the latter), but I may be using riot incorrectly. > FWIW, explicitly using reification doesn't syntactically convert to rdfstar, > e.g. with ex1r.ttl: > > {code:java} > @prefix : <http://www.example.org/> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > :employee38 :jobTitle "Assistant Designer" . > [a rdf:Statement ; > rdf:subject :employee38 ; > rdf:predicate :jobTitle ; > rdf:object "Assistant Designer" ; > :accordingTo :employee22 > ] . > {code} > > I can't seem to convert this into rdfstar syntax using riot > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JENA-2099) Converting rdfstar to plain rdf formats do not work
[ https://issues.apache.org/jira/browse/JENA-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338132#comment-17338132 ] Chris Mungall commented on JENA-2099: - Apologies for the version confusion, homebrew fail on my part. I am now using 4.0.0 I think this issue can be closed in favor of https://issues.apache.org/jira/browse/JENA-1903 I'm a little confused by your answer. From the thread, one of the responses says that in fact RDF-star is syntactic sugar for reification. The current spec seems uncommitted on this however. An interesting thread though. I have the same use case as the original poster, and it's annoying that the syntax doesn't accommodate directly naming the statement. But that doesn't have any bearing on my current use case. My use case is that I want to be able to convert between my rdfstar representations to a concrete form that can be used by non-rdfstar aware toolchains. I have many rdfstar use cases, all of which revolve around property graphs being a superior modeling framework for our modeling needs. > Converting rdfstar to plain rdf formats do not work > --- > > Key: JENA-2099 > URL: https://issues.apache.org/jira/browse/JENA-2099 > Project: Apache Jena > Issue Type: Question > Components: RIOT >Affects Versions: Jena 3.17.0 >Reporter: Chris Mungall >Priority: Major > > Given a file ex1.ttl > > {code:java} > @prefix : <http://www.example.org/> . > :employee38 :familyName "Smith" . > << :employee38 :jobTitle "Assistant Designer" >> :accordingTo :employee22 . > {code} > > my expectation is that I should be able to use riot to serialize as plain > rdf, ttl, ntriples, rdf-xml, and json-ld, with reification used (I cannot > find a normative statement in the spec to support this but I hold it's a > reasonably user expectation given what has been written about rdfstar by the > authors of the spec). > This doesn't work however: > > {code:java} > $ riot --out=jsonld ex1.ttl > org.apache.jena.riot.RiotException: Subject node is not a URI or a blank node > at > org.apache.jena.riot.writer.JenaRDF2JSONLD.parse(JenaRDF2JSONLD.java:64) > at > org.apache.jena.riot.writer.JsonLDWriter.toJsonLDJavaAPI(JsonLDWriter.java:200) > at > org.apache.jena.riot.writer.JsonLDWriter.serialize(JsonLDWriter.java:174) > at > org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:135) > at > org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:141) > at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:166) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:113) > at > org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:204) > at > riotcmd.CmdLangParse.lambda$createAccumulateSink$0(CmdLangParse.java:348) > at riotcmd.CmdLangParse.exec$(CmdLangParse.java:172) > at riotcmd.CmdLangParse.exec(CmdLangParse.java:130) > at jena.cmd.CmdMain.mainMethod(CmdMain.java:92) > at jena.cmd.CmdMain.mainRun(CmdMain.java:58) > at jena.cmd.CmdMain.mainRun(CmdMain.java:45) > at riotcmd.riot.main(riot.java:29) > {code} > > similarly for export to rdfxml > When I convert to turtle, the syntax remains turtlestar: > > {code:java} > $ riot --out=ttl ex1.ttl > @prefix : <http://www.example.org/> . > :employee38 :familyName "Smith" . > << :employee38 :jobTitle "Assistant Designer" >> > :accordingTo :employee22 .{code} > > I'm not totally sure if this is what should happen. I'm not totally sure of > what the different file format options are, and if there is a distinct > "turtlestar" and "ntriplesstar" (as an aside, it would be useful to have more > command line help on permissible values for input and output formats in riot). > My expectation is that parsing should be lenient as should accept *star > syntaxes, but allow fine-grained control in output as to whether plain rdf > syntax or *star syntaxes are used (with expansion to reification happening in > the latter), but I may be using riot incorrectly. > FWIW, explicitly using reification doesn't syntactically convert to rdfstar, > e.g. with ex1r.ttl: > > {code:java} > @prefix : <http://www.example.org/> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > :employee38 :jobTitle "Assistant Designer" . > [a rdf:Statement ; > rdf:subject :employee38 ; > rdf:predicate :jobTitle ; > rdf:object "Assistant Designer" ; > :accordingTo :employee22 > ] . > {code} > > I can't seem to convert this into rdfstar syntax using riot > -- This message was sent by Atlassian Jira (v8.3.4#803005)