Dear all, Regarding our contribution proposal to enable extensions to override SPARQL operators in Jena
We finally got the agreement from our institution to contribute to the Apache foundation. Question 1: what is the procedure to upload the form? About the how, I would like to discuss first with you In a nutshell this is what I was thinking about: Add use of the standard Java Service Provider API to load things automatically found in the classpath: - In TypeMapper --> a method that uses the Service Provider API to find more Datatypes - Datatype subclasses are not for just one URI, but could be for a set of URIs - ValueSpaceClassification should not be an enum any more --> maybe use a class ValueSpace ... - should add some interface like NodeValueComparator, with some methods like: - canCompare(ValueSpace vs, ValueSpace vs) - sameAs(NodeValue nv, NodeValue nv) - compare(NodeValue nv, NodeValue nv) - add(NodeValue nv, NodeValue nv) - substract(NodeValue nv, NodeValue nv) - sameAs(NodeValue nv, NodeValue nv) - in NodeValue class, method sameAs(NodeValue nv1, NodeValue nv2) and compare(...) should uses the Service Provider API to find NodeValueComparators in the classpath - in class NodeValueOps, method divisionNV(NodeValue nv1, NodeValue nv2), multiplicationNV(...) additionNV(...) , subtractionNV(...) should uses the Service Provider API to find more NodeValueComparators in the classpath Any thoughts about this? Best regards, Maxime Lefrançois Le sam. 7 avr. 2018 à 15:13, ajs6f <aj...@apache.org> a écrit : > We're (well, Andy is) working on 3.7.0 now. We've been trying to maintain > a 6-month or so release cadence, so you've hit a really good time to begin > this work. That having been said, I don't think anyone would say that we > are especially stringent about it, so I wouldn't worry too much about the > timing myself. > > ajs6f > > > On Apr 6, 2018, at 9:36 AM, Maxime Lefrançois <maxime.lefranc...@emse.fr> > wrote: > > > > Well, > > > > I think I have a pretty clear idea how I would do this. We would end up > > using a registery like for custom functions or datatypes. > > That registry would contain an ordered list of SPARQL operator handlers, > > pre-filled by one for handling XSD datatypes. > > > > I am currently requesting the right to fill the Apache individual > > contributor license agreement. > > > > What would be the timeline if we wanted this shipped in the next release? > > > > Best, > > Maxime > > > > Le mar. 3 avr. 2018 à 15:30, ajs6f <aj...@apache.org> a écrit : > > > >> I agree. I can imagine plenty of use cases for such a powerful pair of > >> extension points. > >> > >> Maxime, how can we help you attack that work? Is there a design that is > >> already clear to you? Are there any blockers we can help remove? > >> > >> ajs6f > >> > >>> On Mar 28, 2018, at 5:08 AM, Rob Vesse <rve...@dotnetrdf.org> wrote: > >>> > >>> I think work towards Option 2 would be the most valuable to the > community > >>> > >>> > >>> > >>> The SPARQL specification allows for the overloading of any > >> operator/expression where the spec currently defines the evaluation to > be > >> an error so extending operators is a natural and valid extension point > to > >> provide > >>> > >>> > >>> > >>> The Terms of Use for UCUM would probably need us to obtain a licensing > >> assessment from Apache Legal as it is a non-standard OSS license even if > >> the code that implements it is under BSD (which is fine from an Apache > >> perspective). Therefore having a well defined extension mechanism and > then > >> having UCUM support live outside Apache Jena that as an extension > >> implementation maintained by yourself would be the easiest approach > >>> > >>> > >>> > >>> Rob > >>> > >>> > >>> > >>> From: Maxime Lefrançois <maxime.lefranc...@emse.fr> > >>> Reply-To: <dev@jena.apache.org> > >>> Date: Wednesday, 28 March 2018 at 09:29 > >>> To: <dev@jena.apache.org> > >>> Subject: Re: Contribution proposal for Jena: support of a datatype for > >> quantity values > >>> > >>> > >>> > >>> Dear all, > >>> > >>> > >>> > >>> Happy to see you are interested the UCUM datatypes ! > >>> > >>> > >>> > >>> Ok so let's dive in the technical details. > >>> > >>> > >>> > >>> # Compare Jena 3.6.0 and Jena 3.6.0-ucum > >>> > >>> > >>> > >>> > >> > https://github.com/apache/jena/compare/master...OpenSensingCity:jena-3.6.0-ucum > >>> > >>> > >>> > >>> # Modules, dependencies, licences > >>> > >>> > >>> > >>> Two modules forked so far: jena-core and jena-arq. > >>> > >>> One dependency added to jena-core (after a minor change I made today): > >>> > >>> > >>> > >>> systems.uom:systems-ucum-java8:0.7.2 > >>> > >>> -> BSD license of systems-uom, > >>> > >>> and license of UCUM http://unitsofmeasure.org/trac/wiki/TermsOfUse > >>> > >>> > >>> > >>> --> this use implementation of JSR 363 indeed - Units of Measurement > API > >>> > >>> (see attached for the transitive dependencies, all from > >> https://github.com/unitsofmeasurement ) > >>> > >>> > >>> > >>> # External module ? > >>> > >>> > >>> > >>> I would have been happy to develop a separate extension of Jena for the > >> UCUM datatypes. > >>> > >>> One of the main reasons why this is not possible was pointed out by > Andy: > >>> > >>> I had to add a new value space VSPACE_QUANTITY to overload the SPARQL > >> operators '<>=' and arithmetic functions '+-*/'. > >>> > >>> > >>> > >>> Indeed, there are two parts: the necessary extensions for operators, > and > >> the units themselves. > >>> > >>> > >>> > >>> We could choose some other unit system than UCUM, but UCUM is very > >> comprehensive and has different implementations in different programming > >> languages. It would be possible to implement UCUM datatypes in other > >> RDF-SPARQL engines. > >>> > >>> > >>> > >>> # possible directions > >>> > >>> > >>> > >>> I see three main possible directions of work there: > >>> > >>> > >>> > >>> 1. work on the proposal as and potentially integrate it completely > >>> > >>> 2. work on jena-core and jena-arq to make the definition of new > >> datatypes and the overloading of operators as easy as the definition of > new > >> custom functions --> so that I can easily implement UCUM datatypes as an > >> extension (and not a fork) > >>> > >>> 3. add VSPACE_QUANTITY value space and NodeValueQuantity in jena-arq, > >> and externalize the support for the UCUM systems of unit in an external > >> module > >>> > >>> > >>> > >>> Best, > >>> > >>> Maxime > >>> > >>> > >>> > >>> Le mar. 27 mars 2018 à 17:16, Andy Seaborne <a...@apache.org> a écrit > : > >>> > >>> Extending the operators for SPARQL is a new value space > VSPACE_QUANTITY. > >>> > >>> See (comparison): > >>> > >>> > >> > https://github.com/OpenSensingCity/jena-ucum/blob/jena-3.6.0-ucum/jena-arq/src/main/java/org/apache/jena/sparql/expr/NodeValue.java#L566 > >>> > >>> and (multiply) > >>> > >>> > >> > https://github.com/OpenSensingCity/jena-ucum/blob/jena-3.6.0-ucum/jena-arq/src/main/java/org/apache/jena/sparql/expr/nodevalue/NodeValueOps.java#L283 > >>> > >>> with a new NodeValueQuantity for javax.measure.Quantity > >>> > >>> I'm seeing this a "one dimensional units" - a quantity and a unit. > >>> > >>> Even then, there are two part - the necessary extensions for operators > >>> and the units themselves to allow for other unit systems (?). > >>> > >>> There are new dependencies in jena-arq and jena-core. > >>> > >>> http://unitsofmeasurement.github.io/ > >>> JSR 363 - Units of Measurement API > >>> BSD-license > >>> > >>> and an old version of something is on central: > >>> > >>> http://central.maven.org/maven2/javax/measure/unit-api/1.0 > >>> > >>> if that's the right thing. > >>> > >>> --- > >>> > >>> Maxime - what are the dependencies for this contribution and for which > >>> pieces are they needed? > >>> > >>> Andy > >>> > >>> On 27/03/18 15:49, ajs6f wrote: > >>>> Bruno raises an interesting question-- would this contribution have > any > >> effect (or should it) on jena-spatial? Would it be either necessary or > if > >> not, appropriate to integrate there? (I'm particularly interested in > this > >> because it might help decide between core and an extension.) > >>>> > >>>> > >>>> ajs6f > >>>> > >>>>> On Mar 26, 2018, at 5:40 PM, Bruno P. Kinoshita <ki...@apache.org> > >> wrote: > >>>>> > >>>>> Hi Maxime, > >>>>> Don't know whether it would be best as part of jena core or in an > >> extension, but sounds very interesting! Will let others comment on this. > >>>>> At work, one item in my backlog is to replace jscience by jsr363 - > >> Units of Measurement > >>>>> | > >>>>> | > >>>>> | > >>>>> | | | > >>>>> > >>>>> | > >>>>> > >>>>> | > >>>>> | > >>>>> | | > >>>>> Units of Measurement > >>>>> > >>>>> Units of Measurement provides a set of APIs and services for handling > >> units and quantities. > >>>>> | | > >>>>> > >>>>> | > >>>>> > >>>>> | > >>>>> > >>>>> > >>>>> We use it for weather forecast and GIS, with things like wind speed, > >> rain amount, etc. > >>>>> I think another GIS library that we use did the switch as well (some > >> OGC lib I think). > >>>>> Perhaps it would be nice to consider taking a look at their api for > >> compatibility with other systems. > >>>>> CheersBruno > >>>>> > >>>>> Sent from Yahoo Mail on Android > >>>>> > >>>>> On Tue, 27 Mar 2018 at 2:07, Maxime Lefrançois< > >> maxime.lefranc...@emse.fr> wrote: Dear all, > >>>>> > >>>>> I am Associate Professor at MINES Saint-Étienne, France, working on > >>>>> Semantic Web and Linked Data. I'd like to let you know about our > >>>>> project *Custom > >>>>> Datatypes for Quantity Values*[1], that leverages the Unified Code of > >> Units > >>>>> of Measures, a code system intended to include all units of measures > >> being > >>>>> contemporarily used in international science, engineering, and > >> business. > >>>>> Using our UCUM Datatypes, one can encode and query quantity values > in a > >>>>> lightweight manner: > >>>>> > >>>>> PREFIX cdt: <http://w3id.org/lindt/custom_datatypes#> > >>>>> PREFIX ex: <http://example.org/> > >>>>> > >>>>> SELECT ?value1 ?value2 ?result > >>>>> WHERE{ > >>>>> VALUES ( ?value1 ?value2 ) { > >>>>> ( "1.0 m/s"^^cdt:speed "2 s"^^cdt:time ) > >>>>> } > >>>>> BIND( ?value1 * ?value2 AS ?result ) > >>>>> } > >>>>> > >>>>> Results in > >>>>> > >>>>> > ---------------------------------------------------------------------- > >>>>> | value1 | value2 | result | > >>>>> > ====================================================================== > >>>>> | "1.0 m/s"^^cdt:speed | "2 s"^^cdt:time | "2.0 m"^^cdt:length > | > >>>>> > >>>>> See our demonstration online [2]. > >>>>> It uses *a fork of Jena where we implemented UCUM datatypes* [3] (in > >>>>> jena-core and jena-arq, with several unit tests) our implementation > >> uses > >>>>> the recent JSR 385, Units of Measurement API 2.0, and the UCUM > >> extension > >>>>> [4]. > >>>>> > >>>>> This is not the first project I develop into/using Jena. > >>>>> - I forked it to Supporting Arbitrary Custom Datatypes in RDF and > >> SPARQL > >>>>> fetching some Javascript definition at the URI of the datatype [5] > >>>>> - I develop SPARQL-Generate, an extension of SPARQL implemented on > ARQ > >> to > >>>>> generate RDF from web documents in XML, JSON, CSV, HTML, CBOR, and > >> plain > >>>>> text with regular expressions [6] > >>>>> > >>>>> > >>>>> If you agree we me that supporting UCUM datatypes would be a nice > >> addition > >>>>> to Apache Jena and a nice contribution to the Semantic Web > community, I > >>>>> would be willing to help to integrate our contribution to other > modules > >>>>> (with jena-tdb, ... ), and help maintaining it in the future. > >>>>> > >>>>> Best regards, > >>>>> Maxime Lefrançois, > >>>>> Associate Professor, MINES Saint-Étienne > >>>>> > >>>>> [1] - http://w3id.org/lindt/custom_datatypes# > >>>>> [2] - http://w3id.org/lindt/playground.html?example=05-Multiply > >>>>> [3] - http://w3id.org/lindt/custom_datatypes#implementation > >>>>> [4] - > >>>>> > >> > https://github.com/unitsofmeasurement/uom-systems/tree/master/ucum-java8 > >>>>> [5] - https://ci.mines-stetienne.fr/lindt/spec.html > >>>>> [6] - https://ci.mines-stetienne.fr/sparql-generate/ > >>>> > >>> > >>> > >>> > >> > >> > >