Re: Clerezze RDF commons moved back to clerezza with a SPARQL Backend

Reto Gmür Tue, 24 Mar 2015 05:26:12 -0700

On Mon, Mar 23, 2015 at 12:04 PM, Andy Seaborne <[email protected]> wrote:


> On 23/03/15 10:25, Reto Gmür wrote:
>
>> Right now the API on Github says nothing about the identity and hascode of
>> any term. In order to have interoperable it is essential to define the
>> value of hashcode and the identity conditions for the rdf-terms which are
>> not locally scoped, i.e. for IRIs and Literals.
>>
>
> +1
>
>
>> I suggest to take the definitions from the clerezza rdf commons.
>>
>
> Absent active JIRA at the moment, could you email here please?
>
> Given Peter is spending time on his implementation, this might be quite
> useful to him.
>
> Sure.

Literal: the hash code of the lexical form plus the hash code of the
datatype plus if the literal has a language the hash code of the language

https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.git;a=blob;f=api/src/main/java/org/apache/commons/rdf/Literal.java;h=cf5e1eea2d848a57e4e338a3d208f127103d39a4;hb=HEAD

And the IRI: 5 + the hashcode of the string

https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.git;a=blob;f=api/src/main/java/org/apache/commons/rdf/Iri.java;h=e1ef0f7d21a3cb668b4a3b2f2aae7e2f642b68dd;hb=HEAD

Reto

        Andy
>
>
>
>> Reto
>>
>> On Mon, Mar 23, 2015 at 10:18 AM, Stian Soiland-Reyes <[email protected]>
>> wrote:
>>
>>  OK - I can see on settling BlankNode equality can take some more time
>>> (also considering the SPARQL example).
>>>
>>> So then we must keep the "internalIdentifier" and the abstract concept
>>> of the "local scope" for the next release.
>>>
>>> In which case this one should also be applied:
>>>
>>> https://github.com/commons-rdf/commons-rdf/pull/48/files
>>> and perhaps:
>>> https://github.com/commons-rdf/commons-rdf/pull/61/files
>>>
>>>
>>>
>>> I would then need to fix simple GraphImpl.add() to clone and change
>>> the local scope of the BlankNodes:
>>> .. as otherwise it would wrongly merge graph1.b1 and graph2.b1 (in
>>> both having the same internalIdentifier and the abstract Local Scope
>>> of being in the same Graph). This can happen if doing say a copy from
>>> one graph to another.
>>>
>>> Raised and detailed in
>>> https://github.com/commons-rdf/commons-rdf/issues/66
>>> .. adding this to the tests sounds crucial, and would help us later
>>> when sorting this.
>>>
>>>
>>> This is in no way a complete resolution. (New bugs would arise, e.g.
>>> you could add a triple with a BlankNode and then not remove it
>>> afterwards with the same arguments).
>>>
>>>
>>>
>>>
>>>
>>> On 22 March 2015 at 21:00, Peter Ansell <[email protected]> wrote:
>>>
>>>> +1
>>>>
>>>> Although it is not urgent to release a 1.0 version, it is urgent to
>>>> release (and keep releasing often) what we have changed since 0.0.2 so
>>>> we can start experimenting with it, particularly since I have started
>>>> more intently on Sesame 4 in the last few weeks. Stians pull requests
>>>> to change the BNode situation could wait until after 0.0.3 is
>>>> released, at this point.
>>>>
>>>> Cheers,
>>>>
>>>> Peter
>>>>
>>>> On 21 March 2015 at 22:37, Andy Seaborne <[email protected]> wrote:
>>>>
>>>>> I agree with Sergio that releasing something is important.
>>>>>
>>>>> We need to release, then independent groups can start to build on it.
>>>>> We
>>>>> have grounded requirements and a wider community.
>>>>>
>>>>>          Andy
>>>>>
>>>>>
>>>>> On 21/03/15 09:10, Reto Gmür wrote:
>>>>>
>>>>>>
>>>>>> Hi Sergio,
>>>>>>
>>>>>> I don't see where an urgent agenda comes from. Several RDF APIs are
>>>>>>
>>>>> there
>>>
>>>> so a new API essentially needs to be better rather than done with
>>>>>>
>>>>> urgency.
>>>
>>>>
>>>>>> The SPARQL implementation is less something that need to be part of
>>>>>> the
>>>>>> first release but something that helps validating the API proposal. We
>>>>>> should validate our API against many possible usecases and then discus
>>>>>> which are more important to support. In my opinion for an RDF API it
>>>>>> is
>>>>>> more important that it can be used with remote repositories over
>>>>>>
>>>>> standard
>>>
>>>> protocols than support for hadoop style processing across many machines
>>>>>> [1], but maybe we can support both usecases.
>>>>>>
>>>>>> In any case I think its good to have prototypical implementation of
>>>>>> usecases to see what API features are needed and which are
>>>>>>
>>>>> problematic. So
>>>
>>>> I would encourage to write prototype usecases where a hadoop style
>>>>>> processing shows the need for exposed blank node ID or a prototype
>>>>>>
>>>>> showing
>>>
>>>> that that IRI is better an interface than a class, etc.
>>>>>>
>>>>>> At the end we need to decide on the API features based on the usecases
>>>>>> they
>>>>>> are required by respectively compatible with. But it's hard to see the
>>>>>> requirements without prototypical code.
>>>>>>
>>>>>> Cheers,
>>>>>> Reto
>>>>>>
>>>>>> 1.
>>>>>>
>>>>>>  https://github.com/commons-rdf/commons-rdf/pull/48#
>>> issuecomment-72689214
>>>
>>>>
>>>>>> On Fri, Mar 20, 2015 at 8:30 PM, Sergio Fernández <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>  I perfectly understand what you target. But still, FMPOV still out of
>>>>>>>
>>>>>> our
>>>
>>>> urgent agenda. Not because it is not interesting, just because more
>>>>>>> urgent
>>>>>>> things to deal with. I think the most important think is to get
>>>>>>>
>>>>>> running
>>>
>>>> with what we have, and get a release out. But, as I said, we can
>>>>>>>
>>>>>> discuss
>>>
>>>> it.
>>>>>>>
>>>>>>>
>>>>>>> On 20/03/15 19:10, Reto Gmür wrote:
>>>>>>>
>>>>>>>  Just a little usage example to illustrate Stian's point:
>>>>>>>>
>>>>>>>> public class Main {
>>>>>>>>        public static void main(String... args) {
>>>>>>>>            Graph g = new SparqlGraph("http://dbpedia.org/sparql";);
>>>>>>>>            Iterator<Triple> iter = g.filter(new Iri("
>>>>>>>> http://dbpedia.org/ontology/Planet";),
>>>>>>>>                    new
>>>>>>>> Iri("http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>>>>>>> "),
>>>>>>>> null);
>>>>>>>>            while (iter.hasNext()) {
>>>>>>>>                System.out.println(iter.next().getObject());
>>>>>>>>            }
>>>>>>>>        }
>>>>>>>> }
>>>>>>>>
>>>>>>>> I think with Stian's version using streams the above could be
>>>>>>>> shorter
>>>>>>>> and
>>>>>>>> nicer. But the important part is that the above allows to use
>>>>>>>>
>>>>>>> dbpedia as
>>>
>>>> a
>>>>>>>> graph without worrying about sparql.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Reto
>>>>>>>>
>>>>>>>> On Fri, Mar 20, 2015 at 4:16 PM, Stian Soiland-Reyes <
>>>>>>>>
>>>>>>> [email protected]>
>>>
>>>> wrote:
>>>>>>>>
>>>>>>>>    I think a query interface as you say is orthogonal to Reto's
>>>>>>>>
>>>>>>>>>
>>>>>>>>> impl.sparql module - which is trying to be an implementation of RDF
>>>>>>>>> Commons that is backed only by a remote SPARQL endpoint.  Thus it
>>>>>>>>> touches on important edges like streaming and blank node
>>>>>>>>> identities.
>>>>>>>>>
>>>>>>>>> It's not a SPARQL endpoint backed by RDF Commons! :-)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 20 March 2015 at 10:58, Sergio Fernández <[email protected]>
>>>>>>>>>
>>>>>>>> wrote:
>>>
>>>>
>>>>>>>>>  Hi Reto,
>>>>>>>>>>
>>>>>>>>>> yes, that was a deliberated decision on early phases. I'd need to
>>>>>>>>>>
>>>>>>>>> look
>>>
>>>> it
>>>>>>>>>> up, I do not remember the concrete issue.
>>>>>>>>>>
>>>>>>>>>> Just going a bit deeper into the topic, in querying we are talking
>>>>>>>>>>
>>>>>>>>> not
>>>
>>>>
>>>>>>>>>>  only
>>>>>>>>>
>>>>>>>>>  about providing native support to query Graph instance, but also
>>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>>  provide
>>>>>>>>>
>>>>>>>>>  common interfaces to interact with the results.
>>>>>>>>>>
>>>>>>>>>> The idea was to keep the focus on RDF 1.1 concepts before moving
>>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>>  query.
>>>>>>>>>
>>>>>>>>>  Personally I'd prefer to keep that scope for the first incubator
>>>>>>>>>> release,
>>>>>>>>>> and then start to open discussions about such kind of threads. But
>>>>>>>>>>
>>>>>>>>> of
>>>
>>>>
>>>>>>>>>>  course
>>>>>>>>>
>>>>>>>>>  we can vote to change that approach.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 17/03/15 11:05, Reto Gmür wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi Sergio,
>>>>>>>>>>>
>>>>>>>>>>> I'm not sure which deliberate decision you are referring to, is
>>>>>>>>>>> it
>>>>>>>>>>> Issue
>>>>>>>>>>> #35 in Github?
>>>>>>>>>>>
>>>>>>>>>>> Anyway, the impl.sparql code is not about extending the API to
>>>>>>>>>>>
>>>>>>>>>> allow
>>>
>>>> running queries on a graph, in fact the API isn't extended at all.
>>>>>>>>>>> It's
>>>>>>>>>>>
>>>>>>>>>>>  an
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  implementation of the API which is backed by a SPARQL endpoint.
>>>>>>>>>>
>>>>>>>>> Very
>>>
>>>>
>>>>>>>>>>>
>>>>>>>>>>>  often
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  the triple store doesn't run in the same VM as the client and so
>>>>>>>>>>
>>>>>>>>> it is
>>>
>>>>
>>>>>>>>>>> necessary that implementation of the API speak to a remote triple
>>>>>>>>>>> store.
>>>>>>>>>>> This can use some proprietary protocols or standard SPARQL, this
>>>>>>>>>>>
>>>>>>>>>> is
>>>
>>>> an
>>>>>>>>>>> implementation for SPARQL and can thus be used against any SPARQL
>>>>>>>>>>> endpoint.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Reto
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Mar 17, 2015 at 7:41 AM, Sergio Fernández <
>>>>>>>>>>>
>>>>>>>>>> [email protected]>
>>>
>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>    Hi Reto,
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> thanks for updating us with the status from Clerezza.
>>>>>>>>>>>>
>>>>>>>>>>>> In the current Commons RDF API we delivery skipped querying for
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>
>>>>
>>>>>>>>>>>>  early
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  versions.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Although I'd prefer to keep this approach in the initial steps
>>>>>>>>>>>> at
>>>>>>>>>>>> ASF
>>>>>>>>>>>>
>>>>>>>>>>>>  (I
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  hope we can import the code soon...), that's for sure one of the
>>>>>>>>>>
>>>>>>>>> next
>>>
>>>>
>>>>>>>>>>>> points to discuss in the project, where all that experience is
>>>>>>>>>>>>
>>>>>>>>>>>>  valuable.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  Cheers,
>>>>>>>>>>>>
>>>>>>>>>>>> On 16/03/15 13:02, Reto Gmür wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>    Hello,
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> With the new repository the clerezza rdf commons previously in
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>
>>>> commons
>>>>>>>>>>>>> sandbox are now at:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://git-wip-us.apache.org/repos/asf/clerezza-rdf-core.git
>>>>>>>>>>>>>
>>>>>>>>>>>>> I will compare that code with the current status of the code in
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>
>>>> incubating rdf-commons project in a later mail.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now I would like to point to your attention a big step forward
>>>>>>>>>>>>> towards
>>>>>>>>>>>>> CLEREZZA-856. The impl.sparql modules provide an implementation
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>
>>>> the
>>>>>>>>>>>>> API
>>>>>>>>>>>>> on top of a SPARQL endpoint. Currently it only supports read
>>>>>>>>>>>>> access.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  For
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  usage example see the tests in
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> /src/test/java/org/apache/commons/rdf/impl/sparql (
>>>>>>>>>>>>> https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.
>>>>>>>>>>>>> git;a=tree;f=impl.sparql/src/test/java/org/apache/commons/
>>>>>>>>>>>>>
>>>>>>>>>>>>>  rdf/impl/sparql;h=cb9c98bcf427452392e74cd162c08a
>>> b308359c13;hb=HEAD
>>>
>>>> )
>>>>>>>>>>>>>
>>>>>>>>>>>>> The hard part was supporting BlankNodes. The current
>>>>>>>>>>>>>
>>>>>>>>>>>> implementation
>>>
>>>> handles
>>>>>>>>>>>>> them correctly even in tricky situations, however the current
>>>>>>>>>>>>>
>>>>>>>>>>>> code
>>>
>>>> is
>>>>>>>>>>>>> not
>>>>>>>>>>>>> optimized for performance yet. As soon as BlankNodes are
>>>>>>>>>>>>>
>>>>>>>>>>>> involved
>>>
>>>> many
>>>>>>>>>>>>> queries have to be sent to the backend. I'm sure some SPARQL
>>>>>>>>>>>>>
>>>>>>>>>>>> wizard
>>>
>>>> could
>>>>>>>>>>>>> help making things more efficient.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Since SPARQL is the only standardized methods to query RDF
>>>>>>>>>>>>>
>>>>>>>>>>>> data, I
>>>
>>>>
>>>>>>>>>>>>>  think
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  being able to façade an RDF Graph accessible via SPARQL is an
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>  important
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  usecase for an RDF API, so it would be good to also have an SPARQL
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> backed
>>>>>>>>>>>>> implementation of the API proposal in the incubating
>>>>>>>>>>>>> commons-rdf
>>>>>>>>>>>>> repository.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Reto
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>    --
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Sergio Fernández
>>>>>>>>>>>> Partner Technology Manager
>>>>>>>>>>>> Redlink GmbH
>>>>>>>>>>>> m: +43 660 2747 925
>>>>>>>>>>>> e: [email protected]
>>>>>>>>>>>> w: http://redlink.co
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>  --
>>>>>>>>>> Sergio Fernández
>>>>>>>>>> Partner Technology Manager
>>>>>>>>>> Redlink GmbH
>>>>>>>>>> m: +43 660 2747 925
>>>>>>>>>> e: [email protected]
>>>>>>>>>> w: http://redlink.co
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Stian Soiland-Reyes
>>>>>>>>> Apache Taverna (incubating), Apache Commons RDF (incubating)
>>>>>>>>> http://orcid.org/0000-0001-9842-9718
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>  --
>>>>>>> Sergio Fernández
>>>>>>> Partner Technology Manager
>>>>>>> Redlink GmbH
>>>>>>> m: +43 660 2747 925
>>>>>>> e: [email protected]
>>>>>>> w: http://redlink.co
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>>
>>> --
>>> Stian Soiland-Reyes
>>> Apache Taverna (incubating), Apache Commons RDF (incubating)
>>> http://orcid.org/0000-0001-9842-9718
>>>
>>>
>>
>

Re: Clerezze RDF commons moved back to clerezza with a SPARQL Backend

Reply via email to