Re: Clerezze RDF commons moved back to clerezza with a SPARQL Backend

Reto Gmür Mon, 23 Mar 2015 03:27:52 -0700

Right now the API on Github says nothing about the identity and hascode of
any term. In order to have interoperable it is essential to define the
value of hashcode and the identity conditions for the rdf-terms which are
not locally scoped, i.e. for IRIs and Literals.


I suggest to take the definitions from the clerezza rdf commons.

Reto

On Mon, Mar 23, 2015 at 10:18 AM, Stian Soiland-Reyes <[email protected]>
wrote:

> OK - I can see on settling BlankNode equality can take some more time
> (also considering the SPARQL example).
>
> So then we must keep the "internalIdentifier" and the abstract concept
> of the "local scope" for the next release.
>
> In which case this one should also be applied:
>
> https://github.com/commons-rdf/commons-rdf/pull/48/files
> and perhaps:
> https://github.com/commons-rdf/commons-rdf/pull/61/files
>
>
>
> I would then need to fix simple GraphImpl.add() to clone and change
> the local scope of the BlankNodes:
> .. as otherwise it would wrongly merge graph1.b1 and graph2.b1 (in
> both having the same internalIdentifier and the abstract Local Scope
> of being in the same Graph). This can happen if doing say a copy from
> one graph to another.
>
> Raised and detailed in
> https://github.com/commons-rdf/commons-rdf/issues/66
> .. adding this to the tests sounds crucial, and would help us later
> when sorting this.
>
>
> This is in no way a complete resolution. (New bugs would arise, e.g.
> you could add a triple with a BlankNode and then not remove it
> afterwards with the same arguments).
>
>
>
>
>
> On 22 March 2015 at 21:00, Peter Ansell <[email protected]> wrote:
> > +1
> >
> > Although it is not urgent to release a 1.0 version, it is urgent to
> > release (and keep releasing often) what we have changed since 0.0.2 so
> > we can start experimenting with it, particularly since I have started
> > more intently on Sesame 4 in the last few weeks. Stians pull requests
> > to change the BNode situation could wait until after 0.0.3 is
> > released, at this point.
> >
> > Cheers,
> >
> > Peter
> >
> > On 21 March 2015 at 22:37, Andy Seaborne <[email protected]> wrote:
> >> I agree with Sergio that releasing something is important.
> >>
> >> We need to release, then independent groups can start to build on it. We
> >> have grounded requirements and a wider community.
> >>
> >>         Andy
> >>
> >>
> >> On 21/03/15 09:10, Reto Gmür wrote:
> >>>
> >>> Hi Sergio,
> >>>
> >>> I don't see where an urgent agenda comes from. Several RDF APIs are
> there
> >>> so a new API essentially needs to be better rather than done with
> urgency.
> >>>
> >>> The SPARQL implementation is less something that need to be part of the
> >>> first release but something that helps validating the API proposal. We
> >>> should validate our API against many possible usecases and then discus
> >>> which are more important to support. In my opinion for an RDF API it is
> >>> more important that it can be used with remote repositories over
> standard
> >>> protocols than support for hadoop style processing across many machines
> >>> [1], but maybe we can support both usecases.
> >>>
> >>> In any case I think its good to have prototypical implementation of
> >>> usecases to see what API features are needed and which are
> problematic. So
> >>> I would encourage to write prototype usecases where a hadoop style
> >>> processing shows the need for exposed blank node ID or a prototype
> showing
> >>> that that IRI is better an interface than a class, etc.
> >>>
> >>> At the end we need to decide on the API features based on the usecases
> >>> they
> >>> are required by respectively compatible with. But it's hard to see the
> >>> requirements without prototypical code.
> >>>
> >>> Cheers,
> >>> Reto
> >>>
> >>> 1.
> >>>
> https://github.com/commons-rdf/commons-rdf/pull/48#issuecomment-72689214
> >>>
> >>> On Fri, Mar 20, 2015 at 8:30 PM, Sergio Fernández <[email protected]>
> >>> wrote:
> >>>
> >>>> I perfectly understand what you target. But still, FMPOV still out of
> our
> >>>> urgent agenda. Not because it is not interesting, just because more
> >>>> urgent
> >>>> things to deal with. I think the most important think is to get
> running
> >>>> with what we have, and get a release out. But, as I said, we can
> discuss
> >>>> it.
> >>>>
> >>>>
> >>>> On 20/03/15 19:10, Reto Gmür wrote:
> >>>>
> >>>>> Just a little usage example to illustrate Stian's point:
> >>>>>
> >>>>> public class Main {
> >>>>>       public static void main(String... args) {
> >>>>>           Graph g = new SparqlGraph("http://dbpedia.org/sparql";);
> >>>>>           Iterator<Triple> iter = g.filter(new Iri("
> >>>>> http://dbpedia.org/ontology/Planet";),
> >>>>>                   new
> >>>>> Iri("http://www.w3.org/1999/02/22-rdf-syntax-ns#type
> >>>>> "),
> >>>>> null);
> >>>>>           while (iter.hasNext()) {
> >>>>>               System.out.println(iter.next().getObject());
> >>>>>           }
> >>>>>       }
> >>>>> }
> >>>>>
> >>>>> I think with Stian's version using streams the above could be shorter
> >>>>> and
> >>>>> nicer. But the important part is that the above allows to use
> dbpedia as
> >>>>> a
> >>>>> graph without worrying about sparql.
> >>>>>
> >>>>> Cheers,
> >>>>> Reto
> >>>>>
> >>>>> On Fri, Mar 20, 2015 at 4:16 PM, Stian Soiland-Reyes <
> [email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>   I think a query interface as you say is orthogonal to Reto's
> >>>>>>
> >>>>>> impl.sparql module - which is trying to be an implementation of RDF
> >>>>>> Commons that is backed only by a remote SPARQL endpoint.  Thus it
> >>>>>> touches on important edges like streaming and blank node identities.
> >>>>>>
> >>>>>> It's not a SPARQL endpoint backed by RDF Commons! :-)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 20 March 2015 at 10:58, Sergio Fernández <[email protected]>
> wrote:
> >>>>>>
> >>>>>>> Hi Reto,
> >>>>>>>
> >>>>>>> yes, that was a deliberated decision on early phases. I'd need to
> look
> >>>>>>> it
> >>>>>>> up, I do not remember the concrete issue.
> >>>>>>>
> >>>>>>> Just going a bit deeper into the topic, in querying we are talking
> not
> >>>>>>>
> >>>>>> only
> >>>>>>
> >>>>>>> about providing native support to query Graph instance, but also to
> >>>>>>>
> >>>>>> provide
> >>>>>>
> >>>>>>> common interfaces to interact with the results.
> >>>>>>>
> >>>>>>> The idea was to keep the focus on RDF 1.1 concepts before moving to
> >>>>>>>
> >>>>>> query.
> >>>>>>
> >>>>>>> Personally I'd prefer to keep that scope for the first incubator
> >>>>>>> release,
> >>>>>>> and then start to open discussions about such kind of threads. But
> of
> >>>>>>>
> >>>>>> course
> >>>>>>
> >>>>>>> we can vote to change that approach.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 17/03/15 11:05, Reto Gmür wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Hi Sergio,
> >>>>>>>>
> >>>>>>>> I'm not sure which deliberate decision you are referring to, is it
> >>>>>>>> Issue
> >>>>>>>> #35 in Github?
> >>>>>>>>
> >>>>>>>> Anyway, the impl.sparql code is not about extending the API to
> allow
> >>>>>>>> running queries on a graph, in fact the API isn't extended at all.
> >>>>>>>> It's
> >>>>>>>>
> >>>>>>> an
> >>>>>>
> >>>>>>
> >>>>>>> implementation of the API which is backed by a SPARQL endpoint.
> Very
> >>>>>>>>
> >>>>>>>>
> >>>>>>> often
> >>>>>>
> >>>>>>
> >>>>>>> the triple store doesn't run in the same VM as the client and so
> it is
> >>>>>>>>
> >>>>>>>> necessary that implementation of the API speak to a remote triple
> >>>>>>>> store.
> >>>>>>>> This can use some proprietary protocols or standard SPARQL, this
> is
> >>>>>>>> an
> >>>>>>>> implementation for SPARQL and can thus be used against any SPARQL
> >>>>>>>> endpoint.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Reto
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Mar 17, 2015 at 7:41 AM, Sergio Fernández <
> [email protected]>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>   Hi Reto,
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> thanks for updating us with the status from Clerezza.
> >>>>>>>>>
> >>>>>>>>> In the current Commons RDF API we delivery skipped querying for
> the
> >>>>>>>>>
> >>>>>>>> early
> >>>>>>
> >>>>>>
> >>>>>>> versions.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Although I'd prefer to keep this approach in the initial steps at
> >>>>>>>>> ASF
> >>>>>>>>>
> >>>>>>>> (I
> >>>>>>
> >>>>>>
> >>>>>>> hope we can import the code soon...), that's for sure one of the
> next
> >>>>>>>>>
> >>>>>>>>> points to discuss in the project, where all that experience is
> >>>>>>>>>
> >>>>>>>> valuable.
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>>
> >>>>>>>>> On 16/03/15 13:02, Reto Gmür wrote:
> >>>>>>>>>
> >>>>>>>>>   Hello,
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> With the new repository the clerezza rdf commons previously in
> the
> >>>>>>>>>> commons
> >>>>>>>>>> sandbox are now at:
> >>>>>>>>>>
> >>>>>>>>>> https://git-wip-us.apache.org/repos/asf/clerezza-rdf-core.git
> >>>>>>>>>>
> >>>>>>>>>> I will compare that code with the current status of the code in
> the
> >>>>>>>>>> incubating rdf-commons project in a later mail.
> >>>>>>>>>>
> >>>>>>>>>> Now I would like to point to your attention a big step forward
> >>>>>>>>>> towards
> >>>>>>>>>> CLEREZZA-856. The impl.sparql modules provide an implementation
> of
> >>>>>>>>>> the
> >>>>>>>>>> API
> >>>>>>>>>> on top of a SPARQL endpoint. Currently it only supports read
> >>>>>>>>>> access.
> >>>>>>>>>>
> >>>>>>>>> For
> >>>>>>
> >>>>>>
> >>>>>>> usage example see the tests in
> >>>>>>>>>>
> >>>>>>>>>> /src/test/java/org/apache/commons/rdf/impl/sparql (
> >>>>>>>>>> https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.
> >>>>>>>>>> git;a=tree;f=impl.sparql/src/test/java/org/apache/commons/
> >>>>>>>>>>
> rdf/impl/sparql;h=cb9c98bcf427452392e74cd162c08ab308359c13;hb=HEAD
> >>>>>>>>>> )
> >>>>>>>>>>
> >>>>>>>>>> The hard part was supporting BlankNodes. The current
> implementation
> >>>>>>>>>> handles
> >>>>>>>>>> them correctly even in tricky situations, however the current
> code
> >>>>>>>>>> is
> >>>>>>>>>> not
> >>>>>>>>>> optimized for performance yet. As soon as BlankNodes are
> involved
> >>>>>>>>>> many
> >>>>>>>>>> queries have to be sent to the backend. I'm sure some SPARQL
> wizard
> >>>>>>>>>> could
> >>>>>>>>>> help making things more efficient.
> >>>>>>>>>>
> >>>>>>>>>> Since SPARQL is the only standardized methods to query RDF
> data, I
> >>>>>>>>>>
> >>>>>>>>> think
> >>>>>>
> >>>>>>
> >>>>>>> being able to façade an RDF Graph accessible via SPARQL is an
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>> important
> >>>>>>
> >>>>>>
> >>>>>>> usecase for an RDF API, so it would be good to also have an SPARQL
> >>>>>>>>>>
> >>>>>>>>>> backed
> >>>>>>>>>> implementation of the API proposal in the incubating commons-rdf
> >>>>>>>>>> repository.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Reto
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>   --
> >>>>>>>>>
> >>>>>>>>> Sergio Fernández
> >>>>>>>>> Partner Technology Manager
> >>>>>>>>> Redlink GmbH
> >>>>>>>>> m: +43 660 2747 925
> >>>>>>>>> e: [email protected]
> >>>>>>>>> w: http://redlink.co
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>> --
> >>>>>>> Sergio Fernández
> >>>>>>> Partner Technology Manager
> >>>>>>> Redlink GmbH
> >>>>>>> m: +43 660 2747 925
> >>>>>>> e: [email protected]
> >>>>>>> w: http://redlink.co
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Stian Soiland-Reyes
> >>>>>> Apache Taverna (incubating), Apache Commons RDF (incubating)
> >>>>>> http://orcid.org/0000-0001-9842-9718
> >>>>>>
> >>>>>>
> >>>>>
> >>>> --
> >>>> Sergio Fernández
> >>>> Partner Technology Manager
> >>>> Redlink GmbH
> >>>> m: +43 660 2747 925
> >>>> e: [email protected]
> >>>> w: http://redlink.co
> >>>>
> >>>
> >>
>
>
>
> --
> Stian Soiland-Reyes
> Apache Taverna (incubating), Apache Commons RDF (incubating)
> http://orcid.org/0000-0001-9842-9718
>

Re: Clerezze RDF commons moved back to clerezza with a SPARQL Backend

Reply via email to