Re: Clerezze RDF commons moved back to clerezza with a SPARQL Backend

Reto Gmür Mon, 23 Mar 2015 03:48:13 -0700

I voted for the INFRA issue, and will create it as soon as we have it....

On Mon, Mar 23, 2015 at 10:42 AM, Stian Soiland-Reyes <[email protected]>
wrote:


> Could you raise this as an issue so we can focus the discussion?
>
> Jira has not been created yet (
> https://issues.apache.org/jira/browse/INFRA-9245 )  - but I assume we
> would import the github issues one way or another?
>
>
>
> On 23 March 2015 at 10:25, Reto Gmür <[email protected]> wrote:
> > Right now the API on Github says nothing about the identity and hascode
> of
> > any term. In order to have interoperable it is essential to define the
> > value of hashcode and the identity conditions for the rdf-terms which are
> > not locally scoped, i.e. for IRIs and Literals.
> >
> > I suggest to take the definitions from the clerezza rdf commons.
> >
> > Reto
> >
> > On Mon, Mar 23, 2015 at 10:18 AM, Stian Soiland-Reyes <[email protected]>
> > wrote:
> >
> >> OK - I can see on settling BlankNode equality can take some more time
> >> (also considering the SPARQL example).
> >>
> >> So then we must keep the "internalIdentifier" and the abstract concept
> >> of the "local scope" for the next release.
> >>
> >> In which case this one should also be applied:
> >>
> >> https://github.com/commons-rdf/commons-rdf/pull/48/files
> >> and perhaps:
> >> https://github.com/commons-rdf/commons-rdf/pull/61/files
> >>
> >>
> >>
> >> I would then need to fix simple GraphImpl.add() to clone and change
> >> the local scope of the BlankNodes:
> >> .. as otherwise it would wrongly merge graph1.b1 and graph2.b1 (in
> >> both having the same internalIdentifier and the abstract Local Scope
> >> of being in the same Graph). This can happen if doing say a copy from
> >> one graph to another.
> >>
> >> Raised and detailed in
> >> https://github.com/commons-rdf/commons-rdf/issues/66
> >> .. adding this to the tests sounds crucial, and would help us later
> >> when sorting this.
> >>
> >>
> >> This is in no way a complete resolution. (New bugs would arise, e.g.
> >> you could add a triple with a BlankNode and then not remove it
> >> afterwards with the same arguments).
> >>
> >>
> >>
> >>
> >>
> >> On 22 March 2015 at 21:00, Peter Ansell <[email protected]> wrote:
> >> > +1
> >> >
> >> > Although it is not urgent to release a 1.0 version, it is urgent to
> >> > release (and keep releasing often) what we have changed since 0.0.2 so
> >> > we can start experimenting with it, particularly since I have started
> >> > more intently on Sesame 4 in the last few weeks. Stians pull requests
> >> > to change the BNode situation could wait until after 0.0.3 is
> >> > released, at this point.
> >> >
> >> > Cheers,
> >> >
> >> > Peter
> >> >
> >> > On 21 March 2015 at 22:37, Andy Seaborne <[email protected]> wrote:
> >> >> I agree with Sergio that releasing something is important.
> >> >>
> >> >> We need to release, then independent groups can start to build on
> it. We
> >> >> have grounded requirements and a wider community.
> >> >>
> >> >>         Andy
> >> >>
> >> >>
> >> >> On 21/03/15 09:10, Reto Gmür wrote:
> >> >>>
> >> >>> Hi Sergio,
> >> >>>
> >> >>> I don't see where an urgent agenda comes from. Several RDF APIs are
> >> there
> >> >>> so a new API essentially needs to be better rather than done with
> >> urgency.
> >> >>>
> >> >>> The SPARQL implementation is less something that need to be part of
> the
> >> >>> first release but something that helps validating the API proposal.
> We
> >> >>> should validate our API against many possible usecases and then
> discus
> >> >>> which are more important to support. In my opinion for an RDF API
> it is
> >> >>> more important that it can be used with remote repositories over
> >> standard
> >> >>> protocols than support for hadoop style processing across many
> machines
> >> >>> [1], but maybe we can support both usecases.
> >> >>>
> >> >>> In any case I think its good to have prototypical implementation of
> >> >>> usecases to see what API features are needed and which are
> >> problematic. So
> >> >>> I would encourage to write prototype usecases where a hadoop style
> >> >>> processing shows the need for exposed blank node ID or a prototype
> >> showing
> >> >>> that that IRI is better an interface than a class, etc.
> >> >>>
> >> >>> At the end we need to decide on the API features based on the
> usecases
> >> >>> they
> >> >>> are required by respectively compatible with. But it's hard to see
> the
> >> >>> requirements without prototypical code.
> >> >>>
> >> >>> Cheers,
> >> >>> Reto
> >> >>>
> >> >>> 1.
> >> >>>
> >>
> https://github.com/commons-rdf/commons-rdf/pull/48#issuecomment-72689214
> >> >>>
> >> >>> On Fri, Mar 20, 2015 at 8:30 PM, Sergio Fernández <
> [email protected]>
> >> >>> wrote:
> >> >>>
> >> >>>> I perfectly understand what you target. But still, FMPOV still out
> of
> >> our
> >> >>>> urgent agenda. Not because it is not interesting, just because more
> >> >>>> urgent
> >> >>>> things to deal with. I think the most important think is to get
> >> running
> >> >>>> with what we have, and get a release out. But, as I said, we can
> >> discuss
> >> >>>> it.
> >> >>>>
> >> >>>>
> >> >>>> On 20/03/15 19:10, Reto Gmür wrote:
> >> >>>>
> >> >>>>> Just a little usage example to illustrate Stian's point:
> >> >>>>>
> >> >>>>> public class Main {
> >> >>>>>       public static void main(String... args) {
> >> >>>>>           Graph g = new SparqlGraph("http://dbpedia.org/sparql";);
> >> >>>>>           Iterator<Triple> iter = g.filter(new Iri("
> >> >>>>> http://dbpedia.org/ontology/Planet";),
> >> >>>>>                   new
> >> >>>>> Iri("http://www.w3.org/1999/02/22-rdf-syntax-ns#type
> >> >>>>> "),
> >> >>>>> null);
> >> >>>>>           while (iter.hasNext()) {
> >> >>>>>               System.out.println(iter.next().getObject());
> >> >>>>>           }
> >> >>>>>       }
> >> >>>>> }
> >> >>>>>
> >> >>>>> I think with Stian's version using streams the above could be
> shorter
> >> >>>>> and
> >> >>>>> nicer. But the important part is that the above allows to use
> >> dbpedia as
> >> >>>>> a
> >> >>>>> graph without worrying about sparql.
> >> >>>>>
> >> >>>>> Cheers,
> >> >>>>> Reto
> >> >>>>>
> >> >>>>> On Fri, Mar 20, 2015 at 4:16 PM, Stian Soiland-Reyes <
> >> [email protected]>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>   I think a query interface as you say is orthogonal to Reto's
> >> >>>>>>
> >> >>>>>> impl.sparql module - which is trying to be an implementation of
> RDF
> >> >>>>>> Commons that is backed only by a remote SPARQL endpoint.  Thus it
> >> >>>>>> touches on important edges like streaming and blank node
> identities.
> >> >>>>>>
> >> >>>>>> It's not a SPARQL endpoint backed by RDF Commons! :-)
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On 20 March 2015 at 10:58, Sergio Fernández <[email protected]>
> >> wrote:
> >> >>>>>>
> >> >>>>>>> Hi Reto,
> >> >>>>>>>
> >> >>>>>>> yes, that was a deliberated decision on early phases. I'd need
> to
> >> look
> >> >>>>>>> it
> >> >>>>>>> up, I do not remember the concrete issue.
> >> >>>>>>>
> >> >>>>>>> Just going a bit deeper into the topic, in querying we are
> talking
> >> not
> >> >>>>>>>
> >> >>>>>> only
> >> >>>>>>
> >> >>>>>>> about providing native support to query Graph instance, but
> also to
> >> >>>>>>>
> >> >>>>>> provide
> >> >>>>>>
> >> >>>>>>> common interfaces to interact with the results.
> >> >>>>>>>
> >> >>>>>>> The idea was to keep the focus on RDF 1.1 concepts before
> moving to
> >> >>>>>>>
> >> >>>>>> query.
> >> >>>>>>
> >> >>>>>>> Personally I'd prefer to keep that scope for the first incubator
> >> >>>>>>> release,
> >> >>>>>>> and then start to open discussions about such kind of threads.
> But
> >> of
> >> >>>>>>>
> >> >>>>>> course
> >> >>>>>>
> >> >>>>>>> we can vote to change that approach.
> >> >>>>>>>
> >> >>>>>>> Cheers,
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> On 17/03/15 11:05, Reto Gmür wrote:
> >> >>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Hi Sergio,
> >> >>>>>>>>
> >> >>>>>>>> I'm not sure which deliberate decision you are referring to,
> is it
> >> >>>>>>>> Issue
> >> >>>>>>>> #35 in Github?
> >> >>>>>>>>
> >> >>>>>>>> Anyway, the impl.sparql code is not about extending the API to
> >> allow
> >> >>>>>>>> running queries on a graph, in fact the API isn't extended at
> all.
> >> >>>>>>>> It's
> >> >>>>>>>>
> >> >>>>>>> an
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> implementation of the API which is backed by a SPARQL endpoint.
> >> Very
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>> often
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> the triple store doesn't run in the same VM as the client and so
> >> it is
> >> >>>>>>>>
> >> >>>>>>>> necessary that implementation of the API speak to a remote
> triple
> >> >>>>>>>> store.
> >> >>>>>>>> This can use some proprietary protocols or standard SPARQL,
> this
> >> is
> >> >>>>>>>> an
> >> >>>>>>>> implementation for SPARQL and can thus be used against any
> SPARQL
> >> >>>>>>>> endpoint.
> >> >>>>>>>>
> >> >>>>>>>> Cheers,
> >> >>>>>>>> Reto
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Tue, Mar 17, 2015 at 7:41 AM, Sergio Fernández <
> >> [email protected]>
> >> >>>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>   Hi Reto,
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> thanks for updating us with the status from Clerezza.
> >> >>>>>>>>>
> >> >>>>>>>>> In the current Commons RDF API we delivery skipped querying
> for
> >> the
> >> >>>>>>>>>
> >> >>>>>>>> early
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> versions.
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> Although I'd prefer to keep this approach in the initial
> steps at
> >> >>>>>>>>> ASF
> >> >>>>>>>>>
> >> >>>>>>>> (I
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> hope we can import the code soon...), that's for sure one of the
> >> next
> >> >>>>>>>>>
> >> >>>>>>>>> points to discuss in the project, where all that experience is
> >> >>>>>>>>>
> >> >>>>>>>> valuable.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>>
> >> >>>>>>>>> Cheers,
> >> >>>>>>>>>
> >> >>>>>>>>> On 16/03/15 13:02, Reto Gmür wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>   Hello,
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> With the new repository the clerezza rdf commons previously
> in
> >> the
> >> >>>>>>>>>> commons
> >> >>>>>>>>>> sandbox are now at:
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> https://git-wip-us.apache.org/repos/asf/clerezza-rdf-core.git
> >> >>>>>>>>>>
> >> >>>>>>>>>> I will compare that code with the current status of the code
> in
> >> the
> >> >>>>>>>>>> incubating rdf-commons project in a later mail.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Now I would like to point to your attention a big step
> forward
> >> >>>>>>>>>> towards
> >> >>>>>>>>>> CLEREZZA-856. The impl.sparql modules provide an
> implementation
> >> of
> >> >>>>>>>>>> the
> >> >>>>>>>>>> API
> >> >>>>>>>>>> on top of a SPARQL endpoint. Currently it only supports read
> >> >>>>>>>>>> access.
> >> >>>>>>>>>>
> >> >>>>>>>>> For
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> usage example see the tests in
> >> >>>>>>>>>>
> >> >>>>>>>>>> /src/test/java/org/apache/commons/rdf/impl/sparql (
> >> >>>>>>>>>> https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.
> >> >>>>>>>>>> git;a=tree;f=impl.sparql/src/test/java/org/apache/commons/
> >> >>>>>>>>>>
> >> rdf/impl/sparql;h=cb9c98bcf427452392e74cd162c08ab308359c13;hb=HEAD
> >> >>>>>>>>>> )
> >> >>>>>>>>>>
> >> >>>>>>>>>> The hard part was supporting BlankNodes. The current
> >> implementation
> >> >>>>>>>>>> handles
> >> >>>>>>>>>> them correctly even in tricky situations, however the current
> >> code
> >> >>>>>>>>>> is
> >> >>>>>>>>>> not
> >> >>>>>>>>>> optimized for performance yet. As soon as BlankNodes are
> >> involved
> >> >>>>>>>>>> many
> >> >>>>>>>>>> queries have to be sent to the backend. I'm sure some SPARQL
> >> wizard
> >> >>>>>>>>>> could
> >> >>>>>>>>>> help making things more efficient.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Since SPARQL is the only standardized methods to query RDF
> >> data, I
> >> >>>>>>>>>>
> >> >>>>>>>>> think
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> being able to façade an RDF Graph accessible via SPARQL is an
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>> important
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> usecase for an RDF API, so it would be good to also have an
> SPARQL
> >> >>>>>>>>>>
> >> >>>>>>>>>> backed
> >> >>>>>>>>>> implementation of the API proposal in the incubating
> commons-rdf
> >> >>>>>>>>>> repository.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Cheers,
> >> >>>>>>>>>> Reto
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>   --
> >> >>>>>>>>>
> >> >>>>>>>>> Sergio Fernández
> >> >>>>>>>>> Partner Technology Manager
> >> >>>>>>>>> Redlink GmbH
> >> >>>>>>>>> m: +43 660 2747 925
> >> >>>>>>>>> e: [email protected]
> >> >>>>>>>>> w: http://redlink.co
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>> --
> >> >>>>>>> Sergio Fernández
> >> >>>>>>> Partner Technology Manager
> >> >>>>>>> Redlink GmbH
> >> >>>>>>> m: +43 660 2747 925
> >> >>>>>>> e: [email protected]
> >> >>>>>>> w: http://redlink.co
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> Stian Soiland-Reyes
> >> >>>>>> Apache Taverna (incubating), Apache Commons RDF (incubating)
> >> >>>>>> http://orcid.org/0000-0001-9842-9718
> >> >>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>> --
> >> >>>> Sergio Fernández
> >> >>>> Partner Technology Manager
> >> >>>> Redlink GmbH
> >> >>>> m: +43 660 2747 925
> >> >>>> e: [email protected]
> >> >>>> w: http://redlink.co
> >> >>>>
> >> >>>
> >> >>
> >>
> >>
> >>
> >> --
> >> Stian Soiland-Reyes
> >> Apache Taverna (incubating), Apache Commons RDF (incubating)
> >> http://orcid.org/0000-0001-9842-9718
> >>
>
>
>
> --
> Stian Soiland-Reyes
> Apache Taverna (incubating), Apache Commons RDF (incubating)
> http://orcid.org/0000-0001-9842-9718
>

Re: Clerezze RDF commons moved back to clerezza with a SPARQL Backend

Reply via email to