Re: Hashcode definition

Reto Gmür Thu, 26 Mar 2015 09:46:04 -0700

Hi Stian,

>
> It means the "hashcode of the data type" in Literal can be slightly
> ambiguous. Perhaps "hashcode of the #getDataType() IRI" ? It also hammers
> in through getDataType that every Literal has a datatype, e.g. it is always
> added to the hash.
>


Yes that what I meant, the datatype is the IRI returned by the
getDataType() method. I think this could be added in brackets.

Cheers,
Reto


>
> BTW, hashcode of the Optional language is conveniently compliant with "plus
> hash code of language if present", so no similar ambiguity there.
>
>
> https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html#hashCode--
> On 24 Mar 2015 12:25, "Reto Gmür" <[email protected]> wrote:
>
> On Mon, Mar 23, 2015 at 12:04 PM, Andy Seaborne <[email protected]> wrote:
>
> > On 23/03/15 10:25, Reto Gmür wrote:
> >
> >> Right now the API on Github says nothing about the identity and hascode
> of
> >> any term. In order to have interoperable it is essential to define the
> >> value of hashcode and the identity conditions for the rdf-terms which
> are
> >> not locally scoped, i.e. for IRIs and Literals.
> >>
> >
> > +1
> >
> >
> >> I suggest to take the definitions from the clerezza rdf commons.
> >>
> >
> > Absent active JIRA at the moment, could you email here please?
> >
> > Given Peter is spending time on his implementation, this might be quite
> > useful to him.
> >
> > Sure.
>
> Literal: the hash code of the lexical form plus the hash code of the
> datatype plus if the literal has a language the hash code of the language
>
>
> https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.git;a=blob;f=api/src/main/java/org/apache/commons/rdf/Literal.java;h=cf5e1eea2d848a57e4e338a3d208f127103d39a4;hb=HEAD
>
> And the IRI: 5 + the hashcode of the string
>
>
> https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.git;a=blob;f=api/src/main/java/org/apache/commons/rdf/Iri.java;h=e1ef0f7d21a3cb668b4a3b2f2aae7e2f642b68dd;hb=HEAD
>
> Reto
>
>         Andy
> >
> >
> >
> >> Reto
> >>
> >> On Mon, Mar 23, 2015 at 10:18 AM, Stian Soiland-Reyes <[email protected]
> >
> >> wrote:
> >>
> >>  OK - I can see on settling BlankNode equality can take some more time
> >>> (also considering the SPARQL example).
> >>>
> >>> So then we must keep the "internalIdentifier" and the abstract concept
> >>> of the "local scope" for the next release.
> >>>
> >>> In which case this one should also be applied:
> >>>
> >>> https://github.com/commons-rdf/commons-rdf/pull/48/files
> >>> and perhaps:
> >>> https://github.com/commons-rdf/commons-rdf/pull/61/files
> >>>
> >>>
> >>>
> >>> I would then need to fix simple GraphImpl.add() to clone and change
> >>> the local scope of the BlankNodes:
> >>> .. as otherwise it would wrongly merge graph1.b1 and graph2.b1 (in
> >>> both having the same internalIdentifier and the abstract Local Scope
> >>> of being in the same Graph). This can happen if doing say a copy from
> >>> one graph to another.
> >>>
> >>> Raised and detailed in
> >>> https://github.com/commons-rdf/commons-rdf/issues/66
> >>> .. adding this to the tests sounds crucial, and would help us later
> >>> when sorting this.
> >>>
> >>>
> >>> This is in no way a complete resolution. (New bugs would arise, e.g.
> >>> you could add a triple with a BlankNode and then not remove it
> >>> afterwards with the same arguments).
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 22 March 2015 at 21:00, Peter Ansell <[email protected]>
> wrote:
> >>>
> >>>> +1
> >>>>
> >>>> Although it is not urgent to release a 1.0 version, it is urgent to
> >>>> release (and keep releasing often) what we have changed since 0.0.2 so
> >>>> we can start experimenting with it, particularly since I have started
> >>>> more intently on Sesame 4 in the last few weeks. Stians pull requests
> >>>> to change the BNode situation could wait until after 0.0.3 is
> >>>> released, at this point.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Peter
> >>>>
> >>>> On 21 March 2015 at 22:37, Andy Seaborne <[email protected]> wrote:
> >>>>
> >>>>> I agree with Sergio that releasing something is important.
> >>>>>
> >>>>> We need to release, then independent groups can start to build on it.
> >>>>> We
> >>>>> have grounded requirements and a wider community.
> >>>>>
> >>>>>          Andy
> >>>>>
> >>>>>
> >>>>> On 21/03/15 09:10, Reto Gmür wrote:
> >>>>>
> >>>>>>
> >>>>>> Hi Sergio,
> >>>>>>
> >>>>>> I don't see where an urgent agenda comes from. Several RDF APIs are
> >>>>>>
> >>>>> there
> >>>
> >>>> so a new API essentially needs to be better rather than done with
> >>>>>>
> >>>>> urgency.
> >>>
> >>>>
> >>>>>> The SPARQL implementation is less something that need to be part of
> >>>>>> the
> >>>>>> first release but something that helps validating the API proposal.
> We
> >>>>>> should validate our API against many possible usecases and then
> discus
> >>>>>> which are more important to support. In my opinion for an RDF API it
> >>>>>> is
> >>>>>> more important that it can be used with remote repositories over
> >>>>>>
> >>>>> standard
> >>>
> >>>> protocols than support for hadoop style processing across many
> machines
> >>>>>> [1], but maybe we can support both usecases.
> >>>>>>
> >>>>>> In any case I think its good to have prototypical implementation of
> >>>>>> usecases to see what API features are needed and which are
> >>>>>>
> >>>>> problematic. So
> >>>
> >>>> I would encourage to write prototype usecases where a hadoop style
> >>>>>> processing shows the need for exposed blank node ID or a prototype
> >>>>>>
> >>>>> showing
> >>>
> >>>> that that IRI is better an interface than a class, etc.
> >>>>>>
> >>>>>> At the end we need to decide on the API features based on the
> usecases
> >>>>>> they
> >>>>>> are required by respectively compatible with. But it's hard to see
> the
> >>>>>> requirements without prototypical code.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Reto
> >>>>>>
> >>>>>> 1.
> >>>>>>
> >>>>>>  https://github.com/commons-rdf/commons-rdf/pull/48#
> >>> issuecomment-72689214
> >>>
> >>>>
> >>>>>> On Fri, Mar 20, 2015 at 8:30 PM, Sergio Fernández <
> [email protected]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>  I perfectly understand what you target. But still, FMPOV still out
> of
> >>>>>>>
> >>>>>> our
> >>>
> >>>> urgent agenda. Not because it is not interesting, just because more
> >>>>>>> urgent
> >>>>>>> things to deal with. I think the most important think is to get
> >>>>>>>
> >>>>>> running
> >>>
> >>>> with what we have, and get a release out. But, as I said, we can
> >>>>>>>
> >>>>>> discuss
> >>>
> >>>> it.
> >>>>>>>
> >>>>>>>
> >>>>>>> On 20/03/15 19:10, Reto Gmür wrote:
> >>>>>>>
> >>>>>>>  Just a little usage example to illustrate Stian's point:
> >>>>>>>>
> >>>>>>>> public class Main {
> >>>>>>>>        public static void main(String... args) {
> >>>>>>>>            Graph g = new SparqlGraph("http://dbpedia.org/sparql
> ");
> >>>>>>>>            Iterator<Triple> iter = g.filter(new Iri("
> >>>>>>>> http://dbpedia.org/ontology/Planet";),
> >>>>>>>>                    new
> >>>>>>>> Iri("http://www.w3.org/1999/02/22-rdf-syntax-ns#type
> >>>>>>>> "),
> >>>>>>>> null);
> >>>>>>>>            while (iter.hasNext()) {
> >>>>>>>>                System.out.println(iter.next().getObject());
> >>>>>>>>            }
> >>>>>>>>        }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> I think with Stian's version using streams the above could be
> >>>>>>>> shorter
> >>>>>>>> and
> >>>>>>>> nicer. But the important part is that the above allows to use
> >>>>>>>>
> >>>>>>> dbpedia as
> >>>
> >>>> a
> >>>>>>>> graph without worrying about sparql.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Reto
> >>>>>>>>
> >>>>>>>> On Fri, Mar 20, 2015 at 4:16 PM, Stian Soiland-Reyes <
> >>>>>>>>
> >>>>>>> [email protected]>
> >>>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>>    I think a query interface as you say is orthogonal to Reto's
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> impl.sparql module - which is trying to be an implementation of
> RDF
> >>>>>>>>> Commons that is backed only by a remote SPARQL endpoint.  Thus it
> >>>>>>>>> touches on important edges like streaming and blank node
> >>>>>>>>> identities.
> >>>>>>>>>
> >>>>>>>>> It's not a SPARQL endpoint backed by RDF Commons! :-)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 20 March 2015 at 10:58, Sergio Fernández <[email protected]>
> >>>>>>>>>
> >>>>>>>> wrote:
> >>>
> >>>>
> >>>>>>>>>  Hi Reto,
> >>>>>>>>>>
> >>>>>>>>>> yes, that was a deliberated decision on early phases. I'd need
> to
> >>>>>>>>>>
> >>>>>>>>> look
> >>>
> >>>> it
> >>>>>>>>>> up, I do not remember the concrete issue.
> >>>>>>>>>>
> >>>>>>>>>> Just going a bit deeper into the topic, in querying we are
> talking
> >>>>>>>>>>
> >>>>>>>>> not
> >>>
> >>>>
> >>>>>>>>>>  only
> >>>>>>>>>
> >>>>>>>>>  about providing native support to query Graph instance, but also
> >>>>>>>>>> to
> >>>>>>>>>>
> >>>>>>>>>>  provide
> >>>>>>>>>
> >>>>>>>>>  common interfaces to interact with the results.
> >>>>>>>>>>
> >>>>>>>>>> The idea was to keep the focus on RDF 1.1 concepts before moving
> >>>>>>>>>> to
> >>>>>>>>>>
> >>>>>>>>>>  query.
> >>>>>>>>>
> >>>>>>>>>  Personally I'd prefer to keep that scope for the first incubator
> >>>>>>>>>> release,
> >>>>>>>>>> and then start to open discussions about such kind of threads.
> But
> >>>>>>>>>>
> >>>>>>>>> of
> >>>
> >>>>
> >>>>>>>>>>  course
> >>>>>>>>>
> >>>>>>>>>  we can vote to change that approach.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 17/03/15 11:05, Reto Gmür wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Hi Sergio,
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not sure which deliberate decision you are referring to, is
> >>>>>>>>>>> it
> >>>>>>>>>>> Issue
> >>>>>>>>>>> #35 in Github?
> >>>>>>>>>>>
> >>>>>>>>>>> Anyway, the impl.sparql code is not about extending the API to
> >>>>>>>>>>>
> >>>>>>>>>> allow
> >>>
> >>>> running queries on a graph, in fact the API isn't extended at all.
> >>>>>>>>>>> It's
> >>>>>>>>>>>
> >>>>>>>>>>>  an
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  implementation of the API which is backed by a SPARQL endpoint.
> >>>>>>>>>>
> >>>>>>>>> Very
> >>>
> >>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  often
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  the triple store doesn't run in the same VM as the client and so
> >>>>>>>>>>
> >>>>>>>>> it is
> >>>
> >>>>
> >>>>>>>>>>> necessary that implementation of the API speak to a remote
> triple
> >>>>>>>>>>> store.
> >>>>>>>>>>> This can use some proprietary protocols or standard SPARQL,
> this
> >>>>>>>>>>>
> >>>>>>>>>> is
> >>>
> >>>> an
> >>>>>>>>>>> implementation for SPARQL and can thus be used against any
> SPARQL
> >>>>>>>>>>> endpoint.
> >>>>>>>>>>>
> >>>>>>>>>>> Cheers,
> >>>>>>>>>>> Reto
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Mar 17, 2015 at 7:41 AM, Sergio Fernández <
> >>>>>>>>>>>
> >>>>>>>>>> [email protected]>
> >>>
> >>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>    Hi Reto,
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks for updating us with the status from Clerezza.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In the current Commons RDF API we delivery skipped querying
> for
> >>>>>>>>>>>>
> >>>>>>>>>>> the
> >>>
> >>>>
> >>>>>>>>>>>>  early
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  versions.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Although I'd prefer to keep this approach in the initial steps
> >>>>>>>>>>>> at
> >>>>>>>>>>>> ASF
> >>>>>>>>>>>>
> >>>>>>>>>>>>  (I
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  hope we can import the code soon...), that's for sure one of the
> >>>>>>>>>>
> >>>>>>>>> next
> >>>
> >>>>
> >>>>>>>>>>>> points to discuss in the project, where all that experience is
> >>>>>>>>>>>>
> >>>>>>>>>>>>  valuable.
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>  Cheers,
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 16/03/15 13:02, Reto Gmür wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>    Hello,
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> With the new repository the clerezza rdf commons previously
> in
> >>>>>>>>>>>>>
> >>>>>>>>>>>> the
> >>>
> >>>> commons
> >>>>>>>>>>>>> sandbox are now at:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> https://git-wip-us.apache.org/repos/asf/clerezza-rdf-core.git
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I will compare that code with the current status of the code
> in
> >>>>>>>>>>>>>
> >>>>>>>>>>>> the
> >>>
> >>>> incubating rdf-commons project in a later mail.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Now I would like to point to your attention a big step
> forward
> >>>>>>>>>>>>> towards
> >>>>>>>>>>>>> CLEREZZA-856. The impl.sparql modules provide an
> implementation
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>
> >>>> the
> >>>>>>>>>>>>> API
> >>>>>>>>>>>>> on top of a SPARQL endpoint. Currently it only supports read
> >>>>>>>>>>>>> access.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  For
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  usage example see the tests in
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>> /src/test/java/org/apache/commons/rdf/impl/sparql (
> >>>>>>>>>>>>> https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.
> >>>>>>>>>>>>> git;a=tree;f=impl.sparql/src/test/java/org/apache/commons/
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  rdf/impl/sparql;h=cb9c98bcf427452392e74cd162c08a
> >>> b308359c13;hb=HEAD
> >>>
> >>>> )
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The hard part was supporting BlankNodes. The current
> >>>>>>>>>>>>>
> >>>>>>>>>>>> implementation
> >>>
> >>>> handles
> >>>>>>>>>>>>> them correctly even in tricky situations, however the current
> >>>>>>>>>>>>>
> >>>>>>>>>>>> code
> >>>
> >>>> is
> >>>>>>>>>>>>> not
> >>>>>>>>>>>>> optimized for performance yet. As soon as BlankNodes are
> >>>>>>>>>>>>>
> >>>>>>>>>>>> involved
> >>>
> >>>> many
> >>>>>>>>>>>>> queries have to be sent to the backend. I'm sure some SPARQL
> >>>>>>>>>>>>>
> >>>>>>>>>>>> wizard
> >>>
> >>>> could
> >>>>>>>>>>>>> help making things more efficient.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Since SPARQL is the only standardized methods to query RDF
> >>>>>>>>>>>>>
> >>>>>>>>>>>> data, I
> >>>
> >>>>
> >>>>>>>>>>>>>  think
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  being able to façade an RDF Graph accessible via SPARQL is an
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  important
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  usecase for an RDF API, so it would be good to also have an
> SPARQL
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>> backed
> >>>>>>>>>>>>> implementation of the API proposal in the incubating
> >>>>>>>>>>>>> commons-rdf
> >>>>>>>>>>>>> repository.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>> Reto
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    --
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Sergio Fernández
> >>>>>>>>>>>> Partner Technology Manager
> >>>>>>>>>>>> Redlink GmbH
> >>>>>>>>>>>> m: +43 660 2747 925
> >>>>>>>>>>>> e: [email protected]
> >>>>>>>>>>>> w: http://redlink.co
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>  --
> >>>>>>>>>> Sergio Fernández
> >>>>>>>>>> Partner Technology Manager
> >>>>>>>>>> Redlink GmbH
> >>>>>>>>>> m: +43 660 2747 925
> >>>>>>>>>> e: [email protected]
> >>>>>>>>>> w: http://redlink.co
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Stian Soiland-Reyes
> >>>>>>>>> Apache Taverna (incubating), Apache Commons RDF (incubating)
> >>>>>>>>> http://orcid.org/0000-0001-9842-9718
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>  --
> >>>>>>> Sergio Fernández
> >>>>>>> Partner Technology Manager
> >>>>>>> Redlink GmbH
> >>>>>>> m: +43 660 2747 925
> >>>>>>> e: [email protected]
> >>>>>>> w: http://redlink.co
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>>
> >>> --
> >>> Stian Soiland-Reyes
> >>> Apache Taverna (incubating), Apache Commons RDF (incubating)
> >>> http://orcid.org/0000-0001-9842-9718
> >>>
> >>>
> >>
> >
>

Re: Hashcode definition

Reply via email to