[snip] >> No. It is very much like the `java.util.Collections.sort` method [1]: >> it was written/compiled only once. > > > That wasn't quite what I had in mind - binary compatibility allows moving > objects between systems without copy if the receiving system does not > wish/need to copy. It's a choice of the receiver.
Oh, I thought "binary compatibility" just meant "no need to recompile" ;-) > Injecting RDF<...> makes an algorithm independent of one base provider. Exactly. (but not sure about the use of the term "injecting" here) > If you want to work with two base providers, for example, code that does > system A to system B copy, you need RDF<A> and RDF<B> injected. This is > exactly your first point - common choices enables neutrality of containers. > The "simple" commonsRDF, working on the interface objects can work with a > mixture of origins. Why would you want to work with two implementations at the same time? Unless you explicitly want to go from one to another, of course. > RDF<A> and RDF<B> are not compatible in the sense that a triple from one > can't be put into a graph of the other; it needs unpacking from RDF<A> to > the fundamentals (string for IRI etc) and repacking as RDF<B>. Yes, it is the same thing than working with `com.hp.hpl.jena.graph.Triple` and `org.openrdf.model.Statement`. Does one need to go from one to another? > (You can at least have a single converter lib, Among other things, I want to avoid converters. > because there are fundamental > base units (Strings in various uses) but (Java-ism) the converter needs to > be called, there being no implicit definitions.) Sorry I didn't get that. > Both styles have their uses. Yes, I know. banana-rdf has been providing an RDF abstraction for 4 years now, that now accommodates for 5 implementations (Jena, Sesame, banana-plantain, jsonld.js, N3.js) and we never felt the need for interfaces à la Java. So I am genuinely trying to understand where and why people need those interfaces, and how they will use them _in practice_ wrt the underlying implementations. Alexandre > > Andy > > > >> >> [1] >> http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#sort-java.util.List-java.util.Comparator- >> >> In practice, three things would likely happen: >> >> 1. Jena, Sesame, banana-rdf, etc. would have to provide an >> implementation of `RDF<...>` so that their implementations can be used >> with any system accepting using the `RDF<...>` approach >> 2. libraries that want to be abstract in the underlying RDF system >> they work with (e.g. a Turtle parser/writer, a SPARQL client, etc.) >> would have to be parametrized by `Graph`, `Triple`, etc. >> 3. but libraries from 2. would likely offer modules of their APIs >> already instantiated for the main RDF libraries (Jena, Sesame, etc.) >> so that they are ready to use with such systems >> >> Best, >> Alexandre >> >>> >>> Andy >>> >>> >>> On 07/05/15 20:16, Alexandre Bertails wrote: >>>> >>>> >>>> On Thu, May 7, 2015 at 11:18 AM, Alexandre Bertails >>>> <[email protected]> wrote: >>>>> >>>>> >>>>> Hi Stian, >>>>> >>>>> tldr: [1] https://gist.github.com/betehess/8983dbff2c3e89f9dadb >>>> >>>> >>>> >>>> I updated the gist with three examples at the end: CommonsRDF, >>>> StringRDF, PlantainRDF. Note how the types do not have to relate to >>>> the current interfaces, but they can if you want/need to. >>>> >>>> Alexandre >>>> >>>>> >>>>> On Wed, May 6, 2015 at 4:24 PM, Stian Soiland-Reyes <[email protected]> >>>>> wrote: >>>>>> >>>>>> >>>>>> On 6 May 2015 at 05:58, Alexandre Bertails <[email protected]> wrote: >>>>>>> >>>>>>> >>>>>>> I haven't followed the development in a long time, especially after >>>>>>> the move to Apache. I just looked at it and I had a few remarks and >>>>>>> questions. >>>>>> >>>>>> >>>>>> >>>>>> Hi, thanks for joining us! Let's hope we haven't scared you away while >>>>>> we try to get our first Apache release out and have the odd Blank Node >>>>>> fight.. :) >>>>> >>>>> >>>>> >>>>> Wait, there was a Blank Node fight and I wasn't part of it? >>>>> >>>>>>> Just some background for those who don't already know me: I am part >>>>>>> of >>>>>>> the banana-rdf project [1]. The main approach there relies on >>>>>>> defining >>>>>>> RDF and its operations as a typeclass [2]. In that world, Jena and >>>>>>> Sesame are just two instances of that typeclass (see for example [4] >>>>>>> and [5]). So there is no wrapper involved. Still, commons-rdf is >>>>>>> really a good step in the right direction as we could obsolete a lot >>>>>>> of stuff. >>>>>> >>>>>> >>>>>> >>>>>> I can definitely see that banana-rdf is relevant to commons-rdf - and >>>>>> also any requirements you might have to commons-rdf coming from Scala >>>>>> is interesting. >>>>> >>>>> >>>>> >>>>> That's cool if if we can accommodate commons-rdf to make it really >>>>> useful from Scala-land. I think it is possible [1]. >>>>> >>>>>>> Right now, there is no support in commons-rdf for immutable >>>>>>> operations. `Graph`s are mutable by default. Is there any plan to >>>>>>> make >>>>>>> an API for immutable graphs? Graphs in banana-rdf are immutable by >>>>>>> default, and they are persistent in Plantain. We could always wrap an >>>>>>> immutable graph in a structure with a `var`, but, well, there are >>>>>>> better ways to do that. >>>>>> >>>>>> >>>>>> >>>>>> There has been suggestions along those lines. It is not a requirement >>>>>> of Graph now to allow .add() etc. - but there is no method to ask if a >>>>>> graph is mutable or not. >>>>>> >>>>>> In the user guide >>>>>> http://commonsrdf.incubator.apache.org/userguide.html#Adding_triples >>>>>> we therefore say: >>>>>> >>>>>>> Note: Some Graph implementations are immutable, in which case the >>>>>>> below >>>>>>> may throw an UnsupportedOperationException. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> We could probably add this to the Javadoc of the mutability methods of >>>>>> Graph with an explicit @throws. >>>>>> >>>>>> I raised this as https://issues.apache.org/jira/browse/COMMONSRDF-23 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> https://issues.apache.org/jira/browse/COMMONSRDF-7 discusses how we >>>>>> should define immutability on the non-Graph objects. >>>>>> >>>>>> >>>>>> In Clerezza's Commons RDF Core (which is somewhat aligned with Commons >>>>>> RDF) there is an additional marker interface ImmutableGraph -- perhaps >>>>>> something along those lines would work here? >>>>> >>>>> >>>>> >>>>> If it doesn't exist in the type (i.e. statically), it's basically lost >>>>> knowledge. @throws, being silent or explicit, is pretty much useless >>>>> because now client code needs to check for this possibility. It would >>>>> be a slightly better to make it an interface. And we could could have >>>>> another interface for immutable graphs, where `add(Triple)` would >>>>> return another graph. >>>>> >>>>> See how it can be done in [1]. >>>>> >>>>>> >>>>>> >>>>>> https://github.com/apache/clerezza-rdf-core/blob/master/api/src/main/java/org/apache/clerezza/commons/rdf/ImmutableGraph.java >>>>>> >>>>>> >>>>>>> `RDFTermFactory` is stateful just to accommodate >>>>>>> `createBlankNode(String)`. It's stateless otherwise. This is really >>>>>>> an >>>>>>> issue for banana-rdf as everything is defined as pure function (the >>>>>>> output only depends on the input). >>>>>> >>>>>> >>>>>> >>>>>> It does not need to be stateful. >>>>>> >>>>>> In simple we implemented this using a final UUID "salt" that is >>>>>> created per instance of the factory. Do you consider this state? >>>>>> >>>>>> >>>>>> >>>>>> https://github.com/apache/incubator-commonsrdf/blob/master/simple/src/main/java/org/apache/commons/rdf/simple/SimpleRDFTermFactory.java#L51 >>>>>> >>>>>> >>>>>> This is then used by >>>>>> >>>>>> >>>>>> https://github.com/apache/incubator-commonsrdf/blob/master/simple/src/main/java/org/apache/commons/rdf/simple/BlankNodeImpl.java#L41 >>>>>> as part of a hashing to generate the new uniqueReference(). Thus a >>>>>> second call on the same factory with the same name will be hashed with >>>>>> the same salt, and produce the same uniqueReference(), which makes the >>>>>> second BlankNode equal to the first. >>>>>> >>>>>> >>>>>> But you can achieve the contract by other non-stateful means, for >>>>>> instance a random UUID that is static final (and hence no state at all >>>>>> per factory instance), and you can create a uniqueReference() by >>>>>> concatenating that UUID with the System.identityHashCode() of the >>>>>> factory and concatenate the provided name. >>>>> >>>>> >>>>> >>>>> This approach only gives you the illusion that there is no state, but >>>>> there *is* one (e.g. with UUID, and the atomic counter). Because of >>>>> its current contract, `createBlankNode(String)` cannot be >>>>> referentially transparent, and this is an issue if one wants to take a >>>>> functional approach. >>>>> >>>>>> Also you are not required to implement createBlankNode(String) - you >>>>>> can simply throw UnsupportedOperationException and only support >>>>>> createBlankNode(). >>>>> >>>>> >>>>> >>>>> What is the point of doing/allowing that? As a users or implementors, >>>>> I want to know that I can rely on a method/function that is >>>>> accessible. And that is also why I dislike the default implementation >>>>> approach taken in the current draft. >>>>> >>>>>> This should probably be noted in the (yet so far just imagined) >>>>>> Implementors Guide on the Commons RDF website. >>>>>> >>>>>> https://issues.apache.org/jira/browse/COMMONSRDF-24 >>>>>> >>>>>> >>>>>>> Is `createBlankNode(String)` really needed? The internal map for >>>>>>> bnodes could be maintained _outside_ of the factory. Or at least, we >>>>>>> could pass it as an argument instead: `createBlankNode(Map<String, >>>>>>> BlankNode>, String)`. >>>>>> >>>>>> >>>>>> >>>>>> We did discuss if it was needed - there are arguments against it due >>>>>> to blank nodes "existing only as themselves" and therefore a single >>>>>> JVM object per BlankNode should be enough - however having the >>>>>> flexibility for say a streaming RDF parser to create identitcal blank >>>>>> node instances without keeping lots of object references felt like a >>>>>> compelling argument to support this through the factory - with for >>>>>> example the hashing method above this means no state is required. >>>>> >>>>> >>>>> >>>>> This is making *very structuring assumption*. Being referentially >>>>> transparent makes none, and will always accommodate all cases. That >>>>> being said, I understand the constraints. >>>>> >>>>> Note: [1] does not attempt to fix that issue. >>>>> >>>>>>> # wrapped values >>>>>>> >>>>>>> There are a lot of unnecessary objects because of the class >>>>>>> hierarchy. >>>>>>> In banana-rdf, we can say that RDFTerm is a plain `String` while >>>>>>> being >>>>>>> 100% type-safe. That's already what's happening for the N3.js >>>>>>> implementation. And in Plantain, Literals are just `java.lang.Object` >>>>>>> [6] so that we can directly have String, Int, etc. >>>>>> >>>>>> >>>>>> >>>>>> Well, this is tricky in Java where you can't force a new interface >>>>>> onto an existing type. >>>>>> >>>>>> How can you have a String as an RDFTerm? Because you use the >>>>>> ntriplestring? This would require new "instanceOf"-like methods to >>>>>> check what the string really is - and would require various classes >>>>>> like LiteralInspector to dissect the string. This sounds to me like >>>>>> building a different API.. >>>>> >>>>> >>>>> >>>>> The point is to abstract things away. By choosing to use actual >>>>> interfaces, you are forcing everything to be under a class hierarchy >>>>> for no good reason. I do not find the motivation for this approach in >>>>> the documentation. >>>>> >>>>> Please see [1] for a discussion on that subject. >>>>> >>>>>> While I can see this can be a valid way to model RDF in a non-OO way, >>>>>> I think that would be difficult to align with Commons RDF as a >>>>>> Java-focused API, where most Java programmers would expect type >>>>>> hierarchies represented as regular Java class hierarchies. >>>>> >>>>> >>>>> >>>>> I am not convinced. The RDF model is simple enough that another >>>>> approach is possible [1]. >>>>> >>>>>>> That means that there is no way currently to provide a >>>>>>> `RDFTermFactory` for Plantain. The only alternatives I see right now >>>>>>> are: >>>>>> >>>>>> >>>>>> >>>>>> What is the challenge of returning wrappers? I think this is the >>>>>> approach that Jena is also considering. >>>>>> >>>>>> Presumably if you are providing an RDFTermFactory then that is to >>>>>> allow JVM code that expects any Commons RDF code to create Plantain >>>>>> objects for RDF. They would expect to be able to do say: >>>>>> >>>>>> factory.createLiteral("Fred").getDatatype() >>>>>> >>>>>> which would not work on a returned String >>>>> >>>>> >>>>> >>>>> You can (of course) do things like that in Scala, in a very typesafe >>>>> way, i.e. it's not monkey patching. And this is happening in >>>>> banana-rdf :-) >>>>> >>>>> That would be totally compatible with [1]. >>>>> >>>>>>> # getTriples vs iterate >>>>>>> >>>>>>> Not a big deal but I believe the naming could be improved. When I >>>>>>> read >>>>>>> getTriples, I expect to have all the triples in my hand, but this is >>>>>>> not quite what Streams are about. On the other hand, when I read >>>>>>> iterate, I kinda expect the opposite. Of course the types clarify >>>>>>> everything but I believe it'd be easier to use getTriplesAsStream and >>>>>>> getTriplesAsIterable. >>>>>> >>>>>> >>>>>> >>>>>> I might disagree - but I think this is a valuable to discuss. >>>>> >>>>> >>>>> >>>>> As I said, not a big deal. Types are what really matters in the end. >>>>> >>>>>> I have taken the liberty to report this in your name as: >>>>>> https://issues.apache.org/jira/browse/COMMONSRDF-22 >>>>>> >>>>>> so we can discuss this further in the email thread that should have >>>>>> triggered. >>>>>> >>>>>> >>>>>> >>>>>> Thanks for all your valuable suggestions - please do keep in touch and >>>>>> let us know if you have a go at aligning banana-rdf with the upcoming >>>>>> 0.1.0 release, further feedback on documentation, and anything else! >>>>> >>>>> >>>>> >>>>> I believe that the the main issue is that the current approach is both >>>>> for library users _and_ library authors, but there really are two >>>>> different targets here. I agree that the class for the RDF model can >>>>> provide a common framework for many Java people. But libraries relying >>>>> on commons-rdf should not be tied to the classes. >>>>> >>>>> Please have a look at this gist to see what I mean and tell me what >>>>> you think [1]. >>>>> >>>>> Alexandre >>>>> >>>>> [1] https://gist.github.com/betehess/8983dbff2c3e89f9dadb >>>>> >>>>>> >>>>>> -- >>>>>> Stian Soiland-Reyes >>>>>> Apache Taverna (incubating), Apache Commons RDF (incubating) >>>>>> http://orcid.org/0000-0001-9842-9718 >>> >>> >>> >
