Re: General questions

Alexandre Bertails Tue, 07 Jul 2015 20:40:41 -0700

Hi Stian,

On Tue, Jul 7, 2015 at 3:38 AM, Stian Soiland-Reyes <[email protected]> wrote:
> Wow, thank you for a great and solid blog post!  This helps a lot for
> my understanding of your view and of
> https://github.com/betehess/free-rdf


Thanks :-)

>From now on, when discussing the two approaches, I will call the
current approach in Commons RDF the "dynamic approach" while I'll call
the `RDF<...>` approach the "static approach". Because that's really
what's happening.

> I think you are proposing a radical approach that takes us out of the
> classical OO world into a more functional (and in many ways more
> beautiful) approach. I am happy to play with it a bit to see where
> this goes. I am particularly interested to have a go with this for my
> dodgy Clojure approach - [1] - and I like your points on immutability.

I would claim that the static approach is still OO :-) It's just that
people often think that OO is (only) about inheritance. You can blame
Java for that.

> For API documentation I am slightly worried about how much harder it
> would be to understand a bunch of methods grouped together, as opposed
> to split per interface. More inter-linking between the methods would
> probably be needed.

One still has to read the entire API. Can you explain your idea re:
"More inter-linking between the methods"?

> I am however also concerned that you are adding a compile-time binding
> to the implementations,

YES!!!! That's why it works.

> which could mean clients no longer would be
> able to dynamically use any RDF implementation.

That is not the case. Keep reading ;-)

> I think your approach
> would only work for clients if they do this passing of the "RDF"
> instance throughout - or if they based it on the 'concrete' interfaces
> as you propose.

Actually, the static approach is _still_ compatible with the dynamic approach.

For example, Apache Any23 could be written/implemented in terms of
`RDF<...>` while exposing one instance of their API using the `Graph`,
`Triple`, etc. defined in Commons RDF. Users would just have to use
it, without worrying about providing the `RDF<...>` instance nor the
abstract types a single time.

It could also do the same for Jena, Sesame, banana-rdf, etc. This
plumbing has to be written only once, either by the Apache Any23
folks, or by any user having an instance of `RDF<...>` in their hands.

> In a way your approach can be fitted into our current approach as an
> optional higher-level abstraction.

I would actually hope the emphasis to be on the static approach so
that library authors would start with it.

It is _always_ possible to plug the interface approach in a system
using the static approach.

> It also means that you could fit
> use say Jena into Commons-RDF this without any object wrappers - just
> a single functional wrapper.  We could even have generic reverse
> wrapper classes that calls the RDF methods per interface.
>
>
> Applications who are not sure which RDF implementation to use, e.g.
> for performance reasons, can with Commons RDF swap implementations
> with no or little changes - with your approach this is not very
> different as it would just mean to pass a different RDF instance in.
>
> But I also saw a big value of Commons RDF being for other libraries
> who just need to do a bit of RDF on the side, so that they can be
> pluggable for alternative RDF frameworks  without forcing an
> implementation choice on their clients. I am not sure if this would
> work with a compiled library - as you say the generics info is
> basically lost on compile in Java.

Again, the interface approach _can_ still be useful. For me, its main
interest is that the APIs in Jena and Sesame will be unified, easier
to learn, and very close to the spec. That's a huge win for everybody.
But I am not sure of the other benefits.

> So I would want to see more on how clients using this - who are not
> implementing RDF in any way, but just wants to make some statements or
> manipulate some graphs - and see if they would still have the benefit
> of Commons RDF without too much buy-in or compile-time options.

As I said, I believe that library authors would probably do the
plumbing themselves most of the time, in the interest of the users.
And if it happens that the library X does not already provide a module
using your favorite RDF implementation, well, then it's just a PR away
;-)

Best,
Alexandre

>
>
>
> [1] https://github.com/stain/commons-rdf-clj/tree/master/src/commons_rdf_clj
>
> On 17 June 2015 at 16:47, Alexandre Bertails <[email protected]> wrote:
>> FYI http://bertails.org/2015/06/17/an-rdf-abstraction-for-the-jvm/
>>
>> Sorry for the long delay.
>>
>> Best,
>> Alexandre
>>
>> On Sun, May 31, 2015 at 4:01 AM, Andy Seaborne <[email protected]> wrote:
>>> On 14/05/15 15:58, Alexandre Bertails wrote:
>>>>
>>>> Thank you Andy, those are the questions that must be answered before
>>>> significant code is being written.
>>>>
>>>> On Thu, May 14, 2015 at 3:37 AM, Andy Seaborne <[email protected]> wrote:
>>>>>
>>>>> Alexandre's proposal would the project in a different direction.
>>>>
>>>>
>>>> Code-wise, this is true, as a lot of existing interfaces would become
>>>> obsolete, so I understand why this is frustrating. But I believe that
>>>> the proposal is better aligned with a larger goal of interop.
>>>
>>>
>>> Could you expand on that because it looks like a different interoperability,
>>> one where code (algorithms) can be easily ported between systems but it's by
>>> recompiling with different choices for RDF<....> whereas factory injection
>>> means the same binary code can work over different systems.
>>>
>>>>> The goals for me have been switching underlying systems (being able to
>>>>> produce portable algorithms that can be applied without recompiling the
>>>>> world (ServiceLoader to bind to implementation) and of interoperation
>>>>> across
>>>>> systems.
>>>>
>>>>
>>>> The pure Scala parts of banana-rdf cannot interoperate using the
>>>> current framework.And in a larger sense, the current Commons RDF does
>>>>
>>>> not accommodate a lot of more general use cases (see the 7 use cases I
>>>> listed in a previous email).
>>>>
>>>> For me, the current approach was driven by the existing Jena and
>>>> Sesame, and that's already progress. The real question is: are there
>>>> other goals we want to address? If the answer is "no" then it's fine
>>>> as well, but we need to know it, and the reason.
>>>>
>>>>> Being able to switch implementation choices by choosing different types
>>>>> is
>>>>> interesting but different. It would be nice to see both existing though
>>>>> combining into one "thing" seems to overload the focus.
>>>>
>>>>
>>>> As I showed in the code, the two can be combined. It is important that
>>>> people here take the time to understand the big picture and how the
>>>> pieces translate.
>>>>
>>>>> Do we want to "host" the generics approach as well (whether used for the
>>>>> system abstraction work or to go along side)?
>>>>
>>>>
>>>> Very good question. I believe library authors will want their work to
>>>> be usable to more people, not just Jena and Sesame. So if  the
>>>> "generics approach" doesn't happen in Commons RDF, then people with
>>>> interest in better interop will have an incentive to maintain the code
>>>> outside of the project. Especially if they know that there is an easy
>>>> way to communicate with the interfaces from Commons RDF, with no
>>>> runtime cost.
>>>>
>>>>> The other difference is a theory-practice one. The current work is not
>>>>> reworking the general style of Jena, Sesame,
>>>>
>>>>
>>>> The "general style of Jena, Sesame" was not driven by interop. So
>>>> people had different problems to solve and OOP was perfectly fine in
>>>> that case.
>>>>
>>>>> common progamming ways of doing things.
>>>>
>>>>
>>>> I simply don't know what "common programming". In some cases, I have
>>>> heard people using that term to dismiss other forms of programming.
>>>>
>>>> What I know is that the approach in my proposal is not new at all, and
>>>> has existed in Java-land for a long time. Just look at
>>>> java.util.Comparator vs java.util.Comparable, that is the very same
>>>> discussion, just with a few more types.
>>>>
>>>> First we need to agree (or not) on the goals, then we find a technical
>>>> solution. And again, "not interested" is a perfectly fine answer.
>>>>
>>>>> I'd like to see the generics approach validated by external usage,
>>>>> not for its technical design, but addressing whether it creates
>>>>> sufficient
>>>>> demand and sufficient acceptance.
>>>>
>>>>
>>>> As for the design itself, you can consider it's been incubated in
>>>> banana-rdf for four years.
>>>>
>>>> Alexandre
>>>>
>>>> [1] http://en.wikipedia.org/wiki/Type_class
>>>>
>>>>>
>>>>>          Andy
>>>>>
>>>>>
>>>>> On 13/05/15 07:32, Alexandre Bertails wrote:
>>>>>>
>>>>>>
>>>>>> Sergio,
>>>>>>
>>>>>> The approach is different. A "patch" against the current codebase
>>>>>> would remove most of the interfaces.
>>>>>>
>>>>>> I suggest that you try to understand what's going on in the code,
>>>>>> after you read the other messages in that thread.
>>>>>>
>>>>>> Then if there is interest, I can work on a real patch.
>>>>>>
>>>>>> Alexandre
>>>>>>
>>>>>> On Tue, May 12, 2015 at 11:25 PM, Sergio Fernández <[email protected]>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> I'd say if you'd be much more valuable to see a patch about your
>>>>>>> proposal
>>>>>>> that a quick hack from scratch.
>>>>>>> You can fork our github mirror:
>>>>>>> https://github.com/apache/incubator-commonsrdf
>>>>>>>
>>>>>>> On Wed, May 13, 2015 at 8:01 AM, Alexandre Bertails
>>>>>>> <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Tue, May 12, 2015 at 10:21 PM, Sergio Fernández <[email protected]>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Alexandre,
>>>>>>>>>
>>>>>>>>> git clone
>>>>>>>>>
>>>>>>>>> https://[email protected]/repos/asf/incubator-commonsrdf.git
>>>>>>>>> commonsrdf
>>>>>>>>>
>>>>>>>>> The incubator prefix in the name is to keep clear we're still not
>>>>>>>>> fully
>>>>>>>>> endorsed by the ASF. I know it's a bit inconvenient, specially in
>>>>>>>>> later
>>>>>>>>> phases when we'd get rid of that, but is part of the incubator
>>>>>>>>> process.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> I have hacked something quick-and-dirty and made it available at [1].
>>>>>>>>
>>>>>>>> Quick overview of the sub-packages:
>>>>>>>> * `api`: just the RDF interface, and the interfaces from commons-rdf
>>>>>>>> are moved under `concrete`
>>>>>>>> * `concrete`: shows how to implement RDF with the interfaces approach
>>>>>>>> * `simple`: a complete example adapted from commons-rdf
>>>>>>>> * `classless`: a (almost) complete example which does not rely on
>>>>>>>> shared interfaces
>>>>>>>> * `turtle`: a example of how to rely on the RDF interface
>>>>>>>>
>>>>>>>> Feel free to ask questions.
>>>>>>>>
>>>>>>>> Alexandre
>>>>>>>>
>>>>>>>> [1] https://github.com/betehess/free-rdf
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, May 12, 2015 at 6:45 PM, Alexandre Bertails <
>>>>>>>>
>>>>>>>>
>>>>>>>> [email protected]>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Stian,
>>>>>>>>>>
>>>>>>>>>> It sounds stupid but I do not understand where the code actually
>>>>>>>>>> lives.
>>>>>>>>>>
>>>>>>>>>> I have tried
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> git clone https://git-wip-us.apache.org/repos/asf/commons-rdf.git
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> and
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> git clone git://git.apache.org/commons-rdf.git
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> but both tell me that I "appear to have cloned an empty repository."
>>>>>>>>>> The github repo is empty as well.
>>>>>>>>>>
>>>>>>>>>> Can somebody please give me the right URI? Sorry if I miss that in
>>>>>>>>>> the
>>>>>>>>>> documentation, but I did look there and couldn't find the answer :-/
>>>>>>>>>>
>>>>>>>>>> Alexandre
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, May 12, 2015 at 8:41 AM, Alexandre Bertails
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi Stian,
>>>>>>>>>>>
>>>>>>>>>>> On Tue, May 12, 2015 at 7:35 AM, Stian Soiland-Reyes <
>>>>>>>>
>>>>>>>>
>>>>>>>> [email protected]>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 12 May 2015 at 06:20, Alexandre Bertails
>>>>>>>>>>>> <[email protected]>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> I actually didn't understand that we were discussing a
>>>>>>>>>>>>> `createBlankNode(UUID)`. I think we just need to be able to
>>>>>>>>>>>>> create
>>>>>>>>>>>>> a
>>>>>>>>>>>>> fresh blank node.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> That is what createBlankNode() does.
>>>>>>>>>>>>
>>>>>>>>>>>> Is your proposal to simply remove createBlankNode(String)?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> As it is today, yes. Because its contract implies some kind of
>>>>>>>>>>> shared
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> state.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> But we have identified a use-case where the blank node can remember
>>>>>>>>>>> in
>>>>>>>>>>> which context it was generated e.g. the blank node label at parsing
>>>>>>>>>>> time.
>>>>>>>>>>>
>>>>>>>>>>>>> Requiring the caller to provide an explicit UUID
>>>>>>>>>>>>> means that the freshness is happening *outside* of the factory,
>>>>>>>>>>>>> so
>>>>>>>>>>>>> I
>>>>>>>>>>>>> don't see the point.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Well, you wanted to pass in the uniqueness..? You can pass it as a
>>>>>>>>>>>> String (as of today), or, loosely suggested, by restricting this
>>>>>>>>>>>> to
>>>>>>>>>>>> a
>>>>>>>>>>>> UUID (which would require clients to think about this very common
>>>>>>>>>>>> mapping/hashing).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> No, the uniqueness must happen in `createBlankNode()`. That's how
>>>>>>>>>>> you
>>>>>>>>>>> can enforce the invariant.
>>>>>>>>>>>
>>>>>>>>>>>>> Also, it's forcing the strategy (UUID), which
>>>>>>>>>>>>> might not be the best one for everybody, e.g. UUID is known to be
>>>>>>>>>>>>> slow, at least for some notion of slow, and that could become a
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> There are several variations of UUID, you are free to use a
>>>>>>>>>>>> timestamp one that is rather fast to make, SHA-1 is not known to
>>>>>>>>>>>> be
>>>>>>>>
>>>>>>>>
>>>>>>>> slow
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> either, so version 5 hashes are also fast.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> commons-rdf should leave that choice open.
>>>>>>>>>>>
>>>>>>>>>>>> But we agreed that UUID only might be a bit strict for some
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> implementations,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> which meant that uniqueReference() can return any unique string..
>>>>>>>>>>>> so
>>>>>>>>
>>>>>>>>
>>>>>>>> if
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> it
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> considered
>>>>>>>>>>>>
>>>>>>>>>>>>     app=97975c0b-62c1-42c9-b2a9-e87948e4a46e ip=84.92.48.26
>>>>>>>>>>>> uid=1000
>>>>>>>>>>>> pid=292 name=fred
>>>>>>>>>>>>
>>>>>>>>>>>> to be a unique string (with hard-coded
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 97975c0b-62c1-42c9-b2a9-e87948e4a46e
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> in case someone else comes up with a similar scheme),
>>>>>>>>>>>> and didn't mind leaking all that vulnerability data, then that
>>>>>>>>>>>> would
>>>>>>>>
>>>>>>>>
>>>>>>>> be
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> a
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> compliant uniqueReference().
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> I am not arguing for stateless vs stateful. I am just pointing at
>>>>>>>>
>>>>>>>>
>>>>>>>> some
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> design issues which do not allow it. Currently, there is just no
>>>>>>>>>>>>> way
>>>>>>>>>>>>> for an immutable implementation to be used with such a factory.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I am not sure what is the extent of "immutable" here. I'll assume
>>>>>>>>>>>> it
>>>>>>>>>>>> just means that all fields are final, not
>>>>>>>>>>>> that the object is not allowed to have any field at all.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Being final just means that the reference won't be updated, but its
>>>>>>>>>>> state can still be updated. So to be immutable, you also need the
>>>>>>>>>>> final references to be immutable themselves.
>>>>>>>>>>>
>>>>>>>>>>>> You are free to
>>>>>>>>>>>> create RDFTermFactory as you please, so you can simply do it like
>>>>>>>>
>>>>>>>>
>>>>>>>> this:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> public class ImmutableRDFTermFactory implements RDFTermFactory {
>>>>>>>>>>>>       private final UUID salt;
>>>>>>>>>>>>       public ImmutableRDFTermFactory(UUID salt) {
>>>>>>>>>>>>           this.salt = salt;
>>>>>>>>>>>>       }
>>>>>>>>>>>>       public BlankNode createBlankNode() {
>>>>>>>>>>>>         return new BlankNodeImpl(salt);
>>>>>>>>>>>>       }
>>>>>>>>>>>>       public BlankNode createBlankNode(String name) {
>>>>>>>>>>>>         return new BlankNodeImpl(salt, name);
>>>>>>>>>>>>       }
>>>>>>>>>>>>       / ..
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> public class BlankNodeImpl implements BlankNode {
>>>>>>>>>>>>
>>>>>>>>>>>>     private static void unique(UUID salt) {
>>>>>>>>>>>>        Instant now = Clock.systemUTC().instant();
>>>>>>>>>>>>        return salt.toString()  + System.identityHashCode(this) +
>>>>>>>>>>>> now.getEpochSecond() + now.getNano() +
>>>>>>>>
>>>>>>>>
>>>>>>>> Thread.currentThread().getId();
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     }
>>>>>>>>>>>>
>>>>>>>>>>>>     private final String uniqueReference;
>>>>>>>>>>>>     public BlankNodeImpl(UUID salt, String name) {
>>>>>>>>>>>>       uniqueReference = salt.toString() + name;
>>>>>>>>>>>>     }
>>>>>>>>>>>>     public BlankNodeImpl(UUID salt) {
>>>>>>>>>>>>       uniqueReference = salt.toString()  +
>>>>>>>>
>>>>>>>>
>>>>>>>> System.identityHashCode(this)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> + new Date().;
>>>>>>>>>>>>     }
>>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This is not immutable because of the shared state.
>>>>>>>>>>>
>>>>>>>>>>>> Here there is no hidden mutability in AtomicLong or within
>>>>>>>>>>>> java.util.UUID's SecureRandom implementation's internal state. I
>>>>>>>>
>>>>>>>>
>>>>>>>> guess
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> you would not be happy with those either?
>>>>>>>>>>>>
>>>>>>>>>>>> The clock is obviously mutable - but as a device rather than a
>>>>>>>>>>>> memory
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> state.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> There is no "but" in the immutable world :-)
>>>>>>>>>>>
>>>>>>>>>>>>> Having `add` returning a `Graph` does not mean that `Graph` is
>>>>>>>>>>>>> immutable. It just means that it *enables* `Graph` to be
>>>>>>>>>>>>> immutable.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> There is nothing stopping an immutable Graph from having an
>>>>>>>>
>>>>>>>>
>>>>>>>> additional
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> method that does this.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Now I am the one asking for some code, because I don't see how
>>>>>>>>>>> that'd
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> work :-p
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> As I said in a previous, you can wrap an immutable Graph in a new
>>>>>>>>>>> object with a mutable reference to that graph, but, well, please
>>>>>>>>>>> let's
>>>>>>>>>>> avoid having to do that...
>>>>>>>>>>>
>>>>>>>>>>>> For some methods, like builders, returning the mutated state is
>>>>>>>>>>>> good
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> practice.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> When using persistent datastructures, a builder is not an option.
>>>>>>>>>>>
>>>>>>>>>>> There are areas where you do not want to go back to the mutable
>>>>>>>>>>> version. It happens everywhere in banana-rdf e.g. the RDF DSL, the
>>>>>>>>>>> RDF/class mapper, etc. Just because we need to compose graphs
>>>>>>>>>>> without
>>>>>>>>>>> risking to modify an existing one.
>>>>>>>>>>>
>>>>>>>>>>>> It has been suggested earlier to return bool on add() to be
>>>>>>>>
>>>>>>>>
>>>>>>>> compatible
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> with Collection, but we were not all too happy with that as it
>>>>>>>>>>>> might
>>>>>>>>>>>> be difficult/expensive to know if the graph was actually mutated
>>>>>>>>>>>> or
>>>>>>>>>>>> not (e.g. you insert the same triple twice, but the store doesn't
>>>>>>>>>>>> bother checking if the triple existed).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Returning `bool` has very little value from my perspective.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> See
>>>>>>>>>>>> https://issues.apache.org/jira/browse/COMMONSRDF-17
>>>>>>>>>>>> https://github.com/commons-rdf/commons-rdf/issues/27
>>>>>>>>>>>> https://github.com/commons-rdf/commons-rdf/issues/46
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> So your suggestion is for the mutability methods to return the
>>>>>>>>
>>>>>>>>
>>>>>>>> mutated
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> object (which may or may not be the original instance). I think
>>>>>>>>>>>> this
>>>>>>>>>>>> could be an interesting take for discussions - could you raise
>>>>>>>>>>>> this
>>>>>>>>
>>>>>>>>
>>>>>>>> as
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> a separate Jira issue?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, that'd be the way to go.
>>>>>>>>>>>
>>>>>>>>>>> But I would prefer to see how much interest in the general approach
>>>>>>>>>>> there is before opening too many issues.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Well, Scala is just a language. Immutability and referential
>>>>>>>>>>>>> transparency, are just principles, but they are becoming more and
>>>>>>>>
>>>>>>>>
>>>>>>>> more
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> important in many areas (Spark, concurrency, etc.).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Agreed, also for distributed areas like Hadoop.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> There are *many* areas where accommodating immutable graphs has
>>>>>>>>>>> become
>>>>>>>>>>> important.
>>>>>>>>>>>
>>>>>>>>>>>>> There is no shortcut at all. The RDF model only resolves around
>>>>>>>>>>>>> some
>>>>>>>>>>>>> types (Graph, Triple, RDFTerm, BlankNodeOrIRI, IRI, BlankNode,
>>>>>>>>>>>>> Literal) which can be left abstract, as opposed to being concrete
>>>>>>>>
>>>>>>>>
>>>>>>>> when
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> using Java's interfaces. (it's "concrete" in the sense it's using
>>>>>>>>>>>>> nominal subtyping)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Well, I still don't see how a java.util.String will work with Java
>>>>>>>>>>>> code that expects to be able to call .getIRIString(). Would
>>>>>>>>>>>> Scala generate proxies on the fly?  Or would it need to call
>>>>>>>>>>>> .getIRIString() "elsewhere"?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It's like monkey patching, just in a controlled and type safe way:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> val rdf: RDF = ???
>>>>>>>>>>>
>>>>>>>>>>> implicit class IRIWrapper(val iri: IRI) extends AnyVal {
>>>>>>>>>>>     def getIRIString(): String = rdf.getIRIString(iri)
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> val iri: IRI = rdf.createIRI("http://example.com";)
>>>>>>>>>>> assert(rdf.getIRIString(iri) == iri.getIRIString())
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Scala would find that there is an implicit conversion from IRI to
>>>>>>>>>>> something with a getIRIString method, and would do the `new
>>>>>>>>>>> IRIWrapper`. But because this is also a value class (`AnyVal`) then
>>>>>>>>>>> no
>>>>>>>>>>> object would actually be allocated. It's basically free.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> If you look at what I did, you have a *direct* translation of the
>>>>>>>>>>>>> existing interfaces+methods+factory into simple functions.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, but done in Scala. Can I see a suggestion to the changes of
>>>>>>>>>>>> the
>>>>>>>>>>>> current CommonsRDF Java interfaces - in Java?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> No the gist is in Java and uses the same function names.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> * the Java interfaces becomes abstract types
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Java interfaces are abstract types.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Java interfaces provide some abstraction (subtype polymorphism).
>>>>>>>>>>> Types
>>>>>>>>>>> are compile-time information. At runtime, you see a reified version
>>>>>>>>>>> of
>>>>>>>>>>> the type, as an interface or as a class (and module type erasure).
>>>>>>>>>>> That is why Java interfaces are not really abstract types.
>>>>>>>>>>>
>>>>>>>>>>>> Do you mean generics?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes.
>>>>>>>>>>>
>>>>>>>>>>>>    Generics of which class/interface?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Of the RDF interface in the gist [1].
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://gist.github.com/betehess/8983dbff2c3e89f9dadb#file-rdf-java-L10
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Not all Commons RDF clients are expected to interface via
>>>>>>>>>>>> RDFTermFactory. In fact many use-cases don't need it at all.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> * the methods on those interfaces become functions on the
>>>>>>>>>>>>> abstract
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> types
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> * the methods on the interfaces in the factory becomes simple
>>>>>>>>>>>>> functions on the abstract types
>>>>>>>>>>>>> * operating on a node happens with a visitor (as in visitor
>>>>>>>>>>>>> pattern)
>>>>>>>>>>>>> implemented as the `visit` function, taking 3 functions for the 3
>>>>>>>>>>>>> possible cases (I believe the current API asks for checking the
>>>>>>>>
>>>>>>>>
>>>>>>>> class
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> at runtime...)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is too much at an abstract (!) level for me to visualize as
>>>>>>>>
>>>>>>>>
>>>>>>>> we're
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> clashing programming languages here.. could you detail how this
>>>>>>>>>>>> would
>>>>>>>>>>>> look in a set of *.java files? Feel free to raise it as a pull
>>>>>>>>
>>>>>>>>
>>>>>>>> request
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> or similar, even if it's very draft-like. :)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I can transform my gist into a real project. I will need a couple
>>>>>>>>>>> of
>>>>>>>>>>> days to find the time.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Now, let's say I am implementing a Turtle parser. The only thing
>>>>>>>>>>>>> I
>>>>>>>>>>>>> care about is how I can [use case 1] create/inject elements into
>>>>>>>>
>>>>>>>>
>>>>>>>> some
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> existing RDF model. If I am writing a Turtle serializer, I only
>>>>>>>>>>>>> care
>>>>>>>>>>>>> about how to [use case 2] traverse that type hierarchy. In none
>>>>>>>>>>>>> of
>>>>>>>>>>>>> those cases did I care about having the types defined in the
>>>>>>>>>>>>> class/interface hierarchy and I want anybody to use their own RDF
>>>>>>>>>>>>> model.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes. And with the current take of Commons RDF, the Turtle parser
>>>>>>>>>>>> is
>>>>>>>>
>>>>>>>>
>>>>>>>> free
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> to return its own instances of RDFTerm interfaces, which any
>>>>>>>>>>>> Commons
>>>>>>>>
>>>>>>>>
>>>>>>>> RDF
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> consuming client will be able to use as-is, e.g. pass to their own
>>>>>>>>>>>> Graph implementation.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> And here is what people will end up doing:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> Graph graph = JenaTurtleParser.parse(input);
>>>>>>>>>>> com.hp.hpl.jena.graph.Graph jenaGraph =
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> (com.hp.hpl.jena.graph.Graph)graph;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Many will not want to see the common interface but the actual
>>>>>>>>>>> subtype.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> class TurtleParser<Graph, Triple, RDFTerm, BlankNodeOrIRI, IRI,
>>>>>>>>>>>>> BlankNode, Literal> {
>>>>>>>>>>>>>     RDF<Graph, Triple, RDFTerm, BlankNodeOrIRI, IRI, BlankNode,
>>>>>>>>
>>>>>>>>
>>>>>>>> Literal>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> rdf
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>     Graph parse(String input) { /* can call
>>>>>>>>>>>>> rdf.createLiteral("foo"),
>>>>>>>>
>>>>>>>>
>>>>>>>> or
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> anything in rdf.* */ }
>>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think the <brackets> speak for themselves here :-(
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> "Small" remark: I still don't think that
>>>>>>>>>>>>> `createBlankNode(String)`
>>>>>>>>>>>>> belongs to the RDF model. I would really like to see a use case
>>>>>>>>>>>>> that
>>>>>>>>>>>>> shows why it has to be present.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is a valid point of view which I think you should raise
>>>>>>>>>>>> as a new Jira issue. We did argue that it is not part of the
>>>>>>>>>>>> RDF model, but it is still a practically very useful feature,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> "useful feature" --> this is where I would like to see a motivating
>>>>>>>>>>> use case. Then we can discus how useful a feature it is, or how
>>>>>>>>>>> much
>>>>>>>>>>> of a problem it can be.
>>>>>>>>>>>
>>>>>>>>>>>> however it has generated many contention points in the past
>>>>>>>>>>>> as it touches on state and uniqueness.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> See also this discussion about the need (or not) for
>>>>>>>>>>>> exposing .uniqueReference()
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am all in favor or `uniqueReference`. That is how the invariants
>>>>>>>>>>> on
>>>>>>>>>>> the blank node can be achieved.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> https://issues.apache.org/jira/browse/COMMONSRDF-13
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Finally, I will admit that writing all those types parameters can
>>>>>>>>
>>>>>>>>
>>>>>>>> be a
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> bit cumbersome, even if it happens only in a very few places (as
>>>>>>>>>>>>> a
>>>>>>>>>>>>> user: only once when you build what you need e.g. a Turtle
>>>>>>>>>>>>> parser).
>>>>>>>>>>>>> But please let's not sacrifice correctness and functionality to
>>>>>>>>>>>>> (a
>>>>>>>>>>>>> little) convenience...
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Well, if those would be exposed to any client of the Commons RDF
>>>>>>>>>>>> API
>>>>>>>>
>>>>>>>>
>>>>>>>> I
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> fear we would see very little uptake..
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> How so?
>>>>>>>>>>>
>>>>>>>>>>>> If they are hidden inside some upper/inner interface that is not
>>>>>>>>>>>> exposed otherwise, it is not so bad.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, you can always do that.
>>>>>>>>>>>
>>>>>>>>>>> Alexandre
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Stian Soiland-Reyes
>>>>>>>>>>>> Apache Taverna (incubating), Apache Commons RDF (incubating)
>>>>>>>>>>>> http://orcid.org/0000-0001-9842-9718
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Sergio Fernández
>>>>>>>>> Partner Technology Manager
>>>>>>>>> Redlink GmbH
>>>>>>>>> m: +43 6602747925
>>>>>>>>> e: [email protected]
>>>>>>>>> w: http://redlink.co
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Sergio Fernández
>>>>>>> Partner Technology Manager
>>>>>>> Redlink GmbH
>>>>>>> m: +43 6602747925
>>>>>>> e: [email protected]
>>>>>>> w: http://redlink.co
>>>>>
>>>>>
>>>>>
>>>
>
>
>
> --
> Stian Soiland-Reyes
> Apache Taverna (incubating), Apache Commons RDF (incubating)
> http://orcid.org/0000-0001-9842-9718

Re: General questions

Reply via email to