On 12 May 2015 at 06:20, Alexandre Bertails <[email protected]> wrote:
> I actually didn't understand that we were discussing a
> `createBlankNode(UUID)`. I think we just need to be able to create a
> fresh blank node.
That is what createBlankNode() does.
Is your proposal to simply remove createBlankNode(String)?
> Requiring the caller to provide an explicit UUID
> means that the freshness is happening *outside* of the factory, so I
> don't see the point.
Well, you wanted to pass in the uniqueness..? You can pass it as a
String (as of today), or, loosely suggested, by restricting this to a
UUID (which would require clients to think about this very common
mapping/hashing).
> Also, it's forcing the strategy (UUID), which
> might not be the best one for everybody, e.g. UUID is known to be
> slow, at least for some notion of slow, and that could become a
There are several variations of UUID, you are free to use a
timestamp one that is rather fast to make, SHA-1 is not known to be slow
either, so version 5 hashes are also fast.
But we agreed that UUID only might be a bit strict for some implementations,
which meant that uniqueReference() can return any unique string.. so if it
considered
app=97975c0b-62c1-42c9-b2a9-e87948e4a46e ip=84.92.48.26 uid=1000
pid=292 name=fred
to be a unique string (with hard-coded 97975c0b-62c1-42c9-b2a9-e87948e4a46e
in case someone else comes up with a similar scheme),
and didn't mind leaking all that vulnerability data, then that would be a
compliant uniqueReference().
> I am not arguing for stateless vs stateful. I am just pointing at some
> design issues which do not allow it. Currently, there is just no way
> for an immutable implementation to be used with such a factory.
I am not sure what is the extent of "immutable" here. I'll assume it
just means that all fields are final, not
that the object is not allowed to have any field at all.
You are free to
create RDFTermFactory as you please, so you can simply do it like this:
public class ImmutableRDFTermFactory implements RDFTermFactory {
private final UUID salt;
public ImmutableRDFTermFactory(UUID salt) {
this.salt = salt;
}
public BlankNode createBlankNode() {
return new BlankNodeImpl(salt);
}
public BlankNode createBlankNode(String name) {
return new BlankNodeImpl(salt, name);
}
/ ..
}
public class BlankNodeImpl implements BlankNode {
private static void unique(UUID salt) {
Instant now = Clock.systemUTC().instant();
return salt.toString() + System.identityHashCode(this) +
now.getEpochSecond() + now.getNano() + Thread.currentThread().getId();
}
private final String uniqueReference;
public BlankNodeImpl(UUID salt, String name) {
uniqueReference = salt.toString() + name;
}
public BlankNodeImpl(UUID salt) {
uniqueReference = salt.toString() + System.identityHashCode(this)
+ new Date().;
}
}
Here there is no hidden mutability in AtomicLong or within
java.util.UUID's SecureRandom implementation's internal state. I guess
you would not be happy with those either?
The clock is obviously mutable - but as a device rather than a memory state.
> Having `add` returning a `Graph` does not mean that `Graph` is
> immutable. It just means that it *enables* `Graph` to be immutable.
There is nothing stopping an immutable Graph from having an additional
method that does this.
For some methods, like builders, returning the mutated state is good practice.
It has been suggested earlier to return bool on add() to be compatible
with Collection, but we were not all too happy with that as it might
be difficult/expensive to know if the graph was actually mutated or
not (e.g. you insert the same triple twice, but the store doesn't
bother checking if the triple existed).
See
https://issues.apache.org/jira/browse/COMMONSRDF-17
https://github.com/commons-rdf/commons-rdf/issues/27
https://github.com/commons-rdf/commons-rdf/issues/46
So your suggestion is for the mutability methods to return the mutated
object (which may or may not be the original instance). I think this
could be an interesting take for discussions - could you raise this as
a separate Jira issue?
> Well, Scala is just a language. Immutability and referential
> transparency, are just principles, but they are becoming more and more
> important in many areas (Spark, concurrency, etc.).
Agreed, also for distributed areas like Hadoop.
> There is no shortcut at all. The RDF model only resolves around some
> types (Graph, Triple, RDFTerm, BlankNodeOrIRI, IRI, BlankNode,
> Literal) which can be left abstract, as opposed to being concrete when
> using Java's interfaces. (it's "concrete" in the sense it's using
> nominal subtyping)
Well, I still don't see how a java.util.String will work with Java
code that expects to be able to call .getIRIString(). Would
Scala generate proxies on the fly? Or would it need to call
.getIRIString() "elsewhere"?
> If you look at what I did, you have a *direct* translation of the
> existing interfaces+methods+factory into simple functions.
Yes, but done in Scala. Can I see a suggestion to the changes of the
current CommonsRDF Java interfaces - in Java?
> * the Java interfaces becomes abstract types
Java interfaces are abstract types. Do you mean generics? Generics of
which class/interface?
Not all Commons RDF clients are expected to interface via
RDFTermFactory. In fact many use-cases don't need it at all.
> * the methods on those interfaces become functions on the abstract types
> * the methods on the interfaces in the factory becomes simple
> functions on the abstract types
> * operating on a node happens with a visitor (as in visitor pattern)
> implemented as the `visit` function, taking 3 functions for the 3
> possible cases (I believe the current API asks for checking the class
> at runtime...)
This is too much at an abstract (!) level for me to visualize as we're
clashing programming languages here.. could you detail how this would
look in a set of *.java files? Feel free to raise it as a pull request
or similar, even if it's very draft-like. :)
> Now, let's say I am implementing a Turtle parser. The only thing I
> care about is how I can [use case 1] create/inject elements into some
> existing RDF model. If I am writing a Turtle serializer, I only care
> about how to [use case 2] traverse that type hierarchy. In none of
> those cases did I care about having the types defined in the
> class/interface hierarchy and I want anybody to use their own RDF
> model.
Yes. And with the current take of Commons RDF, the Turtle parser is free
to return its own instances of RDFTerm interfaces, which any Commons RDF
consuming client will be able to use as-is, e.g. pass to their own
Graph implementation.
> class TurtleParser<Graph, Triple, RDFTerm, BlankNodeOrIRI, IRI,
> BlankNode, Literal> {
> RDF<Graph, Triple, RDFTerm, BlankNodeOrIRI, IRI, BlankNode, Literal> rdf
> Graph parse(String input) { /* can call rdf.createLiteral("foo"), or
> anything in rdf.* */ }
> }
I think the <brackets> speak for themselves here :-(
> "Small" remark: I still don't think that `createBlankNode(String)`
> belongs to the RDF model. I would really like to see a use case that
> shows why it has to be present.
This is a valid point of view which I think you should raise
as a new Jira issue. We did argue that it is not part of the
RDF model, but it is still a practically very useful feature,
however it has generated many contention points in the past
as it touches on state and uniqueness.
See also this discussion about the need (or not) for
exposing .uniqueReference()
https://issues.apache.org/jira/browse/COMMONSRDF-13
> Finally, I will admit that writing all those types parameters can be a
> bit cumbersome, even if it happens only in a very few places (as a
> user: only once when you build what you need e.g. a Turtle parser).
> But please let's not sacrifice correctness and functionality to (a
> little) convenience...
Well, if those would be exposed to any client of the Commons RDF API I
fear we would see very little uptake..
If they are hidden inside some upper/inner interface that is not
exposed otherwise, it is not so bad.
--
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/0000-0001-9842-9718