[
https://issues.apache.org/jira/browse/COMMONSRDF-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518620#comment-14518620
]
ASF GitHub Bot commented on COMMONSRDF-6:
-----------------------------------------
Github user ansell commented on a diff in the pull request:
https://github.com/apache/incubator-commonsrdf/pull/10#discussion_r29307199
--- Diff: api/src/main/java/org/apache/commons/rdf/api/BlankNode.java ---
@@ -41,60 +41,51 @@
* on the concrete syntax or implementation. The syntactic restrictions on
blank
* node identifiers, if any, therefore also depend on the concrete RDF
syntax or
* implementation.
- *
+ *
* Implementations that handle blank node identifiers in concrete syntaxes
need
* to be careful not to create the same blank node from multiple
occurrences of
* the same blank node identifier except in situations where this is
supported
* by the syntax. </blockquote>
- *
- * A BlankNode object created through the
- * {@link RDFTermFactory#createBlankNode()} method must be universally
unique,
- * and SHOULD contain a {@link UUID} as part of its
- * {@link #internalIdentifier()}.
- *
- * A BlankNode object created through the
- * {@link RDFTermFactory#createBlankNode(String)} method must be
universally
- * unique, but also produce the same {@link #internalIdentifier()} as any
- * previous or future calls to that method on that factory with the same
- * parameters. In addition, it SHOULD contain a {@link UUID} as part of its
- * {@link #internalIdentifier()}, created using
- * {@link UUID#nameUUIDFromBytes(byte[])} using a constant salt for each
- * instance of {@link RDFTermFactory}, with the given identifier joined to
that
- * salt in a consistent manner.
- *
*
+ * A BlankNode SHOULD contain a {@link UUID} string as part of its
+ * universally unique {@link #uniqueReference()}.
+ *
+ * @see RDFTermFactory#createBlankNode()
+ * @see RDFTermFactory#createBlankNode(String)
* @see <a
href="http://www.w3.org/TR/rdf11-concepts/#dfn-blank-node">RDF-1.1
* Blank Node</a>
*/
public interface BlankNode extends BlankNodeOrIRI {
/**
- * Return a <a href=
- * "http://www.w3.org/TR/rdf11-concepts/#dfn-blank-node-identifier"
>unique
- * label</a> for the blank node. This label is generated by either
- * {@link RDFTermFactory#createBlankNode()} or
- * {@link RDFTermFactory#createBlankNode(String)} and is unique within
the
- * context of the instance of the factory. In particular, successive
calls
- * to the {@link RDFTermFactory#createBlankNode(String)} method on a
single
- * factory with the same parameters MUST return BlankNode objects with
- * identical internalIdentifiers, but the identifiers SHOULD be mapped
to
- * unique values in the context of the factory instance.
- *
- * IMPORTANT: This is not a serialization/syntax label, and there are
no
- * guarantees that it is a valid identifier in any concrete syntax.
For an
- * N-Triples compatible identifier use {@link #ntriplesString()}. For
all
- * other syntaxes, the result of this method must be sanitized to
produce a
- * valid concrete identifier if one is needed.
+ * Return a reference for uniquely identifying the blank node.
+ * <p>
+ * The reference string MUST be universally unique, e.g. blank nodes
created
+ * separately in different JVMs or from different {@link
RDFTermFactory}
+ * instances MUST NOT have the same reference string.
--- End diff --
This is slightly inconsistent with RDFTermFactory.createBlankNode(String)
that says "SHOULD NOT" where it says "MUST NOT" here.
> Contract around the internal string of a blank node
> ----------------------------------------------------
>
> Key: COMMONSRDF-6
> URL: https://issues.apache.org/jira/browse/COMMONSRDF-6
> Project: Apache Commons RDF
> Issue Type: Improvement
> Reporter: Stian Soiland-Reyes (old)
> Fix For: 0.1
>
>
> From https://github.com/commons-rdf/commons-rdf/issues/56
> afs:
> {quote}
> RDF 1.1 says "IRIs, literals and blank nodes are distinct and
> distinguishable." [my emphasis]
> http://www.w3.org/TR/rdf11-concepts/#section-rdf-graph
> This is a consequence of RDF being an abstract syntax - there is no
> logic/entailment at this level - it was true in RDF 1.0 but now it is
> explciitly stated in RDF Concepts.
> Distinguishable blank nodes mean that unique characteristics need to align to
> the Java identity contract.
> At least, the same (= RDFTerm.equals) blank node, even when different java
> objects, must have the same internal string. (.equals)
> It's a one-way implicition: same internal string does not imply equality so
> this works across independent implementations.
> An extreme implementation is to always return the same internal string (may
> not be helpful but should be legal).
> {quote}
> afs:
> {quote}
> This also related to the proposed {{BlankNode.ntriplesString()}}.
> The choice of output string is dependent on the writing process. It only
> needs to be unique across the file being written. A choice for output is
> short forms like ":b0", ":b1" etc.
> The ntriples output form is not a unique property of the blank node. I think
> we should not include ntriplesString in the core common API.
> {quote}
> stain:
> {quote}
> Not sure what this is proposing, but :-1: to remove BlankNode.ntriplesString
> - and :+1: to improve the contract text for BlankNode.
> I found ntriplesString very useful as it becomes an interoperability point
> and have (largely) predictable outputs.
> The commons RDF API stays very close to the rdf11-concepts
> http://www.w3.org/TR/rdf11-concepts/ , which I like. The ntriplesString are
> however trivial to implement - and almost all implementations are probably
> going to have something like that anyway. I never liked much that the name
> doesn't include get - but I guess that is because it is a derived value and
> might need further calculations.
> The only contentious part is in BlankNode - so perhaps add a specialization
> of ntriplesString that clarifies the pitfalls here (as we did with equals).
> The long paragraphs of BlankNode on the top does not currently help to
> clarify this.
> See the simple implementation of BlankNode for one simple way to deal with
> those "non-ntriples-valid internal identifiers".
> Always keeping an internal UUID field or similar is another - implementations
> can decide on what is most natural to their implementations - they probably
> have already dealt with this already, although possibly not within their
> equivalent of the BlankNode class. The BlankNode is also free to keep an
> internal reference to the Graph or "local scope" and use that to generate
> identifiers.
> There is no requirement anywhere for Blank Node identifiers to always be
> re-generated in serialization - this is simply a liberty that is available. A
> serializer based on Commons RDF can still do that - he can simply ignore
> BlankNode.ntriplesString and create a temporary Map from internalIdentifier
> to b1, b2, etc. I do however not see why we need to REQUIRE a serializer do
> such an operation - that is taking this API beyond its scope and into "best
> practice" (in which case we would also deal with prefixes, preserving prefix
> names, canonicalizing URIs, etc).
> As an example of the current strength, I was able to write an N-triples
> serializer in simple by just concatenating the ntriplesString of the
> components from TripleImpl.toString and then just joining with \n:
> This is powerful - for nothing else it's great for debugging. I am not
> proposing to add ntriplesString() for Triple, as it might need to be much
> closer to the Graph - but at least RDFNode.toString() could have a default
> method that calls ntriplesString() (which is 200 times more useful than
> LiteralImpl 2bd85b1f529302f9 from Object.toString :) )
> {quote}
> afs:
> {quote}
> Some display string is useful but reading the contract for ntriplesString, it
> is not (just) for display purposes. c.f. Java toString. There is a different
> in escaping. I see that TripleImpl.toString does not do syntax escaping.
> Providing a readable RDFNode.toString() would separate the development dsplay
> concerns (e.g. no escapes maybe) from serialization concerns.
> Some RDF systems implement blank nodes from a sequence (e.g. Mulgara).
> Actually that policy can be quite convenient for debugging development.
> We could include N-Triples in commons-rdf but to me v1 should targetted as
> "use RDF data". Parsing and serialization is the concern of the
> implementation. The simple impl is one such example, not a new RDF system (is
> it?:-)
> {quote}
> ansell:
> {quote}
> I commented on the pull request to remove some of the tests that test or rely
> on the BlankNode internal identifier structure, particularly that it be a
> valid n-triples identifier. However, those tests made it into the merged
> version because it was otherwise basically okay and we are continually
> evolving anyway so there is no need to have perfect pull requests at this
> stage. I will review and merge #55 and then work on any further cases that we
> may not be testing for yet.
> I am all for defining/clarifying the contract for .toString in the API, even
> if it says that there is no specific escaping or formatting done on the
> output of .toString.
> Supporting N-Triples in the base API seems to be natural for two reasons to
> me. Firstly, it is the simplest syntax, so it doesn't require any particular
> optimisations and Triples can be streamed out without relying on a particular
> framework or serialiser. Secondly, for a long time it has been the sole
> established test case format for RDF, although it is defined on its own for
> RDF-1.1, so it is a natural single serialisation to support.
> As long as the output of ntriplesString is defined to be implementation and
> local scope specific for BlankNodes (no confusion with IRI or Literal), I am
> fine with having it. Given the number of times the BlankNode API references
> "local scope" right now, we are unlikely to have more users commenting that
> it is unusual than we already have had for the last 10 years with RDF-1.0.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)