[jira] [Commented] (COMMONSRDF-6) Contract around the internal string of a blank node

ASF GitHub Bot (JIRA) Tue, 28 Apr 2015 19:53:14 -0700

    [ 
https://issues.apache.org/jira/browse/COMMONSRDF-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518620#comment-14518620
 ]


ASF GitHub Bot commented on COMMONSRDF-6:
-----------------------------------------

Github user ansell commented on a diff in the pull request:

    https://github.com/apache/incubator-commonsrdf/pull/10#discussion_r29307199
  
    --- Diff: api/src/main/java/org/apache/commons/rdf/api/BlankNode.java ---
    @@ -41,60 +41,51 @@
      * on the concrete syntax or implementation. The syntactic restrictions on 
blank
      * node identifiers, if any, therefore also depend on the concrete RDF 
syntax or
      * implementation.
    - * 
    + *
      * Implementations that handle blank node identifiers in concrete syntaxes 
need
      * to be careful not to create the same blank node from multiple 
occurrences of
      * the same blank node identifier except in situations where this is 
supported
      * by the syntax. </blockquote>
    - * 
    - * A BlankNode object created through the
    - * {@link RDFTermFactory#createBlankNode()} method must be universally 
unique,
    - * and SHOULD contain a {@link UUID} as part of its
    - * {@link #internalIdentifier()}.
    - * 
    - * A BlankNode object created through the
    - * {@link RDFTermFactory#createBlankNode(String)} method must be 
universally
    - * unique, but also produce the same {@link #internalIdentifier()} as any
    - * previous or future calls to that method on that factory with the same
    - * parameters. In addition, it SHOULD contain a {@link UUID} as part of its
    - * {@link #internalIdentifier()}, created using
    - * {@link UUID#nameUUIDFromBytes(byte[])} using a constant salt for each
    - * instance of {@link RDFTermFactory}, with the given identifier joined to 
that
    - * salt in a consistent manner.
    - * 
      *
    + * A BlankNode SHOULD contain a {@link UUID} string as part of its
    + * universally unique {@link #uniqueReference()}.
    + *
    + * @see RDFTermFactory#createBlankNode()
    + * @see RDFTermFactory#createBlankNode(String)
      * @see <a 
href="http://www.w3.org/TR/rdf11-concepts/#dfn-blank-node";>RDF-1.1
      * Blank Node</a>
      */
     public interface BlankNode extends BlankNodeOrIRI {
     
         /**
    -     * Return a <a href=
    -     * "http://www.w3.org/TR/rdf11-concepts/#dfn-blank-node-identifier"; 
>unique
    -     * label</a> for the blank node. This label is generated by either
    -     * {@link RDFTermFactory#createBlankNode()} or
    -     * {@link RDFTermFactory#createBlankNode(String)} and is unique within 
the
    -     * context of the instance of the factory. In particular, successive 
calls
    -     * to the {@link RDFTermFactory#createBlankNode(String)} method on a 
single
    -     * factory with the same parameters MUST return BlankNode objects with
    -     * identical internalIdentifiers, but the identifiers SHOULD be mapped 
to
    -     * unique values in the context of the factory instance.
    -     *
    -     * IMPORTANT: This is not a serialization/syntax label, and there are 
no
    -     * guarantees that it is a valid identifier in any concrete syntax. 
For an
    -     * N-Triples compatible identifier use {@link #ntriplesString()}. For 
all
    -     * other syntaxes, the result of this method must be sanitized to 
produce a
    -     * valid concrete identifier if one is needed.
    +     * Return a reference for uniquely identifying the blank node.
    +     * <p>
    +     * The reference string MUST be universally unique, e.g. blank nodes 
created
    +     * separately in different JVMs or from different {@link 
RDFTermFactory}
    +     * instances MUST NOT have the same reference string.
    --- End diff --
    
    This is slightly inconsistent with RDFTermFactory.createBlankNode(String) 
that says "SHOULD NOT" where it says "MUST NOT" here.


> Contract around the internal string of a blank node 
> ----------------------------------------------------
>
>                 Key: COMMONSRDF-6
>                 URL: https://issues.apache.org/jira/browse/COMMONSRDF-6
>             Project: Apache Commons RDF
>          Issue Type: Improvement
>            Reporter: Stian Soiland-Reyes (old)
>             Fix For: 0.1
>
>
> From https://github.com/commons-rdf/commons-rdf/issues/56
> afs:
> {quote}
> RDF 1.1 says "IRIs, literals and blank nodes are distinct and 
> distinguishable." [my emphasis]
> http://www.w3.org/TR/rdf11-concepts/#section-rdf-graph
> This is a consequence of RDF being an abstract syntax - there is no 
> logic/entailment at this level - it was true in RDF 1.0 but now it is 
> explciitly stated in RDF Concepts.
> Distinguishable blank nodes mean that unique characteristics need to align to 
> the Java identity contract.
> At least, the same (= RDFTerm.equals) blank node, even when different java 
> objects, must have the same internal string. (.equals)
> It's a one-way implicition: same internal string does not imply equality so 
> this works across independent implementations.
> An extreme implementation is to always return the same internal string (may 
> not be helpful but should be legal).
> {quote}
> afs:
> {quote}
> This also related to the proposed {{BlankNode.ntriplesString()}}.
> The choice of output string is dependent on the writing process. It only 
> needs to be unique across the file being written. A choice for output is 
> short forms like ":b0", ":b1" etc.
> The ntriples output form is not a unique property of the blank node. I think 
> we should not include ntriplesString in the core common API.
> {quote}
> stain:
> {quote}
> Not sure what this is proposing, but :-1: to remove BlankNode.ntriplesString 
> - and :+1: to improve the contract text for BlankNode.
> I found ntriplesString very useful as it becomes an interoperability point 
> and have (largely) predictable outputs.
> The commons RDF API stays very close to the rdf11-concepts 
> http://www.w3.org/TR/rdf11-concepts/ , which I like. The ntriplesString are 
> however trivial to implement - and almost all implementations are probably 
> going to have something like that anyway. I never liked much that the name 
> doesn't include get - but I guess that is because it is a derived value and 
> might need further calculations.
> The only contentious part is in BlankNode - so perhaps add a specialization 
> of ntriplesString that clarifies the pitfalls here (as we did with equals). 
> The long paragraphs of BlankNode on the top does not currently help to 
> clarify this.
> See the simple implementation of BlankNode for one simple way to deal with 
> those "non-ntriples-valid internal identifiers".
> Always keeping an internal UUID field or similar is another - implementations 
> can decide on what is most natural to their implementations - they probably 
> have already dealt with this already, although possibly not within their 
> equivalent of the BlankNode class. The BlankNode is also free to keep an 
> internal reference to the Graph or "local scope" and use that to generate 
> identifiers.
> There is no requirement anywhere for Blank Node identifiers to always be 
> re-generated in serialization - this is simply a liberty that is available. A 
> serializer based on Commons RDF can still do that - he can simply ignore 
> BlankNode.ntriplesString and create a temporary Map from internalIdentifier 
> to b1, b2, etc. I do however not see why we need to REQUIRE a serializer do 
> such an operation - that is taking this API beyond its scope and into "best 
> practice" (in which case we would also deal with prefixes, preserving prefix 
> names, canonicalizing URIs, etc).
> As an example of the current strength, I was able to write an N-triples 
> serializer in simple by just concatenating the ntriplesString of the 
> components from TripleImpl.toString and then just joining with \n:
> This is powerful - for nothing else it's great for debugging. I am not 
> proposing to add ntriplesString() for Triple, as it might need to be much 
> closer to the Graph - but at least RDFNode.toString() could have a default 
> method that calls ntriplesString() (which is 200 times more useful than 
> LiteralImpl 2bd85b1f529302f9 from Object.toString :) )
> {quote}
> afs:
> {quote}
> Some display string is useful but reading the contract for ntriplesString, it 
> is not (just) for display purposes. c.f. Java toString. There is a different 
> in escaping. I see that TripleImpl.toString does not do syntax escaping.
> Providing a readable RDFNode.toString() would separate the development dsplay 
> concerns (e.g. no escapes maybe) from serialization concerns.
> Some RDF systems implement blank nodes from a sequence (e.g. Mulgara). 
> Actually that policy can be quite convenient for debugging development.
> We could include N-Triples in commons-rdf but to me v1 should targetted as 
> "use RDF data". Parsing and serialization is the concern of the 
> implementation. The simple impl is one such example, not a new RDF system (is 
> it?:-)
> {quote}
> ansell:
> {quote}
> I commented on the pull request to remove some of the tests that test or rely 
> on the BlankNode internal identifier structure, particularly that it be a 
> valid n-triples identifier. However, those tests made it into the merged 
> version because it was otherwise basically okay and we are continually 
> evolving anyway so there is no need to have perfect pull requests at this 
> stage. I will review and merge #55 and then work on any further cases that we 
> may not be testing for yet.
> I am all for defining/clarifying the contract for .toString in the API, even 
> if it says that there is no specific escaping or formatting done on the 
> output of .toString.
> Supporting N-Triples in the base API seems to be natural for two reasons to 
> me. Firstly, it is the simplest syntax, so it doesn't require any particular 
> optimisations and Triples can be streamed out without relying on a particular 
> framework or serialiser. Secondly, for a long time it has been the sole 
> established test case format for RDF, although it is defined on its own for 
> RDF-1.1, so it is a natural single serialisation to support.
> As long as the output of ntriplesString is defined to be implementation and 
> local scope specific for BlankNodes (no confusion with IRI or Literal), I am 
> fine with having it. Given the number of times the BlankNode API references 
> "local scope" right now, we are unlikely to have more users commenting that 
> it is unusual than we already have had for the last 10 years with RDF-1.0.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (COMMONSRDF-6) Contract around the internal string of a blank node

Reply via email to