[ 
https://issues.apache.org/jira/browse/COMMONSRDF-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stian Soiland-Reyes reassigned COMMONSRDF-6:
--------------------------------------------

    Assignee: Andy Seaborne

Assigned to Andy to verify and Resolve.

> Contract around the internal string of a blank node 
> ----------------------------------------------------
>
>                 Key: COMMONSRDF-6
>                 URL: https://issues.apache.org/jira/browse/COMMONSRDF-6
>             Project: Apache Commons RDF
>          Issue Type: Improvement
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>             Fix For: 0.1
>
>
> From https://github.com/commons-rdf/commons-rdf/issues/56
> afs:
> {quote}
> RDF 1.1 says "IRIs, literals and blank nodes are distinct and 
> distinguishable." [my emphasis]
> http://www.w3.org/TR/rdf11-concepts/#section-rdf-graph
> This is a consequence of RDF being an abstract syntax - there is no 
> logic/entailment at this level - it was true in RDF 1.0 but now it is 
> explciitly stated in RDF Concepts.
> Distinguishable blank nodes mean that unique characteristics need to align to 
> the Java identity contract.
> At least, the same (= RDFTerm.equals) blank node, even when different java 
> objects, must have the same internal string. (.equals)
> It's a one-way implicition: same internal string does not imply equality so 
> this works across independent implementations.
> An extreme implementation is to always return the same internal string (may 
> not be helpful but should be legal).
> {quote}
> afs:
> {quote}
> This also related to the proposed {{BlankNode.ntriplesString()}}.
> The choice of output string is dependent on the writing process. It only 
> needs to be unique across the file being written. A choice for output is 
> short forms like ":b0", ":b1" etc.
> The ntriples output form is not a unique property of the blank node. I think 
> we should not include ntriplesString in the core common API.
> {quote}
> stain:
> {quote}
> Not sure what this is proposing, but :-1: to remove BlankNode.ntriplesString 
> - and :+1: to improve the contract text for BlankNode.
> I found ntriplesString very useful as it becomes an interoperability point 
> and have (largely) predictable outputs.
> The commons RDF API stays very close to the rdf11-concepts 
> http://www.w3.org/TR/rdf11-concepts/ , which I like. The ntriplesString are 
> however trivial to implement - and almost all implementations are probably 
> going to have something like that anyway. I never liked much that the name 
> doesn't include get - but I guess that is because it is a derived value and 
> might need further calculations.
> The only contentious part is in BlankNode - so perhaps add a specialization 
> of ntriplesString that clarifies the pitfalls here (as we did with equals). 
> The long paragraphs of BlankNode on the top does not currently help to 
> clarify this.
> See the simple implementation of BlankNode for one simple way to deal with 
> those "non-ntriples-valid internal identifiers".
> Always keeping an internal UUID field or similar is another - implementations 
> can decide on what is most natural to their implementations - they probably 
> have already dealt with this already, although possibly not within their 
> equivalent of the BlankNode class. The BlankNode is also free to keep an 
> internal reference to the Graph or "local scope" and use that to generate 
> identifiers.
> There is no requirement anywhere for Blank Node identifiers to always be 
> re-generated in serialization - this is simply a liberty that is available. A 
> serializer based on Commons RDF can still do that - he can simply ignore 
> BlankNode.ntriplesString and create a temporary Map from internalIdentifier 
> to b1, b2, etc. I do however not see why we need to REQUIRE a serializer do 
> such an operation - that is taking this API beyond its scope and into "best 
> practice" (in which case we would also deal with prefixes, preserving prefix 
> names, canonicalizing URIs, etc).
> As an example of the current strength, I was able to write an N-triples 
> serializer in simple by just concatenating the ntriplesString of the 
> components from TripleImpl.toString and then just joining with \n:
> This is powerful - for nothing else it's great for debugging. I am not 
> proposing to add ntriplesString() for Triple, as it might need to be much 
> closer to the Graph - but at least RDFNode.toString() could have a default 
> method that calls ntriplesString() (which is 200 times more useful than 
> LiteralImpl 2bd85b1f529302f9 from Object.toString :) )
> {quote}
> afs:
> {quote}
> Some display string is useful but reading the contract for ntriplesString, it 
> is not (just) for display purposes. c.f. Java toString. There is a different 
> in escaping. I see that TripleImpl.toString does not do syntax escaping.
> Providing a readable RDFNode.toString() would separate the development dsplay 
> concerns (e.g. no escapes maybe) from serialization concerns.
> Some RDF systems implement blank nodes from a sequence (e.g. Mulgara). 
> Actually that policy can be quite convenient for debugging development.
> We could include N-Triples in commons-rdf but to me v1 should targetted as 
> "use RDF data". Parsing and serialization is the concern of the 
> implementation. The simple impl is one such example, not a new RDF system (is 
> it?:-)
> {quote}
> ansell:
> {quote}
> I commented on the pull request to remove some of the tests that test or rely 
> on the BlankNode internal identifier structure, particularly that it be a 
> valid n-triples identifier. However, those tests made it into the merged 
> version because it was otherwise basically okay and we are continually 
> evolving anyway so there is no need to have perfect pull requests at this 
> stage. I will review and merge #55 and then work on any further cases that we 
> may not be testing for yet.
> I am all for defining/clarifying the contract for .toString in the API, even 
> if it says that there is no specific escaping or formatting done on the 
> output of .toString.
> Supporting N-Triples in the base API seems to be natural for two reasons to 
> me. Firstly, it is the simplest syntax, so it doesn't require any particular 
> optimisations and Triples can be streamed out without relying on a particular 
> framework or serialiser. Secondly, for a long time it has been the sole 
> established test case format for RDF, although it is defined on its own for 
> RDF-1.1, so it is a natural single serialisation to support.
> As long as the output of ntriplesString is defined to be implementation and 
> local scope specific for BlankNodes (no confusion with IRI or Literal), I am 
> fine with having it. Given the number of times the BlankNode API references 
> "local scope" right now, we are unlikely to have more users commenting that 
> it is unusual than we already have had for the last 10 years with RDF-1.0.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to