Hi Stian and Reto, Blank nodes are hard to support within a single system. They are fairly close to unsustainable within a general system. However, within a system that has RDF-1.1 as its theoretical basis, the W3C spec defines the mapping functions that are necessary to define equivalence between graphs (but does not say how translation should work in practice). Hence the discussion and a long contract to come to agreement about something that is consistent with the W3C specs, but extends them where necessary to make them work across the JVM.
Part of this issue is that while it is necessary to expose some internally unique information about the BlankNode, the concrete syntax (or the Java Object for intra-VM translation), may not have assigned any identifier to the BlankNode. N-Triples for instance must necessarily know about an identifier to serialise a Triple independent of the context of a Graph. Hence we are trying to converge on a method for consistently assigning labels to blank nodes based on the parser (sorry if the JVM wide local scope comment confused you, the local scope probably needs to be smaller than that, at either the individual document parse level or the Graph level). Some of the use cases that we are trying to support are: 1. The same document parsed using the same parser implementation into the same graph may generate BlankNode objects that are .equals and if they are .equals the .hashCode must be the same. 2. The same document parsed using the same parser implementation into two different graphs must generate BlankNode objects that are not .equals() and hopefully do not have the same .hashCode(). 3. Two different documents parsed using the same parser implementation into the same graph must generate BlankNode objects that are not .equals() and have different .hashCode() results. This includes cases where the concrete syntax contained the same label for the blank node. 4. The same document parsed using different parser implementations into two different graphs must generate BlankNode objects that are not .equals() and hopefully do not have the same .hashCode(). 5. Two different documents parsed using different parser implementations may be then transferred into the same graph and the BlankNode objects inside of the graph must not be .equals() if they came from different physical documents, even if the concrete syntax contained the same label for the blank node. Andy has also brought up the possibility of round-tripping in addition to those requirements. Ie, a BlankNode from one graph could be inserted into another graph, and after some time it should be possible to put it back into the first graph and have it operate as if it were not moved out. The current proposal doesn't allow for that and I am not sure what would be required for that to work. In addition, it is hoped that all of the objects in the system could be immutable within a graph. We have not discussed trimming graphs previously. I have never come at RDF with the requirement of being able to remove triples but I may have had a limited set of use cases. Is there a usecase for that automatic trimming that could not be easily satisfied using a rules engine, as any automatic removal of triples is outside of what I envisioned the scope of Commons RDF to be and it hasn't been brought up by any others. Even if in RDF theory there is some corner case where it is allowed for, it is not a general requirement and is not generally used or asked for in my experience. I am fairly ambivalent on the case for internalIdentifier being substitutable for .toString, but currently we need to work out a consistent way to identify the local scope, and it could be used in conjunction with either internalIdentifier or toString if both have the same contract in practice. What we are doing endeavouring to transfer BlankNodes between implementations inside of the JVM and keep their general identity (and round-tripping adds another level of difficulty on top of that). If we just rely on .toString then we may need to embed the local scope information into the resulting string, so the two pieces of information would be compressed into one, which may not be ideal in the end. In a broader sense, it would be great if the new Commons RDF API didn't enforce restrictions on .toString that already has consistent meanings in each of the implementations, and unique new methods give more flexibility there. Thanks, Peter --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org