--- slightly unfinished message sent by accident before -- Hi Rupert,
Good to have this discussion. > I had always the impression that Graphs are read-only > TripleCollections and MGraphs are read- and writeable > TripleCollections. I had never through about immutable Graphs, > read-only MGraphs and read- and writeable MGraphs. The thing to use in most situations is indeed the MGraph, this is what corresponds to a Graph and a Model in Jena and to a Context in Sesame. Another named could be GraphChangingOverTime. The RDF Semantics has another notion of Graph, two graphs are identical iff they are isomorphic, clearly two jena models are not the same, just because they happen to be isomorphic at some point in time. That a triple collection is an MGraph (unintuively) doesn't mean that you can actually modify it, on one hand you might lack the necessary permissions on the other hand the graph might just mutate for other reasons than some direct modification of it. An MGraph might describe the value of a Stock value, it's not a fixed Graph but something that changes over time but adding and removing triples is not supported Graphs are useful typically for small triple collection, self contained molecules of information that can for example be signed and added to (Hash)Sets. > What are the use cases for immutable graphs in Clerezza? It it really > important to have immutable Graphs? I think for an RDF API it make sense to have this basic element of the RDF specifications available as a core class. Even if there are more evident usecases for MGraph, there are some situations when Graphs have practical values: - Doing RDF synchronization (RDFSync): What you sync is a set of MSG (minimum self contained graphs) which are graphs - Similarly for diffs and versioning: The units to deal with are not triples (when there are bnodes) but small subgraphs - Computing E-Tag in HTTP, the hash of the Graph can be used for it - Digital signing: signing a mutable graph is not what you want > Because creating those is really > expensive (look at the MGraph#getGraph() implementations that create > an in-memory copy of the MGrpah in an SimpleGraph instance). I have > already written about that on the list [1] but at that time I was not > aware about the reason for that and also the follow up discussion > missed to come up with the reason for that. Missed that thread. > I imagine that a lot of users do call MGraph#getGraph() without > realizing that this would clone all the data in the MGrpah. Implementations can be smarter that that and clone the data only if the mutable graph is modified after getGraph has been called. This means that one can use an MGraph to add the triples and return mGraph().getGraph() without the triples being duplicated. A really clever implementation keeps weak-references to the Graphs returned since the last change and only duplicates the data on a modification when one of the Graphs id still referenced. This would mean that the following would never (see limitation below) cause the data to be duplicated. Lock readLock = mGraph.getLock().readLock(); readLock.lock(); try { if (mGraph.getGraph().equals(referenceGraph)) { alert("you did it!") } } finally { readLock.unlock(); } As isomorphism is an expensive operation it might be better to duplicate the graph rather than to keep code that wants to add a triple waiting. The following would duplicate the graph only (see limitation below) if a triple is added while isomorphism is being computed, if (mGraph.getGraph().equals(referenceGraph)) { alert("you did it!") } Limitation: Looking at the javadoc I realize that as long as garbage collections doesn't happen it seems not be possible to find out that a instance has no longer a reference to it. "An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use" sound good, however the further details of WeakHashMap and WeakReference api indicate that the reference is queued for finalization only when the garbage collection detects it. Which is probably not the very instant in which the object becomes eligible for garbage collection. > > Changing the SingleTdbDatasetTcProvider so that the union-grpah is > exposed as read-only MGrpah is really not a big deal. I am just > wondering if I am the only one that uses the Graph interface different > as the Javadoc says. In any case I would add a big WARNING to the > MGrpah#getGraph() method saying that calling this method will create a > copy (and not a read-only wrapper) of the MGraph. I think the warning should say that the data will typically (not if the backend supports versioning) duplicated as soon as a triple is added or removed to the MGraph. Cheers, Reto > > best > Rupert > > [1] > http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/201203.mbox/%3c1f7adf98-d5f7-47f2-be72-fc248b921...@gmail.com%3E > >> Cheers, >> Reto > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen