Adding a graph is a copy, of triples and of the prefixes.






The exception is datasets-map-link


On 13/09/2021 03:47, Holger Knublauch wrote:
To be consistent, wouldn't this mean that as soon as a Graph is added to a dataset, then it should adopt the dataset namespaces too? I am thinking about operations such as when a GraphMem is added to a DatasetGraph using addGraph. But this appears problematic, as Graphs could logically be part of multiple datasets. From the documentation it seems that someone could even add a TDB Graph to another Dataset. So why should TDB behave different from other datasets?

TDB behaves like other datasets: TDB1, TDB2, TIM (Transaction In Memory), and the basic dataset implementation.

Adding a graph is a copy, of the triples and of the prefixes.

A dataset is isolated - if a graph is added, then there is a graph in the dataset and later altering the source graph does not affect the dataset contents.

This is the difference between DatasetFactory.create() and DatasetFactory.createGeneral.

Dataset-map-link is the exception.

It does not provide the isolation contract.

Its primary usage is to give a dataset view over graphs from any storage, or some derived graph. Jena rules being the prototypical example.

It is best used as read-only, or at least not with both graph and dataset updates mixed together. If a graph gets created in DatasetGraphMapLink it defaults to in-memory so can be lost which people run into from time to time.

DatasetFactory.create() could change to be TIM. It should not make a difference except MRSW becomes MR+SW.

Fuseki uses TIM when storing in memory. It takes an assembler to change that.

    Andy


Holger


On 2021-09-10 6:37 am, Andy Seaborne wrote:


On 07/09/2021 12:33, Holger Knublauch wrote:
Having said this, I honestly don't think the limitations of one serialization should be enough to motivate such a drastic change to how prefixes are managed in TDB.

It is not one serialization - it includes JSON-LD, where it looks like new work will include packages of graphs as a dataset. It also makes default union graph work properly and consistently. A dataset is a logical collection of data - shared prefixes makes sense and is natural for datasets read/write.

TDB 1 and 2 are both have per-graph prefixes. Hardly drastic.

TQ has its own proprietary graph combination and security layer which is not based or related to RDF datasets and it does not use Jena data access security. Using a single dataset to store many graphs of that graph combination system is local to TQ.

For example it means that if someone changes the prefixes in one graph of the dataset, then she also changes the prefixes for all others, even for graphs that are not supposed to be writable for her. And then what would happen if two graphs are loaded and added from turtle files where each declares "ex" prefix? All this sounds very fragile and makes the use of such shared-prefixmapping datasets rather limiting - the old design was working just fine.

The triples are kept apart. The second prefix does not change the earlier data.

    Andy

Reply via email to