Re: Shared prefixes for all named graphs in TDB dataset (JENA-2006)

Andy Seaborne Mon, 13 Sep 2021 03:53:05 -0700

Adding a graph is a copy, of triples and of the prefixes.







The exception is datasets-map-link


On 13/09/2021 03:47, Holger Knublauch wrote:

To be consistent, wouldn't this mean that as soon as a Graph is added toa dataset, then it should adopt the dataset namespaces too? I amthinking about operations such as when a GraphMem is added to aDatasetGraph using addGraph. But this appears problematic, as Graphscould logically be part of multiple datasets. From the documentation itseems that someone could even add a TDB Graph to another Dataset. So whyshould TDB behave different from other datasets?

TDB behaves like other datasets: TDB1, TDB2, TIM (Transaction InMemory), and the basic dataset implementation.


Adding a graph is a copy, of the triples and of the prefixes.

A dataset is isolated - if a graph is added, then there is a graph inthe dataset and later altering the source graph does not affect thedataset contents.

This is the difference between DatasetFactory.create() andDatasetFactory.createGeneral.


Dataset-map-link is the exception.

It does not provide the isolation contract.

Its primary usage is to give a dataset view over graphs from anystorage, or some derived graph. Jena rules being the prototypical example.

It is best used as read-only, or at least not with both graph anddataset updates mixed together. If a graph gets created inDatasetGraphMapLink it defaults to in-memory so can be lost which peoplerun into from time to time.

DatasetFactory.create() could change to be TIM. It should not make adifference except MRSW becomes MR+SW.

Fuseki uses TIM when storing in memory. It takes an assembler to changethat.


    Andy

Holger


On 2021-09-10 6:37 am, Andy Seaborne wrote:
On 07/09/2021 12:33, Holger Knublauch wrote:
Having said this, I honestly don't think the limitations of oneserialization should be enough to motivate such a drastic change tohow prefixes are managed in TDB.
It is not one serialization - it includes JSON-LD, where it looks likenew work will include packages of graphs as a dataset. It also makesdefault union graph work properly and consistently. A dataset is alogical collection of data - shared prefixes makes sense and isnatural for datasets read/write.
TDB 1 and 2 are both have per-graph prefixes. Hardly drastic.
TQ has its own proprietary graph combination and security layer whichis not based or related to RDF datasets and it does not use Jena dataaccess security. Using a single dataset to store many graphs of thatgraph combination system is local to TQ.
For example it means that if someone changes the prefixes in onegraph of the dataset, then she also changes the prefixes for allothers, even for graphs that are not supposed to be writable for her.And then what would happen if two graphs are loaded and added fromturtle files where each declares "ex" prefix? All this sounds veryfragile and makes the use of such shared-prefixmapping datasetsrather limiting - the old design was working just fine.
The triples are kept apart. The second prefix does not change theearlier data.
    Andy

Re: Shared prefixes for all named graphs in TDB dataset (JENA-2006)

Reply via email to