Adding a graph is a copy, of triples and of the prefixes.
The exception is datasets-map-link
On 13/09/2021 03:47, Holger Knublauch wrote:
To be consistent, wouldn't this mean that as soon as a Graph is added to
a dataset, then it should adopt the dataset namespaces too? I am
thinking about operations such as when a GraphMem is added to a
DatasetGraph using addGraph. But this appears problematic, as Graphs
could logically be part of multiple datasets. From the documentation it
seems that someone could even add a TDB Graph to another Dataset. So why
should TDB behave different from other datasets?
TDB behaves like other datasets: TDB1, TDB2, TIM (Transaction In
Memory), and the basic dataset implementation.
Adding a graph is a copy, of the triples and of the prefixes.
A dataset is isolated - if a graph is added, then there is a graph in
the dataset and later altering the source graph does not affect the
dataset contents.
This is the difference between DatasetFactory.create() and
DatasetFactory.createGeneral.
Dataset-map-link is the exception.
It does not provide the isolation contract.
Its primary usage is to give a dataset view over graphs from any
storage, or some derived graph. Jena rules being the prototypical example.
It is best used as read-only, or at least not with both graph and
dataset updates mixed together. If a graph gets created in
DatasetGraphMapLink it defaults to in-memory so can be lost which people
run into from time to time.
DatasetFactory.create() could change to be TIM. It should not make a
difference except MRSW becomes MR+SW.
Fuseki uses TIM when storing in memory. It takes an assembler to change
that.
Andy
Holger
On 2021-09-10 6:37 am, Andy Seaborne wrote:
On 07/09/2021 12:33, Holger Knublauch wrote:
Having said this, I honestly don't think the limitations of one
serialization should be enough to motivate such a drastic change to
how prefixes are managed in TDB.
It is not one serialization - it includes JSON-LD, where it looks like
new work will include packages of graphs as a dataset. It also makes
default union graph work properly and consistently. A dataset is a
logical collection of data - shared prefixes makes sense and is
natural for datasets read/write.
TDB 1 and 2 are both have per-graph prefixes. Hardly drastic.
TQ has its own proprietary graph combination and security layer which
is not based or related to RDF datasets and it does not use Jena data
access security. Using a single dataset to store many graphs of that
graph combination system is local to TQ.
For example it means that if someone changes the prefixes in one
graph of the dataset, then she also changes the prefixes for all
others, even for graphs that are not supposed to be writable for her.
And then what would happen if two graphs are loaded and added from
turtle files where each declares "ex" prefix? All this sounds very
fragile and makes the use of such shared-prefixmapping datasets
rather limiting - the old design was working just fine.
The triples are kept apart. The second prefix does not change the
earlier data.
Andy