In my opinion, a graph is like a schema in relational database, and a dataset 
is like a database instance.

Regards,
David 

-----Daan Reid <osmo...@dds.nl> escribió: -----
Para: users@jena.apache.org
De: Daan Reid <osmo...@dds.nl>
Fecha: 22/03/2018 11:35
Asunto: [MASSMAIL]Re: Splitting data into graphs vs datasets

I would say that using separate datasets is a good idea if you have sets 
of graphs that just don't belong together. The dataset as an 
organisational, abstract container is an excellent idea, in my opinion.

Regards,

Daan

On 22-03-18 11:22, Mikael Pesonen wrote:
> Ok seems that using many datasets is not a good idea. I had no bias and 
> not having any issues with speed, just wanted to see what is best way to 
> go.
> 
> On 21.3.2018 20:48, ajs6f wrote:
>>>   Those sure are good reasons for using named graphs. But what about 
>>> using different datasets too?
>> Consider that you may not be seeing such reasons because it may not 
>> actually be as good an idea.
>>
>> Here's another reason to prefer graphs: There is a standard management 
>> HTTP API for named graphs: SPARQL Graph Store. There is no equivalent 
>> for datasets, so each product rolls its own. That's not good for 
>> flexibility if you have to move products.
>>
>> As for performance, that will depend radically on the implementation. 
>> Jena TIM, for example, using hashing for its indexes, so the 
>> difference between having a lot of quads in a dataset and a few isn't 
>> likely to be that much. Other impls will vary.
>>
>> Are you sure that performance is going to be improved by separating 
>> out datasets? (I.e. is that the measured bottleneck?) Are you now 
>> having problems with queries accidentally querying data they shouldn't 
>> see, and can your queries be rewritten to fix that (which might also 
>> improve performance)? (Jena has a permissions framework that can 
>> secure information down to the individual triple.)
>>
>> ajs6f
>>
>>> On Mar 21, 2018, at 6:35 AM, Mikael Pesonen 
>>> <mikael.peso...@lingsoft.fi> wrote:
>>>
>>>
>>> Those sure are good reasons for using named graphs. But what about 
>>> using different datasets too?
>>>
>>> btw, I couldn't find info on how to run many datasets with Fuseki. is 
>>> it just one dataset per fuseki process? -loc parameter for 
>>> fuseki-server.jar?
>>>
>>> Br
>>>
>>> On 20.3.2018 14:22, Martynas Jusevičius wrote:
>>>> Provenance. With named graphs, it's easier to track where data came 
>>>> from:
>>>> who imported it, when etc.
>>>> You can also have meta-graphs about other graphs.
>>>>
>>>> Also editing and updating data. You can load named graph contents (of
>>>> smallish size) in an editor, make changes and then store a new 
>>>> version in
>>>> the same graph. You probably would not want to do this with a large 
>>>> default
>>>> graph.
>>>>
>>>> On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 
>>>> <mikael.peso...@lingsoft.fi>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm using Fuseki GSP, and so far have put all data into one default
>>>>> dataset and using graphs to split it.
>>>>>
>>>>> If I'm right there would be benefits using more than one dataset
>>>>> - better performance - each query is done inside a dataset so less 
>>>>> data =
>>>>> faster query
>>>>> - protection of data - can't "accidentaly" query data from other 
>>>>> datasets
>>>>> Downsides:
>>>>> - combining data from various datasets is heavier task
>>>>>
>>>>> Is this correct? Any other things that should be considered?
>>>>>
>>>>> Thank you
>>>>>
>>>>> -- 
>>>>> Lingsoft - 30 years of Leading Language Management
>>>>>
>>>>> www.lingsoft.fi
>>>>>
>>>>> Speech Applications - Language Management - Translation - Reader's and
>>>>> Writer's Tools - Text Tools - E-books and M-books
>>>>>
>>>>> Mikael Pesonen
>>>>> System Engineer
>>>>>
>>>>> e-mail: mikael.peso...@lingsoft.fi
>>>>> Tel. +358 2 279 3300
>>>>>
>>>>> Time zone: GMT+2
>>>>>
>>>>> Helsinki Office
>>>>> Eteläranta 10
>>>>> <https://maps.google.com/?q=Etel%C3%A4ranta+10&entry=gmail&source=g>
>>>>> FI-00130 Helsinki
>>>>> FINLAND
>>>>>
>>>>> Turku Office
>>>>> Kauppiaskatu 5 A
>>>>> <https://maps.google.com/?q=Kauppiaskatu+5+A&entry=gmail&source=g>
>>>>> FI-20100 Turku
>>>>> FINLAND
>>>>>
>>>>>
>>> -- 
>>> Lingsoft - 30 years of Leading Language Management
>>>
>>> www.lingsoft.fi
>>>
>>> Speech Applications - Language Management - Translation - Reader's 
>>> and Writer's Tools - Text Tools - E-books and M-books
>>>
>>> Mikael Pesonen
>>> System Engineer
>>>
>>> e-mail: mikael.peso...@lingsoft.fi
>>> Tel. +358 2 279 3300
>>>
>>> Time zone: GMT+2
>>>
>>> Helsinki Office
>>> Eteläranta 10
>>> FI-00130 Helsinki
>>> FINLAND
>>>
>>> Turku Office
>>> Kauppiaskatu 5 A
>>> FI-20100 Turku
>>> FINLAND
>>>
> 

Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir 
aquest missatge si no és estrictament necessari | Avoid printing this message 
if it is not absolutely necessary

Reply via email to