Re: [MASSMAIL]Re: Splitting data into graphs vs datasets

2018-03-22 Thread DAVID MOLINA ESTRADA
In my opinion, a graph is like a schema in relational database, and a dataset 
is like a database instance.

Regards,
David 

-Daan Reid <osmo...@dds.nl> escribió: -
Para: users@jena.apache.org
De: Daan Reid <osmo...@dds.nl>
Fecha: 22/03/2018 11:35
Asunto: [MASSMAIL]Re: Splitting data into graphs vs datasets

I would say that using separate datasets is a good idea if you have sets 
of graphs that just don't belong together. The dataset as an 
organisational, abstract container is an excellent idea, in my opinion.

Regards,

Daan

On 22-03-18 11:22, Mikael Pesonen wrote:
> Ok seems that using many datasets is not a good idea. I had no bias and 
> not having any issues with speed, just wanted to see what is best way to 
> go.
> 
> On 21.3.2018 20:48, ajs6f wrote:
>>>   Those sure are good reasons for using named graphs. But what about 
>>> using different datasets too?
>> Consider that you may not be seeing such reasons because it may not 
>> actually be as good an idea.
>>
>> Here's another reason to prefer graphs: There is a standard management 
>> HTTP API for named graphs: SPARQL Graph Store. There is no equivalent 
>> for datasets, so each product rolls its own. That's not good for 
>> flexibility if you have to move products.
>>
>> As for performance, that will depend radically on the implementation. 
>> Jena TIM, for example, using hashing for its indexes, so the 
>> difference between having a lot of quads in a dataset and a few isn't 
>> likely to be that much. Other impls will vary.
>>
>> Are you sure that performance is going to be improved by separating 
>> out datasets? (I.e. is that the measured bottleneck?) Are you now 
>> having problems with queries accidentally querying data they shouldn't 
>> see, and can your queries be rewritten to fix that (which might also 
>> improve performance)? (Jena has a permissions framework that can 
>> secure information down to the individual triple.)
>>
>> ajs6f
>>
>>> On Mar 21, 2018, at 6:35 AM, Mikael Pesonen 
>>> <mikael.peso...@lingsoft.fi> wrote:
>>>
>>>
>>> Those sure are good reasons for using named graphs. But what about 
>>> using different datasets too?
>>>
>>> btw, I couldn't find info on how to run many datasets with Fuseki. is 
>>> it just one dataset per fuseki process? -loc parameter for 
>>> fuseki-server.jar?
>>>
>>> Br
>>>
>>> On 20.3.2018 14:22, Martynas Jusevičius wrote:
>>>> Provenance. With named graphs, it's easier to track where data came 
>>>> from:
>>>> who imported it, when etc.
>>>> You can also have meta-graphs about other graphs.
>>>>
>>>> Also editing and updating data. You can load named graph contents (of
>>>> smallish size) in an editor, make changes and then store a new 
>>>> version in
>>>> the same graph. You probably would not want to do this with a large 
>>>> default
>>>> graph.
>>>>
>>>> On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 
>>>> <mikael.peso...@lingsoft.fi>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm using Fuseki GSP, and so far have put all data into one default
>>>>> dataset and using graphs to split it.
>>>>>
>>>>> If I'm right there would be benefits using more than one dataset
>>>>> - better performance - each query is done inside a dataset so less 
>>>>> data =
>>>>> faster query
>>>>> - protection of data - can't "accidentaly" query data from other 
>>>>> datasets
>>>>> Downsides:
>>>>> - combining data from various datasets is heavier task
>>>>>
>>>>> Is this correct? Any other things that should be considered?
>>>>>
>>>>> Thank you
>>>>>
>>>>> -- 
>>>>> Lingsoft - 30 years of Leading Language Management
>>>>>
>>>>> www.lingsoft.fi
>>>>>
>>>>> Speech Applications - Language Management - Translation - Reader's and
>>>>> Writer's Tools - Text Tools - E-books and M-books
>>>>>
>>>>> Mikael Pesonen
>>>>> System Engineer
>>>>>
>>>>> e-mail: mikael.peso...@lingsoft.fi
>>>>> Tel. +358 2 279 3300
>>>>>
>>>>> Time zone: GMT+2
>>>>>
>>>>> Helsinki Office
>>>>> Eteläranta 10
>>>>> <https://maps.google.com/?q=Etel%C3%A4ranta+10=gmail=g>
>>>>> FI-00130 Helsinki
>>>>> FINLAND
>>>>>
>>>>> Turku Office
>>>>> Kauppiaskatu 5 A
>>>>> <https://maps.google.com/?q=Kauppiaskatu+5+A=gmail=g>
>>>>> FI-20100 Turku
>>>>> FINLAND
>>>>>
>>>>>
>>> -- 
>>> Lingsoft - 30 years of Leading Language Management
>>>
>>> www.lingsoft.fi
>>>
>>> Speech Applications - Language Management - Translation - Reader's 
>>> and Writer's Tools - Text Tools - E-books and M-books
>>>
>>> Mikael Pesonen
>>> System Engineer
>>>
>>> e-mail: mikael.peso...@lingsoft.fi
>>> Tel. +358 2 279 3300
>>>
>>> Time zone: GMT+2
>>>
>>> Helsinki Office
>>> Eteläranta 10
>>> FI-00130 Helsinki
>>> FINLAND
>>>
>>> Turku Office
>>> Kauppiaskatu 5 A
>>> FI-20100 Turku
>>> FINLAND
>>>
> 

Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir 
aquest missatge si no és estrictament necessari | Avoid printing this message 
if it is not absolutely necessary



Re: Splitting data into graphs vs datasets

2018-03-22 Thread Daan Reid
I would say that using separate datasets is a good idea if you have sets 
of graphs that just don't belong together. The dataset as an 
organisational, abstract container is an excellent idea, in my opinion.


Regards,

Daan

On 22-03-18 11:22, Mikael Pesonen wrote:
Ok seems that using many datasets is not a good idea. I had no bias and 
not having any issues with speed, just wanted to see what is best way to 
go.


On 21.3.2018 20:48, ajs6f wrote:
  Those sure are good reasons for using named graphs. But what about 
using different datasets too?
Consider that you may not be seeing such reasons because it may not 
actually be as good an idea.


Here's another reason to prefer graphs: There is a standard management 
HTTP API for named graphs: SPARQL Graph Store. There is no equivalent 
for datasets, so each product rolls its own. That's not good for 
flexibility if you have to move products.


As for performance, that will depend radically on the implementation. 
Jena TIM, for example, using hashing for its indexes, so the 
difference between having a lot of quads in a dataset and a few isn't 
likely to be that much. Other impls will vary.


Are you sure that performance is going to be improved by separating 
out datasets? (I.e. is that the measured bottleneck?) Are you now 
having problems with queries accidentally querying data they shouldn't 
see, and can your queries be rewritten to fix that (which might also 
improve performance)? (Jena has a permissions framework that can 
secure information down to the individual triple.)


ajs6f

On Mar 21, 2018, at 6:35 AM, Mikael Pesonen 
 wrote:



Those sure are good reasons for using named graphs. But what about 
using different datasets too?


btw, I couldn't find info on how to run many datasets with Fuseki. is 
it just one dataset per fuseki process? -loc parameter for 
fuseki-server.jar?


Br

On 20.3.2018 14:22, Martynas Jusevičius wrote:
Provenance. With named graphs, it's easier to track where data came 
from:

who imported it, when etc.
You can also have meta-graphs about other graphs.

Also editing and updating data. You can load named graph contents (of
smallish size) in an editor, make changes and then store a new 
version in
the same graph. You probably would not want to do this with a large 
default

graph.

On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 


wrote:


Hi,

I'm using Fuseki GSP, and so far have put all data into one default
dataset and using graphs to split it.

If I'm right there would be benefits using more than one dataset
- better performance - each query is done inside a dataset so less 
data =

faster query
- protection of data - can't "accidentaly" query data from other 
datasets

Downsides:
- combining data from various datasets is heavier task

Is this correct? Any other things that should be considered?

Thank you

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and
Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10

FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A

FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's 
and Writer's Tools - Text Tools - E-books and M-books


Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND





Re: Splitting data into graphs vs datasets

2018-03-22 Thread Mikael Pesonen


Ok seems that using many datasets is not a good idea. I had no bias and 
not having any issues with speed, just wanted to see what is best way to go.


On 21.3.2018 20:48, ajs6f wrote:

  Those sure are good reasons for using named graphs. But what about using 
different datasets too?

Consider that you may not be seeing such reasons because it may not actually be 
as good an idea.

Here's another reason to prefer graphs: There is a standard management HTTP API 
for named graphs: SPARQL Graph Store. There is no equivalent for datasets, so 
each product rolls its own. That's not good for flexibility if you have to move 
products.

As for performance, that will depend radically on the implementation. Jena TIM, 
for example, using hashing for its indexes, so the difference between having a 
lot of quads in a dataset and a few isn't likely to be that much. Other impls 
will vary.

Are you sure that performance is going to be improved by separating out 
datasets? (I.e. is that the measured bottleneck?) Are you now having problems 
with queries accidentally querying data they shouldn't see, and can your 
queries be rewritten to fix that (which might also improve performance)? (Jena 
has a permissions framework that can secure information down to the individual 
triple.)

ajs6f


On Mar 21, 2018, at 6:35 AM, Mikael Pesonen  wrote:


Those sure are good reasons for using named graphs. But what about using 
different datasets too?

btw, I couldn't find info on how to run many datasets with Fuseki. is it just 
one dataset per fuseki process? -loc parameter for fuseki-server.jar?

Br

On 20.3.2018 14:22, Martynas Jusevičius wrote:

Provenance. With named graphs, it's easier to track where data came from:
who imported it, when etc.
You can also have meta-graphs about other graphs.

Also editing and updating data. You can load named graph contents (of
smallish size) in an editor, make changes and then store a new version in
the same graph. You probably would not want to do this with a large default
graph.

On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 
wrote:


Hi,

I'm using Fuseki GSP, and so far have put all data into one default
dataset and using graphs to split it.

If I'm right there would be benefits using more than one dataset
- better performance - each query is done inside a dataset so less data =
faster query
- protection of data - can't "accidentaly" query data from other datasets
Downsides:
- combining data from various datasets is heavier task

Is this correct? Any other things that should be considered?

Thank you

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and
Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10

FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A

FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Splitting data into graphs vs datasets

2018-03-21 Thread ajs6f
>  Those sure are good reasons for using named graphs. But what about using 
> different datasets too?

Consider that you may not be seeing such reasons because it may not actually be 
as good an idea.

Here's another reason to prefer graphs: There is a standard management HTTP API 
for named graphs: SPARQL Graph Store. There is no equivalent for datasets, so 
each product rolls its own. That's not good for flexibility if you have to move 
products.

As for performance, that will depend radically on the implementation. Jena TIM, 
for example, using hashing for its indexes, so the difference between having a 
lot of quads in a dataset and a few isn't likely to be that much. Other impls 
will vary.

Are you sure that performance is going to be improved by separating out 
datasets? (I.e. is that the measured bottleneck?) Are you now having problems 
with queries accidentally querying data they shouldn't see, and can your 
queries be rewritten to fix that (which might also improve performance)? (Jena 
has a permissions framework that can secure information down to the individual 
triple.)

ajs6f

> On Mar 21, 2018, at 6:35 AM, Mikael Pesonen  
> wrote:
> 
> 
> Those sure are good reasons for using named graphs. But what about using 
> different datasets too?
> 
> btw, I couldn't find info on how to run many datasets with Fuseki. is it just 
> one dataset per fuseki process? -loc parameter for fuseki-server.jar?
> 
> Br
> 
> On 20.3.2018 14:22, Martynas Jusevičius wrote:
>> Provenance. With named graphs, it's easier to track where data came from:
>> who imported it, when etc.
>> You can also have meta-graphs about other graphs.
>> 
>> Also editing and updating data. You can load named graph contents (of
>> smallish size) in an editor, make changes and then store a new version in
>> the same graph. You probably would not want to do this with a large default
>> graph.
>> 
>> On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 
>> wrote:
>> 
>>> Hi,
>>> 
>>> I'm using Fuseki GSP, and so far have put all data into one default
>>> dataset and using graphs to split it.
>>> 
>>> If I'm right there would be benefits using more than one dataset
>>> - better performance - each query is done inside a dataset so less data =
>>> faster query
>>> - protection of data - can't "accidentaly" query data from other datasets
>>> Downsides:
>>> - combining data from various datasets is heavier task
>>> 
>>> Is this correct? Any other things that should be considered?
>>> 
>>> Thank you
>>> 
>>> --
>>> Lingsoft - 30 years of Leading Language Management
>>> 
>>> www.lingsoft.fi
>>> 
>>> Speech Applications - Language Management - Translation - Reader's and
>>> Writer's Tools - Text Tools - E-books and M-books
>>> 
>>> Mikael Pesonen
>>> System Engineer
>>> 
>>> e-mail: mikael.peso...@lingsoft.fi
>>> Tel. +358 2 279 3300
>>> 
>>> Time zone: GMT+2
>>> 
>>> Helsinki Office
>>> Eteläranta 10
>>> 
>>> FI-00130 Helsinki
>>> FINLAND
>>> 
>>> Turku Office
>>> Kauppiaskatu 5 A
>>> 
>>> FI-20100 Turku
>>> FINLAND
>>> 
>>> 
> 
> -- 
> Lingsoft - 30 years of Leading Language Management
> 
> www.lingsoft.fi
> 
> Speech Applications - Language Management - Translation - Reader's and 
> Writer's Tools - Text Tools - E-books and M-books
> 
> Mikael Pesonen
> System Engineer
> 
> e-mail: mikael.peso...@lingsoft.fi
> Tel. +358 2 279 3300
> 
> Time zone: GMT+2
> 
> Helsinki Office
> Eteläranta 10
> FI-00130 Helsinki
> FINLAND
> 
> Turku Office
> Kauppiaskatu 5 A
> FI-20100 Turku
> FINLAND
> 



Re: Splitting data into graphs vs datasets

2018-03-21 Thread Mikael Pesonen


Is this example from web page (a bit modified)

config_ds1.ttl:

@prefix  fuseki:     .
@prefix  rdf:    .
@prefix  rdfs:   .
@prefix  tdb:    .
@prefix  ja:     .
@prefix  : <# >  .

<#service1> rdf:type fuseki:Service ;
fuseki:name"ds"  ;# http://host:port/ds
fuseki:serviceQuery"sparql"  ;# SPARQL query service
fuseki:serviceQuery"query"  ; # SPARQL query service 
(alt name)
fuseki:serviceUpdate   "update"  ;# SPARQL update service
fuseki:serviceUpload   "upload"  ;# Non-SPARQL upload 
service
fuseki:serviceReadWriteGraphStore  "data"  ;  # SPARQL Graph store 
protocol (read and write)
fuseki:dataset<#dataset1> ;
.
<#dataset1> rdf:type tdb:DatasetTDB ; tdb:location 
"/home/jena_data/dataset1/" ;ja:context [ ja:cxtName "arq:queryTimeout" 
; ja:cxtValue "1000" ] ;
fuseki-server.jar --update --port 3030 --config config_ds1.ttl Same as 
fuseki-server.jar --update --port 3030 --loc=/home/jena_data/dataset1/ /ds



On 21.3.2018 12:46, Rob Vesse wrote:

You can run many datasets by using the --config argument and specifying an 
appropriate configuration file.  This should be used instead of the --loc 
argument which is a convenience short cut to run  a server with a single 
dataset.

http://jena.apache.org/documentation/fuseki2/fuseki-configuration.html

Rob

On 21/03/2018, 10:35, "Mikael Pesonen"  wrote:

 
 Those sure are good reasons for using named graphs. But what about using

 different datasets too?
 
 btw, I couldn't find info on how to run many datasets with Fuseki. is it

 just one dataset per fuseki process? -loc parameter for fuseki-server.jar?
 
 Br
 
 On 20.3.2018 14:22, Martynas Jusevičius wrote:

 > Provenance. With named graphs, it's easier to track where data came from:
 > who imported it, when etc.
 > You can also have meta-graphs about other graphs.
 >
 > Also editing and updating data. You can load named graph contents (of
 > smallish size) in an editor, make changes and then store a new version in
 > the same graph. You probably would not want to do this with a large 
default
 > graph.
 >
 > On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 

 > wrote:
 >
 >> Hi,
 >>
 >> I'm using Fuseki GSP, and so far have put all data into one default
 >> dataset and using graphs to split it.
 >>
 >> If I'm right there would be benefits using more than one dataset
 >> - better performance - each query is done inside a dataset so less data 
=
 >> faster query
 >> - protection of data - can't "accidentaly" query data from other 
datasets
 >> Downsides:
 >> - combining data from various datasets is heavier task
 >>
 >> Is this correct? Any other things that should be considered?
 >>
 >> Thank you
 >>
 >> --
 >> Lingsoft - 30 years of Leading Language Management
 >>
 >> www.lingsoft.fi
 >>
 >> Speech Applications - Language Management - Translation - Reader's and
 >> Writer's Tools - Text Tools - E-books and M-books
 >>
 >> Mikael Pesonen
 >> System Engineer
 >>
 >> e-mail: mikael.peso...@lingsoft.fi
 >> Tel. +358 2 279 3300
 >>
 >> Time zone: GMT+2
 >>
 >> Helsinki Office
 >> Eteläranta 10
 >> 
 >> FI-00130 Helsinki
 >> FINLAND
 >>
 >> Turku Office
 >> Kauppiaskatu 5 A
 >> 
 >> FI-20100 Turku
 >> FINLAND
 >>
 >>
 
 --

 Lingsoft - 30 years of Leading Language Management
 
 www.lingsoft.fi
 
 Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
 
 Mikael Pesonen

 System Engineer
 
 e-mail: mikael.peso...@lingsoft.fi

 Tel. +358 2 279 3300
 
 Time zone: GMT+2
 
 Helsinki Office

 Eteläranta 10
 FI-00130 Helsinki
 FINLAND
 
 Turku Office

 Kauppiaskatu 5 A
 FI-20100 Turku
 FINLAND
 
 







--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Splitting data into graphs vs datasets

2018-03-21 Thread Rob Vesse
You can run many datasets by using the --config argument and specifying an 
appropriate configuration file.  This should be used instead of the --loc 
argument which is a convenience short cut to run  a server with a single 
dataset.

http://jena.apache.org/documentation/fuseki2/fuseki-configuration.html

Rob

On 21/03/2018, 10:35, "Mikael Pesonen"  wrote:


Those sure are good reasons for using named graphs. But what about using 
different datasets too?

btw, I couldn't find info on how to run many datasets with Fuseki. is it 
just one dataset per fuseki process? -loc parameter for fuseki-server.jar?

Br

On 20.3.2018 14:22, Martynas Jusevičius wrote:
> Provenance. With named graphs, it's easier to track where data came from:
> who imported it, when etc.
> You can also have meta-graphs about other graphs.
>
> Also editing and updating data. You can load named graph contents (of
> smallish size) in an editor, make changes and then store a new version in
> the same graph. You probably would not want to do this with a large 
default
> graph.
>
> On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 

> wrote:
>
>> Hi,
>>
>> I'm using Fuseki GSP, and so far have put all data into one default
>> dataset and using graphs to split it.
>>
>> If I'm right there would be benefits using more than one dataset
>> - better performance - each query is done inside a dataset so less data =
>> faster query
>> - protection of data - can't "accidentaly" query data from other datasets
>> Downsides:
>> - combining data from various datasets is heavier task
>>
>> Is this correct? Any other things that should be considered?
>>
>> Thank you
>>
>> --
>> Lingsoft - 30 years of Leading Language Management
>>
>> www.lingsoft.fi
>>
>> Speech Applications - Language Management - Translation - Reader's and
>> Writer's Tools - Text Tools - E-books and M-books
>>
>> Mikael Pesonen
>> System Engineer
>>
>> e-mail: mikael.peso...@lingsoft.fi
>> Tel. +358 2 279 3300
>>
>> Time zone: GMT+2
>>
>> Helsinki Office
>> Eteläranta 10
>> 
>> FI-00130 Helsinki
>> FINLAND
>>
>> Turku Office
>> Kauppiaskatu 5 A
>> 
>> FI-20100 Turku
>> FINLAND
>>
>>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and 
Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND








Re: Splitting data into graphs vs datasets

2018-03-21 Thread Mikael Pesonen


Those sure are good reasons for using named graphs. But what about using 
different datasets too?


btw, I couldn't find info on how to run many datasets with Fuseki. is it 
just one dataset per fuseki process? -loc parameter for fuseki-server.jar?


Br

On 20.3.2018 14:22, Martynas Jusevičius wrote:

Provenance. With named graphs, it's easier to track where data came from:
who imported it, when etc.
You can also have meta-graphs about other graphs.

Also editing and updating data. You can load named graph contents (of
smallish size) in an editor, make changes and then store a new version in
the same graph. You probably would not want to do this with a large default
graph.

On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 
wrote:


Hi,

I'm using Fuseki GSP, and so far have put all data into one default
dataset and using graphs to split it.

If I'm right there would be benefits using more than one dataset
- better performance - each query is done inside a dataset so less data =
faster query
- protection of data - can't "accidentaly" query data from other datasets
Downsides:
- combining data from various datasets is heavier task

Is this correct? Any other things that should be considered?

Thank you

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and
Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10

FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A

FI-20100 Turku
FINLAND




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Splitting data into graphs vs datasets

2018-03-20 Thread Martynas Jusevičius
Provenance. With named graphs, it's easier to track where data came from:
who imported it, when etc.
You can also have meta-graphs about other graphs.

Also editing and updating data. You can load named graph contents (of
smallish size) in an editor, make changes and then store a new version in
the same graph. You probably would not want to do this with a large default
graph.

On Tue, Mar 20, 2018 at 1:16 PM, Mikael Pesonen 
wrote:

>
> Hi,
>
> I'm using Fuseki GSP, and so far have put all data into one default
> dataset and using graphs to split it.
>
> If I'm right there would be benefits using more than one dataset
> - better performance - each query is done inside a dataset so less data =
> faster query
> - protection of data - can't "accidentaly" query data from other datasets
> Downsides:
> - combining data from various datasets is heavier task
>
> Is this correct? Any other things that should be considered?
>
> Thank you
>
> --
> Lingsoft - 30 years of Leading Language Management
>
> www.lingsoft.fi
>
> Speech Applications - Language Management - Translation - Reader's and
> Writer's Tools - Text Tools - E-books and M-books
>
> Mikael Pesonen
> System Engineer
>
> e-mail: mikael.peso...@lingsoft.fi
> Tel. +358 2 279 3300
>
> Time zone: GMT+2
>
> Helsinki Office
> Eteläranta 10
> 
> FI-00130 Helsinki
> FINLAND
>
> Turku Office
> Kauppiaskatu 5 A
> 
> FI-20100 Turku
> FINLAND
>
>