subject:"TDB triple storage"

Re: TDB triple storage

2016-07-26 Thread Dave Reynolds


On 26/07/16 15:10, Chao Wang wrote:

You are right about the reasoner. I used GenericRuleReasoner and loaded a few 
rules from external file.
This statement reasoner.setOWLTranslation(true) is the cause of the issue. Not 
sure what it does.


It's a horrible horrible hack that finds all explicit owl:intersectionOf 
groups in the data and inserts a set of forward and backward rules for 
each group (to compute and recognize the relevant subClassOf deductions).


Used by the OWL reasoners but shouldn't normally be used for your own 
rule sets.


Dave


On 7/26/16, 7:42 AM, "Andy Seaborne" <a...@apache.org> wrote:


On 26/07/16 12:08, Chao Wang wrote:

Changed code to use RDFFormat.TURTLE_BLOCKS, Set -Xmx8192m on 16g i7 laptop
Still getting out of memory error after running for a while, Any suggestions?


A complete, minimal example. That is, something someone else can run,
and just large enough to illustrate the issue.

Also details of which version of Jena, and which OS.

The reasoner setup is probably a factor.

Andy






On 7/25/16, 4:41 PM, "Andy Seaborne" <a...@apache.org> wrote:


On 25/07/16 21:14, Chao Wang wrote:

Hi Dave,
As you suggested, I have computed the closure in memory, totaling over 4 
millions triples. trying to serialize it.
Is there a direct API to serialize the whole model into TDB?
Tried to serialize into file, keep getting memory issue. What's the typical 
resource need for this size of model?


If you are getting problems as you write out the file, try using one of
the streaming formats.  The default format for RDF/XML or Turtle is
"pretty" and takes a significant amount of working space for analysis
before writing.

Some streaming output formats are:

Lang.NTRIPLES
RDFFormat.TURTLE_BLOCKS

https://jena.apache.org/documentation/io/rdf-output.html

Or does it fail during writing, after some output?

 Andy



From: Dave Reynolds [dave.e.reyno...@gmail.com]
Sent: Thursday, July 21, 2016 9:09 AM
To: users@jena.apache.org
Subject: Re: TDB triple storage

On 21/07/16 13:45, Chao Wang wrote:

Thanks Dave,
So my fuseki has configuration using TDB with OWL reasoner. I preloaded the TDB 
with tdbloader, then starts up fuseki.
My question is when fuseki starts up, does it load all triples including 
inferred triples into memory?


Yes. It's actually slightly worse than that. All the inferences will be
in memory (including intermediate state) which will be bigger than than
source data. But the data itself isn't loaded explicitly which means
that the reasoner is going back to TDB for each query which is a further
slow down.

Using a lighter reasoner config (OWL Micro if you are not already using
it) may help.

Otherwise, if your data is stable, then as I say, compute the closure
once in memory, off line. Store that in TDB. Then have your fuseki
configuration use that precomputed closure with no runtime inference.

Dave


I am experiencing hanging sparql query. works fine with a small dataset. I am 
hoping reasoning is not done during query time...

From: Dave Reynolds [dave.e.reyno...@gmail.com]
Sent: Thursday, July 21, 2016 3:35 AM
To: users@jena.apache.org
Subject: Re: TDB triple storage

On 21/07/16 02:09, Chao Wang wrote:

A newbie question:
Does jena store the inferred triples into tdb? If yes, when?


No. The current reasoners operate in memory.

If you wish you can take the results of inference (either the entire
closure or the results of some selective queries) and store those back
in TDB yourself. A common pattern would be use separate named graphs for
the original data and for the inference closure and use union-default.
All this under your control but is not automatically done for you.

There is also some support for generating a partial RDFS inference
closure at the time you load TDB.

Dave

Re: TDB triple storage

2016-07-26 Thread Chao Wang

You are right about the reasoner. I used GenericRuleReasoner and loaded a few 
rules from external file.
This statement reasoner.setOWLTranslation(true) is the cause of the issue. Not 
sure what it does.




On 7/26/16, 7:42 AM, "Andy Seaborne" <a...@apache.org> wrote:

>On 26/07/16 12:08, Chao Wang wrote:
>> Changed code to use RDFFormat.TURTLE_BLOCKS, Set -Xmx8192m on 16g i7 laptop
>> Still getting out of memory error after running for a while, Any suggestions?
>
>A complete, minimal example. That is, something someone else can run, 
>and just large enough to illustrate the issue.
>
>Also details of which version of Jena, and which OS.
>
>The reasoner setup is probably a factor.
>
>   Andy
>
>>
>>
>>
>>
>> On 7/25/16, 4:41 PM, "Andy Seaborne" <a...@apache.org> wrote:
>>
>>> On 25/07/16 21:14, Chao Wang wrote:
>>>> Hi Dave,
>>>> As you suggested, I have computed the closure in memory, totaling over 4 
>>>> millions triples. trying to serialize it.
>>>> Is there a direct API to serialize the whole model into TDB?
>>>> Tried to serialize into file, keep getting memory issue. What's the 
>>>> typical resource need for this size of model?
>>>
>>> If you are getting problems as you write out the file, try using one of
>>> the streaming formats.  The default format for RDF/XML or Turtle is
>>> "pretty" and takes a significant amount of working space for analysis
>>> before writing.
>>>
>>> Some streaming output formats are:
>>>
>>> Lang.NTRIPLES
>>> RDFFormat.TURTLE_BLOCKS
>>>
>>> https://jena.apache.org/documentation/io/rdf-output.html
>>>
>>> Or does it fail during writing, after some output?
>>>
>>> Andy
>>>
>>>> 
>>>> From: Dave Reynolds [dave.e.reyno...@gmail.com]
>>>> Sent: Thursday, July 21, 2016 9:09 AM
>>>> To: users@jena.apache.org
>>>> Subject: Re: TDB triple storage
>>>>
>>>> On 21/07/16 13:45, Chao Wang wrote:
>>>>> Thanks Dave,
>>>>> So my fuseki has configuration using TDB with OWL reasoner. I preloaded 
>>>>> the TDB with tdbloader, then starts up fuseki.
>>>>> My question is when fuseki starts up, does it load all triples including 
>>>>> inferred triples into memory?
>>>>
>>>> Yes. It's actually slightly worse than that. All the inferences will be
>>>> in memory (including intermediate state) which will be bigger than than
>>>> source data. But the data itself isn't loaded explicitly which means
>>>> that the reasoner is going back to TDB for each query which is a further
>>>> slow down.
>>>>
>>>> Using a lighter reasoner config (OWL Micro if you are not already using
>>>> it) may help.
>>>>
>>>> Otherwise, if your data is stable, then as I say, compute the closure
>>>> once in memory, off line. Store that in TDB. Then have your fuseki
>>>> configuration use that precomputed closure with no runtime inference.
>>>>
>>>> Dave
>>>>
>>>>> I am experiencing hanging sparql query. works fine with a small dataset. 
>>>>> I am hoping reasoning is not done during query time...
>>>>> 
>>>>> From: Dave Reynolds [dave.e.reyno...@gmail.com]
>>>>> Sent: Thursday, July 21, 2016 3:35 AM
>>>>> To: users@jena.apache.org
>>>>> Subject: Re: TDB triple storage
>>>>>
>>>>> On 21/07/16 02:09, Chao Wang wrote:
>>>>>> A newbie question:
>>>>>> Does jena store the inferred triples into tdb? If yes, when?
>>>>>
>>>>> No. The current reasoners operate in memory.
>>>>>
>>>>> If you wish you can take the results of inference (either the entire
>>>>> closure or the results of some selective queries) and store those back
>>>>> in TDB yourself. A common pattern would be use separate named graphs for
>>>>> the original data and for the inference closure and use union-default.
>>>>> All this under your control but is not automatically done for you.
>>>>>
>>>>> There is also some support for generating a partial RDFS inference
>>>>> closure at the time you load TDB.
>>>>>
>>>>> Dave
>>>>>
>>>
>

Re: TDB triple storage

2016-07-26 Thread Andy Seaborne


On 26/07/16 12:08, Chao Wang wrote:

Changed code to use RDFFormat.TURTLE_BLOCKS, Set -Xmx8192m on 16g i7 laptop
Still getting out of memory error after running for a while, Any suggestions?


A complete, minimal example. That is, something someone else can run, 
and just large enough to illustrate the issue.


Also details of which version of Jena, and which OS.

The reasoner setup is probably a factor.

Andy






On 7/25/16, 4:41 PM, "Andy Seaborne" <a...@apache.org> wrote:


On 25/07/16 21:14, Chao Wang wrote:

Hi Dave,
As you suggested, I have computed the closure in memory, totaling over 4 
millions triples. trying to serialize it.
Is there a direct API to serialize the whole model into TDB?
Tried to serialize into file, keep getting memory issue. What's the typical 
resource need for this size of model?


If you are getting problems as you write out the file, try using one of
the streaming formats.  The default format for RDF/XML or Turtle is
"pretty" and takes a significant amount of working space for analysis
before writing.

Some streaming output formats are:

Lang.NTRIPLES
RDFFormat.TURTLE_BLOCKS

https://jena.apache.org/documentation/io/rdf-output.html

Or does it fail during writing, after some output?

Andy



From: Dave Reynolds [dave.e.reyno...@gmail.com]
Sent: Thursday, July 21, 2016 9:09 AM
To: users@jena.apache.org
Subject: Re: TDB triple storage

On 21/07/16 13:45, Chao Wang wrote:

Thanks Dave,
So my fuseki has configuration using TDB with OWL reasoner. I preloaded the TDB 
with tdbloader, then starts up fuseki.
My question is when fuseki starts up, does it load all triples including 
inferred triples into memory?


Yes. It's actually slightly worse than that. All the inferences will be
in memory (including intermediate state) which will be bigger than than
source data. But the data itself isn't loaded explicitly which means
that the reasoner is going back to TDB for each query which is a further
slow down.

Using a lighter reasoner config (OWL Micro if you are not already using
it) may help.

Otherwise, if your data is stable, then as I say, compute the closure
once in memory, off line. Store that in TDB. Then have your fuseki
configuration use that precomputed closure with no runtime inference.

Dave


I am experiencing hanging sparql query. works fine with a small dataset. I am 
hoping reasoning is not done during query time...

From: Dave Reynolds [dave.e.reyno...@gmail.com]
Sent: Thursday, July 21, 2016 3:35 AM
To: users@jena.apache.org
Subject: Re: TDB triple storage

On 21/07/16 02:09, Chao Wang wrote:

A newbie question:
Does jena store the inferred triples into tdb? If yes, when?


No. The current reasoners operate in memory.

If you wish you can take the results of inference (either the entire
closure or the results of some selective queries) and store those back
in TDB yourself. A common pattern would be use separate named graphs for
the original data and for the inference closure and use union-default.
All this under your control but is not automatically done for you.

There is also some support for generating a partial RDFS inference
closure at the time you load TDB.

Dave

Re: TDB triple storage

2016-07-26 Thread Chao Wang

Changed code to use RDFFormat.TURTLE_BLOCKS, Set -Xmx8192m on 16g i7 laptop
Still getting out of memory error after running for a while, Any suggestions?




On 7/25/16, 4:41 PM, "Andy Seaborne" <a...@apache.org> wrote:

>On 25/07/16 21:14, Chao Wang wrote:
>> Hi Dave,
>> As you suggested, I have computed the closure in memory, totaling over 4 
>> millions triples. trying to serialize it.
>> Is there a direct API to serialize the whole model into TDB?
>> Tried to serialize into file, keep getting memory issue. What's the typical 
>> resource need for this size of model?
>
>If you are getting problems as you write out the file, try using one of 
>the streaming formats.  The default format for RDF/XML or Turtle is 
>"pretty" and takes a significant amount of working space for analysis 
>before writing.
>
>Some streaming output formats are:
>
>Lang.NTRIPLES
>RDFFormat.TURTLE_BLOCKS
>
>https://jena.apache.org/documentation/io/rdf-output.html
>
>Or does it fail during writing, after some output?
>
> Andy
>
>> 
>> From: Dave Reynolds [dave.e.reyno...@gmail.com]
>> Sent: Thursday, July 21, 2016 9:09 AM
>> To: users@jena.apache.org
>> Subject: Re: TDB triple storage
>>
>> On 21/07/16 13:45, Chao Wang wrote:
>>> Thanks Dave,
>>> So my fuseki has configuration using TDB with OWL reasoner. I preloaded the 
>>> TDB with tdbloader, then starts up fuseki.
>>> My question is when fuseki starts up, does it load all triples including 
>>> inferred triples into memory?
>>
>> Yes. It's actually slightly worse than that. All the inferences will be
>> in memory (including intermediate state) which will be bigger than than
>> source data. But the data itself isn't loaded explicitly which means
>> that the reasoner is going back to TDB for each query which is a further
>> slow down.
>>
>> Using a lighter reasoner config (OWL Micro if you are not already using
>> it) may help.
>>
>> Otherwise, if your data is stable, then as I say, compute the closure
>> once in memory, off line. Store that in TDB. Then have your fuseki
>> configuration use that precomputed closure with no runtime inference.
>>
>> Dave
>>
>>> I am experiencing hanging sparql query. works fine with a small dataset. I 
>>> am hoping reasoning is not done during query time...
>>> 
>>> From: Dave Reynolds [dave.e.reyno...@gmail.com]
>>> Sent: Thursday, July 21, 2016 3:35 AM
>>> To: users@jena.apache.org
>>> Subject: Re: TDB triple storage
>>>
>>> On 21/07/16 02:09, Chao Wang wrote:
>>>> A newbie question:
>>>> Does jena store the inferred triples into tdb? If yes, when?
>>>
>>> No. The current reasoners operate in memory.
>>>
>>> If you wish you can take the results of inference (either the entire
>>> closure or the results of some selective queries) and store those back
>>> in TDB yourself. A common pattern would be use separate named graphs for
>>> the original data and for the inference closure and use union-default.
>>> All this under your control but is not automatically done for you.
>>>
>>> There is also some support for generating a partial RDFS inference
>>> closure at the time you load TDB.
>>>
>>> Dave
>>>
>

RE: TDB triple storage

2016-07-25 Thread Chao Wang

Hi Dave,
As you suggested, I have computed the closure in memory, totaling over 4 
millions triples. trying to serialize it.
Is there a direct API to serialize the whole model into TDB?
Tried to serialize into file, keep getting memory issue. What's the typical 
resource need for this size of model?

From: Dave Reynolds [dave.e.reyno...@gmail.com]
Sent: Thursday, July 21, 2016 9:09 AM
To: users@jena.apache.org
Subject: Re: TDB triple storage

On 21/07/16 13:45, Chao Wang wrote:
> Thanks Dave,
> So my fuseki has configuration using TDB with OWL reasoner. I preloaded the 
> TDB with tdbloader, then starts up fuseki.
> My question is when fuseki starts up, does it load all triples including 
> inferred triples into memory?

Yes. It's actually slightly worse than that. All the inferences will be
in memory (including intermediate state) which will be bigger than than
source data. But the data itself isn't loaded explicitly which means
that the reasoner is going back to TDB for each query which is a further
slow down.

Using a lighter reasoner config (OWL Micro if you are not already using
it) may help.

Otherwise, if your data is stable, then as I say, compute the closure
once in memory, off line. Store that in TDB. Then have your fuseki
configuration use that precomputed closure with no runtime inference.

Dave

> I am experiencing hanging sparql query. works fine with a small dataset. I am 
> hoping reasoning is not done during query time...
> 
> From: Dave Reynolds [dave.e.reyno...@gmail.com]
> Sent: Thursday, July 21, 2016 3:35 AM
> To: users@jena.apache.org
> Subject: Re: TDB triple storage
>
> On 21/07/16 02:09, Chao Wang wrote:
>> A newbie question:
>> Does jena store the inferred triples into tdb? If yes, when?
>
> No. The current reasoners operate in memory.
>
> If you wish you can take the results of inference (either the entire
> closure or the results of some selective queries) and store those back
> in TDB yourself. A common pattern would be use separate named graphs for
> the original data and for the inference closure and use union-default.
> All this under your control but is not automatically done for you.
>
> There is also some support for generating a partial RDFS inference
> closure at the time you load TDB.
>
> Dave
>

Re: TDB triple storage

2016-07-21 Thread Dave Reynolds


On 21/07/16 13:45, Chao Wang wrote:

Thanks Dave,
So my fuseki has configuration using TDB with OWL reasoner. I preloaded the TDB 
with tdbloader, then starts up fuseki.
My question is when fuseki starts up, does it load all triples including 
inferred triples into memory?


Yes. It's actually slightly worse than that. All the inferences will be 
in memory (including intermediate state) which will be bigger than than 
source data. But the data itself isn't loaded explicitly which means 
that the reasoner is going back to TDB for each query which is a further 
slow down.


Using a lighter reasoner config (OWL Micro if you are not already using 
it) may help.


Otherwise, if your data is stable, then as I say, compute the closure 
once in memory, off line. Store that in TDB. Then have your fuseki 
configuration use that precomputed closure with no runtime inference.


Dave


I am experiencing hanging sparql query. works fine with a small dataset. I am 
hoping reasoning is not done during query time...

From: Dave Reynolds [dave.e.reyno...@gmail.com]
Sent: Thursday, July 21, 2016 3:35 AM
To: users@jena.apache.org
Subject: Re: TDB triple storage

On 21/07/16 02:09, Chao Wang wrote:

A newbie question:
Does jena store the inferred triples into tdb? If yes, when?


No. The current reasoners operate in memory.

If you wish you can take the results of inference (either the entire
closure or the results of some selective queries) and store those back
in TDB yourself. A common pattern would be use separate named graphs for
the original data and for the inference closure and use union-default.
All this under your control but is not automatically done for you.

There is also some support for generating a partial RDFS inference
closure at the time you load TDB.

Dave

RE: TDB triple storage

2016-07-21 Thread Chao Wang

Thanks Dave,
So my fuseki has configuration using TDB with OWL reasoner. I preloaded the TDB 
with tdbloader, then starts up fuseki.
My question is when fuseki starts up, does it load all triples including 
inferred triples into memory?
I am experiencing hanging sparql query. works fine with a small dataset. I am 
hoping reasoning is not done during query time...

From: Dave Reynolds [dave.e.reyno...@gmail.com]
Sent: Thursday, July 21, 2016 3:35 AM
To: users@jena.apache.org
Subject: Re: TDB triple storage

On 21/07/16 02:09, Chao Wang wrote:
> A newbie question:
> Does jena store the inferred triples into tdb? If yes, when?

No. The current reasoners operate in memory.

If you wish you can take the results of inference (either the entire
closure or the results of some selective queries) and store those back
in TDB yourself. A common pattern would be use separate named graphs for
the original data and for the inference closure and use union-default.
All this under your control but is not automatically done for you.

There is also some support for generating a partial RDFS inference
closure at the time you load TDB.

Dave

Re: TDB triple storage

2016-07-21 Thread Dave Reynolds


On 21/07/16 02:09, Chao Wang wrote:

A newbie question:
Does jena store the inferred triples into tdb? If yes, when?


No. The current reasoners operate in memory.

If you wish you can take the results of inference (either the entire 
closure or the results of some selective queries) and store those back 
in TDB yourself. A common pattern would be use separate named graphs for 
the original data and for the inference closure and use union-default. 
All this under your control but is not automatically done for you.


There is also some support for generating a partial RDFS inference 
closure at the time you load TDB.


Dave

Re: TDB triple storage

Re: TDB triple storage

Re: TDB triple storage

Re: TDB triple storage

RE: TDB triple storage

Re: TDB triple storage

RE: TDB triple storage

Re: TDB triple storage

8 matches

Site Navigation

Mail list logo

Footer information