Re: Question about Dataset as created from a collection of ont files

Jack Park Sun, 08 Feb 2015 07:14:55 -0800

Sorry. While I am not new to Jena (I built systems with it in the previous
decade when it was much simpler to use), I suppose I get confused easily.
The code snippet I showed was found all over the web in tutorials, and so
forth. I spent some time reading around in the Jena source code and find it
hard to believe that "there's nothing to write a log..." since several
classes are loaded and exercised just by calling TDBFactory.createDataset.


But, I confess that I don't see anything in that code which suggests it
should actually read the database itself (though that might be the case).
Still it created a 238+mb log file before crashing.

Rooting around in the Fuseki code to see how it boots a given database is
truly difficult. Maybe someone familiar with the code can explain it.

Maybe there is a better way to boot a given TDB database and create an
OntModel from that?



On Sun, Feb 8, 2015 at 6:27 AM, Dave Reynolds <[email protected]>
wrote:

> On 08/02/15 13:55, Jack Park wrote:
>
>> Hi Dave,
>>
>> Thanks for your response.
>> I should have stated more clearly that the code I did show is *all* the
>> code that is running. That snippet:
>>
>> Dataset dataset = TDBFactory.createDataset(dbPath) ;
>>
>> is what is running when the system gets an OutOfMemory error. Even with a
>> 4gb heap, it still blows. All the code which does "begin", Model = ...,"
>> and so forth has been commented out.
>>
>> The behavior according to the log is that somewhere in the createDataset
>> code, it is reading every class in the ontology stored in the TDB database
>> it is, I presume, opening.
>>
>
> What log? If that is literally the only code line then there's nothing to
> write a log and certainly nothing that will go round trying to read classes.
>
> Dave
>
>
>  That's the current puzzle. It's almost as if there is some system property
>> I need to set somewhere which tells it that this is not an in-memory
>> event.
>>
>> Thoughts?
>>
>> Jack
>>
>> On Sun, Feb 8, 2015 at 2:06 AM, Dave Reynolds <[email protected]>
>> wrote:
>>
>>  On 07/02/15 22:18, Jack Park wrote:
>>>
>>>  I used the Jena to load, on behalf of Fuseki, a collection of owl files.
>>>> There might be 4gb of data all totaled in there.
>>>>
>>>> Now, rather than use Fuseki to access that data, I am writing code which
>>>> will use a Dataset opened against that database to create an OntModel.
>>>>
>>>> I use this code, taken from a variety of sources:
>>>>
>>>> Dataset dataset = TDBFactory.createDataset(dbPath) ;
>>>>
>>>> where dbPath points to the directory where Jena made the database.
>>>>
>>>> When I boot Fuseki against that data, it boots quickly and without any
>>>> issues.
>>>>
>>>> When I run that code against the same data, firstly, it blossoms a
>>>> logfile
>>>>
>>>>  260 mb, showing all the ont classes it is reading. Then, it runs out of
>>>>>
>>>>>  heap space and crashes.
>>>>
>>>>
>>> Simply accessing data in a TDBDataset won't load it all into memory so
>>> the
>>> problem will be in how you are creating the OntModels.
>>>
>>> Since you don't show "that code" it is hard know what the problem is.
>>>
>>> It *might* be you have dynamic imports processing switched on and so your
>>> OntModels are going out to the original sources and reloading them.
>>>
>>> It is possible to do imports processing but have the imports be found as
>>> database models [1] but in your case since you have all the ontologies in
>>> there anyway then I would just switch off all imports processing.
>>>
>>> Or it might be nothing to do with imports processing but a bug in how you
>>> are creating the OntModels. Not enough information to tell.
>>>
>>> Dave
>>>
>>> [1] There used to be a somewhat old example of how to do this in the
>>> documentation but I can't find it in the current web site.
>>>
>>>
>>>
>>
>

Re: Question about Dataset as created from a collection of ont files

Reply via email to