Hi Jack,
There's nothing wrong with the line of code that I can see. That is the
normal way to open a TDB dataset. TDB does not load all the data into
memory and there are no magic flags to set.
What I'm asking is what log file you are talking about.
I don't see why a program with just that line of code and nothing else
should be creating any log files at all, so I'm not clear on what log
file you are talking about. I guess you might have a log4j.properties
file in your path set to DEBUG level and writing to a file instead of
stdout/err but that's still not going result in hundreds of Mb of output.
Can you provide a complete minimal example of what you are doing?
Dave
On 08/02/15 15:12, Jack Park wrote:
Sorry. While I am not new to Jena (I built systems with it in the previous
decade when it was much simpler to use), I suppose I get confused easily.
The code snippet I showed was found all over the web in tutorials, and so
forth. I spent some time reading around in the Jena source code and find it
hard to believe that "there's nothing to write a log..." since several
classes are loaded and exercised just by calling TDBFactory.createDataset.
But, I confess that I don't see anything in that code which suggests it
should actually read the database itself (though that might be the case).
Still it created a 238+mb log file before crashing.
Rooting around in the Fuseki code to see how it boots a given database is
truly difficult. Maybe someone familiar with the code can explain it.
Maybe there is a better way to boot a given TDB database and create an
OntModel from that?
On Sun, Feb 8, 2015 at 6:27 AM, Dave Reynolds <[email protected]>
wrote:
On 08/02/15 13:55, Jack Park wrote:
Hi Dave,
Thanks for your response.
I should have stated more clearly that the code I did show is *all* the
code that is running. That snippet:
Dataset dataset = TDBFactory.createDataset(dbPath) ;
is what is running when the system gets an OutOfMemory error. Even with a
4gb heap, it still blows. All the code which does "begin", Model = ...,"
and so forth has been commented out.
The behavior according to the log is that somewhere in the createDataset
code, it is reading every class in the ontology stored in the TDB database
it is, I presume, opening.
What log? If that is literally the only code line then there's nothing to
write a log and certainly nothing that will go round trying to read classes.
Dave
That's the current puzzle. It's almost as if there is some system property
I need to set somewhere which tells it that this is not an in-memory
event.
Thoughts?
Jack
On Sun, Feb 8, 2015 at 2:06 AM, Dave Reynolds <[email protected]>
wrote:
On 07/02/15 22:18, Jack Park wrote:
I used the Jena to load, on behalf of Fuseki, a collection of owl files.
There might be 4gb of data all totaled in there.
Now, rather than use Fuseki to access that data, I am writing code which
will use a Dataset opened against that database to create an OntModel.
I use this code, taken from a variety of sources:
Dataset dataset = TDBFactory.createDataset(dbPath) ;
where dbPath points to the directory where Jena made the database.
When I boot Fuseki against that data, it boots quickly and without any
issues.
When I run that code against the same data, firstly, it blossoms a
logfile
260 mb, showing all the ont classes it is reading. Then, it runs out of
heap space and crashes.
Simply accessing data in a TDBDataset won't load it all into memory so
the
problem will be in how you are creating the OntModels.
Since you don't show "that code" it is hard know what the problem is.
It *might* be you have dynamic imports processing switched on and so your
OntModels are going out to the original sources and reloading them.
It is possible to do imports processing but have the imports be found as
database models [1] but in your case since you have all the ontologies in
there anyway then I would just switch off all imports processing.
Or it might be nothing to do with imports processing but a bug in how you
are creating the OntModels. Not enough information to tell.
Dave
[1] There used to be a somewhat old example of how to do this in the
documentation but I can't find it in the current web site.