Hi,

I am trying to load about 100,000 datagraphs (roughly 10M triples) into a
Jena TDB Dataset, but am running out of memory. I'm doing the load by
repeatedly calling code that looks something like this:

      InputStream instream = entity.getContent(); // the RDF graph to load
      fResourceDataset.getLock().enterCriticalSection(Lock.WRITE);
      try {
            Model model = fResourceDataset.getNamedModel(resourceURI);
            model.read(instream, null);
            //model.close();
      } finally { fResourceDataset.getLock().leaveCriticalSection() ; }
      instream.close();

After calling this code about 2-3 thousand times, it starts to run much
slower, and then eventually I get an exception like this:

      Exception in thread "pool-3-thread-43" java.lang.OutOfMemoryError:
Java heap space

I tried increasing the amount of memory, but that just increased the number
of calls that succeed (e.g., 10000 vs 2000) before getting the exception.

I'm wondering if there's something I need to do to release memory between
these calls. I tried putting in a call to model.close(), but all that it
seemed to do was make it run slower, but I still got the exception.

Is there something else I should be doing, or is there a possible memory
leak in the version of Jena I'm using (a fairly recent SNAPSHOT build)?

Btw, I tried commenting out the call to model.read(instream, null) to
confirm that the memory leak isn't somewhere else in my program, and that
worked - i.e., went through the 100,000 calls without an exception.

Any ideas or pointers to what may be wrong would be appreciated.

Thanks,
Frank.

Reply via email to