Re: leak but where after parsing rdf files?

Hasan Hasan Tue, 25 Jan 2011 03:47:22 -0800

On Tue, Jan 25, 2011 at 11:14 AM, Andy Seaborne <
[email protected]> wrote:


>
>
> On 25/01/11 09:33, Hasan Hasan wrote:
>
>> Hi Andy,
>>
>> thanks for taking a look at the code.
>> This means that there is a limit to the number of triples with large
>> literals that can be returned by jenaGraph.find(). Right?
>>
>
> Not in the design - Graph.find() returns a streaming iterator from TDB. if
> the application is keeping the triples returned, then it takes space, RDF
> terms are materialized to return them - there is no delayed evaluation
> there.
>

as you can see, the test code does not keep the triples. In the next call to
next() the previous triple is overwritten.

while (jenaIter.hasNext()) {
    Triple triple = (Triple) jenaIter.next();
}


> But once the iterator from Graph.find has returned a triples, it's not in
> TDB at all.  There is an issue with how the node table cache might grow
> because of large literals in it, but is is limited to a maximum number of
> entries.  Turn the cache size down.


How to do this?

cheers
hasan


>
>
>  If this limit is
>> exceeded, then it can lead to outofmemoryerror exception. And this limit
>> depends on max memory allocated for heap, the size of literals ?
>>
>
> And the size of the cache.
>
>
>  So to see whether there is a memory leak, I could try to loop over
>> jenaGraph.find() where in each iteration there shouldn't be a heap memory
>> exception.
>>
>
> If the heap is big enough for the cache.  The worst case is pretty big.
>
>
>  I'll test it now and let you know.
>> But we'll consider your suggestion to not have large literals in the
>> triples, but their references.
>>
>> Cheers
>> Hasan
>>
>
> let me know how it goes,
>
>        Andy
>

Re: leak but where after parsing rdf files?

Reply via email to