The Parser is pdfBox. pdf is about 25% of the over all indexing volume on the productive system. I also have word-docs and loads of hmtl resources to be indexed.
In my testing environment I merely have 5 pdf docs and still those permanent object hanging around, though.
Cheers,
Daniel


Ben Litchfield wrote:

I can say that gc is not collecting these objects since I forced gc
runs when indexing every now and then (when parsing pdf-type objects,
that is): No effect.


<>
What PDF parser are you using? Is the problem within the parser and not
lucene? Are you releasing all resources?
Ben

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to