My guess is Garbage Collection -- Try allocating twice as much Heap as before. or more. Try running with -gc:verbose (or whatever).
Cheers, Winton >Question from a Lucene newbie... I'm trying to index a file structure which >happens to include a relatively large file (310kb with 55,700 words) and >for some reason it appears to hanging the whole indexing process. Here's a >quick run-down.. > >1) Am using a webcrawler to retrieve files and copy to my local disk. >2) For files like .pdf's... I'm copying an .html equivalent of the file to >my disk (but leaving .pdf extension). >3) Then later in a serperate batch process I run pretty much the standard >out of the box "org.apache.lucene.IndexHTML" demo class (except I've added >.pdf as a possible indexing type). > >That's about it. No big deal. The transformation from pdf to html is not >perfected yet either... so file size will definitely drop in the future... >as nonsense terms are being included in these files. But for now... what >should I be looking at or altering to find out what is causing the hang? >Thanks! > >Jon Wasson > > >-- >To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> >For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> Winton Davies Lead Engineer, Overture (NSDQ: OVER) 1820 Gateway Drive, Suite 360 San Mateo, CA 94404 work: (650) 403-2259 cell: (650) 867-1598 http://www.overture.com/ -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
