Just to close this thread: we have meanwhile got GCJ 4.2.0 compiled with LARGE_CONFIG on Solaris and built PyLucene 2.0 for Python2.4 with it. This seems to solve the memory leak problem.
Next step is now to raise the 'MaxFieldLength' limit (which by default limits the number of terms to store per field to 10.000 terms - this should be noted if anyone runs into the situation where search results seems to be incomplete...). Will see how this influences the memory consumption of the indexer. Stay tuned .-) Regards Thomas -- OrbiTeam Software GmbH & Co. KG http://www.orbiteam.de > -----Ursprüngliche Nachricht----- > Von: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Im Auftrag > von Thomas Koch > Gesendet: Mittwoch, 23. Mai 2007 14:00 > An: [email protected] > Betreff: AW: [pylucene-dev] MemoryError on Solaris - GC memory leak ? > > > We had to rebuild GCJ 3.4.6 with LARGE_CONFIG defined to avoid this > > message. I checked GCJ 4.2.0, and LARGE_CONFIG still > doesn't seem to > > be defined by default. The comment from 4.2.0's > Makefile.direct still > > reads: > > ... > > Aaron, > > Thanks for the hint - will try this (we're currently running a build). > > > > If there's any other way to get rid of the GC Warning (and memory > > > leak) that would be of interest of course... > > > > You could probably divide up your documents, and index, say, 50K in > > one process, exit, do the next 50K in a new process, etc., > tuning the > > batch sizes as needed. Inelegant, but it'd probably work. > > > > Well that would be an option - but would of course require > some more "batch overhead" (like saving state and the like). > > Regards > Thomas > > _______________________________________________ > pylucene-dev mailing list > [email protected] > http://lists.osafoundation.org/mailman/listinfo/pylucene-dev > _______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
