Hi guys, as a followup, I applied both patches (https://issues.apache.org/jira/browse/NUTCH-356 and https://issues.apache.org/jira/browse/NUTCH-1640) - the first one made a huge difference. Using the loop I pasted in my original post, I can now crawl a small site ~3500 times before running out of memory, against 150 times before applying the patches (there's still a small leak somewhere).
Thanks for your help, appreciated! Yann -- View this message in context: http://lucene.472066.n3.nabble.com/Memory-leak-when-crawling-repeatedly-tp4106960p4108376.html Sent from the Nutch - User mailing list archive at Nabble.com.

