Did you see the logs and figure out from stack trace which portion of the
code is responsible for OOM ?

Thanks,
Tejas


On Mon, Dec 16, 2013 at 9:32 AM, yann <[email protected]> wrote:

> Hi guys,
>
> I'm writing a server / rest API for Nutch, but I'm running into a memory
> leak issue.
>
> I simplified the problem down to this: crawling a site repeatedly (as
> below)
> will eventually run out of memory; when looking at the running JVM with
> VisualVM, the permGen space grows indefinitely at the same rate, until it
> runs out and the application crashes.
>
> I suspect there is a memory leak in Nutch or in Hadoop, as I wouldn't
> expect
> the code below not to grow its memory footprint indefinitely.
>
> The code:
>
> while (true) {
>     Configuration configuration = NutchConfiguration.create();
>      String crawlArg = "config/urls/dev -dir crawls/dev -threads 5 -depth 2
> -topN 100 ";
>      ToolRunner.run(configuration, new Crawl(),
> MiscUtils.tokenize(crawlArg));
> }
>
> Anything I can do on my side to fix this?
>
> Thanks for all comments,
>
> Yann
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Memory-leak-when-crawling-repeatedly-tp4106960.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Reply via email to