If you use the default nutch script i would set a NUTCH_HEAPSIZE of 2000. That generally works for me and i have over 100 million urls in db and generally 10 million urls per segment/index.
-byron --- smith learner <[EMAIL PROTECTED]> wrote: > Thanks for your reply. But I guess this solution > doesn't work for me. Actually, I didn't use this > parameter (I removed it from nutch script). > > BTW: My RAM is 4G. I use redhat. kernel is > 2.4.20-31.9bigmem. > > Have you ever got the out of memory exception when > you used nutch to crawl millions of website? > > Regards, > > Jack. > > > --- cao yuzhong <[EMAIL PROTECTED]> wrote: > > Changing the JVM parameter -Xmx may help you. > > > > >From: smith learner <[EMAIL PROTECTED]> > > >Reply-To: nutch-user@incubator.apache.org > > >To: nutch-user@incubator.apache.org > > >Subject: out of memory exception. > > >Date: Fri, 22 Apr 2005 12:44:04 -0700 (PDT) > > > > > >I used nutch0.6 to crawl a million websites. When > > it > > >fetched around 2.5 million web pages, it always > > throws > > >out of memory exception. I catched the exception, > > and > > >tried to print out the stack trace, somehow, it > > just > > >print out > > > > > >Exception in thread "main" > > java.lang.OutOfMemoryError: > > >Java heap space > > > > > >the strange thing here is when I use 0.5 nutch > > >version. it could fetch more than 7 million web > > pages. > > > > > >I don't know what happens there? Can anybody shed > > >light on this. > > > > > >Thanks in advance! > > > > > >Regards, > > > > > >smith. > > > > > > > > > > > > > > > > > > > > > > > >__________________________________ > > >Do you Yahoo!? > > >Yahoo! Mail - You care about security. So do we. > > >http://promotions.yahoo.com/new_mail > > > > > > > > > > > ------------------------------------------------------- > > SF email is sponsored by - The IT Product Guide > > Read honest & candid reviews on hundreds of IT > > Products from real users. > > Discover which products truly live up to the hype. > > Start reading now. > > > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > > _______________________________________________ > > Nutch-general mailing list > > Nutch-general@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/nutch-general > > > > > > __________________________________ > Do you Yahoo!? > Yahoo! Mail - Helps protect you from nasty viruses. > http://promotions.yahoo.com/new_mail > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT > Products from real users. > Discover which products truly live up to the hype. > Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Nutch-general mailing list > Nutch-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nutch-general > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com