Dennis Kubes wrote: > Do you think it is the parsing that is causing it?
Just checking ... probably not. You could figure out from a thread dump where it's spending time. > I was looking at a smaller fetching run and the cpu gets pushed to > 100% as well but the reports keep happening. This only seems to > happen when I run very large fetches (> 500K pages). I just ran a > 100K fetch and it worked just fine. Should I have some special > settings for larger fetches? You could try tweaking the io.sort values, if it times out during the sorting phase. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
