Hi Sheham,
the nutch-site.xml configures
mapreduce.task.timeout
1800
1.8 seconds (1800 milliseconds) is very short. The default is 600 seconds or 10
minutes, see [1]. Since Nutch needs to finish fetching before the task timeout
applies, threads fetching not quickly enough and
Hi Sheham,
On 2024/04/20 08:47:41 Sheham Izat wrote:
> The Fetcher job was aborted, does that still mean that it went through the
> entire list of seed urls?
Yes it processed the entire generated segment but the fetcher…
* hung on https://disneyland.disney.go.com/, https://api.onlyoffice.com/,
Hi Lewis,
The Fetcher job was aborted, does that still mean that it went through the
entire list of seed urls?
I will go through the mailing list questions.
Thank you
On Fri, Apr 19, 2024 at 10:15 PM Lewis John McGibbney
wrote:
> Hi Sheham,
>
> On 2024/04/19 15:18:01 Sheham Izat wrote:
> >
>
Hi Sheham,
On 2024/04/19 15:18:01 Sheham Izat wrote:
>
> My questions are:
>
> 1) What do I need to do to get Nutch to continue working even if there are
> hung threads?
>From what I can see in the log you provided, nothing is preventing Nutch from
>continuing to work. The Fetcher job
Hi Shashanka, All,
Thank you for your reply!
I'm using Nutch 1.19. I did the injection and segment generation using the
following commands:
bin/nutch inject crawl/crawldb urls
bin/nutch generate crawl/crawldb crawl/segments
When I run the fetch command, Nutch stops with errors about hung
Hi Shehamizat,
Please feel free to drop questions on the email itself. One of us/community
will be glad to help on the same.
*Regards*
Shashanka Balakuntala Srinivasa
On Fri, 19 Apr 2024 at 7:15 AM, Sheham Izat wrote:
> Hi,
>
> I'm trying to get Nutch to work and I have issues, how can I
Hi,
I'm trying to get Nutch to work and I have issues, how can I post questions
on the group?
Thank you,
Sheham
7 matches
Mail list logo