Re: Nutch fetching times out at 3 hours, not sure why.

2018-05-01 Thread Chip Calhoun
; Sent: Monday, April 30, 2018 4:53:20 PM To: user@nutch.apache.org Subject: Re: Nutch fetching times out at 3 hours, not sure why. Hi Chip, got it, you probably run bin/crawl which has the option: --time-limit-fetch Number of minutes allocated to the fetching [default: 180] It's good to have

Re: Nutch fetching times out at 3 hours, not sure why.

2018-04-30 Thread Sebastian Nagel
nore it unless it causes a problem for my other cores. > > Chip > > -Original Message- > From: Sebastian Nagel [mailto:wastl.na...@googlemail.com] > Sent: Monday, April 30, 2018 12:21 PM > To: user@nutch.apache.org > Subject: Re: Nutch fetching times out at 3 hou

RE: Nutch fetching times out at 3 hours, not sure why.

2018-04-30 Thread Chip Calhoun
a problem for my other cores. Chip -Original Message- From: Sebastian Nagel [mailto:wastl.na...@googlemail.com] Sent: Monday, April 30, 2018 12:21 PM To: user@nutch.apache.org Subject: Re: Nutch fetching times out at 3 hours, not sure why. Hi, if you still see the log message

Re: Nutch fetching times out at 3 hours, not sure why.

2018-04-30 Thread Sebastian Nagel
; > Are these 3 hour loops standard for large crawls? > > -Original Message- > From: Chip Calhoun [mailto:ccalh...@aip.org] > Sent: Tuesday, April 17, 2018 3:27 PM > To: user@nutch.apache.org > Subject: RE: Nutch fetching times out at 3 hours, not sure why. > &g

RE: Nutch fetching times out at 3 hours, not sure why.

2018-04-30 Thread Chip Calhoun
, April 17, 2018 1:43 PM To: user@nutch.apache.org Subject: RE: Nutch fetching times out at 3 hours, not sure why. Which version are you running? That value is defaulted to -1 in my current version (1.14) so shouldn't be something you should have needed to change. My crawls, by default, go for as much

RE: Nutch fetching times out at 3 hours, not sure why.

2018-04-19 Thread Chip Calhoun
Hi Lewis, I'm using Nutch 1.2. Chip -Original Message- From: lewis john mcgibbney [mailto:lewi...@apache.org] Sent: Wednesday, April 18, 2018 1:55 PM To: user@nutch.apache.org Subject: Re: Nutch fetching times out at 3 hours, not sure why. Hi Chip, Which version of Nutch are you using

RE: Nutch fetching times out at 3 hours, not sure why.

2018-04-19 Thread Chip Calhoun
...@openindex.io] Sent: Tuesday, April 17, 2018 3:58 PM To: user@nutch.apache.org Subject: RE: Nutch fetching times out at 3 hours, not sure why. Hello Chip, I have no clue where the three hour limit could come from. Please take a further look in the last few minutes of the logs. The only thing i can

Re: Nutch fetching times out at 3 hours, not sure why.

2018-04-18 Thread lewis john mcgibbney
18 14:45:01 +0000 > Subject: Nutch fetching times out at 3 hours, not sure why. > I crawl a list of roughly 2600 URLs all on my local server, and I'm only > crawling around 1000 of them. The fetcher quits after exactly 3 hours (give > or take a few milliseconds) with th

RE: Nutch fetching times out at 3 hours, not sure why.

2018-04-17 Thread Markus Jelsma
Tuesday 17th April 2018 21:27 > To: user@nutch.apache.org > Subject: RE: Nutch fetching times out at 3 hours, not sure why. > > I'm on 1.12, and mine also defaulted at -1. It does not fail at the same URL, > or even at the same point in a URL's fetcher loop; it reall

RE: Nutch fetching times out at 3 hours, not sure why.

2018-04-17 Thread Chip Calhoun
@nutch.apache.org Subject: RE: Nutch fetching times out at 3 hours, not sure why. Which version are you running? That value is defaulted to -1 in my current version (1.14) so shouldn't be something you should have needed to change. My crawls, by default, go for as much as even 12 hours with little

RE: Nutch fetching times out at 3 hours, not sure why.

2018-04-17 Thread Sadiki Latty
it. Is it always the same URL that it fails at? -Original Message- From: Chip Calhoun [mailto:ccalh...@aip.org] Sent: April-17-18 10:45 AM To: user@nutch.apache.org Subject: Nutch fetching times out at 3 hours, not sure why. I crawl a list of roughly 2600 URLs all on my local server

Nutch fetching times out at 3 hours, not sure why.

2018-04-17 Thread Chip Calhoun
I crawl a list of roughly 2600 URLs all on my local server, and I'm only crawling around 1000 of them. The fetcher quits after exactly 3 hours (give or take a few milliseconds) with this message in the log: 2018-04-13 15:50:48,885 INFO fetcher.FetchItemQueues - * queue: