Jérôme,
> which Nutch version do you use?
Kind of gave up on mapred for awhile, so I am using
trunk.
> There were a bug concerning the content-types with
> parameters such as
> "text/html; charset=iso-8859-1".
Yeah, when I telnet in to GET / shopthar.com, I get
Content-Type: text/html; charset=iso-8859-1
> This issue is fixed in trunk and mapred.
Hmm, well, I was seeing something earlier in trunk.
That said, something happened and I now seem to get a
partial crawl started. How very strange. I did catch
a few updates today, but the commits sure didn't seem
related.
Now I crawl for awhile, and then it just stops. I
still get new segments starting, but no new http hits
to the server. So looks like I have something new to
track down. But yeah, when it is going, it can hammer
pretty good.
Earl
__________________________________
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com