Hi Lewis - i know, no worries. I've read some on that proxy so it means we cannot control exit node behaviour from Nutch. I don't know if this is even really possible with Tor at all, but it would be very useful for a crawler to frequently use different exit nodes. This would help masking the crawler if it were to be used on the public web via Tor.
Thanks anyway, this is interesting! Markus -----Original message----- From: Lewis John Mcgibbney<[email protected]> Sent: Thursday 25th September 2014 16:24 To: [email protected] Subject: Re: DOCUMENTATION - Nutch and Hidden Services Hi Markus, On Thu, Sep 25, 2014 at 2:58 AM, <[email protected] <mailto:[email protected]>> wrote: Hi - this is really awesome! Is there also a way to use different exit nodes for different fetchers or queues, or can you instruct to regularly change exit nodes? Hi Markus, I get these in digests so apologies if this is a slightly delayed. Essentially, the aim for this one was for us to *enable crawling of* hidden services... not for us to use hidden services to crawl. That would be something else entirely and would involve you or someone else having a node/nodes within the Tor network which would permit you to crawl from. The idea here is to open up more content from the dark web. Hopefully this makes sense... Thanks Lewis

