Hi Lewis - i know, no worries. I've read some on that proxy so it means we 
cannot control exit node behaviour from Nutch. I don't know if this is even 
really possible with Tor at all, but it would be very useful for a crawler to 
frequently use different exit nodes. This would help masking the crawler if it 
were to be used on the public web via Tor.

Thanks anyway, this is interesting!

Markus

-----Original message-----
From: Lewis John Mcgibbney<[email protected]>
Sent: Thursday 25th September 2014 16:24
To: [email protected]
Subject: Re: DOCUMENTATION - Nutch and Hidden Services

Hi Markus,

On Thu, Sep 25, 2014 at 2:58 AM,  <[email protected] 
<mailto:[email protected]>> wrote:

Hi - this is really awesome! Is there also a way to use different exit nodes 
for different fetchers or queues, or can you instruct to regularly change exit 
nodes?

Hi Markus, I get these in digests so apologies if this is a slightly delayed.

Essentially, the aim for this one was for us to *enable crawling of* hidden 
services... not for us to use hidden services to crawl. That would be something 
else entirely and would involve you or someone else having a node/nodes within 
the Tor network which would permit you to crawl from. The idea here is to open 
up more content from the dark web.

Hopefully this makes sense...

Thanks
Lewis


Reply via email to