Hey Doug,
I'm finally picking this up again, and I believe I have everything
configured as you suggested - I'm running dfs at the indexing
datacenter, along with a job tracker/namenode/task trackers. I'm running
a job tracker/namenode/task trackers at the crawling datacenter, and its
all
Sorry, I am on holiday until the 8th of May.
Please contact the [EMAIL PROTECTED] for urgent matters.
Kind regards, Herman.
Jason Camp wrote:
Unfortunately in our scenario, bw is cheap at our fetching datacenter,
but adding additional disk capacity is expensive - so we are fetching
the data and sending it back to another cluster (by exporting segments
from ndfs, copy, importing).
But to perform the copies, you're
Hi,
I've been using Nutch 7 for a few months, and recently started
working with 8. I'm testing everything right now on a single server,
using the local file system. I generated 10 segments with 100k urls in
each, and fetched the content. Then I do the updatedb, but it looks like
the