Doug Cutting wrote:
> Branching doesn't sound like the right solution here. ...

I couldn't agree more that the not-splitting-up approach is indeed better
for resource-utilization, but how do we get round the problems that we keep
encountering?

We haven't managed to run a script without Hadoop popping up to do the
map/reduce.  And many of the problems we have encountered in debugging with
the Hadoop interaction are probably down to a lack of understanding on our
part of how Hadoop works.

It is also clear to me that most Nutch developers and users are quite happy
with the direction it is taking -- it is after all intended to be used,
first and foremost as a web-crawler a-la Google, as opposed to the way we
use it for enterprise file-systems, databases, and very specific small-scope
web crawling.

[EMAIL PROTECTED] wrote:
> All good arguments, and as nobody else voiced the desire to have this
> other branch of Nutch I was rambling about, I'll consider this thread
> done.

Unfortunately, at the current juncture, we have spent so much time trying to
work around the problems that we encountered with 0.8.1, that have had to
roll back to 0.7.2, albeit temporarily.

We would however, be most happy to rejoin the team effort at 0.9 once our
pressing issues have been resolved, and if we can somehow find a way to add
classes/methods to the architecture to cater more successfully for
enterprise search.

Best regards,
Alan
_________________________
Alan Tanaman
iDNA Solutions
Tel: +44 (20) 7257 6125
Mobile: +44 (7796) 932 362
http://blog.idna-solutions.com



Reply via email to