Sean Dean wrote:
Folks,
I was wondering if anyone could shed some light on the status of this
issue heading into a potential 1.0 (or 0.x) release over the few months?
Looks like a patch has been sitting there for a long time. Don't know
if it is still applicable or not. Checking on it.
I realize many upgrades have been made to Hadoop and Lucene, and in
addition to that bug fixes in just about every element of the system but
does this issue not prevent Nutch from being a true scalable system?
I don't think so. Especially with some of the new link analysis and
indexing stuff we have run production systems up to 100 million and 150
nodes and not seem problems (I think).
Dennis
My current situation limits me from providing development work but i can
(and will) be ready to test any solution submitted against the latest
code-base. I believe getting the distributed search functionality
working correctly should be a requirement for any 1.0 release candidate.
What does the rest of the community think?