Andrzej Bialecki wrote:
Sami Siren wrote:
Lots of good thoughts and ideas, easy to agree with.
Something for the ease of use category:
-allow running on top of plain vanilla hadoop
What does it mean plain vanilla here? Do you mean the current DB
implementation? That's the idea, we should
Sami Siren wrote:
Lots of good thoughts and ideas, easy to agree with.
Something for the ease of use category:
-allow running on top of plain vanilla hadoop
What does it mean plain vanilla here? Do you mean the current DB
implementation? That's the idea, we should aim for an abstract layer
Lots of good thoughts and ideas, easy to agree with.
Something for the ease of use category:
-allow running on top of plain vanilla hadoop
-split into reusable components with nice and clean public api
-publish mvn artifacts so developers can directly use mvn, ivy etc to
pull required
Subhojit Roy wrote:
Hi,
Would it be possible to include in Nutch, the ability to crawl download a
page only if the page has been updated since the last crawl? I had read
sometime back that there were plans to include such a feature. It would be a
very useful feature to have IMO. This of course
At 2:44 PM +0100 11/16/09, Andrzej Bialecki wrote:
This is already implemented - see the Signature / MD5Signature /
TextProfileSignature.
OK, then could somebody explain how to implement this feature? Does
the initial indexing require a special commmand-line? Then does the
secondary indexing
Hi,
Would it be possible to include in Nutch, the ability to crawl download a
page only if the page has been updated since the last crawl? I had read
sometime back that there were plans to include such a feature. It would be a
very useful feature to have IMO. This of course depends on the last
Hi all,
The ApacheCon is over, our release 1.0 has been out already for some
time, so I think it's a good moment to discuss what are the next steps
in Nutch development.
Let me share with you the topics I identified and presented in the
ApacheCon slides, and some topics that are worth