Nitin Borwankar wrote:
Hi all,

First an intro. I am another Nutch newbie and am finding 0.7.2 to be
quite an effective single machine crawler.
[..]
The ability to keep db formats compatible would be nice to allow reuse
of existing results but is not necessary.


That's probably not going to happen - each branch has specific requirements from the db and segment formats, which are incompatible. However, given enough interest we could implement converters, even bi-directional.


As a potential developer I would like to volunteer for the ongoing
maintenance and evolution of 0.7.2 as an effective single machine
crawler.

That's excellent! I imagine the procedure to get you involved would be something like this:

* start collecting issues related to maintenance, bugfixes or improvements of that branch,

* create JIRA issues, plus start collecting patches, tested and ready for committing. One of the existing developers will commit them on your behalf.

* after a while we would consider giving you committer rights so that you could work directly with the code.


Consider this a proposal to maintain two separate versions by continuing
bug fix versions of 0.7  until one of two things happen

a) 0.8 evolves to something satisfactory for use as also as a single
machine search engine and everyone is happy moving to it
b) a critical mass of developers steps forward to support the ongoing
development of 0.7.2 into say Nutch-lite always and only meant for
single machine use.
I do hope that option a) becomes a reality sooner rather than later. But if 
there is sufficient interest (and enough developers) in developing 0.7 branch, 
then go for it - keeping in mind, though, that eventually these code bases will 
diverge so much that maintaining them will require two mostly separate teams ...

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to