Andrzej Bialecki wrote:

Nitin Borwankar wrote:

Hi all,

First an intro. I am another Nutch newbie and am finding 0.7.2 to be
quite an effective single machine crawler.

[..]

The ability to keep db formats compatible would be nice to allow reuse
of existing results but is not necessary.



That's probably not going to happen - each branch has specific requirements from the db and segment formats, which are incompatible. However, given enough interest we could implement converters, even bi-directional.


As a potential developer I would like to volunteer for the ongoing
maintenance and evolution of 0.7.2 as an effective single machine
crawler.


That's excellent! I imagine the procedure to get you involved would be something like this:

* start collecting issues related to maintenance, bugfixes or improvements of that branch,

what is the mechanism for this collection process - do we create a separate email list, a separate alias ... or everyone just sends me email ( this may get messy fast ).


* create JIRA issues, plus start collecting patches, tested and ready for committing. One of the existing developers will commit them on your behalf.

sounds good.

* after a while we would consider giving you committer rights so that you could work directly with the code.

fair enough. Do we take this offline for further thrashing out ? Or continue here ?

Nitin Borwankar


Consider this a proposal to maintain two separate versions by continuing
bug fix versions of 0.7  until one of two things happen

a) 0.8 evolves to something satisfactory for use as also as a single
machine search engine and everyone is happy moving to it
b) a critical mass of developers steps forward to support the ongoing
development of 0.7.2 into say Nutch-lite always and only meant for
single machine use.

I do hope that option a) becomes a reality sooner rather than later. But if there is sufficient interest (and enough developers) in developing 0.7 branch, then go for it - keeping in mind, though, that eventually these code bases will diverge so much that maintaining them will require two mostly separate teams ...


Reply via email to