Unfortunately I am not a developper. But as an user of Nutch in a single
machine, and very happy with 0.7.2, I think those are good news.
And there is a feature I would like to see in the nutch.default.xml:
"db.ignore.external.links"; I just don`t know how to do it, as the actual
"db.max.outlinks.per.page", from my experience, does`nt give as good results
as the former, used in 0.8.1.
Tanks
Carmmello
----- Original Message -----
From: "Piotr Kosiorowski" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, November 15, 2006 11:42 AM
Subject: Re: Strategic Direction of Nutch
I agree with Andrzej. On my part if some takes the effort of
preparing patches and testing I as a committer (not very active one
recently) may focus on 7.2 issues and commit the patches. And in
future prepare 7.3 release.
Regards,
Piotr
On 11/15/06, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Nitin Borwankar wrote:
> Hi all,
>
> First an intro. I am another Nutch newbie and am finding 0.7.2 to be
> quite an effective single machine crawler.
>
[..]
> The ability to keep db formats compatible would be nice to allow reuse
> of existing results but is not necessary.
>
That's probably not going to happen - each branch has specific
requirements from the db and segment formats, which are incompatible.
However, given enough interest we could implement converters, even
bi-directional.
> As a potential developer I would like to volunteer for the ongoing
> maintenance and evolution of 0.7.2 as an effective single machine
> crawler.
>
That's excellent! I imagine the procedure to get you involved would be
something like this:
* start collecting issues related to maintenance, bugfixes or
improvements of that branch,
* create JIRA issues, plus start collecting patches, tested and ready
for committing. One of the existing developers will commit them on your
behalf.
* after a while we would consider giving you committer rights so that
you could work directly with the code.
> Consider this a proposal to maintain two separate versions by
> continuing
> bug fix versions of 0.7 until one of two things happen
>
> a) 0.8 evolves to something satisfactory for use as also as a single
> machine search engine and everyone is happy moving to it
> b) a critical mass of developers steps forward to support the ongoing
> development of 0.7.2 into say Nutch-lite always and only meant for
> single machine use.
>
I do hope that option a) becomes a reality sooner rather than later. But
if there is sufficient interest (and enough developers) in developing 0.7
branch, then go for it - keeping in mind, though, that eventually these
code bases will diverge so much that maintaining them will require two
mostly separate teams ...
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.430 / Virus Database: 268.14.5/534 - Release Date: 14/11/2006
15:58