Sami Siren wrote:
Hello,

It has been a while from a previous release (0.8.1) and looking at the
great fixes done in trunk I'd start thinking about baking a new release
soon.

Looking at the jira roadmaps there are 1 blocking issues (fixing the
license headers) for 0.8.2 and two other blocking issues for 0.9.0 of
which I think NUTCH-233 is safe to put in.

Agreed. The replacement regex mentioned in the original comment seems safe enough, and simpler.

The top 10 voted issues are currently:

NUTCH-61         Adaptive re-fetch interval. Detecting umodified content

Well ... I'm of a split mind on this. I can bring this patch up to date and apply it before 0.9.0, if we understand that this is a "0" release ... ;) Otherwise I'd prefer to wait with it right after the release.

I would like also to proceed with NUTCH-339 (Fetcher2 patches + plus some changes I made in the meantime), since I'd like to expose the new fetcher to a broader audience, and it doesn't affect the existing implementation.


NUTCH-48        "Did you mean" query enhancement/refignment feature
NUTCH-251       Administration GUI
NUTCH-289       CrawlDatum should store IP address

I'm still not entirely convinced about this - and there is already a mechanism in place to support it if someone really wishes to keep this particular info (CrawlDatum.metaData).

NUTCH-36        Chinese in Nutch
NUTCH-185       XMLParser is configurable xml parser plugin.            
NUTCH-59        meta
data support in webdb
NUTCH-92        DistributedSearch incorrectly scores results            
NUTCH-68        

This is too intrusive to fix just before the release - and needs additional discussion.


NUTCH-68        A
tool to generate arbitrary fetchlists   

Easy to port this to 0.9.0 - I can do this.


        NUTCH-87        Efficient
site-specific crawling for a large number of sites



--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to