> Hi all,
>
> The following issues need to be discussed and appropriate action taken
> before the 0.9 release:
>
> Blocker
> ========
> * NUTCH-400 (Update & add missing license headers) - I believe this is
> fixed and should be closed
>
> * NUTCH-353 (pages that serverside forwards will be refetched every
> time) - this was partially fixed in NUTCH-273, but a more complete
> solution would require significant changes to LinkDb. As there are no
> patches implementing this, I left it open, but it's no longer as
> critical as it was before. I propose to move it to "Major" and address
> it in the next release.
>
> * NUTCH-233 (wrong regular expression hang reduce process for ever) - I
> propose to apply the fix provided by Sean Dean and close this issue for
> now.
>
> Critical
> ========
> * NUTCH-436 (Incorrect handling of relative paths when the embedded URL
> path is empty). There is no patch available yet. If someone could
> contribute a patch I'd like to see this fixed before the release.

I am starting to take a look at this.  I will try to get it fixed before
we release.

>
> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
> certainly not critical (as this is an optional new feature). I propose
> to change it to Major, and make a decision - do we want another plugin
> like parse-mp3 or parse-rtf, or not.
>
> * NUTCH-381 (Ignore external link not work as expected) - I'll try to
> reproduce it, and if I find an easy fix I'd like to apply it before the
> release.
>
> * NUTCH-277 (Fetcher dies because of "max. redirects") - I wasn't able
> to reproduce it. If there is no updated information on this I propose to
> close it with "Can't reproduce".
>
> * NUTCH-167 (Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE">) -
> there's a patch which I tested in a limited production env. If there are
> no objections I'd like to apply it before the release.
>
> Major
> =====
> There are 84 major issues, but some of them are either invalid, or
> should be "minor", or no longer apply and should be closed. Please
> review them if you can and provide some comments or recommendations if
> you think you have some new information.
>
>
> One decision also that we need to make is which version of Hadoop should
> be included in the release. Current trunk uses 0.10.1, I have a set of
> production-tested patches that use 0.11.2, and today the Hadoop team
> released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
> before our release). The most conservative option is to stay with
> 0.10.1, but by the time people start using Nutch this will be a fairly
> old version already. I propose to upgrade to 0.11.2. We could use 0.12.1
> - but in this case with the expectation that we release less than stable
> version of Nutch to be soon followed by a minor stable release ...

+1 for using 0.11.2.  I looked through the release notes for 0.12 and
there were some niceties such as HADOOP-432 for undeletes and alot of bug
fixes, but it didn't look like there were any critical issues as far as
Nutch is concerned.

Dennis Kubes

>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to