Hello, It has been a while from a previous release (0.8.1) and looking at the great fixes done in trunk I'd start thinking about baking a new release soon.
Looking at the jira roadmaps there are 1 blocking issues (fixing the license headers) for 0.8.2 and two other blocking issues for 0.9.0 of which I think NUTCH-233 is safe to put in. The top 10 voted issues are currently: NUTCH-61 Adaptive re-fetch interval. Detecting umodified content NUTCH-48 "Did you mean" query enhancement/refignment feature NUTCH-251 Administration GUI NUTCH-289 CrawlDatum should store IP address NUTCH-36 Chinese in Nutch NUTCH-185 XMLParser is configurable xml parser plugin. NUTCH-59 meta data support in webdb NUTCH-92 DistributedSearch incorrectly scores results NUTCH-68 A tool to generate arbitrary fetchlists NUTCH-87 Efficient site-specific crawling for a large number of sites Are there any opinions about issues that should go in before the next release (Answering yes means that you are willing to provide a patch for it). -- Sami Siren