And here is a list of issues from me that needs more discussion/review: NUTCH-442 - Integrate Nutch/Solr: If NUTCH-442 is too complex to review for people, for now we can just write a SolrIndexer like Sami Siren's and deal with 442 after 1.0. I would be happy to provide such a patch.
NUTCH-631 - MoreIndexingFilter fails with NoSuchElementException: I don't know how to fix this one but indexing almost always fails with index-more enabled. NUTCH-652 - AdaptiveFetchSchedule#setFetchSchedule doesn't calculate fetch interval correctly: I botched it once so now I am afraid to commit it :D NUTCH-626 - fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects: I am going to update the patch and commit it if no objections. Also, I think NUTCH-658 would be a nice feature for 1.0. There are some others but these are the most recent and we really should push 1.0 out the door already :D Oh and finally we should do a review of all libraries in nutch (libraries in plugins included) and update them to latest versions. I am going to open an issue with the intenton of updating all the libraries that do not require code changes. -- Doğacan Güney