And here is a list of issues from me that needs more discussion/review:

NUTCH-442 - Integrate Nutch/Solr: If NUTCH-442 is too complex to
review for people, for now we can just write a SolrIndexer like Sami
Siren's and deal with 442 after 1.0. I would be happy to provide such
a patch.

NUTCH-631 - MoreIndexingFilter fails with NoSuchElementException: I
don't know how to fix this one but indexing almost always fails with
index-more enabled.

NUTCH-652 - AdaptiveFetchSchedule#setFetchSchedule doesn't calculate
fetch interval correctly: I botched it once so now I am afraid to
commit it :D

NUTCH-626 - fetcher2 breaks out the domain with
db.ignore.external.links set at cross domain redirects: I am going to
update the patch and commit it if no objections.

Also, I think NUTCH-658 would be a nice feature for 1.0.

There are some others but these are the most recent and we really
should push 1.0 out the door already :D

Oh and finally we should do a review of all libraries in nutch
(libraries in plugins included) and update them to latest versions. I
am going to open an issue with the intenton of updating all the
libraries that do not require code changes.

-- 
Doğacan Güney

Reply via email to