Is NUTCH-442 going to be part of the 1.0 release? I hope so, Nutch/
Solr integration would be a huge.
just my .02 cents.
-John
On Nov 27, 2008, at 12:10 PM, Doğacan Güney wrote:
And here is a list of issues from me that needs more discussion/
review:
NUTCH-442 - Integrate Nutch/Solr: If NUTCH-442 is too complex to
review for people, for now we can just write a SolrIndexer like Sami
Siren's and deal with 442 after 1.0. I would be happy to provide such
a patch.
NUTCH-631 - MoreIndexingFilter fails with NoSuchElementException: I
don't know how to fix this one but indexing almost always fails with
index-more enabled.
NUTCH-652 - AdaptiveFetchSchedule#setFetchSchedule doesn't calculate
fetch interval correctly: I botched it once so now I am afraid to
commit it :D
NUTCH-626 - fetcher2 breaks out the domain with
db.ignore.external.links set at cross domain redirects: I am going to
update the patch and commit it if no objections.
Also, I think NUTCH-658 would be a nice feature for 1.0.
There are some others but these are the most recent and we really
should push 1.0 out the door already :D
Oh and finally we should do a review of all libraries in nutch
(libraries in plugins included) and update them to latest versions. I
am going to open an issue with the intenton of updating all the
libraries that do not require code changes.
--
Doğacan Güney