> Guys, > > I added a new label 1.4 on the JIRA. Shall we create a new branch 1.4 on > SVN from the existing 1.3? I agree that it is a pain to have to maintain > 1.x AND trunk in parallel but my feeling is that 2.0 needs more work > before being completely reliable and in the meantime we might want to add > new features to the stable 1.x branch.
Agreed. > > One possible feature would be to add a new endpoint for indexing-backends > and make the indexing plugable. at the moment we are hardwired to SOLR - > which is OK - but as other resources like ElasticSearch are becoming more > popular it would be better to handle this as plugins. Not sure about the > name of the endpoint though : we already have indexing-plugins (which are > about generating fields sent to the backends) and moreover the backends are > not necessarily for indexing / searching but could be just an external > storage e.g. CouchDB. The term backend on its own would be confusing in 2.0 > as this could be pertaining to the storage in GORA. 'indexing-backend' is > the best name that came to my mind so far - please suggest better ones. Yes, i'd like to see this `renamed` as well. I makes perfectly sense to have a plugin to `index` to CouchDB as well as send the stuff to Solr and ES. I'm unsure how to name this. Indexing becomes a bit ambiguous since 1.3. > > For 1.4 (and 2.0) it would be good to improve the detection of duplicates > so that it detects them using mapreduce on the crawldb instead of pulling > the docs from SOLR. Yes, i remeber a ticket for deduplicating locally (or was it mentioned in the 404 cleaner). Anyway, this is really desired as it can take a lot of strain on the Solr index, especially if it is also a query/slave node. I think we should come up with generic map/reduce jobs for indexing, deduplicating and cleaning and maybe add a Nutch extension point there so we can easily hook up indexing, cleaning and deduplicating for various ... end- points? > > Let's just add to the wishlist on JIRA with the tag 1.4. Is everybody happy > with having a new branch 1.4? I'm not everybody but +1 anyway ;) > > Jul

