> Guys,
> 
> I added a new label 1.4 on the JIRA. Shall we create a new branch 1.4 on
> SVN from the existing 1.3? I agree that it is a pain to have to maintain
> 1.x AND trunk in parallel but my feeling is that 2.0 needs more work
> before being completely reliable and in the meantime we might want to add
> new features to the stable 1.x branch.

Agreed.

> 
> One possible feature would be to add a new endpoint for indexing-backends
> and make the indexing plugable. at the moment we are hardwired to SOLR -
> which is OK - but as other resources like ElasticSearch are becoming more
> popular it would be better to handle this as plugins. Not sure about the
> name of the endpoint though : we already have indexing-plugins (which are
> about generating fields sent to the backends) and moreover the backends are
> not necessarily for indexing / searching but could be just an external
> storage e.g. CouchDB. The term backend on its own would be confusing in 2.0
> as this could be pertaining to the storage in GORA. 'indexing-backend' is
> the best name that came to my mind so far - please suggest better ones.

Yes, i'd like to see this `renamed` as well. I makes perfectly sense to have a 
plugin to `index` to CouchDB as well as send the stuff to Solr and ES. I'm 
unsure how to name this. Indexing becomes a bit ambiguous since 1.3.

> 
> For 1.4 (and 2.0) it would be good to improve the detection of duplicates
> so that it detects them using mapreduce on the crawldb instead of pulling
> the docs from SOLR.

Yes, i remeber a ticket for deduplicating locally (or was it mentioned in the 
404 cleaner). Anyway, this is really desired as it can take a lot of strain on 
the Solr index, especially if it is also a query/slave node. 

I think we should come up with generic map/reduce jobs for indexing, 
deduplicating and cleaning and maybe add a Nutch extension point there so we 
can easily hook up indexing, cleaning and deduplicating for various ... end-
points?

> 
> Let's just add to the wishlist on JIRA with the tag 1.4. Is everybody happy
> with having a new branch 1.4?

I'm not everybody but +1 anyway ;)

> 
> Jul

Reply via email to