Pluggable indexing backends
---------------------------

                 Key: NUTCH-1047
                 URL: https://issues.apache.org/jira/browse/NUTCH-1047
             Project: Nutch
          Issue Type: Improvement
          Components: indexer
    Affects Versions: 1.4
            Reporter: Julien Nioche
             Fix For: 1.4


One possible feature would be to add a new endpoint for indexing-backends and 
make the indexing plugable. at the moment we are hardwired to SOLR - which is 
OK - but as other resources like ElasticSearch are becoming more popular it 
would be better to handle this as plugins. Not sure about the name of the 
endpoint though : we already have indexing-plugins (which are about generating 
fields sent to the backends) and moreover the backends are not necessarily for 
indexing / searching but could be just an external storage e.g. CouchDB. The 
term backend on its own would be confusing in 2.0 as this could be pertaining 
to the storage in GORA. 'indexing-backend' is the best name that came to my 
mind so far - please suggest better ones.

We should come up with generic map/reduce jobs for indexing, deduplicating and 
cleaning and maybe add a Nutch extension point there so we can easily hook up 
indexing, cleaning and deduplicating for various end-points.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to