[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renaud Delbru updated MAPREDUCE-1722:
-------------------------------------

          Status: Patch Available  (was: Open)
    Release Note: 
Upgraded to use new hadoop API.
Upgraded to use new lucene API (based on lucene 3.1 snapshot).

A few notes:
- I have re-organised the source directories to be compatible with maven, and 
added a pom file (I am working with maven, so it was more handy for me to 
deploy and test the index components in my own applications).
- I have also changed the logger to be based on logback and slf4j (to match our 
own logger system). You will have maybe to change back to the original logger 
system.
- the UpdateIndex has not been upgraded (still using old api)
- I removed the assertion of the done file in the TestIndexUpdater (should be 
added back).

Possible future improvements:
- the IIndexUpdater interface is made for normal hadoop job, but it is 
incompatible for hadoop job with habse table as input (no input paths, but a 
table name and scan is necessary).
- the contrib/index component should be extended to support Solr (maybe using 
EmbeddedSolrServer) in order to create distributed index using a solr schema.

I'll work on that improvements in the future, but this will add more 
dependencies to the contrib/index compoenents, i.e., hbase and solr. I am not 
usre if you would like to have this.





> contrib/index - Upgrade to new Hadoop and Lucene API
> ----------------------------------------------------
>
>                 Key: MAPREDUCE-1722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1722
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/index
>    Affects Versions: 0.20.2
>            Reporter: Renaud Delbru
>            Priority: Minor
>
> contrib/index is still using the old hadoop API. In addition, lucene 3.x has 
> also a new API. The contrib/index should be updated and based on these new 
> APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to