Hi Iqbal, Am doing a POC to help decide if we should be using Nutch 1.9 or 2.2.1 > version. > > We would be indexing our crawled data in ElasticSearch 1.x version. > > I know the 2.2.1 version provides OTB support for Elastic 0.x version but > to use 2.x I need to change the code (ElasticWriter.java) This means its a > customised Nutch installation, which I don't prefer. > > However even though 1.9 doesn't provide Elastic as default it does support > 1.x OTB which means no code change at all. And this is a big advantage. >
what do you mean by '1.9 doesn't provide ES by default'? > > I don't really need the flexibility provided by GORA as we're ok to use > HBase. do you mean HDFS? > Also 2.x doesn't seem to have periodic commits compared to 1.9 > > Therefore I was wondering what others think as am not sure about the > Roadmap going forward, are we going to cease 1.x at some point and migrate > the missing functionality to 2.x or we going to continue to have two > parallel versions. > more likely two parallel versions. 2.x is not making much progress. IMHO of the two versions 1.x is not the one which is going to die first ;-) > > Any suggestion to help me make my decision please? > See discussion on this list ( http://www.mail-archive.com/[email protected]/msg12550.html). 1.x is more robust, faster and more actively maintained. Since it sounds like you don't have any need for any specific features from 2.x then I'd recommend to use 1.x. HTH Julien > > Thanks, > > Iqbal Shaikh > Transform is a trading division of Engine Partners UK LLP, a limited > liability partnership registered in England & Wales with registered number > OC365812. > Our registered office is at 60 Great Portland Street, London W1W 7RT, > United Kingdom. > A list of our members is open for inspection at our registered office. -- Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

