Hi Otis, Thank you for this. From reaading various posts on this list and the roadmap for Nutch 2.0 I had gathered that using HBase was probably the most supported option within the community.
Lewis ________________________________________ From: Otis Gospodnetic [[email protected]] Sent: 16 January 2011 10:45 To: [email protected] Subject: Re: Database data storage question There are lots of factors to consider, so one can't give a good general answer, but: Nutch already uses HBase (trunk), so that's +1 for HBase. HBase makes it easy to scale and has built-in replication thanks to being built on top of HDFS. Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: "McGibbney, Lewis John" <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Fri, January 14, 2011 8:00:50 AM > Subject: Database data storage question > > Hello List, > > I am gathering information on the above topic as I intend to integrate a >database to store fetched data. I would like community input of any >experiences >using different database implementations before doing so. E.g. comparison >between HBase & MySQL etc. > > Thank you > > Lewis > > > Glasgow Caledonian University is a registered Scottish charity, number >SC021474 > > Winner: Times Higher Education's Widening Participation Initiative of the > Year >2009 and Herald Society's Education Initiative of the Year 2009 >http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html >l > Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

