Nutch 2.1 officially had support for MySQL as datastore. There were lot of issues reported with MySQL and so in the newer version ie. 2.2.X, the MySQL support is removed. I would recommend using HBase as its the most stable backend amongst all supported ones.
On Thu, Aug 1, 2013 at 7:01 AM, Jayadeep Reddy <[email protected]>wrote: > Thank you Julien, > Will get hbase and try to crawl. > > > On Thu, Aug 1, 2013 at 7:10 PM, A Laxmi <[email protected]> wrote: > > > Julien - whatever you are saying about Nutch 2.x and SQL - does it apply > > for the recent release 2.2.1 as well? > > > > > > On Thu, Aug 1, 2013 at 9:38 AM, Julien Nioche < > > [email protected] > > > wrote: > > > > > If you are using Nutch 2.x then you are actually accessing the SQL > > storage > > > via Apache GORA. The SQL backend in GORA does not work and it is not > > > advised to use it. If you want to use Nutch 2 then use a different > > backend > > > like HBase or Cassandra or use Nutch 1.x > > > > > > On 1 August 2013 14:32, Jayadeep Reddy <[email protected]> > > wrote: > > > > > > > No Julien Using Mysql > > > > > > > > > > > > On Thu, Aug 1, 2013 at 7:00 PM, Julien Nioche < > > > > [email protected] > > > > > wrote: > > > > > > > > > What GORA backend are you using? > > > > > > > > > > > > > > > On 1 August 2013 14:03, Jayadeep Reddy <[email protected] > > > > > > wrote: > > > > > > > > > > > I am using Nutch 2.1 every time I run crawl from dmoz directory > my > > > > > existing > > > > > > crawled pages in the database are fetched again(Taking long > time/). > > > Is > > > > > > there a way to crawl only new sites. > > > > > > > > > > > > Thank you > > > > > > > > > > > > -- > > > > > > Jayadeep Reddy.S, > > > > > > M.D & C.E.O > > > > > > e Health Access Pvt.Ltd > > > > > > www.ehealthaccess.com > > > > > > Hyderabad-Chennai-Banglore > > > > > > http://www.youtube.com/watch?v=0k5LX8mw6Sk > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > * > > > > > *Open Source Solutions for Text Engineering > > > > > > > > > > http://digitalpebble.blogspot.com/ > > > > > http://www.digitalpebble.com > > > > > http://twitter.com/digitalpebble > > > > > > > > > > > > > > > > > > > > > -- > > > > Jayadeep Reddy.S, > > > > M.D & C.E.O > > > > e Health Access Pvt.Ltd > > > > www.ehealthaccess.com > > > > Hyderabad-Chennai-Banglore > > > > http://www.youtube.com/watch?v=0k5LX8mw6Sk > > > > > > > > > > > > > > > > -- > > > * > > > *Open Source Solutions for Text Engineering > > > > > > http://digitalpebble.blogspot.com/ > > > http://www.digitalpebble.com > > > http://twitter.com/digitalpebble > > > > > > > > > -- > Jayadeep Reddy.S, > M.D & C.E.O > e Health Access Pvt.Ltd > www.ehealthaccess.com > Hyderabad-Chennai-Banglore > http://www.youtube.com/watch?v=0k5LX8mw6Sk >

