You are correct, my bad, I forgot to mention that my cluster runs with HBase 0.94.2 so that makes it incompatible... And there is that bug I mentioned with MySQL... So should I go for 1.6 ?
On Mon, Feb 18, 2013 at 3:31 PM, kiran chitturi <[email protected]>wrote: > Hi Amit, > > Nutch 2.1 with Hbase is stable than using MySQL as backend. Please check > the link here [0] on how to use Hbase as backend. > > [0] - http://wiki.apache.org/nutch/Nutch2Tutorial > > > On Mon, Feb 18, 2013 at 8:07 AM, Amit Sela <[email protected]> wrote: > > > Hi all, > > > > I installed Nutch 2.1 with Gora and MySQL and I tried running the inject > > job i got the following exception: > > > > org.apache.gora.util.GoraException: java.io.IOException: > > com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Column length > > too big for column 'text' (max = 16383); use BLOB or TEXT instead > > > > Then I found out it's a known BUG > > NUTCH-970<https://issues.apache.org/jira/browse/NUTCH-970> > > > > So what version should I use for a stable crawler to parse about 12MM > urls > > ? > > I want to try it first on my laptop (with much less urls to parse...) and > > then deploy on an existing Hadoop cluster. > > > > Any suggestions ? > > > > Thanks, > > > > Amit. > > > > > > -- > Kiran Chitturi >

