Thanks a lot for the suggestion Julien, I suspected that might be the case and I really appreciate the recommendation of using 1.x for robustness.
Also that sounds like a wonderful idea regarding extending the indexer. I think that's exactly what we'll do! Is this something you all would be interested in having as part of the 1.x code base? We would be glad to contribute it back to you all once we have done this. Alex On Wed, May 1, 2013 at 4:25 PM, Julien Nioche <[email protected] > wrote: > Nutch 1.x is definitely more tested and robust than 2.x. Loads of work is > done for the latter but the former is probably a safer option in > production. You could use the pluggable indexer and send the documents to > HBase (ideally via GORA)? This would be an elegant way of migrating from > 1.x to 2.x BTW. > > > On 1 May 2013 19:41, AC Nutch <[email protected]> wrote: > > > Hello All, > > > > Has anyone gotten the latest version of HBase 0.94.6 to work with Nutch > 2.1 > > on Ubuntu with Hadoop >= 1.0.X. I keep getting the error: > > > > Exception in thread "main" org.apache.gora.util.GoraException: > > java.lang.IllegalArgumentException: Not a host:port pair: > > > > Googling around I saw the suggestion to replace the hbase-0.90.4 jar with > > the hbase-0.94.6.jar from my hbase distro (btw I understand I'm trying to > > do something that is unsupported by using the latest hbase version). The > > suggestion didn't appear to work - I get the same error. Has anyone > gotten > > the latest HBase to work with Nutch 2.1 and if so, how did you get around > > this error? > > > > As a little bit of background, the overall problem I'm trying to solve is > > that I really want to use Nutch 2.1 as opposed to the 1.6 branch for what > > will become a production application. However, I have the requirement of > > using at least Hadoop 1.0.X which, as I understand it, is not supported > by > > HBase 0.90.x. On the other hand, Nutch 2.1 (or rather GORA) doesn't > support > > later HBase versions, which leaves me in quite the pickle - it seems that > > either I use an older Hadoop (which I can't do) or I use Nutch 1.6 > (which I > > don't want to do). Any suggestions? > > > > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble >

