Nutch 1.x is definitely more tested and robust than 2.x. Loads of work is
done for the latter but the former is probably a safer option in
production. You could use the pluggable indexer and send the documents to
HBase (ideally via GORA)? This would be an elegant way of migrating from
1.x to 2.x BTW.


On 1 May 2013 19:41, AC Nutch <[email protected]> wrote:

> Hello All,
>
> Has anyone gotten the latest version of HBase 0.94.6 to work with Nutch 2.1
> on Ubuntu with Hadoop >= 1.0.X. I keep getting the error:
>
> Exception in thread "main" org.apache.gora.util.GoraException:
> java.lang.IllegalArgumentException: Not a host:port pair:
>
> Googling around I saw the suggestion to replace the hbase-0.90.4 jar with
> the hbase-0.94.6.jar from my hbase distro (btw I understand I'm trying to
> do something that is unsupported by using the latest hbase version). The
> suggestion didn't appear to work - I get the same error. Has anyone gotten
> the latest HBase to work with Nutch 2.1 and if so, how did you get around
> this error?
>
> As a little bit of background, the overall problem I'm trying to solve is
> that I really want to use Nutch 2.1 as opposed to the 1.6 branch for what
> will become a production application. However, I have the requirement of
> using at least Hadoop 1.0.X which, as I understand it, is not supported by
> HBase 0.90.x. On the other hand, Nutch 2.1 (or rather GORA) doesn't support
> later HBase versions, which leaves me in quite the pickle - it seems that
> either I use an older Hadoop (which I can't do) or I use Nutch 1.6 (which I
> don't want to do). Any suggestions?
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to