Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "GORA_HBase" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/GORA_HBase?action=diff&rev1=11&rev2=12 - This document describes how to get Nutch to use HBase as a backend for GORA and is based on the revision 993857 of the Nutch trunk + = Nutch 2.0 Tutorial = + {{http://www.interadvertising.co.uk/files/nutch_logo_medium.gif}} {{http://gora.apache.org/images/gora-logo.png}} {{http://hbase.apache.org/images/hbase_logo.png}} - * Install and configure HBase 0.20.6. You can check it out from [[http://svn.apache.org/repos/asf/hbase/tags/0.20.6/|here]] ('''N.B.''' It is important that you grab HBase version 0.20.6 at this is supported by Gora) + This document describes how to get Nutch 2.0 to use HBase as a storage backend for Gora. + + * Install and configure HBase. You can get it [[http://www.apache.org/dyn/closer.cgi/hbase/|here]] ('''N.B.''' Gora 0.2 uses HBase 0.90.4, however the setup is know to work with more recent versions of HBase.) * Specify the GORA backend in nutch-site.xml {{{ @@ -12, +15 @@ <description>Default class for storing data</description> </property> }}} - Note: Currently HBaseStore is NOT YET THREAD-SAFE, so all processes should have single threaded settings (i.e. set number of fetchers to 1). Work to make it thread-safe is in progress. * Compile Nutch -> ant runtime * Make sure HBase is started and working properly as per the quick start tutorial [[http://hbase.apache.org/book/quickstart.html|here]] @@ -24, +26 @@ nutch readdb }}} - You should find more details in the logs on ''$NUTCH_HOME/runtime/local/logs/hadoop.log'' + You should find more details in the logs on ''$NUTCH_HOME/runtime/local/logs/hadoop.log''. + For more details of the command line interface options, please see [[http://wiki.apache.org/nutch/CommandLineOptions|here]], or of course run ./bin/nutch which will print usage to std out. + Finally, for a more detailed Nutch (1.X) tutorial, please see [[http://wiki.apache.org/nutch/NutchTutorial|here]] + + '''back to FrontPage''' +

