I solved my issue and I want to write how I solved it in case somebody else runs into the same problem.
*/etc/hosts* was not configured properly: I had to configure it as described in [0]. For each machine of my cluster, I had to comment the line *127.0.0.1 localhost* and add *localhost *to the line where my master's address was written. [0] http://stackoverflow.com/questions/7791788/hbase-client-do-not-able-to-connect-with-remote-hbase-server 2013/1/31 Adriana Farina <[email protected]> > Hello, > > I've set up a cluster of 4 machines with Hadoop 1.0.4 and I'm trying to > run nutch 2.0 in distributed mode using HBase 0.90.4 to store crawling > informations. > I've followed the tutorial > Nutch2Tutorial<https://wiki.apache.org/nutch/Nutch2Tutorial> and configured > HBase following the guide http://hbase.apache.org/book/quickstart.html. > However, when I try to run nutch, the crawling process runs for a little > bit and then I get the following exception: > > > org.apache.gora.util.GoraException: > org.apache.hadoop.hbase.MasterNotRunningException: master:60000 > at > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167) > at > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:118) > at > org.apache.gora.mapreduce.GoraOutputFormat.getRecordWriter(GoraOutputFormat.java:88) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:628) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:753) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.hadoop.hbase.MasterNotRunningException: master:60000 > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:396) > at > org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94) > at > org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:108) > at > org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102) > at > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161) > ... 10 more > > > After that, the crawling process keeps on running, but after some > map\reduce cycles it outputs that exception again and so on... > The strange thing is that the hbase master is up and running: there is no > error in the log files and I can access http://localhost:60010/ with no > problem. > > My hbase-site.xml is: > > > <property> > <name>hbase.master</name> > <value>crawler1a:60000</value> > <description>The host and port that the HBase master runs > at.</description> > </property> > > <property> > <name>hbase.rootdir</name> > <value>hdfs://*master ip address*:54310/hbase</value> > <description>The directory shared by region servers.</description> > </property> > > <property> > <name>hbase.cluster.distributed</name> > <value>true</value> > <description>The mode the cluster will be in. Possible values are > false: standalone and pseudo-distributed setups with managed > Zookeeper > true: fully-distributed with unmanaged Zookeeper Quorum (see > hbase-env.sh) > </description> > </property> > > <!--<property> > <name>hbase.zookeeper.quorum</name> > <value>*master ip address*</value> > </property>--> > > <property> > <name>hbase.zookeeper.property.dataDir</name> > <value>/usr/local/hbase-0.90.4/zookeeper_data</value> > </property> > > <property> > <name>hbase.zookeeper.quorum</name> > <value>*cluster machines addresses*</value> > <description>Comma separated list of servers in the ZooKeeper Quorum. > For example, "host1.mydomain.com,host2.mydomain.com, > host3.mydomain.com". > By default this is set to localhost for local and > pseudo-distributed modes > of operation. For a fully-distributed setup, this should be set to a > full > list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in > hbase-env.sh > this is the list of servers which we will start/stop ZooKeeper on. > </description> > </property> > > <property> > <name>zookeeper.session.timeout</name> > <value>30000</value> > <description>ZooKeeper session timeout. > HBase passes this to the zk quorum as suggested maximum time for a > session (This setting becomes zookeeper's 'maxSessionTimeout'). > See > > http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions > "The client sends a requested timeout, the server responds with the > timeout that it can give the client. " In milliseconds. > </description> > </property> > > </configuration> > > > > Searching on google, I've found that it can be an issue due to /etc/hosts, > but it's correctly configured: > > 127.0.0.1 crawler1a localhost.localdomain localhost > > where crawler1a is the master machine both for hadoop and for hbase. > > > Anybody can help? > > Thank you very much. > > -- > Adriana Farina > -- Adriana Farina

