Thanks for your response St.Ack!

No exceptions are being thrown -- the task gets killed once the 10 minute 
default is up.  I put some log statements in the setup and it tries to get the 
table, but the log statement after the call to pool.getTable(name) never gets 
called.

The hbase-default.xml and hbase-site.xml files are all identical on each of the 
machines (double checked).  However, I did notice that the zookeeper folder was 
not on the machines that failed.  I put the folder on all the machines, 
restarted zookeeper and hbase, and still had no luck.

All regions are showing up in the UI with regions in them and the count does 
run through all the data in the table.

Another thing I noticed is that there is no ZooKeeper QuorumPeerMain service 
running on any machine but the two that are functioning.  Only an HRegionServer 
is running on all ten machines.  Is this an issue?  I was told that they do not 
need to be running to work properly.

Thanks!
-- Adam

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Stack
Sent: Friday, July 15, 2011 12:37 PM
To: [email protected]
Subject: Re: Failure to get HTable using MapReduce on some Nodes

On Fri, Jul 15, 2011 at 7:36 AM, Adam Shook <[email protected]> wrote:
> I am running a MapReduce job using standard input and output formats and 
> using an HTable as a reference data set in my Mapper code.  I am using a 
> small cluster of around 10 nodes.  In my setup phase I am using an HTablePool 
> to get a reference to a table.  On all but two nodes, the call to getting the 
> table hangs and eventually causes the task to fail.  However, on two of the 
> 10 machines, it retrieves the table and its business as usual.  (I also just 
> tried creating a new HTable without the pool - no dice).
>

Any exception thrown?

> It just so happens that the two machines I can successfully get the table on 
> are configured in the hbase-site.xml file for the hbase.zookeeper.quorum file.
>

Any other configuration differences on these machines?  Are these
using localhost to find zk and finding something because zk instance
is actually running locally (whereas the others fail to find a zk
member)?

>  I was told that the other machines don't need to be in this file - ZooKeeper 
> will handle everything and I should be able to get a table just fine.  The 
> cluster is configured so HBase is not managing ZooKeeper.
>
This should be fine.

> If I ssh into any of the 8 machines that don't work, I am able to use the 
> HBase shell and scan through a table.
>

If you look at your UI are all ten machines showing, all with regions
loaded?  If you run a count from the shell, it runs through all of
your table contents? (Could take a while if big)

St.Ack

> Any help would be very much appreciated.
>
> Thanks!
> Adam
>

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1390 / Virus Database: 1516/3766 - Release Date: 07/15/11

Reply via email to