I have a third datapoint. A database import here failed today when the master 
timed out the leases of some regionservers (5 out of 16), including the one 
hosting META. The regionservers restarted and eventually things came back. 
There was little in the logs to go on. The progression was:

1) In master log, lease timeout notices.

2) In regionserver logs, quiesce/restart.

3) In master log, errors related to META going away.

4) In master log, reassignment of META to a regionserver still up.

5) In master log, new start messages from the regionservers that have finished 
restarting.

6) Import continues (eventually).

Before #1 there are no errors or anything out of the ordinary in either the 
master or regionserver logs. 

I upped the regionserver lease period from 60 to 120 seconds, reinitialized, 
and ran the import again. No problems.

   - Andy


--- On Wed, 7/23/08, stack <[EMAIL PROTECTED]> wrote:

> From: stack <[EMAIL PROTECTED]>
> Subject: Re: [ANN] hbase-0.2.0 Release Candidate 1
> To: [email protected]
> Date: Wednesday, July 23, 2008, 4:10 PM
> The below happens for every task?  None can see the master? 
> You verify the master is running via shell or something?
> There could be something going on here.  I heard a second-
> report of such a phenomeon where there is a timeout though
> master seems to be listening fine. 




      

Reply via email to