2010/12/20 Zhou Shuaifeng <[email protected]>:
> Hi,
> I checked the log, It's not the master caused the regionserver shutdown, but
> the regionserver log rolling failed caused regionserver shutdown.
>
The problem block only had one replica? If you look in the hdfs
emissions, it'll usually log other nodes that have the wanted block.
I don't believe you say which hbase/hdfs you are using? In 0.89.x
hbases, at least for WAL log, we'll go out of our way to guarantee
sufficient replicas.
St.Ack
> According the log, error occurred in the pipeline, but why hdfs are not able
> to select another good data node when one datanode in the pipeline is not
> available?
>
>
> The log:
> 2010-12-20 09:15:41,769 FATAL
> org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed with ioe:
>
> java.io.IOException: Error Recovery for block blk_1292656843439_2494096
> failed because recovery from primary datanode 167.6.5.17:50010 failed 6
> times. Pipeline was 167.6.5.17:50010. Aborting...
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSCli
> ent.java:3249)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:
> 2654)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.
> java:2837)
>
> the corresponding code in regionserver:
> LOG.fatal("Log rolling failed with ioe: ",
> RemoteExceptionHandler.checkIOException(ex));
> server.checkFileSystem();
> // Abort if we get here. We probably won't recover an IOE.
> HBASE-1132
> server.abort();
>
> the abort() code:
> public void abort() {
> this.abortRequested = true;
> this.reservedSpace.clear();
> LOG.info("Dump of metrics: " + this.metrics.toString());
> stop();
> }
>
> The corresponding log:
> 2010-12-20 09:15:41,777 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=9.666667, regions=1512, stores=1512, storefiles=5833,
> storefileIndexSize=1833, memstoreSize=2941, compactionQueueSize=1228,
> usedHeap=6849, maxHeap=8165, blockCacheSize=14047672,
> blockCacheFree=1698276936, blockCacheCount=0, blockCacheHitRatio=0,
> fsReadLatency=0, fsWriteLatency=59, fsSyncLatency=0
>
>
>
>
> Zhou Shuaifeng(Frank)
> HUAWEI TECHNOLOGIES CO.,LTD. huawei_logo
>
>
> -----邮件原件-----
> 发件人: Daniel Iancu [mailto:[email protected]]
> 发送时间: 2010年12月20日 23:46
> 收件人: [email protected]
> 主题: Re: all regionserver shutdown after close hdfs datanode
>
> Hi Zhou
> You should check if the HMaster is still up. If not, check its logs, if
> for some reason HMaster thinks HDFS is not available it will
> shutdown the HBase cluster.
> Regards
> Daniel
>
> On 12/20/2010 06:15 AM, Zhou Shuaifeng wrote:
>> Hi,
>>
>>
>>
>> I have a cluster of 8 hdfs datanodes and 8 hbase regionservers. When I
>> shutdown one node(a pc with one datanode and one regionserver running),
> all
>> hbase regionservers shutdown after a while.
>>
>> Other 7 hdfs datanodes is OK.
>>
>>
>>
>> I think it's not reasionable. Hbase is a distribute system that should
>> tolerance some nodes abnormal. So, what's the matter? Is there any
> configure
>> that can solve this problem or is a bug?
>>
>>
>>
>> Thanks and best Regards.
>>
>>
>>
>> Zhou
>>
>>
> ----------------------------------------------------------------------------
>> ---------------------------------------------------------
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which
>> is intended only for the person or entity whose address is listed above.
> Any
>> use of the
>> information contained herein in any way (including, but not limited to,
>> total or partial
>> disclosure, reproduction, or dissemination) by persons other than the
>> intended
>> recipient(s) is prohibited. If you receive this e-mail in error, please
>> notify the sender by
>> phone or email immediately and delete it!
>>
>
> --
> Daniel Iancu
> Java Developer,Web Components Romania
> 1&1 Internet Development srl.
> 18 Mircea Eliade St
> Sect 1, Bucharest
> RO Bucharest, 012015
> www.1and1.ro
> Phone:+40-031-223-9081
> Email:[email protected]
> IM:[email protected]
>
>
>
>