Re: hbase集群单个region flush时间过长问题求助

Duo Zhang Wed, 29 Mar 2023 07:00:48 -0700

这个要看你 flush 出去的数据有多大吧？以及写 HDFS 的速度，十几秒通常看还行吧，不算特别慢。有监控打 snapshot 时候集群的整体负载和
IO 情况吗？


邢* <xingxuem...@163.com> 于2023年3月29日周三 19:20写道：

> hi all，
>
>
> 向社区求助一个HBase的情况，情况描述如下：我们hbase集群某张大表最近做snapshot的时间过长，耗时约半小时。梳理日志发现每个region
> flush的时间在十几秒左右，想咨询下flush的时间为什么会这么久？
>
>     集群现状：目前是有14台cpu内存都是96c 384g机型的regionServer机器+3台cpu内存都是8c
> 32g机型的zookeeper机器，使用的hbase版本为2.1.9。
>
>     最近排查到的日志记录如下：
>
>     regionserver DFSClient 日志详情：
>
>     2023-03-29 16:24:55,395 INFO org.apache.hadoop.hdfs.DFSClient: Could
> not complete
> /hbase/.hbase-snapshot/.tmp/sdhz_user_info_realtime_1680076952828/region-manifest.603b4d8028af279648af4bfaa3889fd0
> retrying...
>
>     namenode日志详情：
>
>     2023-03-29 16:24:54,995 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK*
> blk_1109699571_35958782 is COMMITTED but not COMPLETE(numNodes= 0 <
> minimum = 1) in file
> /hbase/.hbase-snapshot/.tmp/sdhz_user_info_realtime_1680076952828/region-manifest.603b4d8028af279648af4bfaa3889fd0

Re: hbase集群单个region flush时间过长问题求助

Reply via email to