hbase集群单个region flush时间过长问题求助

邢* Wed, 29 Mar 2023 04:20:47 -0700

hi all，

    向社区求助一个HBase的情况，情况描述如下：我们hbase集群某张大表最近做snapshot的时间过长，耗时约半小时。梳理日志发现每个region 
flush的时间在十几秒左右，想咨询下flush的时间为什么会这么久？


    集群现状：目前是有14台cpu内存都是96c 384g机型的regionServer机器+3台cpu内存都是8c 
32g机型的zookeeper机器，使用的hbase版本为2.1.9。

    最近排查到的日志记录如下：

    regionserver DFSClient 日志详情：

    2023-03-29 16:24:55,395 INFO org.apache.hadoop.hdfs.DFSClient: Could not 
complete 
/hbase/.hbase-snapshot/.tmp/sdhz_user_info_realtime_1680076952828/region-manifest.603b4d8028af279648af4bfaa3889fd0
 retrying...

    namenode日志详情：

    2023-03-29 16:24:54,995 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
blk_1109699571_35958782 is COMMITTED but not COMPLETE(numNodes= 0 <  minimum = 
1) in file 
/hbase/.hbase-snapshot/.tmp/sdhz_user_info_realtime_1680076952828/region-manifest.603b4d8028af279648af4bfaa3889fd0

hbase集群单个region flush时间过长问题求助

Reply via email to