[
https://issues.apache.org/jira/browse/HBASE-27850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756348#comment-17756348
]
Haoze Wu commented on HBASE-27850:
----------------------------------
Hello, may I have the logs printed out?
> TimeoutIOException: Failed to get sync result after 300000 ms for
> txid=16920651960, WAL system stuck?
> -----------------------------------------------------------------------------------------------------
>
> Key: HBASE-27850
> URL: https://issues.apache.org/jira/browse/HBASE-27850
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 2.2.6
> Environment: hbase 2.2.6
> hadoop 3.3.1
> Reporter: longping_jie
> Priority: Major
> Attachments: 49151.log1
>
>
> A node under a RsGroup (only one table), at a certain moment, the write call
> queue is blocked, and the blocking time starts, and the reading and writing
> qps of this table are all reduced to 0, and the client cannot read and write
> the table, RS call At the point in time when queue blocking starts, the
> following errors are continuously reported in the log:
>
> 2023-05-08 12:42:27,310 ERROR [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: Cache flush failed for region
> user_feature_v2,eacf_1658057555,1660314723816.2376cc2326b5372131cc530b115d959a.
> org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync
> result after 300000 ms for txid=16920651960, WAL system stuck?
> at
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:155)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:743)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:625)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:602)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doSyncOfUnflushedWALChanges(HRegion.java:2754)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2691)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2549)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2523)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2409)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:611)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:580)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:68)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:360)
> at java.lang.Thread.run(Thread.java:748)
> The data in the node memstore cannot be flushed to the WAL file, other
> indicators of the node are normal, and HDFS is not under pressure. After
> restarting the blocked node, the table returned to normal.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)