[
https://issues.apache.org/jira/browse/HBASE-26411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436722#comment-17436722
]
Duo Zhang commented on HBASE-26411:
-----------------------------------
OK, so first, in AsyncFSWAL.syncCompleted, besides requestLogRoll, we should
also have a check to stop writing to the current writer if the wal file size is
already too big, maybe just abort the region server.
And second, why we have an infinite wait for writing here...
> Wal do not roll and write a big wal
> ------------------------------------
>
> Key: HBASE-26411
> URL: https://issues.apache.org/jira/browse/HBASE-26411
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.4.8
> Reporter: Lijin Bin
> Priority: Major
>
> We see wal have long time to roll and write a big wal which has 3TB.
> And according to the jstack we can see the wal create hang.
> {code}
> "regionserver/11.149.48.227:60020.logRoller" #667 daemon prio=5 os_prio=0
> cpu=116916.81ms elapsed=447455.26s tid=0x00007fa35d231000 nid=0xbdd2 waiting
> on condition [0x00007f79c7407000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f9f10df5158> (a
> java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> at
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.write(AsyncProtobufLogWriter.java:178)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.writeMagicAndWALHeader(AsyncProtobufLogWriter.java:191)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufLogWriter.init(AbstractProtobufLogWriter.java:170)
> at
> org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:113)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:615)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:126)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:763)
> at
> org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:184)
> at java.lang.Thread.run(Thread.java:748)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)