[
https://issues.apache.org/jira/browse/HBASE-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730360#comment-14730360
]
Hudson commented on HBASE-14317:
--------------------------------
FAILURE: Integrated in HBase-TRUNK #6778 (See
[https://builds.apache.org/job/HBase-TRUNK/6778/])
HBASE-14317 Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL (stack: rev
661faf6fe0833726d7ce7ad44a829eba3f8e3e45)
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SyncFuture.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWALLockup.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConcurrencyControl.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiVersionConcurrencyControlBasic.java
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiVersionConcurrencyControl.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/DamagedWALException.java
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFailedAppendAndSync.java
HBASE-14317 Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL; addendum
(stack: rev 54717a6314ef6673f7607091e5f77321c202d49f)
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
> Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
> -----------------------------------------------------
>
> Key: HBASE-14317
> URL: https://issues.apache.org/jira/browse/HBASE-14317
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.2.0, 1.1.1
> Reporter: stack
> Assignee: stack
> Priority: Blocker
> Fix For: 2.0.0, 1.2.0, 1.0.3, 1.1.3
>
> Attachments: 14317.test.txt, 14317v10.txt, 14317v11.txt,
> 14317v12.txt, 14317v13.txt, 14317v14.txt, 14317v15.txt,
> 14317v5.branch-1.2.txt, 14317v5.txt, 14317v9.txt, HBASE-14317-v1.patch,
> HBASE-14317-v2.patch, HBASE-14317-v3.patch, HBASE-14317-v4.patch,
> HBASE-14317.patch, [Java] RS stuck on WAL sync to a dead DN -
> Pastebin.com.html, append-only-test.patch, raw.php, repro.txt, san_dump.txt,
> subset.of.rs.log
>
>
> hbase-1.1.1 and hadoop-2.7.1
> We try to roll logs because can't append (See HDFS-8960) but we get stuck.
> See attached thread dump and associated log. What is interesting is that
> syncers are waiting to take syncs to run and at same time we want to flush so
> we are waiting on a safe point but there seems to be nothing in our ring
> buffer; did we go to roll log and not add safe point sync to clear out
> ringbuffer?
> Needs a bit of study. Try to reproduce.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)