[ 
https://issues.apache.org/jira/browse/HADOOP-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557687#action_12557687
 ] 

stack commented on HADOOP-2558:
-------------------------------

TestHLog did a vintage hang-at-end-of-successful-test for ten hours last night:

{code}
2008-01-10 07:59:51,981 INFO  [main] hbase.HRegionServer$ShutdownThread(151): 
Starting shutdown thread.
    [junit] 2008-01-10 07:59:51,981 INFO  [main] 
hbase.HRegionServer$ShutdownThread(156): Shutdown thread complete
    [junit] Running org.apache.hadoop.hbase.TestHLog
    [junit] Starting DataNode 0 with dfs.data.dir: 
/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
    [junit] Starting DataNode 1 with dfs.data.dir: 
/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
    [junit] 2008-01-10 07:59:55,430 INFO  [main] hbase.HLog(313): new log 
writer created at /hbase/hlog.dat.000
    [junit] 2008-01-10 07:59:55,608 WARN  [IPC Server handler 1 on 37583] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] 2008-01-10 07:59:55,658 DEBUG [main] hbase.HLog(301): Closing 
current log writer /hbase/hlog.dat.000 to get a new one
    [junit] 2008-01-10 07:59:55,662 INFO  [main] hbase.HLog(313): new log 
writer created at /hbase/hlog.dat.001
    [junit] 2008-01-10 07:59:55,665 DEBUG [main] hbase.HLog(346): Found 0 logs 
to remove using oldest outstanding seqnum of 0 from region 0
    [junit] 2008-01-10 07:59:55,670 WARN  [IPC Server handler 7 on 37583] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] 2008-01-10 07:59:55,678 DEBUG [main] hbase.HLog(301): Closing 
current log writer /hbase/hlog.dat.001 to get a new one
    [junit] 2008-01-10 07:59:55,683 INFO  [main] hbase.HLog(313): new log 
writer created at /hbase/hlog.dat.002
    [junit] 2008-01-10 07:59:55,684 DEBUG [main] hbase.HLog(346): Found 0 logs 
to remove using oldest outstanding seqnum of 0 from region 0
    [junit] 2008-01-10 07:59:55,689 WARN  [IPC Server handler 2 on 37583] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] 2008-01-10 07:59:55,694 DEBUG [main] hbase.HLog(301): Closing 
current log writer /hbase/hlog.dat.002 to get a new one
    [junit] 2008-01-10 07:59:55,699 INFO  [main] hbase.HLog(313): new log 
writer created at /hbase/hlog.dat.003
    [junit] 2008-01-10 07:59:55,700 DEBUG [main] hbase.HLog(346): Found 0 logs 
to remove using oldest outstanding seqnum of 0 from region 0
    [junit] 2008-01-10 07:59:55,707 INFO  [main] hbase.HLog(148): splitting 4 
log(s) in /hbase
    [junit] 2008-01-10 07:59:55,708 DEBUG [main] hbase.HLog(155): Splitting 0 
of 4: hdfs://localhost:37583/hbase/hlog.dat.000
    [junit] 2008-01-10 07:59:55,750 DEBUG [main] hbase.HLog(177): Creating new 
log file writer for path 
test.build.data/testSplit/hregion_14095470/oldlogfile.log; map content {}
    [junit] 2008-01-10 07:59:55,757 DEBUG [main] hbase.HLog(177): Creating new 
log file writer for path 
test.build.data/testSplit/hregion_1701666436/oldlogfile.log; map content [EMAIL 
PROTECTED]
    [junit] 2008-01-10 07:59:55,762 DEBUG [main] hbase.HLog(177): Creating new 
log file writer for path 
test.build.data/testSplit/hregion_1249881816/oldlogfile.log; map content [EMAIL 
PROTECTED], [EMAIL PROTECTED]
    [junit] 2008-01-10 07:59:55,767 DEBUG [main] hbase.HLog(192): Applied 9 
total edits
    [junit] 2008-01-10 07:59:55,768 DEBUG [main] hbase.HLog(155): Splitting 1 
of 4: hdfs://localhost:37583/hbase/hlog.dat.001
    [junit] 2008-01-10 07:59:55,771 DEBUG [main] hbase.HLog(192): Applied 9 
total edits
    [junit] 2008-01-10 07:59:55,772 DEBUG [main] hbase.HLog(155): Splitting 2 
of 4: hdfs://localhost:37583/hbase/hlog.dat.002
    [junit] 2008-01-10 07:59:55,791 DEBUG [main] hbase.HLog(192): Applied 9 
total edits
    [junit] 2008-01-10 07:59:55,792 DEBUG [main] hbase.HLog(155): Splitting 3 
of 4: hdfs://localhost:37583/hbase/hlog.dat.003
    [junit] 2008-01-10 07:59:55,793 INFO  [main] hbase.HLog(160): Skipping 
hdfs://localhost:37583/hbase/hlog.dat.003 because zero length
    [junit] 2008-01-10 07:59:55,794 WARN  [IPC Server handler 3 on 37583] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] 2008-01-10 07:59:55,802 WARN  [IPC Server handler 5 on 37583] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] 2008-01-10 07:59:55,811 WARN  [IPC Server handler 8 on 37583] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] 2008-01-10 07:59:55,819 INFO  [main] hbase.HLog(212): log file 
splitting completed for /hbase
    [junit] 2008-01-10 07:59:55,821 INFO  [main] 
hbase.StaticTestEnvironment(135): Shutting down FileSystem
    [junit] 2008-01-10 07:59:55,822 WARN  [IPC Server handler 4 on 37583] 
dfs.FSNamesystem(1689): DIR* NameSystem.internalReleaseCreate: attempt to 
release a create lock on /hbase/hlog.dat.003 file does not exist.
    [junit] 2008-01-10 07:59:56,106 INFO  [main] 
hbase.StaticTestEnvironment(142): Shutting down Mini DFS 
    [junit] Shutting down the Mini HDFS Cluster
    [junit] Shutting down DataNode 1
    [junit] 2008-01-10 07:59:56,419 WARN  [DataNode: 
[/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4]]
 dfs.DataNode(658): java.io.InterruptedIOException
    [junit]     at java.io.FileInputStream.readBytes(Native Method)
    [junit]     at java.io.FileInputStream.read(FileInputStream.java:194)
    [junit]     at 
java.lang.UNIXProcess$DeferredCloseInputStream.read(UNIXProcess.java:227)
    [junit]     at 
java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
    [junit]     at 
java.io.BufferedInputStream.read(BufferedInputStream.java:313)
    [junit]     at 
sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java:411)
    [junit]     at 
sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java:453)
    [junit]     at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:183)
    [junit]     at java.io.InputStreamReader.read(InputStreamReader.java:167)
    [junit]     at java.io.BufferedReader.fill(BufferedReader.java:136)
    [junit]     at java.io.BufferedReader.readLine(BufferedReader.java:299)
    [junit]     at java.io.BufferedReader.readLine(BufferedReader.java:362)
    [junit]     at org.apache.hadoop.fs.DU.parseExecResult(DU.java:73)
    [junit]     at org.apache.hadoop.util.Shell.runCommand(Shell.java:145)
    [junit] Shutting down DataNode 0
    [junit]     at org.apache.hadoop.util.Shell.run(Shell.java:100)
    [junit]     at org.apache.hadoop.fs.DU.getUsed(DU.java:53)
    [junit]     at 
org.apache.hadoop.dfs.FSDataset$FSVolume.getDfsUsed(FSDataset.java:299)
    [junit]     at 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:396)
    [junit]     at 
org.apache.hadoop.dfs.FSDataset.getDfsUsed(FSDataset.java:516)
    [junit]     at 
org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:562)
    [junit]     at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1736)
    [junit]     at java.lang.Thread.run(Thread.java:595)

    [junit] 2008-01-10 07:59:56,419 WARN  [EMAIL PROTECTED] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] 2008-01-10 07:59:56,422 WARN  [Thread-44] util.Shell$1(137): Error 
reading the error stream
    [junit] java.io.InterruptedIOException
    [junit]     at java.io.FileInputStream.readBytes(Native Method)
    [junit]     at java.io.FileInputStream.read(FileInputStream.java:194)
    [junit]     at 
java.lang.UNIXProcess$DeferredCloseInputStream.read(UNIXProcess.java:227)
    [junit]     at 
sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java:411)
    [junit]     at 
sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java:453)
    [junit]     at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:183)
    [junit]     at java.io.InputStreamReader.read(InputStreamReader.java:167)
    [junit]     at java.io.BufferedReader.fill(BufferedReader.java:136)
    [junit]     at java.io.BufferedReader.readLine(BufferedReader.java:299)
    [junit]     at java.io.BufferedReader.readLine(BufferedReader.java:362)
    [junit]     at org.apache.hadoop.util.Shell$1.run(Shell.java:130)
    [junit] Starting DataNode 0 with dfs.data.dir: 
/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
    [junit] Starting DataNode 1 with dfs.data.dir: 
/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
    [junit] 2008-01-10 07:59:58,650 INFO  [main] hbase.HLog(313): new log 
writer created at /hbase/hlog.dat.000
    [junit] 2008-01-10 07:59:58,652 DEBUG [main] hbase.HLog(399): closing log 
writer in /hbase
    [junit] 2008-01-10 07:59:58,654 WARN  [IPC Server handler 3 on 37603] 
dfs.ReplicationTargetChooser(177): Not able to place enough replicas, still in 
need of 1
    [junit] tablename/regionname/row/0 (0/1199951998651/0)
    [junit] tablename/regionname/row/1 (1/1199951998651/1)
    [junit] tablename/regionname/row/2 (2/1199951998651/2)
    [junit] tablename/regionname/row/3 (3/1199951998651/3)
    [junit] tablename/regionname/row/4 (4/1199951998651/4)
    [junit] tablename/regionname/row/5 (5/1199951998651/5)
    [junit] tablename/regionname/row/6 (6/1199951998651/6)
    [junit] tablename/regionname/row/7 (7/1199951998651/7)
    [junit] tablename/regionname/row/8 (8/1199951998651/8)
    [junit] tablename/regionname/row/9 (9/1199951998651/9)
    [junit] tablename/regionname/METAROW/10 
(METACOLUMN:/1199951998652/HBASE::CACHEFLUSH)
    [junit] 2008-01-10 07:59:58,667 INFO  [main] 
hbase.StaticTestEnvironment(135): Shutting down FileSystem
    [junit] 2008-01-10 07:59:59,646 INFO  [main] 
hbase.StaticTestEnvironment(142): Shutting down Mini DFS 
    [junit] Shutting down the Mini HDFS Cluster
    [junit] Shutting down DataNode 1
    [junit] Shutting down DataNode 0
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 7.84 sec
    [junit] Running org.apache.hadoop.hbase.TestHMemcache
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.018 sec
    [junit] Running org.apache.hadoop.hbase.TestHRegion
    [junit] Starting DataNode 0 with dfs.data.dir: 
/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
    [junit] Starting DataNode 1 with dfs.data.dir: 
/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/tmp/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
    [junit] 2008-01-10 16:21:52,123 INFO  [main] hbase.HLog(313): new log 
writer created at /hbase/log/hlog.dat.000
{code}
I just did a kill -9 (kill and -QUIT had no effect).

> [hbase] fixes for build up on hudson
> ------------------------------------
>
>                 Key: HADOOP-2558
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2558
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>         Attachments: 2558-v2.patch, 2558-v3.patch, 2558.patch
>
>
> Fixes for hbase breakage up on hudson.  There seem to be many reasons for the 
> failings.
> One is that the .META. region of a sudden decides its 'no good' and it gets 
> deployed elsewhere.  Tests don't have the tolerance for this kinda churn.  A 
> previous commit adding in logging of why .META. is 'no good'.  Hopefully that 
> will help.
> Found also a case where TestTableMapReduce would fail because no sleep 
> between retries when getting new scanners.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to