[ 
https://issues.apache.org/jira/browse/HBASE-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931960#action_12931960
 ] 

stack commented on HBASE-3234:
------------------------------

Good on you Todd.

Ignore the user issue mentioned above.

Lining up a good log and a bad log, here is a good log:

{code}
 8173 2010-11-15 01:42:29,781 DEBUG [PRI IPC Server handler 1 on 51098] 
hdfs.DFSClient$DFSOutputStream(3099): DFSClient writeChunk allocating new 
packet seqno=1, src=/user/stack/.lo      
gs/pynchon-432.lan,51098,1289785308076/192.168.1.69%3A51098.1289785308941, 
packetSize=65557, chunksPerPacket=127, bytesCurBlock=0
 8174 2010-11-15 01:42:29,782 DEBUG 
[RegionServer:0;pynchon-432.lan,51098,1289785308076.logSyncer] 
hdfs.DFSClient$DFSOutputStream(3170): DFSClient flush() : saveOffset 512 
bytesCurB      lock 762 lastFlushOffset 301
 8175 2010-11-15 01:42:29,782 DEBUG [DataStreamer for file 
/user/stack/.logs/pynchon-432.lan,51098,1289785308076/192.168.1.69%3A51098.1289785308941
 block blk_8612924244108136372_100      7] 
hdfs.DFSClient$DFSOutputStream$DataStreamer(2376): DataStreamer block 
blk_8612924244108136372_1007 wrote packet seqno:1 size:795 offsetInBlock:0 
lastPacketInBlock:false
 8176 2010-11-15 01:42:29,782 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@1a83e35b] 
datanode.BlockReceiver(393): Receiving one packet for block 
blk_8612924244108136372      _1007 of length 774 seqno 1 offsetInBlock 0 
lastPacketInBlock false
 8177 2010-11-15 01:42:29,782 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@1a83e35b] 
datanode.BlockReceiver(635): Changing block file offset of block 
blk_86129242441081      36372_1007 from 301 to 0 meta file offset to 7
 8178 2010-11-15 01:42:29,783 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@1a83e35b] 
datanode.BlockReceiver$PacketResponder(736): PacketResponder 0 adding seqno 1 
to ac      k queue.
 8179 2010-11-15 01:42:29,783 DEBUG [PacketResponder 0 for Block 
blk_8612924244108136372_1007] datanode.BlockReceiver$PacketResponder(806): 
PacketResponder 0 for block blk_861292424      4108136372_1007 acking for 
packet 1
 8180 2010-11-15 01:42:29,783 DEBUG [ResponseProcessor for block 
blk_8612924244108136372_1007] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2481): DFSClient Replies for 
seqno 1       are SUCCESS
 8181 2010-11-15 01:42:29,783 DEBUG [PRI IPC Server handler 1 on 51098] 
ipc.HBaseRPC$Server(574): Served: put queueTime= 0 procesingTime= 3
 8182 2010-11-15 01:42:29,783 DEBUG [PRI IPC Server handler 1 on 51098] 
ipc.HBaseServer$Responder(718): IPC Server Responder: responding to #157 from 
192.168.1.69:51194
 8183 2010-11-15 01:42:29,783 DEBUG [PRI IPC Server handler 1 on 51098] 
ipc.HBaseServer$Responder(737): IPC Server Responder: responding to #157 from 
192.168.1.69:51194 Wrote 8 byte      s.
 8184 2010-11-15 01:42:29,783 DEBUG [IPC Client (47) connection to 
pynchon-432.lan/192.168.1.69:51098 from an unknown user] 
ipc.HBaseClient$Connection(524): IPC Client (47) connecti      on to 
pynchon-432.lan/192.168.1.69:51098 from an unknown user got value #157
 8185 2010-11-15 01:42:29,784 DEBUG [IPC Server handler 4 on 51095] 
ipc.HBaseRPC$Invoker(261): Call: put 4
 8186 2010-11-15 01:42:29,784 INFO  [IPC Server handler 4 on 51095] 
catalog.MetaEditor(59): Added region 
test,,1289785349688.03c89b2db74c8c3215c5fd46429fa2f0. to META
{code}

Here is bad log

{code}
 8118 2010-11-14 23:38:57,616 DEBUG [PRI IPC Server handler 8 on 64748] 
hdfs.DFSClient$DFSOutputStream(3154): DFSClient writeChunk allocating new 
packet seqno=1, src=/user/stack/.lo      
gs/172.16.198.219,64748,1289777896452/172.16.198.219%3A64748.1289777897337, 
packetSize=65557, chunksPerPacket=127, bytesCurBlock=0
 8119 2010-11-14 23:38:57,617 DEBUG 
[RegionServer:0;172.16.198.219,64748,1289777896452.logSyncer] 
hdfs.DFSClient$DFSOutputStream(3238): DFSClient flush() : saveOffset 512 
bytesCurBl      ock 761 lastFlushOffset 300
 8120 2010-11-14 23:38:57,617 DEBUG [DataStreamer for file 
/user/stack/.logs/172.16.198.219,64748,1289777896452/172.16.198.219%3A64748.1289777897337
 block blk_-8256147503844598069_1      007] 
hdfs.DFSClient$DFSOutputStream$DataStreamer(2429): DataStreamer block 
blk_-8256147503844598069_1007 wrote packet seqno:1 size:794 offsetInBlock:0 
lastPacketInBlock:false
 8121 2010-11-14 23:38:57,617 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@753bc30a] 
datanode.BlockReceiver(393): Receiving one packet for block 
blk_-825614750384459806      9_1007 of length 773 seqno 1 offsetInBlock 0 
lastPacketInBlock false
 8122 2010-11-14 23:38:57,617 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@753bc30a] 
datanode.BlockReceiver$PacketResponder(737): PacketResponder 0 adding seqno 1 
to ac      k queue.
 8123 2010-11-14 23:38:57,618 DEBUG [PacketResponder 0 for Block 
blk_-8256147503844598069_1007] datanode.BlockReceiver$PacketResponder(891): 
PacketResponder 0 for block blk_-8256147      503844598069_1007 responded an 
ack: Replies for seqno 1 are SUCCESS
 8124 2010-11-14 23:38:57,618 DEBUG [ResponseProcessor for block 
blk_-8256147503844598069_1007] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2534): DFSClient Replies for 
seqno 0       are FAILED
 8125 2010-11-14 23:38:57,618 DEBUG [PacketResponder 0 for Block 
blk_-8256147503844598069_1007] datanode.BlockReceiver$PacketResponder(789): 
PacketResponder 0 seqno = -2 for block b      lk_-8256147503844598069_1007 
waiting for local datanode to finish write.
 8126 2010-11-14 23:38:57,618 WARN  [ResponseProcessor for block 
blk_-8256147503844598069_1007] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2580): DFSOutputStream 
ResponseProce      ssor exception  for block 
blk_-8256147503844598069_1007java.io.IOException: Bad response 1 for block 
blk_-8256147503844598069_1007 from datanode 127.0.0.1:64740
 8127     at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2542)
 8128 
 8129 2010-11-14 23:38:57,619 WARN  [DataStreamer for file 
/user/stack/.logs/172.16.198.219,64748,1289777896452/172.16.198.219%3A64748.1289777897337
 block blk_-8256147503844598069_1      007] 
hdfs.DFSClient$DFSOutputStream(2616): Error Recovery for block 
blk_-8256147503844598069_1007 bad datanode[0] 127.0.0.1:64740
 8130 2010-11-14 23:38:57,619 INFO  
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@753bc30a] 
datanode.BlockReceiver(565): Exception in receiveBlock for block 
blk_-8256147503844      598069_1007 java.io.EOFException: while trying to read 
794 bytes
{code}


In the bad, its writing 794 bytes but in good its writing 795 bytes.

The good does a ' Changing block file offset of block 
blk_8612924244108136372_1007 from 301 to 0 meta file offset to 7' before it 
does   PacketResponder 0 adding seqno 1 to ack queue.'  and then getting an ack 
to report SUCCESS.   The 'bad' doesn't do this its size seem to be smaller by 
one byte and after acking seqno 1, it then goes on to look for ack on seqno 0 
and reports FAILED.

I don't know what this means (smile).  Seems significant. 

> hdfs-724 "breaks" TestHBaseTestingUtility multiClusters
> -------------------------------------------------------
>
>                 Key: HBASE-3234
>                 URL: https://issues.apache.org/jira/browse/HBASE-3234
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: 
> org.apache.hadoop.hbase.TestHBaseTestingUtility-output.txt, 
> org.apache.hadoop.hbase.TestHBaseTestingUtility-output.txt, 
> org.apache.hadoop.hbase.TestHBaseTestingUtility.txt
>
>
> We upgraded our hadoop jar in TRUNK to latest on 0.20-append branch.  
> TestHBaseTestingUtility started failing reliably.  If I back out hdfs-724, 
> the test passes again.  This issue is about figuring whats up here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to