[ 
https://issues.apache.org/jira/browse/HBASE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776855#action_12776855
 ] 

stack commented on HBASE-1973:
------------------------------

Weird thing is that it seems like the create eventually succeeds.. after we've 
given up on it if I look at the NN:

{code}
[st...@aa0-000-12 logs]$ grep 
'/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617'
 hadoop-stack-namenode-aa0-000-12.u.powerset.com.log 
2009-11-12 05:09:38,618 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* 
NameNode.create: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250 at 208.76.44.141
2009-11-12 05:09:38,618 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.startFile: 
src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617,
 holder=DFSClient_276778250, clientMachine=208.76.44.141, createParent=true, 
replication=3, overwrite=true, append=false
2009-11-12 05:09:38,619 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 is added to the file system
2009-11-12 05:09:38,619 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.startFile: add 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 to namespace for DFSClient_276778250
2009-11-12 05:09:38,620 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=stack,powerset,engineering  ip=/208.76.44.141       cmd=create      
src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
   dst=null        perm=stack:supergroup:rw-r--r--
2009-11-12 05:09:38,630 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.addBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:38,630 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.getAdditionalBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:38,631 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_1786802275426334496_1076 block is added to the in-memory file system
2009-11-12 05:09:38,631 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617.
 blk_1786802275426334496_1076{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], 
ReplicaUnderConstruction[208.76.44.139:51010|RBW], 
ReplicaUnderConstruction[208.76.44.140:51010|RBW]]}
2009-11-12 05:09:38,646 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.abandonBlock: blk_1786802275426334496_1076 of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:38,646 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.abandonBlock: blk_1786802275426334496_1076of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:38,646 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_1786802275426334496_1076 block is added to the file system
2009-11-12 05:09:44,655 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.addBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:44,655 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.getAdditionalBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:44,656 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_5838454320666652488_1076 block is added to the in-memory file system
2009-11-12 05:09:44,656 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617.
 blk_5838454320666652488_1076{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], 
ReplicaUnderConstruction[208.76.44.140:51010|RBW], 
ReplicaUnderConstruction[208.76.44.139:51010|RBW]]}
2009-11-12 05:09:44,662 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.abandonBlock: blk_5838454320666652488_1076 of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:44,662 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.abandonBlock: blk_5838454320666652488_1076of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:44,662 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_5838454320666652488_1076 block is added to the file system
2009-11-12 05:09:50,675 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.addBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:50,675 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.getAdditionalBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:50,676 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_8615264034947520229_1076 block is added to the in-memory file system
2009-11-12 05:09:50,676 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617.
 blk_8615264034947520229_1076{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], 
ReplicaUnderConstruction[208.76.44.139:51010|RBW], 
ReplicaUnderConstruction[208.76.44.140:51010|RBW]]}
2009-11-12 05:09:50,683 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.abandonBlock: blk_8615264034947520229_1076 of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:50,683 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.abandonBlock: blk_8615264034947520229_1076of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:50,683 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_8615264034947520229_1076 block is added to the file system
2009-11-12 05:09:56,685 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.addBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:56,685 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.getAdditionalBlock: file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 for DFSClient_276778250
2009-11-12 05:09:56,686 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_-5632723632070749366_1077 block is added to the in-memory file system
2009-11-12 05:09:56,686 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617.
 blk_-5632723632070749366_1077{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], 
ReplicaUnderConstruction[208.76.44.140:51010|RBW], 
ReplicaUnderConstruction[208.76.44.139:51010|RBW]]}
2009-11-12 05:09:56,691 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.abandonBlock: blk_-5632723632070749366_1077 of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:56,691 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.abandonBlock: blk_-5632723632070749366_1077of file 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:56,691 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.addFile: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 with blk_-5632723632070749366_1077 block is added to the file system

Its failing above opening the file.

Below is open by the master doing its split....

2009-11-12 05:10:10,924 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=stack,powerset,engineering  ip=/208.76.44.139       cmd=open        
src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
   dst=null        perm=null
2009-11-12 05:10:10,925 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* 
Namenode.delete: 
src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617,
 recursive=true
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.delete: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.delete: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
FSDirectory.unprotectedDelete: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 is removed
2009-11-12 05:10:10,926 DEBUG 
org.apache.hadoop.hdfs.server.namenode.LeaseManager: LeaseManager.findLease: 
prefix=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:10:10,926 DEBUG 
org.apache.hadoop.hdfs.server.namenode.LeaseManager: 
LeaseManager.removeLeaseWithPrefixPath: 
entry=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617=[Lease.
  Holder: DFSClient_276778250, pendingcreates: 1]
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* 
Namesystem.delete: 
/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
 is removed
2009-11-12 05:10:10,927 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=stack,powerset,engineering  ip=/208.76.44.139       cmd=delete      
src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
   dst=null        perm=null

{code}

The above open on master actually throws an EOFE.  File length is seen as zero. 
 We don't do the available on the backing stream to get the length hdfs sees.

> Second RS goes down because "Bad connect ack with firstBadLink as..."
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1973
>                 URL: https://issues.apache.org/jira/browse/HBASE-1973
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> Placeholder for issue where a second RS goes down when I kill another RS and 
> its DN.  I asked over on hdfs-user why we're stuck on 140.  Why can't it move 
> on to another DN creating a block.
> {code}
> 2009-11-12 05:09:38,624 [regionserver/208.76.44.141:60020.logRoller] INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLog: Roll 
> /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002405346,
>  entries=52747, calcsize=63757973, filesize=58588977. New hlog 
> /hbase/.logs/aa0-000-14.u.powerset.com,60020,12
> 58002404935/hlog.dat.1258002578617
> 2009-11-12 05:09:38,628 [regionserver/208.76.44.141:60020.logRoller] DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLog: Found 0 hlogs to remove  out 
> of total 1; oldest outstanding seqnum is 1 from region .META.,,1
> 2009-11-12 05:09:38,628 [IPC Server handler 16 on 60020] INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLog: edit=0, 
> write=TestTable/TestTable,,1258002466308/612228
> 2009-11-12 05:09:38,630 [regionserver/208.76.44.141:60020.logSyncer] INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLog: sync
> 2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Exception in createBlockOutputStream java.io.IOException: Bad connect ack 
> with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Abandoning block blk_1786802275426334496_1076
> 2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Exception in createBlockOutputStream java.io.IOException: Bad connect ack 
> with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Abandoning block blk_5838454320666652488_1076
> 2009-11-12 05:09:46,226 [pool-1-thread-1] DEBUG 
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: 
> Total=5.5656967MB (5836056), Free=672.77185MB (705452392), Max=678.3375MB 
> (711288448), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, 
> Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted
> /Run=NaN
> 2009-11-12 05:09:50,683 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Exception in createBlockOutputStream java.io.IOException: Bad connect ack 
> with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:50,684 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Abandoning block blk_8615264034947520229_1076
> 2009-11-12 05:09:56,691 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Exception in createBlockOutputStream java.io.IOException: Bad connect ack 
> with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:56,692 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: 
> Abandoning block blk_-5632723632070749366_1077
> 2009-11-12 05:10:02,696 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: 
> DataStreamer Exception: java.io.IOException: Unable to create new block.
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3100)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,697 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: 
> Could not get block locations. Source file 
> "/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617"
>  - Aborting...
> 2009-11-12 05:10:02,699 [regionserver/208.76.44.141:60020.logSyncer] FATAL 
> org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting 
> close of hlog
> java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,701 [regionserver/208.76.44.141:60020.logSyncer] ERROR 
> org.apache.hadoop.hbase.regionserver.wal.HLog: Error while syncing, 
> requesting close of hlog 
> java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,703 [IPC Server handler 20 on 60020] FATAL 
> org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting 
> close of hlog
> java.io.IOException: IOException flush:java.io.IOException: Bad connect ack 
> with firstBadLink as 208.76.44.140:51010
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
> 2009-11-12 05:10:02,703 [regionserver/208.76.44.141:60020.logSyncer] INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLog: 
> regionserver/208.76.44.141:60020.logSyncer exiting
> 2009-11-12 05:10:02,706 [IPC Server handler 10 on 60020] FATAL 
> org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting 
> close of hlog
> java.io.IOException: IOException flush:java.io.IOException: Bad connect ack 
> with firstBadLink as 208.76.44.140:51010
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to