[ 
https://issues.apache.org/jira/browse/HBASE-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025989#comment-13025989
 ] 

Naresh Rapolu commented on HBASE-1960:
--------------------------------------

Got this again with HBase-0.90.2 and Hadoop-0.20.2 append branch while 
launching cluster on EC2 :
The master retries to create /hbase/hbase.version, but the namenode rejects 
saying 
{panel}
2011-04-27 20:26:15,004 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.startFile: failed to create file /hbase/hbase.version for 
DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105 on client 
10.108.79.232 because current leaseholder is trying to recreate file.

2011-04-27 20:26:15,005 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 
on 50001, call create(/hbase/hbase.version, rwxr-xr-x, 
DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105, true, 3, 
67108864) from 10.108.79.232:36701: error: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create 
file /hbase/hbase.version for 
DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105 on client 
10.108.79.232 because current leaseholder is trying to recreate file.
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create 
file /hbase/hbase.version for 
DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105 on client 
10.108.79.232 because current leaseholder is trying to recreate file.
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1182)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1054)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1002)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:381)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
{panel}

This sequence of events (retry by master and rejection by namenode) continue 
forever; master is never started. 
The following is from HBase master logs:

{panel}
2011-04-27 20:12:44,760 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
for block null bad datanode[0] nodes == null
2011-04-27 20:12:44,760 WARN org.apache.hadoop.hdfs.DFSClient: Could not get 
block locations. Source file "/hbase/hbase.version" - Aborting...
2011-04-27 20:12:44,762 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to 
create version file at hdfs://ip-10-108-79-232.ec2.internal:50001/hbase, 
retrying: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/hbase/hbase.version could only be replicated to 0 nodes, instead of 1
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1363)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:449)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
..........
.......... (While retrying)
..........

2011-04-27 20:28:15,044 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to 
create version file at hdfs://ip-10-108-79-232.ec2.internal:50001/hbase, 
retrying: org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create 
file /hbase/hbase.version for 
DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105 on client 
10.108.79.232 because current leaseholder is trying to recreate file.
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1182)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1054)
{panel}

> Master should wait for DFS to come up when creating hbase.version
> -----------------------------------------------------------------
>
>                 Key: HBASE-1960
>                 URL: https://issues.apache.org/jira/browse/HBASE-1960
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.90.2, 0.92.0
>
>         Attachments: HBASE-1960-redux.patch, HBASE-1960.patch
>
>
> The master does not wait for DFS to come up in the circumstance where the DFS 
> master is started for the first time after format and no datanodes have been 
> started yet. 
> {noformat}
> 2009-11-07 11:47:28,115 INFO org.apache.hadoop.hbase.master.HMaster: 
> vmName=Java HotSpot(TM) 64-Bit Server VM, vmVendor=Sun Microsystems Inc., 
> vmVersion=14.2-b01
> 2009-11-07 11:47:28,116 INFO org.apache.hadoop.hbase.master.HMaster: 
> vmInputArguments=[-Xmx1000m, -XX:+HeapDumpOnOutOfMemoryError, 
> -XX:+UseConcMarkSweepGC, -XX:+CMSIncrementalMode, 
> -Dhbase.log.dir=/mnt/hbase/logs, 
> -Dhbase.log.file=hbase-root-master-ip-10-242-15-159.log, 
> -Dhbase.home.dir=/usr/local/hbase-0.20.1/bin/.., -Dhbase.id.str=root, 
> -Dhbase.root.logger=INFO,DRFA, 
> -Djava.library.path=/usr/local/hbase-0.20.1/bin/../lib/native/Linux-amd64-64]
> 2009-11-07 11:47:28,247 INFO org.apache.hadoop.hbase.master.HMaster: My 
> address is ip-10-242-15-159.ec2.internal:60000
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
> /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> [...]
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
> for block null bad datanode[0] nodes == null
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: Could not get 
> block locations. Source file "/hbase/hbase.version" - Aborting...
> 2009-11-07 11:47:28,729 FATAL org.apache.hadoop.hbase.master.HMaster: Not 
> starting HMaster because:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
> /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> {noformat}
> Should probably sleep and retry the write a few times.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to