Anyone have insight on the following message from a near-TRUNK namenode log?
2007-11-26 01:16:23,282 WARN dfs.StateChange - DIR*
NameSystem.startFile: failed to create file
/hbase/hregion_-1194436719/oldlogfile.log for DFSClient_610028837 on
client 38.99.77.80 because current leaseholder is trying to recreate file.
It starts for no apparent reason. There is no exception or warning
preceding it in the log, and creation of files in dfs by pertinent code
has been working fine for a good while before this. After it starts,
it repeats every minute until the client-side goes away. While the
client is up, it retries creating the file every 5 minutes or so.
From the client-side, the exception looks like this:
34974 2007-11-26 01:21:23,999 WARN hbase.HMaster - Processing pending
operations: ProcessServerShutdown of XX.XX.XX.XX:60020
34975 org.apache.hadoop.dfs.AlreadyBeingCreatedException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create
file /hbase/hregion_-1194436719/oldlogfile.log for DFSClient_610028837
on client XX.XX.XX.XX because curren t leaseholder is trying to recreate
file.
34976 at
org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:848)
34977 at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:804)
34978 at org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
34979 at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
34980 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
34981 at java.lang.reflect.Method.invoke(Method.java:597)
34982 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
34983 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
34984
34985 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
34986 at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
34987 at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
34988 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
34989 at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)
34990 at org.apache.hadoop.hbase.HMaster.run(HMaster.java:1094)
The code is not exotic:
SequenceFile.Writer w = SequenceFile.createWriter(fs,
conf, logfile, HLogKey.class, HLogEdit.class);
(See HLog from hbase around #180 if you want to see more context).
I see the comments in the incomplete HADOOP-2050. There is some
implication that an attempt at a create should be wrapped in a try/catch
that retries any AlreadyBeingCreatedException. Should I be doing this
(It looks like its done in DFSClient)?
Thanks for any insight,
St.Ack
P.S. HADOOP-2283 is issue I've opened to cover this particular topic.