I reread your earlier posting. I misunderstood what you were asking.
Pardon me.
I'll make an issue. I'll run some tests first. I think I have a simple
receipe for provoking this failure mode.
Thanks for the help,
St.Ack
Raghu Angadi wrote:
Yes, the earler message was also from Namenode log. Before that line
there should have been a message that adds a block to this file in the
same log. I think you should file a Jira for this. It probably related
to HADOOP-999.
Raghu.
Michael Stack wrote:
Thanks for responding Raghu.
The exception in my previous mail IS from the namenode log and
contains the the correct filename but let me paste more log because I
notice there is a subsequent message that may be related:
....
2007-08-22 21:38:48,914 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/208.76.44.140:50010
2007-08-22 21:38:49,022 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/208.76.44.142:50010
2007-08-22 21:38:49,069 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/208.76.44.141:50010
2007-08-22 21:38:49,071 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/208.76.44.139:50010
2007-08-22 21:43:50,216 INFO org.apache.hadoop.fs.FSNamesystem: Roll
Edit Log from 208.76.44.139
2007-08-22 21:45:21,459 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 9000, call
complete(/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data,
DFSClient_1857293290) from 208.76.44.139:52301: error:
java.io.IOException: Unknown file:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
java.io.IOException: Unknown file:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
at
org.apache.hadoop.dfs.FSDirectory.addBlocks(FSDirectory.java:561)
at
org.apache.hadoop.dfs.FSNamesystem.completeFileInternal(FSNamesystem.java:1002)
at
org.apache.hadoop.dfs.FSNamesystem.completeFile(FSNamesystem.java:952)
at org.apache.hadoop.dfs.NameNode.complete(NameNode.java:348)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
2007-08-22 21:45:24,394 WARN org.apache.hadoop.dfs.StateChange: DIR*
NameSystem.startFile: failed to create file
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
for DFSClient_1857293290 on client 208.76.44.139 because current
leaseholder is trying to recreate file.
2007-08-22 21:45:24,394 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 9000, call
create(/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data,
DFSClient_1857293290, true, 3, 67108864) from 208.76.44.139:52312:
error: org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to
create file
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
for DFSClient_1857293290 on client 208.76.44.139 because current
leaseholder is trying to recreate file.
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create
file
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
for DFSClient_1857293290 on client 208.76.44.139 because current
leaseholder is trying to recreate file.
at
org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:740)
at org.apache.hadoop.dfs.NameNode.create(NameNode.java:307)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
2007-08-22 21:46:24,405 WARN org.apache.hadoop.dfs.StateChange: DIR*
NameSystem.startFile: failed to create file
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
for DFSClient_1857293290 on client 208.76.44.139 because current
leaseholder is trying to recreate file.
...
St.Ack
Raghu Angadi wrote:
Can you check in Namenode log for the filename? There should at
least be one message regd allocating a block to this file. If exact
filename grep does not give any results, then try to look for
something close to it.
Raghu.
Michael Stack wrote:
Anyone have any pointers debugging why an odd HDFS close is failing?
Here is the exception I'm getting.
2007-08-22 21:45:21,459 INFO org.apache.hadoop.ipc.Server: IPC
Server handler 1 on 9000, call
complete(/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data,
DFSClient_1857293290) from 208.76.44.139:52301: error:
java.io.IOException: Unknown file:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
java.io.IOException: Unknown file:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
at
org.apache.hadoop.dfs.FSDirectory.addBlocks(FSDirectory.java:561)
at
org.apache.hadoop.dfs.FSNamesystem.completeFileInternal(FSNamesystem.java:1002)
at
org.apache.hadoop.dfs.FSNamesystem.completeFile(FSNamesystem.java:952)
at org.apache.hadoop.dfs.NameNode.complete(NameNode.java:348)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
...
Its an open, write a few records and then a close. Usually it works.
My hadoop is TRUNK, perhaps a couple of hours old (r567876).
Nothing in dmesg. On this machine I only have 1024 file
descriptors but I'd expect to see complaint about file handles if
this were the issue.
Thanks,
St.Ack