Michael Stack wrote:
Why would a lightly loaded nameserver w/ no other emissions on a seemingly healthy machine have trouble allocating blocks in a job that is almost done?

From the nameserver log:

060323 173126 Server handler 4 on 8009 call error: java.io.IOException: Cannot obtain additional block for file /user/stack/e04/outputs/segments/20060322213322/crawl_parse/part-00019 java.io.IOException: Cannot obtain additional block for file /user/stack/e04/outputs/segments/20060322213322/crawl_parse/part-00019
   at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:160)
   at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)

It looks like we try 5 times at 100ms interval and still come back empty handed. Its recurrence is threatening my jobs' completion.

From reading the code, this could happen if previous blocks from this file haven't been written out yet (i.e. their replication is less than 1). Probably we should wait here a bit longer.. or perhaps datanodes should report block completion sooner.

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to