I have a java application that retrieves documents from the web and
writes them to a sequence file on a hadoop cluster.
After running for a while (about 500MB worth) it dies with the below
error. Anyone have any ideas what might be causing this?
I'm connecting to the remote filesystem using:
FileSystem fs = FileSystem.get(conf);
Creating a sequence file like so:
SequenceFile.createWriter(fs, fs.getConf() , thePath,
IntWritable.class, MapWritable.class );
Then writing to it like so:
w.append( new IntWritable(itemNum) , mw );
Thanks for any help you can provide.
Here's the error:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease
on /input/items.seq
at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:976)
at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
at org.apache.hadoop.ipc.Client.call(Client.java:482)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.locateFollowingBlock(DFSClient.java:1541)
at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.nextBlockOutputStream(DFSClient.java:1487)
at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.endBlock(DFSClient.java:1613)
at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.writeChunk(DFSClient.java:1589)
at
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:140)
at
org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
at org.apache.hadoop.fs.FSDataOutputStream
$PositionCache.write(FSDataOutputStream.java:39)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.SequenceFile
$RecordCompressWriter.append(SequenceFile.java:1077)
at
com.XXX.crawler.HadoopDestination.writeItem(HadoopDestination.java:79)
at com.XXX.crawler.Crawler.main(Crawler.java:142)