I have a java application that retrieves documents from the web and
writes them to a sequence file on a hadoop cluster.
After running for a while (about 500MB worth) it dies with the below
error. Anyone have any ideas what might be causing this?

I'm connecting to the remote filesystem using:

    FileSystem fs = FileSystem.get(conf);

Creating a sequence file like so:

    SequenceFile.createWriter(fs, fs.getConf() , thePath,
IntWritable.class, MapWritable.class );

Then writing to it like so:

    w.append( new IntWritable(itemNum) , mw );

Thanks for any help you can provide.

Here's the error:

org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease
on /input/items.seq
    at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:976)
    at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

    at org.apache.hadoop.ipc.Client.call(Client.java:482)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.locateFollowingBlock(DFSClient.java:1541)
    at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.nextBlockOutputStream(DFSClient.java:1487)
    at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.endBlock(DFSClient.java:1613)
    at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.writeChunk(DFSClient.java:1589)
    at
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:140)
    at
org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100)
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
    at org.apache.hadoop.fs.FSDataOutputStream
$PositionCache.write(FSDataOutputStream.java:39)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.io.SequenceFile
$RecordCompressWriter.append(SequenceFile.java:1077)
    at
com.XXX.crawler.HadoopDestination.writeItem(HadoopDestination.java:79)
    at com.XXX.crawler.Crawler.main(Crawler.java:142)

Reply via email to