Hi,

I'm writing a fairly simple client application which basically concatenates
the output files of a MapReduce job (Hadoop 20.2). The code is as follows:

DFSClient client = new DFSClient(new Configuration());
FileStatus[] listing = client.listPaths("/myoutputdir");
int read = 0;
byte[] buffer = new byte[2048];
StringBuilder builder = new StringBuilder();
for(FileStatus file : listing) {
  if(builder.length() > 0)
    builder.append("\n");

  String filename = file.getPath().getName();

  if(filename.startsWith("part")) {
    InputStream input = client.open(filename);

    read = input.read(buffer);
    while(read > 0) {
       builder.append(new String(buffer, 0, read));
       read = input.read(buffer);
    }

    input.close();
  }
}

I'm following the RPC calls and I notice most of them are working. For
instance, I see a call for "getListing(/myoutputdir)." Afterwards, I
successfully retrieve the files in the directory. Once I reach the
client.open() statement however, a call is sent
"getBlockLocations(part-r-00000, 0, 671088640)" which I believe, going out
on a limb here, finds the block locations for the file part-r-00000.
Unfortunately this fails and worse yet, debugging information is slim. I get
back:

org.apache.hadoop.ipc.RemoteException: java.io.IOException:
java.lang.NullPointerException

  at org.apache.hadoop.ip.Client.call(Client.java:740)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
  at $Proxy54.getBlockLocations (Unknown Source)
  ...

Since the exception is on the remote side I don't get a lot of help from the
stack trace. Client.java:740 is actually a statement to fill in the stack
trace. I also couldn't find anything in the namenode log either. Any
suggestions?

Reply via email to