Hi,

I've written a parse-exe plugin for downloading EXE files from crawled
pages.
I've used the parse-pdf as my template.
Although the plugin works (d/l the exe with any content type related ,i.e.
application/(x-exe|x-msdos|x-dosexec..)), i still get nullPointerException
for parseData.
I don't fully understand the code in the end, and i might missed something,
can anyone help?

the getParse(Content content) i've written:

  public Parse getParse(Content content) {
    String resultText = "No textual content available";
    String resultTitle = "No textual content available";
    Outlink[] outlinks = new Outlink[0];
    Metadata metadata = new Metadata();

    try {

      byte[] raw = content.getContent();

      String contentLength = content.getMetadata().get(
Response.CONTENT_LENGTH);
      if (contentLength != null && raw.length !=
Integer.parseInt(contentLength))
{
          return new ParseStatus(ParseStatus.FAILED,
ParseStatus.FAILED_TRUNCATED,
                  "Content truncated at "+raw.length
            +" bytes. Parser can't handle incomplete exe
file.").getEmptyParse(getConf());
      }
      // download the file - separate method (doesn't effect the other vars)
      downloadContentType(content);

    }catch (Exception e) { // run time exception
        if (LOG.isWarnEnabled()) {
          LOG.warn("General exception in EXE parser: "+e.getMessage());
          e.printStackTrace(LogUtil.getWarnStream(LOG));
        }
          return new ParseStatus(ParseStatus.FAILED,
              "Can't be handled as exe document. " +
e).getEmptyParse(getConf());
     }

     ParseData parseData = new ParseData(ParseStatus.STATUS_SUCCESS,
                                              resultTitle, outlinks,
                                              content.getMetadata());

    return new ParseImpl(resultText, parseData);
  }


when running i get this exception:

java.lang.NullPointerException
        at org.apache.nutch.parse.ParseData.write(ParseData.java:163)
        at org.apache.nutch.parse.ParseImpl.write(ParseImpl.java:55)
        at org.apache.nutch.fetcher.FetcherOutput.write(FetcherOutput.java
:63)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(
MapTask.java:315)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.output(
Fetcher.java:403)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java
:164)
fetch of http://www2.ati.com/misc/themes/ATI_ThemeManager_July2004.exefailed
with:
java.lang.NullPointerException

Thanks,

Eyal.


-- 
Eyal Edri

Reply via email to