I have a valid pdf file for testing. I pass its name to the test program below and I get back:

got file ./test.pdf
Title: null
Author: null
Content:
Closing stream...

Can anyone see what I am doing wrong? This is tika 0.9 that comes with the latest Solr release.


Thanks - Tod

imports not listed for brevity;
public class testTika {

  public static void main(String[] args) throws ClassNotFoundException {
    try {
        InputStream stream = new FileInputStream(new File(args[0]));
        System.err.println("got file: " + args[0]);
      try {
        Parser parser = new AutoDetectParser();
        BodyContentHandler textHandler = new BodyContentHandler();
        Metadata metadata = new Metadata();
        metadata.set(Metadata.RESOURCE_NAME_KEY,args[0]);
        ParseContext context = new ParseContext();
        parser.parse(stream, textHandler, metadata, context);

        System.out.println("Title: " + metadata.get(Metadata.TITLE));
        System.out.println("Author: " + metadata.get("Author"));
        System.out.println("Content: " + textHandler.toString());
      } finally {
        System.out.println("Closing stream...");
        stream.close();
      }
    } catch (Exception ge) {
      System.err.println("Problem ... bailing");
      ge.printStackTrace();
    }
  }
}

Reply via email to