Hello,
sorry if this is already answered, I am new here.
I am using Tika 0.7, Windows XP. This is my code:
for (File f : files) {
if (isFile(f)) {
try {
input = new FileInputStream(f);
ParseContext context = new ParseContext();
Metadata metadata = new Metadata ();
// -1 to disable write length limit
ContentHandler handler = new BodyContentHandler(-1);
AutoDetectParser parse = new AutoDetectParser();
parse.parse(input, handler, metadata, context);
} catch (FileNotFoundException fnfe) {
System.out.println((fnfe.getMessage());
} catch (TikaException te) {
System.out.println (te.getMessage());
} catch (SAXException se) {
System.out.println (se.getMessage());
} catch (IOException ie) {
tSystem.out.println (ie.getMessage());
} finally {
if (input != null)
try {
input.close();
} catch (IOException ie) {
this.error.append(ie.getMessage()+"\n");
}
}
}
}
I have several questions:
- what is the ParserContext for ? what should I set ?
- I know POI does not support Office 2007, but the above
parse.parse(input, handler, metadata, context);
does not throw any exception and for some reason which I do not know, it
does not continue the for loop when it reads in a Office 07 PPT file.
- is there a plan to support Office 07 (sorry, I should ask this in the POI
forum, just in case if anyone here knows) thanks,
canal