Could be a bug in PDFBox. Might want to ask on the pdfbox users' list.
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Friday, September 16, 2016 7:30 AM
To: [email protected]
Subject: [Tika] I have a question. --> "Exception :
org.apache.pdfbox.cos.COSArray cannot be cast to
org.apache.pdfbox.cos.COSDictionary"
An exception is raised in line:"parser.parse(new Fil ....".
"Exception : org.apache.pdfbox.cos.COSArray cannot be cast to
org.apache.pdfbox.cos.COSDictionary"
Why exception occurs?
In other dozens of PDF, the exception does not occur.
below, my program.
-----------------------------------------------------
try {
File document = new File("/usr/local/sample.pdf");
PDFParser parser = new PDFParser();
ContentHandler handler = new BodyContentHandler(Integer.MAX_VALUE);
Metadata metadata = new Metadata();
parser.parse(new FileInputStream(document), handler, metadata
, new
ParseContext());
String plainText = handler.toString();
System.out.println(plainText);
}
catch (FileNotFoundException e) {
e.printStackTrace();
throw new RuntimeException(e.getMessage()); } catch (IOException e) {
e.printStackTrace();
throw new RuntimeException(e.getMessage()); } catch (SAXException e) {
e.printStackTrace();
throw new RuntimeException(e.getMessage()); } catch (TikaException e) {
e.printStackTrace();
throw new RuntimeException(e.getMessage()); } catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e.getMessage()); }
-----------------------------------------------------
--
syosinnsya