Could be a bug in PDFBox. Might want to ask on the pdfbox users' list. -----Original Message----- From: question.answer...@gmail.com [mailto:question.answer...@gmail.com] Sent: Friday, September 16, 2016 7:30 AM To: user@tika.apache.org Subject: [Tika] I have a question. --> "Exception : org.apache.pdfbox.cos.COSArray cannot be cast to org.apache.pdfbox.cos.COSDictionary"
An exception is raised in line:"parser.parse(new Fil ....". "Exception : org.apache.pdfbox.cos.COSArray cannot be cast to org.apache.pdfbox.cos.COSDictionary" Why exception occurs? In other dozens of PDF, the exception does not occur. below, my program. ----------------------------------------------------- try { File document = new File("/usr/local/sample.pdf"); PDFParser parser = new PDFParser(); ContentHandler handler = new BodyContentHandler(Integer.MAX_VALUE); Metadata metadata = new Metadata(); parser.parse(new FileInputStream(document), handler, metadata , new ParseContext()); String plainText = handler.toString(); System.out.println(plainText); } catch (FileNotFoundException e) { e.printStackTrace(); throw new RuntimeException(e.getMessage()); } catch (IOException e) { e.printStackTrace(); throw new RuntimeException(e.getMessage()); } catch (SAXException e) { e.printStackTrace(); throw new RuntimeException(e.getMessage()); } catch (TikaException e) { e.printStackTrace(); throw new RuntimeException(e.getMessage()); } catch (Exception e) { e.printStackTrace(); throw new RuntimeException(e.getMessage()); } ----------------------------------------------------- -- syosinnsya