Chris, Is this on an updated and/or reverted trunk or on an modified rc-3?
I haven't gotten around to installing tesseract yet so I can't actually kick the tires, but the last time there was a test for 5 items on line 91 of RFC822ParserTest was in r1552405...before the fixes for TIKA-1422. But r1552405 doesn't quite seem to fit the error message, which says that it can't find 5 "div" (if I understand correctly), and in r1552405 the test was for 5 "p". In r1633331 and 1633325, there is a path through the code to test for 5 "div" if Tesseract is running, but that isn't occurring on line 91 in those revisions. -----Original Message----- From: Mattmann, Chris A (3980) [mailto:[email protected]] Sent: Sunday, January 11, 2015 7:33 PM To: [email protected] Subject: TestMultiPart tests failing Hey Guys, I’m on Mac OS X 10.9.4, Java version: [chipotle:~/src/tika] mattmann% uname -a Darwin chipotle.local 13.3.0 Darwin Kernel Version 13.3.0: Tue Jun 3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 x86_64 [chipotle:~/src/tika] mattmann% java -version java version "1.7.0_60" Java(TM) SE Runtime Environment (build 1.7.0_60-b19) Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode) [chipotle:~/src/tika] mattmann% With Tesseract installed: [chipotle:~/src/tika] mattmann% tesseract --version tesseract 3.02.02 leptonica-1.71 libjpeg 8d : libpng 1.6.13 : libtiff 4.0.3 : zlib 1.2.5 [chipotle:~/src/tika] mattmann% And the following tests are failing: Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 sec Running org.apache.tika.parser.xml.EmptyAndDuplicateElementsXMLParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec Running org.apache.tika.parser.xml.FictionBookParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.sax.PhoneExtractingContentHandlerTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec Running org.apache.tika.TestParsers Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.669 sec Results : Failed tests: testMultipart(org.apache.tika.parser.mail.RFC822ParserTest): (..) Tests run: 572, Failures: 1, Errors: 0, Skipped: 2 [INFO] ------------------------------------------------- Test set: org.apache.tika.parser.mail.RFC822ParserTest --------------------------------------------------------------------------- ---- Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.903 sec <<< FAILURE! testMultipart(org.apache.tika.parser.mail.RFC822ParserTest) Time elapsed: 0.289 sec <<< FAILURE! org.mockito.exceptions.verification.TooLittleActualInvocations: xHTMLContentHandler.startElement( "http://www.w3.org/1999/xhtml", "div", "div", isA(org.xml.sax.Attributes) ); Wanted 5 times but was 4 at org.apache.tika.parser.mail.RFC822ParserTest.testMultipart(RFC822ParserTest .java:91) Caused by: org.mockito.exceptions.cause.TooLittleInvocations: Too little invocations: at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDeco rator.java:126) at org.apache.tika.sax.SafeContentHandler.startElement(SafeContentHandler.java :264) at org.apache.tika.sax.XHTMLContentHandler.startElement(XHTMLContentHandler.ja va:254) at org.apache.tika.sax.XHTMLContentHandler.startElement(XHTMLContentHandler.ja va:291) at org.apache.tika.parser.mail.MailContentHandler.startBodyPart(MailContentHan dler.java:242) at org.apache.james.mime4j.parser.MimeStr Ideas? Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
