[
https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142309#comment-14142309
]
Chris A. Mattmann commented on TIKA-1421:
-----------------------------------------
Here's how it fails when Tesseract is installed:
{noformat}
[INFO] Surefire report directory:
/data/home/mattmann/src/tika/tika-parsers/target/surefire-reports
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running org.apache.tika.parser.audio.AudioParserTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.827 sec
Running org.apache.tika.parser.audio.MidiParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.015 sec
Running org.apache.tika.parser.microsoft.ooxml.OOXMLContainerExtractionTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.483 sec
Running org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest
Tests run: 34, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.665 sec
Running org.apache.tika.parser.microsoft.VisioParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.047 sec
Running org.apache.tika.parser.microsoft.PowerPointParserTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.296 sec
Running org.apache.tika.parser.microsoft.POIContainerExtractionTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.127 sec
Running org.apache.tika.parser.microsoft.ExcelParserTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.793 sec
Running org.apache.tika.parser.microsoft.WriteProtectedParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.091 sec
Running org.apache.tika.parser.microsoft.OfficeParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec
Running org.apache.tika.parser.microsoft.ProjectParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec
Running org.apache.tika.parser.microsoft.WordParserTest
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.466 sec
Running org.apache.tika.parser.microsoft.OutlookParserTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.418 sec
Running org.apache.tika.parser.microsoft.PublisherParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec
Running org.apache.tika.parser.microsoft.TNEFParserTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.049 sec
Running org.apache.tika.parser.xml.EmptyAndDuplicateElementsXMLParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec
Running org.apache.tika.parser.xml.FictionBookParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.025 sec
Running org.apache.tika.parser.xml.DcXMLParserTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec
Running org.apache.tika.parser.iwork.IWorkParserTest
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.634 sec
Running org.apache.tika.parser.iwork.AutoPageNumberUtilsTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
Running org.apache.tika.parser.asm.ClassParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 sec
Running org.apache.tika.parser.chm.TestPmglHeader
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec
Running org.apache.tika.parser.chm.TestChmItspHeader
Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 sec
Running org.apache.tika.parser.chm.TestPmgiHeader
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec
Running org.apache.tika.parser.chm.TestChmExtractor
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.185 sec
Running org.apache.tika.parser.chm.TestChmLzxState
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec
Running org.apache.tika.parser.chm.TestChmLzxcResetTable
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec
Running org.apache.tika.parser.chm.TestChmExtraction
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.938 sec
Running org.apache.tika.parser.chm.TestDirectoryListingEntry
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec
Running org.apache.tika.parser.chm.TestChmLzxcControlData
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec
Running org.apache.tika.parser.chm.TestChmBlockInfo
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec
Running org.apache.tika.parser.chm.TestChmItsfHeader
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec
Running org.apache.tika.parser.txt.TXTParserTest
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec
Running org.apache.tika.parser.txt.CharsetDetectorTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec
Running org.apache.tika.parser.image.xmp.JempboxExtractorTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 sec
Running org.apache.tika.parser.image.PSDParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec
Running org.apache.tika.parser.image.ImageParserTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec
Running org.apache.tika.parser.image.ImageMetadataExtractorTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec
Running org.apache.tika.parser.image.MetadataFieldsTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
Running org.apache.tika.parser.image.TiffParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec
Running org.apache.tika.parser.font.FontParsersTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.183 sec
Running org.apache.tika.parser.mp4.MP4ParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.059 sec
Running org.apache.tika.parser.mp3.Mp3ParserTest
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.035 sec
Running org.apache.tika.parser.mp3.MpegStreamTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.apache.tika.parser.dwg.DWGParserTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec
Running org.apache.tika.parser.pkg.GzipParserTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.186 sec
Running org.apache.tika.parser.pkg.Seven7ParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.314 sec
Running org.apache.tika.parser.pkg.TarParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.082 sec
Running org.apache.tika.parser.pkg.Bzip2ParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.166 sec
Running org.apache.tika.parser.pkg.ArParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec
Running org.apache.tika.parser.pkg.ZipParserTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.216 sec
Running org.apache.tika.parser.video.FLVParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec
Running org.apache.tika.parser.solidworks.SolidworksParserTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec
Running org.apache.tika.parser.ibooks.iBooksParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec
Running org.apache.tika.parser.ParsingReaderTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec
Running org.apache.tika.parser.mail.RFC822ParserTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.46 sec
Running org.apache.tika.parser.mbox.MboxParserTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec
Running org.apache.tika.parser.mbox.OutlookPSTParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.072 sec
Running org.apache.tika.parser.jpeg.JpegParserTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.178 sec
Running org.apache.tika.parser.executable.ExecutableParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.apache.tika.parser.rtf.RTFParserTest
Tests run: 31, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.12 sec
Running org.apache.tika.parser.fork.ForkParserIntegrationTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.299 sec
Running org.apache.tika.parser.envi.EnviHeaderParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.apache.tika.parser.AutoDetectParserTest
Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.884 sec
Running org.apache.tika.parser.epub.EpubParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec
Running org.apache.tika.parser.code.SourceCodeParserTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.065 sec
Running org.apache.tika.parser.netcdf.NetCDFParserTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.127 sec
Running org.apache.tika.parser.pdf.PDFParserTest
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
INFO [main] (PDFParser.java:248) - Document is encrypted
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 5592
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 5592
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 12324
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 5969
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 5687
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785
WARN [main] (FontManager.java:312) - Font not found: Times New Roman
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785
WARN [main] (FontManager.java:312) - Font not found: Times New Roman
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 26441
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 5592
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 8777
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 2314576
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 116
ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at
offset 5500
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851
INFO [main] (PDFParser.java:248) - Document is encrypted
INFO [main] (PDFParser.java:248) - Document is encrypted
Tests run: 27, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.361 sec
Running org.apache.tika.parser.RecursiveParserWrapperTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.243 sec
Running org.apache.tika.parser.prt.PRTParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.011 sec
Running org.apache.tika.parser.html.HtmlParserTest
Tests run: 38, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.133 sec
Running org.apache.tika.parser.mat.MatParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.536 sec
Running org.apache.tika.parser.feed.FeedParserTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec
Running org.apache.tika.parser.ocr.TesseractOCRTest
Tests run: 3, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 0.272 sec <<<
FAILURE!
Running org.apache.tika.parser.odf.ODFParserTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.097 sec
Running org.apache.tika.parser.hdf.HDFParserTest
WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=53 tag= VG (1965) Vgroup length=34 class= Dim0.0 name= Longitude using
data 52
WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=55 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= Latitude using
data 54
WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=57 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim2 using
data 56
WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=59 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim3 using
data 58
WARN [main] (H4header.java:844) - data tag missing vgroup= 70 Sea Surface
Temperature
WARN [main] (H4header.java:844) - data tag missing vgroup= 73 Number of
Observations per Bin
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.087 sec
Running org.apache.tika.embedder.ExternalEmbedderTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec
Running org.apache.tika.mime.MimeTypesTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.apache.tika.mime.TestMimeTypes
Tests run: 47, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.126 sec
Running org.apache.tika.mime.MimeTypeTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
Running org.apache.tika.detect.TestContainerAwareDetector
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.262 sec
Running org.apache.tika.TestParsers
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229
WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785
WARN [main] (FontManager.java:312) - Font not found: Times New Roman
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.186 sec
Results :
Failed tests: testPPTXOCR(org.apache.tika.parser.ocr.TesseractOCRTest): Check
for the image's text.
testDOCXOCR(org.apache.tika.parser.ocr.TesseractOCRTest)
testPDFOCR(org.apache.tika.parser.ocr.TesseractOCRTest)
Tests run: 538, Failures: 3, Errors: 0, Skipped: 1
{noformat}
> Tika-Parsers tests fail on CentOS6 if tesseract isn't installed
> ---------------------------------------------------------------
>
> Key: TIKA-1421
> URL: https://issues.apache.org/jira/browse/TIKA-1421
> Project: Tika
> Issue Type: Bug
> Components: parser
> Environment: CentOS6 AWS VM for DARPA Memex
> Reporter: Chris A. Mattmann
> Assignee: Chris A. Mattmann
> Fix For: 1.7
>
>
> While testing TIKA-93 on CentOS6, I ran into some test failing issues on a
> 1.7-trunk fresh install of tika in tika-parsers:
> {noformat}
> Running org.apache.tika.parser.chm.TestChmLzxcControlData
> Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec
> Running org.apache.tika.parser.chm.TestChmBlockInfo
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec
> Running org.apache.tika.parser.chm.TestChmItsfHeader
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec
> Running org.apache.tika.parser.txt.TXTParserTest
> Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec
> Running org.apache.tika.parser.txt.CharsetDetectorTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec
> Running org.apache.tika.parser.image.xmp.JempboxExtractorTest
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec
> Running org.apache.tika.parser.image.PSDParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec
> Running org.apache.tika.parser.image.ImageParserTest
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec
> Running org.apache.tika.parser.image.ImageMetadataExtractorTest
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.241 sec
> Running org.apache.tika.parser.image.MetadataFieldsTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
> Running org.apache.tika.parser.image.TiffParserTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec
> Running org.apache.tika.parser.font.FontParsersTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.192 sec
> Running org.apache.tika.parser.mp4.MP4ParserTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec
> Running org.apache.tika.parser.mp3.Mp3ParserTest
> Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.046 sec
> Running org.apache.tika.parser.mp3.MpegStreamTest
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec
> Running org.apache.tika.parser.dwg.DWGParserTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec
> Running org.apache.tika.parser.pkg.GzipParserTest
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec
> Running org.apache.tika.parser.pkg.Seven7ParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.37 sec
> Running org.apache.tika.parser.pkg.TarParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec
> Running org.apache.tika.parser.pkg.Bzip2ParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec
> Running org.apache.tika.parser.pkg.ArParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec
> Running org.apache.tika.parser.pkg.ZipParserTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.302 sec
> Running org.apache.tika.parser.video.FLVParserTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec
> Running org.apache.tika.parser.solidworks.SolidworksParserTest
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec
> Running org.apache.tika.parser.ibooks.iBooksParserTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec
> Running org.apache.tika.parser.ParsingReaderTest
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.018 sec
> Running org.apache.tika.parser.mail.RFC822ParserTest
> Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.31 sec <<<
> FAILURE!
> Running org.apache.tika.parser.mbox.MboxParserTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec
> Running org.apache.tika.parser.mbox.OutlookPSTParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec
> Running org.apache.tika.parser.jpeg.JpegParserTest
> Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.153 sec
> Running org.apache.tika.parser.executable.ExecutableParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec
> Running org.apache.tika.parser.rtf.RTFParserTest
> Tests run: 31, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.221 sec
> Running org.apache.tika.parser.fork.ForkParserIntegrationTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.322 sec
> Running org.apache.tika.parser.envi.EnviHeaderParserTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
> Running org.apache.tika.parser.AutoDetectParserTest
> Tests run: 22, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.439 sec
> <<< FAILURE!
> Running org.apache.tika.parser.epub.EpubParserTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec
> Running org.apache.tika.parser.code.SourceCodeParserTest
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.069 sec
> Running org.apache.tika.parser.netcdf.NetCDFParserTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.125 sec
> Running org.apache.tika.parser.pdf.PDFParserTest
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
> INFO [main] (PDFParser.java:248) - Document is encrypted
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 5592
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 5592
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 12324
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 5969
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 5687
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785
> WARN [main] (FontManager.java:312) - Font not found: Times New Roman
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785
> WARN [main] (FontManager.java:312) - Font not found: Times New Roman
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 26441
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 5592
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 8777
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 2314576
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 116
> ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref
> at offset 5500
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851
> INFO [main] (PDFParser.java:248) - Document is encrypted
> INFO [main] (PDFParser.java:248) - Document is encrypted
> Tests run: 27, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 14.305 sec
> <<< FAILURE!
> Running org.apache.tika.parser.RecursiveParserWrapperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec
> Running org.apache.tika.parser.prt.PRTParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec
> Running org.apache.tika.parser.html.HtmlParserTest
> Tests run: 38, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.162 sec
> Running org.apache.tika.parser.mat.MatParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.543 sec
> Running org.apache.tika.parser.feed.FeedParserTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.011 sec
> Running org.apache.tika.parser.ocr.TesseractOCRTest
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 0.007 sec
> Running org.apache.tika.parser.odf.ODFParserTest
> Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.098 sec
> Running org.apache.tika.parser.hdf.HDFParserTest
> WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
> *refno=53 tag= VG (1965) Vgroup length=34 class= Dim0.0 name= Longitude using
> data 52
> WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
> *refno=55 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= Latitude using
> data 54
> WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
> *refno=57 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim2 using
> data 56
> WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
> *refno=59 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim3 using
> data 58
> WARN [main] (H4header.java:844) - data tag missing vgroup= 70 Sea Surface
> Temperature
> WARN [main] (H4header.java:844) - data tag missing vgroup= 73 Number of
> Observations per Bin
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.087 sec
> Running org.apache.tika.embedder.ExternalEmbedderTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec
> Running org.apache.tika.mime.MimeTypesTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec
> Running org.apache.tika.mime.TestMimeTypes
> Tests run: 47, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.163 sec
> Running org.apache.tika.mime.MimeTypeTest
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
> Running org.apache.tika.detect.TestContainerAwareDetector
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.277 sec
> Running org.apache.tika.TestParsers
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229
> WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785
> WARN [main] (FontManager.java:312) - Font not found: Times New Roman
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.2 sec
> Results :
> Failed tests: testMultipart(org.apache.tika.parser.mail.RFC822ParserTest):
> Exception thrown: TIKA-198: Illegal IOException from
> org.apache.tika.parser.ocr.TesseractOCRParser@2657d8a0
> testInlineSelector(org.apache.tika.parser.pdf.PDFParserTest): expected:<2>
> but was:<0>
> testInlineConfig(org.apache.tika.parser.pdf.PDFParserTest): expected:<2>
> but was:<0>
> testEmbeddedFilesInChildren(org.apache.tika.parser.pdf.PDFParserTest):
> expected:<5> but was:<3>
> Tests in error:
> testUnusualFromAddress(org.apache.tika.parser.mail.RFC822ParserTest):
> TIKA-198: Illegal IOException from
> org.apache.tika.parser.ocr.TesseractOCRParser@1574a7af
> testImages(org.apache.tika.parser.AutoDetectParserTest): TIKA-198: Illegal
> IOException from org.apache.tika.parser.ocr.TesseractOCRParser@107aac4a
> Tests run: 538, Failures: 4, Errors: 2, Skipped: 4
> {noformat}
> I tried installing Tesseract here:
> http://pkgs.org/centos-6/naulinux-school-x86_64/tesseract-3.01-2.el6.x86_64.rpm.html
> However, installing that causes the other tests to pass, but the Tesseract
> ones to fail (I think there is something wrong with the English config and am
> looking into it).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)