Tim,

 

Are you seeing this?

 

Results :

 

Failed tests: 

  PDFParserTest.testEmbeddedDocsWithOCROnly:1250->TikaTest.assertContains:103 
pdf_haystack not found in:

<html xmlns="http://www.w3.org/1999/xhtml";>

<head>

<meta name="date" content="2013-05-23T18:30:00Z" />

<meta name="cp:revision" content="1" />

<meta name="extended-properties:AppVersion" content="14.0000" />

<meta name="meta:paragraph-count" content="1" />

<meta name="meta:word-count" content="16" />

<meta name="extended-properties:Company" content="" />

<meta name="Word-Count" content="16" />

<meta name="dcterms:created" content="2013-05-23T18:30:00Z" />

<meta name="meta:line-count" content="1" />

<meta name="Last-Modified" content="2013-05-23T18:30:00Z" />

<meta name="dcterms:modified" content="2013-05-23T18:30:00Z" />

<meta name="Last-Save-Date" content="2013-05-23T18:30:00Z" />

<meta name="meta:character-count" content="96" />

<meta name="Template" content="Normal.dotm" />

<meta name="Line-Count" content="1" />

<meta name="Paragraph-Count" content="1" />

<meta name="meta:save-date" content="2013-05-23T18:30:00Z" />

<meta name="meta:character-count-with-spaces" content="111" />

<meta name="Application-Name" content="Microsoft Office Word" />

<meta name="modified" content="2013-05-23T18:30:00Z" />

<meta name="Content-Type" 
content="application/vnd.openxmlformats-officedocument.wordprocessingml.document"
 />

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

<meta name="X-Parsed-By" 
content="org.apache.tika.parser.microsoft.ooxml.OOXMLParser" />

<meta name="meta:creation-date" content="2013-05-23T18:30:00Z" />

<meta name="extended-properties:Application" content="Microsoft Office Word" />

<meta name="Creation-Date" content="2013-05-23T18:30:00Z" />

<meta name="xmpTPg:NPages" content="1" />

<meta name="Character-Count-With-Spaces" content="111" />

<meta name="Character Count" content="96" />

<meta name="Page-Count" content="1" />

<meta name="Revision-Number" content="1" />

<meta name="Application-Version" content="14.0000" />

<meta name="extended-properties:Template" content="Normal.dotm" />

<meta name="publisher" content="" />

<meta name="meta:page-count" content="1" />

<meta name="dc:publisher" content="" />

<title></title>

</head>

<body><p class="header" />

<p class="header" />

<p class="header" />

<p>Outer_haystack</p>

<p>Outer_haystack</p>

<p><div class="embedded" id="rId8" />

</p>

<p>Outer_haystack</p>

<p />

<p>Outer_haystack</p>

<p />

<p>Outer_haystack</p>

<p><a name="_GoBack" /></p>

<p class="footer" />

<p class="footer" />

<p class="footer" />

<p>attached.pdf</p>

<div class="page"><div class="ocr">dehayslack dehaystack dehayslack dehaystack 
dehaystack dehaystack pd'

 

</div>

</div>

<p class="header" />

 

<p class="header" />

 

<p class="header" />

 

<p>Haystack</p>

 

<p>Needle</p>

 

<p>Haystack</p>

 

<p><a name="_GoBack" /></p>

 

<p class="footer" />

 

<p class="footer" />

 

<p class="footer" />

 

<div source="attachment" class="embedded" id="Test.docx" />

</body></html>

 

Tests run: 1009, Failures: 1, Errors: 0, Skipped: 30

 

[INFO] ------------------------------------------------------------------------

[INFO] Reactor Summary:

[INFO] 

[INFO] Apache Tika parent ................................. SUCCESS [  1.565 s]

[INFO] Apache Tika core ................................... SUCCESS [ 32.977 s]

[INFO] Apache Tika parsers ................................ FAILURE [05:52 min]

[INFO] Apache Tika XMP .................................... SKIPPED

[INFO] Apache Tika serialization .......................... SKIPPED

[INFO] Apache Tika batch .................................. SKIPPED

[INFO] Apache Tika language detection ..................... SKIPPED

[INFO] Apache Tika application ............................ SKIPPED

[INFO] Apache Tika OSGi bundle ............................ SKIPPED

[INFO] Apache Tika translate .............................. SKIPPED

[INFO] Apache Tika server ................................. SKIPPED

[INFO] Apache Tika examples ............................... SKIPPED

[INFO] Apache Tika Java-7 Components ...................... SKIPPED

[INFO] Apache Tika eval ................................... SKIPPED

[INFO] Apache Tika Deep Learning (powered by DL4J) ........ SKIPPED

[INFO] Apache Tika Natural Language Processing ............ SKIPPED

[INFO] Apache Tika ........................................ SKIPPED

[INFO] ------------------------------------------------------------------------

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 06:27 min

[INFO] Finished at: 2018-05-24T09:04:59-07:00

[INFO] Final Memory: 72M/1029M

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on 
project tika-parsers: There are test failures.

[ERROR] 

[ERROR] Please refer to 
/Users/mattmann/tmp/tika2.0.0/tika-parsers/target/surefire-reports for the 
individual test results.

[ERROR] -> [Help 1]

[ERROR] 

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR] 

[ERROR] For more information about the errors and possible solutions, please 
read the following articles:

[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

[ERROR] 

[ERROR] After correcting the problems, you can resume the build with the command

[ERROR]   mvn <goals> -rf :tika-parsers

 

Keeps failing for me.

nonas:tika2.0.0 mattmann$ java -version

java version "1.8.0_144"

Java(TM) SE Runtime Environment (build 1.8.0_144-b01)

Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

nonas:tika2.0.0 mattmann$ 

 

Any ideas?

 

Cheers,

Chris

 

Reply via email to