Hi,

For some time (in 2.x) we have commented out this test as it was
waiting for TIKA-748 to be resolved... which now has been resolved
however I'm getting some confusing output when trying to resurrect the
test!

So @line 105 we do

String text = parse.getText();
assertEquals("The quick brown fox jumps over the lazy dog", text.trim());

But I was wanting to implement the suggested test for title e.g.

String title = parse.getTitle();
String text = parse.getText();
assertEquals("test rft document", title);
assertEquals("The quick brown fox jumps over the lazy dog", text.trim());

The test fails on the 2nd assertion which with the following

Testcase: testIt took 5.668 sec
        FAILED
null expected:<[The quick brown fox jumps over the lazy dog]> but
was:<[test rft document]>
junit.framework.ComparisonFailure: null expected:<[The quick brown fox
jumps over the lazy dog]> but was:<[test rft document]>
        at org.apache.nutch.parse.tika.TestRTFParser.testIt(TestRTFParser.java:)

So this looks like parse.getText() returns the same (in this instance)
as parse.getTitle()... which smells like rotting herring to me.

Any immediate thoughts whether this is a known problem in the Tika RTF
parser, parse-tika's DomContentUtils class or somewhere in between?

Thank you

Lewis

-- 
Lewis

Reply via email to