testextract failure on Linux and Mac OS X
-----------------------------------------

                 Key: PDFBOX-568
                 URL: https://issues.apache.org/jira/browse/PDFBOX-568
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 0.8.0-incubator
            Reporter: Jukka Zitting


As discussed on the mailing list, the extraction test case seems to fail on 
non-Windows platforms.

The troublesome test file is ample_fonts_solidconvertor.pdf, and the 
textextract.log file says the following (^@ is U+0000 and � is U+FFFD):

Lines differ at index expected:46-253 actual:46-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 8 at actual line: 8
  expected line was: "^...@v^@e...@r^@d...@a^@n...@a^@:^@ ^...@t^@o...@t^@o^@ 
^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@ý^@ ^...@t^@e...@x^@t^@ ^...@s^@ ^A"
  actual line was:   "^...@v^@e...@r^@d...@a^@n...@a^@:^@ ^...@t^@o...@t^@o^@ 
^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@�^@ ^...@t^@e...@x^@t^@ ^...@s^@ ^A"
Lines differ at index expected:4-253 actual:4-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 10 at actual line: 10
  expected line was: "^ay^...@ý^@�...@í^@é"
  actual line was:   "^ay^...@�^@�...@�^@�"
Lines differ at index expected:52-253 actual:52-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 11 at actual line: 11
  expected line was: "^...@s^@a...@n^@s^@ ^...@s^@e...@r^@i...@f^@:^@ 
^...@t^@o...@t^@o^@ ^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@ý^@ 
^...@t^@e...@x^@t^@ ^...@s^@ ^A"
  actual line was:   "^...@s^@a...@n^@s^@ ^...@s^@e...@r^@i...@f^@:^@ 
^...@t^@o...@t^@o^@ ^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@�^@ 
^...@t^@e...@x^@t^@ ^...@s^@ ^A"
Lines differ at index expected:4-253 actual:4-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 13 at actual line: 13
  expected line was: "^ay^...@ý^@�...@í^@é"
  actual line was:   "^ay^...@�^@�...@�^@�"
Preparing to parse sample_fonts_solidconvertor.pdf for sorted test
Lines differ at index expected:46-253 actual:46-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 8 at actual line: 8
  expected line was: "^...@v^@e...@r^@d...@a^@n...@a^@:^@ ^...@t^@o...@t^@o^@ 
^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@ý^@ ^...@t^@e...@x^@t^@ ^...@s^@ ^A"
  actual line was:   "^...@v^@e...@r^@d...@a^@n...@a^@:^@ ^...@t^@o...@t^@o^@ 
^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@�^@ ^...@t^@e...@x^@t^@ ^...@s^@ ^A"
Lines differ at index expected:0-253 actual:0-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 10 at actual line: 10
  expected line was: "^...@ý^@�...@í^@é"
  actual line was:   "^...@�^@�...@�^@�"
Lines differ at index expected:52-253 actual:52-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 11 at actual line: 11
  expected line was: "^...@s^@a...@n^@s^@ ^...@s^@e...@r^@i...@f^@:^@ 
^...@t^@o...@t^@o^@ ^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@ý^@ 
^...@t^@e...@x^@t^@ ^...@s^@ ^A"
  actual line was:   "^...@s^@a...@n^@s^@ ^...@s^@e...@r^@i...@f^@:^@ 
^...@t^@o...@t^@o^@ ^...@j^@e^@ ^...@p^@o...@k^@u...@s^@n...@�^@ 
^...@t^@e...@x^@t^@ ^...@s^@ ^A"
Lines differ at index expected:4-253 actual:4-65533
FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected 
line: 13 at actual line: 13
  expected line was: "^a~^...@ý^@�...@í^@é"
  actual line was:   "^a~^...@�^@�...@�^@�"


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to