[Bug 374500] Re: There are disorder codes.

Mike Pontillo Tue, 12 May 2009 19:40:57 -0700

I ran a script to use "pdftext" to extract the text of this PDF, then
use "iconv" to display the text of the PDF in each possible converted
encoding.


Could you try this:

 - Open a terminal window
 - Change directory to where your problem .pdf file is located
 - Run: pdftotext <name-of-your-problem-pdf-file> tmp.txt
 - Run: iconv --from-code=WINDOWS-936 --to-code=UTF-8 tmp.txt -o text.txt
 - Run: gedit text.txt

Then try to read the text. This produced text that looked like it could
possibly be correct. (though I can not read Chinese, so I can't tell)

If it works, that may be useful data for an upstream bug report.

Note, even if the PDF reader can understand the encoding of the text, it
may still have trouble with the fonts. On the second .pdf file you
attached, Acrobat Reader (on Linux) looked like it might have been able
to display it, but it reported that the font was not found. It's
possible that if the encoding is not properly set in the .pdf file, the
name of the font would not convert properly and it would not be found -
even if it would have otherwise been possible to find a match for the
font.

Note: I tried FoxitReader under Wine in Jaunty and it can not display
the text either.

-- 
There are disorder codes.
https://bugs.launchpad.net/bugs/374500
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 374500] Re: There are disorder codes.

Reply via email to