unijis-ucs2-hw-h problems

牛小伟 Sat, 25 Jul 2015 00:46:12 -0700

Dear team:
         We are using your product pdfbox 1.6 to do text extraction. 
But when we are processing the encoding(UniJIS-UCS2-HW-H), 
it appears unreadable code like this(????????????????????????3?????????????).
We have tried some other ways to process it. But they don't work.
We also have some doc with the encoding(GBK-EUC-H),the pdfbox
can work perfectly. I also tried the pdfbox 1.8, it also didn't work.
I checked the charset of the pdfbox. It contains both of the encoding.
I don't know why one is working, another is not working.
Hope your support for this .Very thanks.



Best Regard.


the docsnapshot of the encoding:

unijis-ucs2-hw-h problems

Reply via email to