Encoding troubles

Kévin Sailly Sat, 10 Mar 2012 00:21:41 -0800

Hello,

I am building a webapp that receiving text from rich text (tinyMCE)
componnent, planning to build a pdf from this text (hard copy-paste from
Word doc is allow) using PdfBox.


As my server is runing on Debian, every thing is encoded/decoded in UTF-8
(server, jvm and database storing the text), nevertheless (as you see my
poor english), i am french, and then text too. At end, I would like to
proceed any language.

First unescaping xml rich text with StringEscapeUtils from apache, the
result is that characters like rightquote cannot be correctly rendered on
PDFs.

Re-encoding text in UTF-16 was appearing for me to be a solution but i am
facing this bug :
https://issues.apache.org/jira/browse/PDFBOX-1242

As nobody seems to correct this bug, is any work around i could use?

Thanks,
Kévin

Encoding troubles

Reply via email to