Encoding problem with one specific letter and postscript
Hello, I am using FOP 1.1 to generate postscript files with embedded fonts. Later I’m using ps2pdf to convert the postscript files to pdfs. The text includes Lithuanian letters. However, after coverting them to pdf, two specific letters are displayed as squares, all the ohter Lituhuanian letters are displayed correctly. The problematic letters are the upper- and lowercase letters ė and Ė (e with dot, 0116 and 0117 in unicode). I can copy all the letters from the pdf (including the two problematic ones – when copying the square and pasting it somewhere, it displays the letter correctly). I am using the standard Arial font from Windows fonts (arial.ttf). This is my fop-config.xml: configuration renderers renderer mime=application/postscript auto-rotate-landscapetrue/auto-rotate-landscape fonts font embed-url=./arial.ttf encoding-mode=single-byte font-triplet name=Arial style=normal weight=normal/ /font /fonts /renderer /renderers /configuration When leaving out the encoding-mode=“single-byte“, the letters display correctly, but when copying from the pdf, I get gibberish. When generating straight to pdf with FOP, everything is displayed correctly and copying is also possible. I have tried other ps-pdf converters and they give the same result. Using a metric file did not help. The problem can be reproduced with the xml and xslt in the fop quick start guide (https://xmlgraphics.apache.org/fop/quickstartguide.html) with these modifications: Set the name in name.xml to ABC14pąęčėųūĘĖŲČĄ. Add the attribute font-familiy=“Arial“ to the fo:block in name2fo.xsl. Use the above fop-config file and include the standard Arial font in FOP’s directory. Run ’fop -xml name.xml -xsl name2fo.xsl -ps name.ps -c fop-config.xml’ Can anyone suggest, what could be the issue or how should I go about debugging this? Thank You. Lembit
Re: Encoding problem with one specific letter and postscript
Why aren't generating PDF directly from FOP? On Sun, Oct 12, 2014 at 1:16 PM, Lembit Gerz lembit.g...@nortal.com wrote: Hello, I am using FOP 1.1 to generate postscript files with embedded fonts. Later I’m using ps2pdf to convert the postscript files to pdfs. The text includes Lithuanian letters. However, after coverting them to pdf, two specific letters are displayed as squares, all the ohter Lituhuanian letters are displayed correctly. The problematic letters are the upper- and lowercase letters ė and Ė (e with dot, 0116 and 0117 in unicode). I can copy all the letters from the pdf (including the two problematic ones – when copying the square and pasting it somewhere, it displays the letter correctly). I am using the standard Arial font from Windows fonts (arial.ttf). This is my fop-config.xml: configuration renderers renderer mime=application/postscript auto-rotate-landscapetrue/auto-rotate-landscape fonts font embed-url=./arial.ttf encoding-mode=single-byte font-triplet name=Arial style=normal weight=normal/ /font /fonts /renderer /renderers /configuration When leaving out the encoding-mode=“single-byte“, the letters display correctly, but when copying from the pdf, I get gibberish. When generating straight to pdf with FOP, everything is displayed correctly and copying is also possible. I have tried other ps-pdf converters and they give the same result. Using a metric file did not help. The problem can be reproduced with the xml and xslt in the fop quick start guide (https://xmlgraphics.apache.org/fop/quickstartguide.html) with these modifications: Set the name in name.xml to ABC14pąęčėųūĘĖŲČĄ. Add the attribute font-familiy=“Arial“ to the fo:block in name2fo.xsl. Use the above fop-config file and include the standard Arial font in FOP’s directory. Run ’fop -xml name.xml -xsl name2fo.xsl -ps name.ps -c fop-config.xml’ Can anyone suggest, what could be the issue or how should I go about debugging this? Thank You. Lembit
RE: Encoding problem with one specific letter and postscript
To add watermarks or other transformations to the document. The current setup is the following: generate postscript - apply an awk skript to the ps, that for example adds a watermark - convert the ps to pdf. I know it might be a hacky solution, but unfortunately changing this setup is currently out of the question. From: Glenn Adams [mailto:gl...@skynav.com] Sent: 12. oktoober 2014. a. 22:31 To: FOP Users Subject: Re: Encoding problem with one specific letter and postscript Why aren't generating PDF directly from FOP? On Sun, Oct 12, 2014 at 1:16 PM, Lembit Gerz lembit.g...@nortal.commailto:lembit.g...@nortal.com wrote: Hello, I am using FOP 1.1 to generate postscript files with embedded fonts. Later I’m using ps2pdf to convert the postscript files to pdfs. The text includes Lithuanian letters. However, after coverting them to pdf, two specific letters are displayed as squares, all the ohter Lituhuanian letters are displayed correctly. The problematic letters are the upper- and lowercase letters ė and Ė (e with dot, 0116 and 0117 in unicode). I can copy all the letters from the pdf (including the two problematic ones – when copying the square and pasting it somewhere, it displays the letter correctly). I am using the standard Arial font from Windows fonts (arial.ttf). This is my fop-config.xml: configuration renderers renderer mime=application/postscript auto-rotate-landscapetrue/auto-rotate-landscape fonts font embed-url=./arial.ttf encoding-mode=single-byte font-triplet name=Arial style=normal weight=normal/ /font /fonts /renderer /renderers /configuration When leaving out the encoding-mode=“single-byte“, the letters display correctly, but when copying from the pdf, I get gibberish. When generating straight to pdf with FOP, everything is displayed correctly and copying is also possible. I have tried other ps-pdf converters and they give the same result. Using a metric file did not help. The problem can be reproduced with the xml and xslt in the fop quick start guide (https://xmlgraphics.apache.org/fop/quickstartguide.html) with these modifications: Set the name in name.xml to ABC14pąęčėųūĘĖŲČĄ. Add the attribute font-familiy=“Arial“ to the fo:block in name2fo.xsl. Use the above fop-config file and include the standard Arial font in FOP’s directory. Run ’fop -xml name.xml -xsl name2fo.xsl -ps name.pshttp://name.ps -c fop-config.xml’ Can anyone suggest, what could be the issue or how should I go about debugging this? Thank You. Lembit
Re: Encoding problem with one specific letter and postscript
If you do try the same data using FOP generating PDF directly, then does the problem occur? On Sun, Oct 12, 2014 at 1:42 PM, Lembit Gerz lembit.g...@nortal.com wrote: To add watermarks or other transformations to the document. The current setup is the following: generate postscript - apply an awk skript to the ps, that for example adds a watermark - convert the ps to pdf. I know it might be a hacky solution, but unfortunately changing this setup is currently out of the question. *From:* Glenn Adams [mailto:gl...@skynav.com] *Sent:* 12. oktoober 2014. a. 22:31 *To:* FOP Users *Subject:* Re: Encoding problem with one specific letter and postscript Why aren't generating PDF directly from FOP? On Sun, Oct 12, 2014 at 1:16 PM, Lembit Gerz lembit.g...@nortal.com wrote: Hello, I am using FOP 1.1 to generate postscript files with embedded fonts. Later I’m using ps2pdf to convert the postscript files to pdfs. The text includes Lithuanian letters. However, after coverting them to pdf, two specific letters are displayed as squares, all the ohter Lituhuanian letters are displayed correctly. The problematic letters are the upper- and lowercase letters ė and Ė (e with dot, 0116 and 0117 in unicode). I can copy all the letters from the pdf (including the two problematic ones – when copying the square and pasting it somewhere, it displays the letter correctly). I am using the standard Arial font from Windows fonts (arial.ttf). This is my fop-config.xml: configuration renderers renderer mime=application/postscript auto-rotate-landscapetrue/auto-rotate-landscape fonts font embed-url=./arial.ttf encoding-mode=single-byte font-triplet name=Arial style=normal weight=normal/ /font /fonts /renderer /renderers /configuration When leaving out the encoding-mode=“single-byte“, the letters display correctly, but when copying from the pdf, I get gibberish. When generating straight to pdf with FOP, everything is displayed correctly and copying is also possible. I have tried other ps-pdf converters and they give the same result. Using a metric file did not help. The problem can be reproduced with the xml and xslt in the fop quick start guide (https://xmlgraphics.apache.org/fop/quickstartguide.html) with these modifications: Set the name in name.xml to ABC14pąęčėųūĘĖŲČĄ. Add the attribute font-familiy=“Arial“ to the fo:block in name2fo.xsl. Use the above fop-config file and include the standard Arial font in FOP’s directory. Run ’fop -xml name.xml -xsl name2fo.xsl -ps name.ps -c fop-config.xml’ Can anyone suggest, what could be the issue or how should I go about debugging this? Thank You. Lembit
RE: Encoding problem with one specific letter and postscript
No, if I generate the PDF directly using the same data and font file, then all the letters are displayed correctly and copying from the PDF is also possible. From: Glenn Adams [mailto:gl...@skynav.com] Sent: 12. oktoober 2014. a. 23:03 To: FOP Users Subject: Re: Encoding problem with one specific letter and postscript If you do try the same data using FOP generating PDF directly, then does the problem occur? On Sun, Oct 12, 2014 at 1:42 PM, Lembit Gerz lembit.g...@nortal.commailto:lembit.g...@nortal.com wrote: To add watermarks or other transformations to the document. The current setup is the following: generate postscript - apply an awk skript to the ps, that for example adds a watermark - convert the ps to pdf. I know it might be a hacky solution, but unfortunately changing this setup is currently out of the question. From: Glenn Adams [mailto:gl...@skynav.commailto:gl...@skynav.com] Sent: 12. oktoober 2014. a. 22:31 To: FOP Users Subject: Re: Encoding problem with one specific letter and postscript Why aren't generating PDF directly from FOP? On Sun, Oct 12, 2014 at 1:16 PM, Lembit Gerz lembit.g...@nortal.commailto:lembit.g...@nortal.com wrote: Hello, I am using FOP 1.1 to generate postscript files with embedded fonts. Later I’m using ps2pdf to convert the postscript files to pdfs. The text includes Lithuanian letters. However, after coverting them to pdf, two specific letters are displayed as squares, all the ohter Lituhuanian letters are displayed correctly. The problematic letters are the upper- and lowercase letters ė and Ė (e with dot, 0116 and 0117 in unicode). I can copy all the letters from the pdf (including the two problematic ones – when copying the square and pasting it somewhere, it displays the letter correctly). I am using the standard Arial font from Windows fonts (arial.ttf). This is my fop-config.xml: configuration renderers renderer mime=application/postscript auto-rotate-landscapetrue/auto-rotate-landscape fonts font embed-url=./arial.ttf encoding-mode=single-byte font-triplet name=Arial style=normal weight=normal/ /font /fonts /renderer /renderers /configuration When leaving out the encoding-mode=“single-byte“, the letters display correctly, but when copying from the pdf, I get gibberish. When generating straight to pdf with FOP, everything is displayed correctly and copying is also possible. I have tried other ps-pdf converters and they give the same result. Using a metric file did not help. The problem can be reproduced with the xml and xslt in the fop quick start guide (https://xmlgraphics.apache.org/fop/quickstartguide.html) with these modifications: Set the name in name.xml to ABC14pąęčėųūĘĖŲČĄ. Add the attribute font-familiy=“Arial“ to the fo:block in name2fo.xsl. Use the above fop-config file and include the standard Arial font in FOP’s directory. Run ’fop -xml name.xml -xsl name2fo.xsl -ps name.pshttp://name.ps -c fop-config.xml’ Can anyone suggest, what could be the issue or how should I go about debugging this? Thank You. Lembit
Re: Encoding problem with one specific letter and postscript
You should provide the following: - maximally minimal input FO file (XML/XSLT input files are irrelevant) - the output PS file you obtain (when producing PS directly) - the output PDF file you obtain (when producing PDF directly) On Sun, Oct 12, 2014 at 2:09 PM, Lembit Gerz lembit.g...@nortal.com wrote: No, if I generate the PDF directly using the same data and font file, then all the letters are displayed correctly and copying from the PDF is also possible. *From:* Glenn Adams [mailto:gl...@skynav.com] *Sent:* 12. oktoober 2014. a. 23:03 *To:* FOP Users *Subject:* Re: Encoding problem with one specific letter and postscript If you do try the same data using FOP generating PDF directly, then does the problem occur? On Sun, Oct 12, 2014 at 1:42 PM, Lembit Gerz lembit.g...@nortal.com wrote: To add watermarks or other transformations to the document. The current setup is the following: generate postscript - apply an awk skript to the ps, that for example adds a watermark - convert the ps to pdf. I know it might be a hacky solution, but unfortunately changing this setup is currently out of the question. *From:* Glenn Adams [mailto:gl...@skynav.com] *Sent:* 12. oktoober 2014. a. 22:31 *To:* FOP Users *Subject:* Re: Encoding problem with one specific letter and postscript Why aren't generating PDF directly from FOP? On Sun, Oct 12, 2014 at 1:16 PM, Lembit Gerz lembit.g...@nortal.com wrote: Hello, I am using FOP 1.1 to generate postscript files with embedded fonts. Later I’m using ps2pdf to convert the postscript files to pdfs. The text includes Lithuanian letters. However, after coverting them to pdf, two specific letters are displayed as squares, all the ohter Lituhuanian letters are displayed correctly. The problematic letters are the upper- and lowercase letters ė and Ė (e with dot, 0116 and 0117 in unicode). I can copy all the letters from the pdf (including the two problematic ones – when copying the square and pasting it somewhere, it displays the letter correctly). I am using the standard Arial font from Windows fonts (arial.ttf). This is my fop-config.xml: configuration renderers renderer mime=application/postscript auto-rotate-landscapetrue/auto-rotate-landscape fonts font embed-url=./arial.ttf encoding-mode=single-byte font-triplet name=Arial style=normal weight=normal/ /font /fonts /renderer /renderers /configuration When leaving out the encoding-mode=“single-byte“, the letters display correctly, but when copying from the pdf, I get gibberish. When generating straight to pdf with FOP, everything is displayed correctly and copying is also possible. I have tried other ps-pdf converters and they give the same result. Using a metric file did not help. The problem can be reproduced with the xml and xslt in the fop quick start guide (https://xmlgraphics.apache.org/fop/quickstartguide.html) with these modifications: Set the name in name.xml to ABC14pąęčėųūĘĖŲČĄ. Add the attribute font-familiy=“Arial“ to the fo:block in name2fo.xsl. Use the above fop-config file and include the standard Arial font in FOP’s directory. Run ’fop -xml name.xml -xsl name2fo.xsl -ps name.ps -c fop-config.xml’ Can anyone suggest, what could be the issue or how should I go about debugging this? Thank You. Lembit