Re: Bug fix for non ASCII environments
Thx ! Hoang Nam wrote: Sure! Try this link : http://marc.theaimsgroup.com/?l=fop-dev&w=2&r=1&s=compression&q=b - Original Message - From: "Markus Bernhardt" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, July 23, 2001 4:26 PM Subject: Re: Bug fix for non ASCII environments > Is there also a SEARCHABLE list ? _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] begin:vcard n:Bernhardt;Markus tel;cell:0171-5770462 tel;fax:089-420903-20 tel;home:089-6378949 tel;work:089-420903-14 x-mozilla-html:FALSE url:www.swsgmbh.de org:Software Service Wulf Schupp GmbH;Spieljoch adr:;;Spieljochstr. 34;München;;81825;Germany version:2.1 email;internet:[EMAIL PROTECTED] title:Entwicklungsleiter note;quoted-printable:[dF]Quisam=0D=0Awww.discordian-Front.de=0D=0AHail Eris !!!=0D=0AHappy Frag !!! fn:Markus Bernhardt end:vcard - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Bug fix for non ASCII environments
Tore Engvig wrote: To answer your questions: Yes, there are unicode pdf files. Good. Fop can generate pdf files for non iso 8859 encodings if you provide fonts that support it. I think fonts haven't been my problem. The problem was that the PDF structure is coded as strings into the java source of FOP. Lets say you have a PDF-keyword 'FlateDecode' Anywhere in the FOP Java Source is something like: String text = "FlateDecode"; Later this String gets written to the PDF-file via text.getBytes() as: 46 6C 61 74 65 44 65 63 6F 64 65 F l a t e D e c o d e But that works only on a ASCII based system ! On EBCDIC: C6 93 81 A3 85 C4 85 83 96 84 85 F l a t e D e c o d e So Acrobat can't get the structure of the document at all. However, the generated pdf isn't unicode but multibyte indexes to the fonts glyphs. It will be unicode as soon as the toUnicode CMAP is implemented. To generate the multibyte text, getBytes("UnicodeBigUnmarked") is used. Will that fix my problem ? ( To be honest with you I simply don't understand the last 2 sentences. ) I think you could (and should) replace getBytes() with getBytes("ISO-8859-1") most places, but you have to be careful in the renderers and should test fop thoroughly doing it. We have done that and works great. Not one problem found (in our kind of documents). Tore [SNIP] - markus begin:vcard n:Bernhardt;Markus tel;cell:0171-5770462 tel;fax:089-420903-20 tel;home:089-6378949 tel;work:089-420903-14 x-mozilla-html:FALSE url:www.swsgmbh.de org:Software Service Wulf Schupp GmbH;Spieljoch adr:;;Spieljochstr. 34;München;;81825;Germany version:2.1 email;internet:[EMAIL PROTECTED] title:Entwicklungsleiter note;quoted-printable:[dF]Quisam=0D=0Awww.discordian-Front.de=0D=0AHail Eris !!!=0D=0AHappy Frag !!! fn:Markus Bernhardt end:vcard - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Bug fix for non ASCII environments
Sure! Try this link : http://marc.theaimsgroup.com/?l=fop-dev&w=2&r=1&s=compression&q=b - Original Message - From: "Markus Bernhardt" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, July 23, 2001 4:26 PM Subject: Re: Bug fix for non ASCII environments > Is there also a SEARCHABLE list ? _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Bug fix for non ASCII environments
David BAUDOIN wrote: > Hi, > > http://xml.apache.org/mail/fop-dev/ is a working archive of this mailing > list. Is there also a SEARCHABLE list ? > > > Regards, > David Baudoin > > - Original Message - > From: Darren Munt <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Monday, July 23, 2001 2:34 PM > Subject: RE: Bug fix for non ASCII environments > > > Markus, > > > > You ask two questions there: > > 1. Are there Unicode PDFs? The answer to that one is yes. > > 2. Can FOP create them? This I do not know and will leave to someone else > to > > answer. > > > > The point is that if you force ISO-8859-1 encoding, it wont ever be able > to. > > I've been working with FOP for only a week or two and I can't find a > working > > archive of this mailing list, so I don't know if this has been discussed > > already. > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, email: [EMAIL PROTECTED] begin:vcard n:Bernhardt;Markus tel;cell:0171-5770462 tel;fax:089-420903-20 tel;home:089-6378949 tel;work:089-420903-14 x-mozilla-html:FALSE url:www.swsgmbh.de org:Software Service Wulf Schupp GmbH;Spieljoch adr:;;Spieljochstr. 34;München;;81825;Germany version:2.1 email;internet:[EMAIL PROTECTED] title:Entwicklungsleiter note;quoted-printable:[dF]Quisam=0D=0Awww.discordian-Front.de=0D=0AHail Eris !!!=0D=0AHappy Frag !!! fn:Markus Bernhardt end:vcard - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Bug fix for non ASCII environments
To answer your questions: Yes, there are unicode pdf files. Fop can generate pdf files for non iso 8859 encodings if you provide fonts that support it. However, the generated pdf isn't unicode but multibyte indexes to the fonts glyphs. It will be unicode as soon as the toUnicode CMAP is implemented. To generate the multibyte text, getBytes("UnicodeBigUnmarked") is used. I think you could (and should) replace getBytes() with getBytes("ISO-8859-1") most places, but you have to be careful in the renderers and should test fop thoroughly doing it. Tore On Mon, 23 Jul 2001, Markus Bernhardt wrote: > Hi ! > > Darren Munt wrote: > > > Forgive me if I show my ignorance of FOP internals, but isn't that going to > > convert all text read by getBytes into ISO-8859-1. If that's the case, there > > might be a few complaints from Unicode users (such as myself). > > We are using FOP since 0.12 and try now to port our > stuff to a EBCDIC based system, but I'm quite new to the FOP source code. > > As I understand you, you have unicode input files, or ? > I'm interested at the output. > > At the moment > string.getBytes() > is used to convert java.lang.Strings to bytes to write them > over a ByteArrayOutputStream into the pdf file. > Problem here is, that the default encoding will be used > by java to convert the internal 2 byte character representation > to 1 byte output. On almost every system under the sun > the standard java encoding is ISO-8859-1. > > AFAIK getBytes() is NEVER unicode safe. > AFAI can see there should be NO changes in the behavior > of FOP on any ASCII based system, because java already > uses this encoding scheme. > > Are there unicode PDF files ? > Can FOP create unicode PDF files ? > > > > > > > > > -Original Message- > > From: Markus Bernhardt [mailto:[EMAIL PROTECTED]] > > Sent: Monday, 23 July 2001 6:26 > > To: [EMAIL PROTECTED] > > Subject: Bug fix for non ASCII environments > > > > Hi ! > > > > Finally we have the actual FOP running under OS/390. > > > > We simply replaced all occurencies of > > > > string.getBytes() > > > > with: > > > > try { > > string.getBytes("ISO-8859-1"); > > } catch (UnsupportedEncodingException) {} > > > > Is there any chance this fix could go into the official FOP package. > > It will took only about 30 minutes to inculde it. > > > > - markus > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, email: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Bug fix for non ASCII environments
Hi, http://xml.apache.org/mail/fop-dev/ is a working archive of this mailing list. Regards, David Baudoin - Original Message - From: Darren Munt <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, July 23, 2001 2:34 PM Subject: RE: Bug fix for non ASCII environments > Markus, > > You ask two questions there: > 1. Are there Unicode PDFs? The answer to that one is yes. > 2. Can FOP create them? This I do not know and will leave to someone else to > answer. > > The point is that if you force ISO-8859-1 encoding, it wont ever be able to. > I've been working with FOP for only a week or two and I can't find a working > archive of this mailing list, so I don't know if this has been discussed > already. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: Bug fix for non ASCII environments
Markus, You ask two questions there: 1. Are there Unicode PDFs? The answer to that one is yes. 2. Can FOP create them? This I do not know and will leave to someone else to answer. The point is that if you force ISO-8859-1 encoding, it wont ever be able to. I've been working with FOP for only a week or two and I can't find a working archive of this mailing list, so I don't know if this has been discussed already. All I can say is that for a tool like FOP to reach it's full potential, it must support more than one encoding. I'd have thought this would be possible by adding an encoding property, to be supplied to the getBytes method where required. I have an application to use FOP where unicode support is required in the PDF output. We want to eventually be able to produce the same reports in many different languages, including Asian and Arabic languages. As you know, ISO-8859-1 only supports Latin characters -Original Message- From: Markus Bernhardt To: [EMAIL PROTECTED] Sent: 23/07/2001 19:59 Subject: Re: Bug fix for non ASCII environments Hi ! Darren Munt wrote: > Forgive me if I show my ignorance of FOP internals, but isn't that going to > convert all text read by getBytes into ISO-8859-1. If that's the case, there > might be a few complaints from Unicode users (such as myself). We are using FOP since 0.12 and try now to port our stuff to a EBCDIC based system, but I'm quite new to the FOP source code. As I understand you, you have unicode input files, or ? I'm interested at the output. At the moment string.getBytes() is used to convert java.lang.Strings to bytes to write them over a ByteArrayOutputStream into the pdf file. Problem here is, that the default encoding will be used by java to convert the internal 2 byte character representation to 1 byte output. On almost every system under the sun the standard java encoding is ISO-8859-1. AFAIK getBytes() is NEVER unicode safe. AFAI can see there should be NO changes in the behavior of FOP on any ASCII based system, because java already uses this encoding scheme. Are there unicode PDF files ? Can FOP create unicode PDF files ? > > > -Original Message- > From: Markus Bernhardt [mailto:[EMAIL PROTECTED]] > Sent: Monday, 23 July 2001 6:26 > To: [EMAIL PROTECTED] > Subject: Bug fix for non ASCII environments > > Hi ! > > Finally we have the actual FOP running under OS/390. > > We simply replaced all occurencies of > > string.getBytes() > > with: > > try { > string.getBytes("ISO-8859-1"); > } catch (UnsupportedEncodingException) {} > > Is there any chance this fix could go into the official FOP package. > It will took only about 30 minutes to inculde it. > > - markus > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, email: [EMAIL PROTECTED] <> <> - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Bug fix for non ASCII environments
Hi ! Darren Munt wrote: > Forgive me if I show my ignorance of FOP internals, but isn't that going to > convert all text read by getBytes into ISO-8859-1. If that's the case, there > might be a few complaints from Unicode users (such as myself). We are using FOP since 0.12 and try now to port our stuff to a EBCDIC based system, but I'm quite new to the FOP source code. As I understand you, you have unicode input files, or ? I'm interested at the output. At the moment string.getBytes() is used to convert java.lang.Strings to bytes to write them over a ByteArrayOutputStream into the pdf file. Problem here is, that the default encoding will be used by java to convert the internal 2 byte character representation to 1 byte output. On almost every system under the sun the standard java encoding is ISO-8859-1. AFAIK getBytes() is NEVER unicode safe. AFAI can see there should be NO changes in the behavior of FOP on any ASCII based system, because java already uses this encoding scheme. Are there unicode PDF files ? Can FOP create unicode PDF files ? > > > -Original Message- > From: Markus Bernhardt [mailto:[EMAIL PROTECTED]] > Sent: Monday, 23 July 2001 6:26 > To: [EMAIL PROTECTED] > Subject: Bug fix for non ASCII environments > > Hi ! > > Finally we have the actual FOP running under OS/390. > > We simply replaced all occurencies of > > string.getBytes() > > with: > > try { > string.getBytes("ISO-8859-1"); > } catch (UnsupportedEncodingException) {} > > Is there any chance this fix could go into the official FOP package. > It will took only about 30 minutes to inculde it. > > - markus > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, email: [EMAIL PROTECTED] begin:vcard n:Bernhardt;Markus tel;cell:0171-5770462 tel;fax:089-420903-20 tel;home:089-6378949 tel;work:089-420903-14 x-mozilla-html:FALSE url:www.swsgmbh.de org:Software Service Wulf Schupp GmbH;Spieljoch adr:;;Spieljochstr. 34;München;;81825;Germany version:2.1 email;internet:[EMAIL PROTECTED] title:Entwicklungsleiter note;quoted-printable:[dF]Quisam=0D=0Awww.discordian-Front.de=0D=0AHail Eris !!!=0D=0AHappy Frag !!! fn:Markus Bernhardt end:vcard - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: Bug fix for non ASCII environments
Forgive me if I show my ignorance of FOP internals, but isn't that going to convert all text read by getBytes into ISO-8859-1. If that's the case, there might be a few complaints from Unicode users (such as myself). -Original Message- From: Markus Bernhardt [mailto:[EMAIL PROTECTED]] Sent: Monday, 23 July 2001 6:26 To: [EMAIL PROTECTED] Subject: Bug fix for non ASCII environments Hi ! Finally we have the actual FOP running under OS/390. We simply replaced all occurencies of string.getBytes() with: try { string.getBytes("ISO-8859-1"); } catch (UnsupportedEncodingException) {} Is there any chance this fix could go into the official FOP package. It will took only about 30 minutes to inculde it. - markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]