Re: XMLParser error with unicode characters in XML file.
Manoj, For the XML declaration, the encoding attribute value is specified by the IANA values. The uppercase value of 'utf-8' is the prefered form of the character set. This is the first time I have seen the need for choosing the preferred value, but I will make a note of it for my stylesheets. You can find a list of character set values online (below). Mike Ferrando Washington, DC IANA character set spec http://www.iana.org/assignments/character-sets W3c XML 1.0 spec http://www.w3.org/TR/2004/REC-xml-20040204/#sec-existing-stds --- [EMAIL PROTECTED] wrote: Thanks for the reply. However this problem has nothing to do with data transfer as I am creating XML file on the fly. So if is installed on Unix file gets created on Unix and same for windows... My guess turned out right. On Unix you need to specify the encoding as 'UTF-8' ( case sensitive ) while on windows 'utf8' ( and for that matter 'UTF-8') works fine.. FYI Manoj J.Pietschmann [EMAIL PROTECTED]To: [EMAIL PROTECTED] e cc: Subject: Re: XMLParser error with unicode characters in XML 07/08/2004 03:59 file. AM Please respond to fop-user [EMAIL PROTECTED] wrote: My boss just called me and informed me that on UNIX ( where we have our jars and where we run our application server) its getting an error saying encoding error utf8. It worked on windows 2000 pro which I am using. Does the utf8 string needs to be different on unix? The most likely problem is that the files got corrupted during transfer. You should definitely get some education about encoding issues rather than stumbling along blindly. Start here for the XML related problems http://skew.org/xml/tutorial/ There are some websites dealing with encoding issues in Java and with transferring files across machines as well. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do you Yahoo!? Yahoo! Mail - You care about security. So do we. http://promotions.yahoo.com/new_mail - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: XMLParser error with unicode characters in XML file.
Thanks for the reply. However this problem has nothing to do with data transfer as I am creating XML file on the fly. So if is installed on Unix file gets created on Unix and same for windows... My guess turned out right. On Unix you need to specify the encoding as 'UTF-8' ( case sensitive ) while on windows 'utf8' ( and for that matter 'UTF-8') works fine.. FYI Manoj J.Pietschmann [EMAIL PROTECTED]To: [EMAIL PROTECTED] e cc: Subject: Re: XMLParser error with unicode characters in XML 07/08/2004 03:59 file. AM Please respond to fop-user [EMAIL PROTECTED] wrote: My boss just called me and informed me that on UNIX ( where we have our jars and where we run our application server) its getting an error saying encoding error utf8. It worked on windows 2000 pro which I am using. Does the utf8 string needs to be different on unix? The most likely problem is that the files got corrupted during transfer. You should definitely get some education about encoding issues rather than stumbling along blindly. Start here for the XML related problems http://skew.org/xml/tutorial/ There are some websites dealing with encoding issues in Java and with transferring files across machines as well. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: XMLParser error with unicode characters in XML file.
My boss just called me and informed me that on UNIX ( where we have our jars and where we run our application server) its getting an error saying "encoding error utf8". It worked on windows 2000 pro which I am using. Does the utf8 string needs to be different on unix?Paging Vinuta ThanksManoj-Forwarded by Manoj Nair/LA/SPE on 07/07/2004 06:08PM -To: [EMAIL PROTECTED]From: [EMAIL PROTECTED]Date: 07/07/2004 11:01AMSubject: RE: XMLParser error with unicode characters in XML file.ThanksVinuta!Thatworkedfine.IhavetestedwithSpanish,German,Italian,FrenchandPortuguese.IamstilltotestJapanese(whichmightbeapaininneck)...Willkeepyouposted...Thanksagaintoallwhoreplied..ManojVinutaNagaraddi[EMAIL PROTECTED]To:[EMAIL PROTECTED]muscle.comcc:Subject:RE:XMLParsererrorwithunicodecharactersinXML07/06/200402:31file.PMPleaserespondtofop-userIhadasimilarproblem.Iamwritingtoafileusingthefollowingcode:FileoutputDir=newFile(outputPath);outputDir.mkdir();StringfoFile=outputPath+"/image.fo";log.debug("Writingtofile"+foFile);FileOutputStreamfileoutstream=newFileOutputStream(foFile);Writerwriter=newOutputStreamWriter(fileoutstream,"utf8");writer.write(foDoc.toString().trim());writer.close();Theimportantpartofthecodeiswritingtothefileusingutf8encoding.VinutaNagaraddi-OriginalMessage-From:[EMAIL PROTECTED][mailto:[EMAIL PROTECTED]Sent:Tuesday,July06,20045:28PMTo:[EMAIL PROTECTED]Cc:fop-dev@xml.apache.orgSubject:XMLParsererrorwithunicodecharactersinXMLfile.IamgettingaXMLparsingerrorfromweblogic.apache.xerceswhenIparseaXMLdocumentwhichcontainsaccentedcharacters.ThisiswhatIamdoing1)Somedatabasecolumnshaveaccenteddataforspanish,japaneseetclanguageslikeNúmerodeidentificação:andnúmerodeidentificación.2)IamreadingthisdataandcreatingaXMLfileusingsomeprocessingandthenwritingthefileonthediscusingweblogic.xml.stream.XMLOutputStreamflush()method.3)ThenIamusingFOPtorenderthisXMLinPDF.Intherenderingprocesstheweblogic.apache.xerces.XMLparsergetscalledtoparsetheXML.Heretheparserthrowsaorg.xml.sax.SAXParserException(AninvalidXMLcharacter(Unicode:0xfa)wasfoundintheelementcontentofthedocument).IwasundertheimpressionthatXMLParsershouldtakecareoftheaccentedcharacters.WhenIopentheXMLfilewhichIcreatedinXMLSPYIsee"box"characterslike"clientendeidentificaci".Pleaseletmeknowhowshouldihandlemycodehere.ThanksManojThanks-Tounsubscribe,e-mail:[EMAIL PROTECTED]Foradditionalcommands,e-mail:[EMAIL PROTECTED]
Re: XMLParser error with unicode characters in XML file.
[EMAIL PROTECTED] wrote: My boss just called me and informed me that on UNIX ( where we have our jars and where we run our application server) its getting an error saying encoding error utf8. It worked on windows 2000 pro which I am using. Does the utf8 string needs to be different on unix? The most likely problem is that the files got corrupted during transfer. You should definitely get some education about encoding issues rather than stumbling along blindly. Start here for the XML related problems http://skew.org/xml/tutorial/ There are some websites dealing with encoding issues in Java and with transferring files across machines as well. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: XMLParser error with unicode characters in XML file.
Thanks Vinuta ! That worked fine. I have tested with Spanish,German,Italian, French and Portuguese. I am still to test Japanese ( which might be a pain in neck )... Will keep you posted... Thanks again to all who replied.. Manoj Vinuta Nagaraddi [EMAIL PROTECTED]To: [EMAIL PROTECTED] muscle.com cc: Subject: RE: XMLParser error with unicode characters in XML 07/06/2004 02:31 file. PM Please respond to fop-user I had a similar problem. I am writing to a file using the following code: File outputDir = new File(outputPath); outputDir.mkdir(); String foFile = outputPath + /image.fo; log.debug(Writing to file + foFile); FileOutputStream fileoutstream = new FileOutputStream(foFile); Writer writer = new OutputStreamWriter(fileoutstream, utf8); writer.write(foDoc.toString().trim()); writer.close(); The important part of the code is writing to the file using utf8 encoding. Vinuta Nagaraddi -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 06, 2004 5:28 PM To: [EMAIL PROTECTED] Cc: fop-dev@xml.apache.org Subject: XMLParser error with unicode characters in XML file. I am getting a XML parsing error from weblogic.apache.xerces when I parse a XML document which contains accented characters. This is what I am doing 1) Some database columns have accented data for spanish,japanese etc languages like Nmero de identificao: and nmero de identificacin. 2) I am reading this data and creating a XML file using some processing and then writing the file on the disc using weblogic.xml.stream.XMLOutputStream flush() method. 3) Then I am using FOP to render this XML in PDF. In the rendering process the weblogic.apache.xerces.XMLparser gets called to parse the XML. Here the parser throws a org.xml.sax.SAXParserException ( An invalid XML character (Unicode: 0xfa) was found in the element content of the document). I was under the impression that XMLParser should take care of the accented characters. When I open the XML file which I created in XML SPY I see box characters like cliente n de identificaci. Please let me know how should i handle my code here. Thanks Manoj Thanks - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]