Hi,

I've seen that link already, but, as you can see from the previous
attachment, the 0x1c character is in a [CDATA[...]] section and in the
specification of xml there's written that:

Within a CDATA section, only the CDEnd string is recognized as markup

http://www.w3.org/TR/2008/REC-xml-20081126/#sec-cdata-sect

So you're saying that ALSO inside the CDATA section one has to use the
same encoding as said in the xml directive:

<?xml version="1.0" encoding="ISO-8859-1" ?>

I think that our supplier is aware of the fact that he has two differnt
encodings and uses the [CDATA[]] section for the the purpose of being
able to put UTF-8 bytes inside an ISO-8859-1 encoded xml.

Could anyone help?

Many thanks

Best regards

Il giorno mar, 25/08/2009 alle 01.35 -0700, Jacob Danner ha scritto:
> Ahh, just re-looked at your old post and clicked on the link and ended
> up at the following page which I think might explain some of your
> issue.
> http://www.w3schools.com/xmL/xml_encoding.asp
> 
> 
> 
> On Mon, Aug 24, 2009 at 11:47 PM, Bartolomeo Nicolotti
> <bnicolo...@siapcn.it> wrote:
>         Hi,
>         
>         if you open the attached file with an editor that let you see
>         the hex
>         code of the files, for example ghex2 in linux, you'll see that
>         before
>         the string
>         
>         
>         "denominaciones de origen espa"
>         
>         there's a 0x1c byte that's the one that causes the exception.
>         Removing
>         this byte there's a further failure due to 0x1d bytes.
>         
>         I've the xml in a Java String and I use the method
>         parse(String), that
>         fails due to the 0x1c, 0x1d bytes inside the string. No http
>         header is
>         involved, as I have the xml in memory as a Java String.
>         
>         Many thanks
>         
>         Best regards.
>         
>         
>         
>         Il giorno lun, 24/08/2009 alle 11.39 -0700, Jacob Danner ha
>         scritto:
>         
>         > Can you properly parse the XMLObject when the value you are
>         trying to
>         > parse comes from a file?
>         >
>         > Again, I do not think this error is caused by an entry in
>         the CDATA of
>         > an element but rather in the content of the HTTP. When I
>         recieved this
>         > error before I found the issue was in some data that I
>         recieved before
>         > I had even recieved the XML PI.
>         > Also, what are you doing with the http headers since that
>         occurs
>         > before the payload?
>         >
>         > -jacobd
>         >
>         > On Mon, Aug 24, 2009 at 12:25 AM, Bartolomeo Nicolotti
>         > <bnicolo...@siapcn.it> wrote:
>         >         Hi,
>         >
>         >         we do the same, we use have the attached file in a
>         string,
>         >         having POSTed
>         >         it with
>         >
>         >
>         >         int
>         >
>         org.apache.commons.httpclient.HttpClient.executeMethod(HttpMethod
>         >         method) throws IOException, HttpException
>         >
>         >          and then we do XMLObject.parse, as you can see also
>         from the
>         >         call
>         >         stack:
>         >
>         >
>         
> org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:208)
>         >
>         >
>         >         Our problem is that in the xml itself there's a
>         character 0x1c
>         >         that
>         >         causes the parser to crash giving
>         >
>         >         >
>          e.toString():org.apache.xmlbeans.XmlException:
>         >         error: Illegal
>         >         >         XML
>         >         >         character: 0x1c
>         >         >
>         >
>         org.apache.xmlbeans.impl.piccolo.io.IllegalCharException:
>         >         >         Illegal XML
>         >         >         character: 0x1c
>         >
>         >
>         >         I think that the parser ignores that inside CDATA
>         there could
>         >         be 0x1c
>         >         characters due to a different encoding.
>         >
>         >         This is a big problem, isn't it? Especially because
>         the parse
>         >         fails
>         >         completely!
>         >
>         >         Many thanks
>         >
>         >         Bye
>         >
>         >         Il giorno ven, 21/08/2009 alle 14.10 -0700, Jacob
>         Danner ha
>         >         scritto:
>         >
>         >         > I've seen similar when working with content
>         retrieved from
>         >         URLs. What
>         >         > I found was the problem wasn't in the content of
>         the xml,
>         >         but in some
>         >         > additional data that was passed along prior to the
>         xml
>         >         payload I
>         >         > wanted. My workaround to this was to use some IO
>         Stream APIs
>         >         to read
>         >         > the content into a string and then parse the data.
>         >         >
>         >         > Out of curiousity, if you save the payload to a
>         file, can
>         >         you read it
>         >         > with XMLBeans (ie, XMLObject.parse(...))?
>         >         >
>         >         > HTH,
>         >         > -jacobd
>         >         >
>         >         > On Fri, Aug 21, 2009 at 10:03 AM, Bartolomeo
>         Nicolotti
>         >         > <bnicolo...@siapcn.it> wrote:
>         >         >         Hi,
>         >         >
>         >         >         we're receiving xml from a supplier
>         encoded in
>         >         ISO-8859-1, but
>         >         >         some tags
>         >         >         body are encoded with UTF-8, but they are
>         surrounded
>         >         with
>         >         >         CDATA, so that
>         >         >         strange encodings, like 0x1c character
>         shouldn't be
>         >         a problem
>         >         >         to the
>         >         >         parser, as said here:
>         >         >
>         >         >         http://www.w3schools.com/xmL/xml_cdata.asp
>         >         >
>         >         >         We've built a parser with xmlbean last
>         stable
>         >         version, but the
>         >         >         parser
>         >         >         complain about this 0x1c character, see
>         attachment
>         >         near:
>         >         >
>         >         >         ...
>         >         >         "denominaciones de origen espa"
>         >         >         ...
>         >         >
>         >         >         Fri Aug 21 16:14:39 CEST 2009:class
>         >         >
>         >
>         com.siap.DPKWebServices.Util.OTA_literal_HttpPost.queryHttp
>         >         >         caught an
>         >         >         exception: 29047814
>         org.apache.xmlbeans.XmlException
>         >         >
>          e.toString():org.apache.xmlbeans.XmlException:
>         >         error: Illegal
>         >         >         XML
>         >         >         character: 0x1c
>         >         >
>         >
>         org.apache.xmlbeans.impl.piccolo.io.IllegalCharException:
>         >         >         Illegal XML
>         >         >         character: 0x1c
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.piccolo.xml.XMLReaderReader.read(XMLReaderReader.java:169)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yy_refill(PiccoloLexer.java:3474)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yynextChar(PiccoloLexer.java:3721)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseCdataSection(PiccoloLexer.java:2671)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:4850)
>         >         >                at
>         >         >
>         >
>         org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
>         >         >                at
>         >         >
>         >
>         org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
>         >         >                at
>         org.apache.xmlbeans.impl.store.Locale
>         >         >         $SaxLoader.load(Locale.java:3439)
>         >         >                at
>         >         >
>         >
>         org.apache.xmlbeans.impl.store.Locale.parse(Locale.java:706)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:690)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:677)
>         >         >                at
>         >         >
>         >
>         
> org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:208)
>         >         >                at
>         com.siap.TransHotel.GetAvailAccomDocument
>         >         >         $Factory.parse(Unknown Source)
>         >         >                at
>         >         >
>         >
>         com.siap.DPKWebServices.Util.TransHotelUtil.validateRS(TransHotelUti
>         >         >
>         >         >         Is there a way to work-around this prolem?
>         >         >
>         >         >         Many thanks
>         >         >
>         >         >         Best regards
>         >         >
>         >         >         Bartolomeo
>         >         >
>         >         >         --
>         >         >         Bartolomeo Nicolotti
>         >         >         SIAP s.r.l.
>         >         >         www.siapcn.it
>         >         >         v.S.Albano 13 12049
>         >         >         Trinità(CN) Italy
>         >         >         ph:+39 0172 652553
>         >         >         centralino: +39 0172 652511
>         >         >         fax: +39 0172 652519
>         >         >
>         >         >
>         >         >
>         >
>         ---------------------------------------------------------------------
>         >         >         To unsubscribe, e-mail:
>         >         user-unsubscr...@xmlbeans.apache.org
>         >         >         For additional commands, e-mail:
>         >         user-h...@xmlbeans.apache.org
>         >         >
>         >
>         >         --
>         >
>         >         Bartolomeo Nicolotti
>         >         SIAP s.r.l.
>         >         www.siapcn.it
>         >         v.S.Albano 13 12049
>         >         Trinità(CN) Italy
>         >         ph:+39 0172 652553
>         >         centralino: +39 0172 652511
>         >         fax: +39 0172 652519
>         >
>         >
>         >
>         ---------------------------------------------------------------------
>         >         To unsubscribe, e-mail:
>         user-unsubscr...@xmlbeans.apache.org
>         >         For additional commands, e-mail:
>         user-h...@xmlbeans.apache.org
>         >
>         >
>         >
>         
>         --
>         
>         Bartolomeo Nicolotti
>         SIAP s.r.l.
>         www.siapcn.it
>         v.S.Albano 13 12049
>         Trinità(CN) Italy
>         ph:+39 0172 652553
>         centralino: +39 0172 652511
>         fax: +39 0172 652519
>         
>         
>         ---------------------------------------------------------------------
>         To unsubscribe, e-mail: user-unsubscr...@xmlbeans.apache.org
>         For additional commands, e-mail: user-h...@xmlbeans.apache.org
>         
>         
> 
-- 
Bartolomeo Nicolotti
SIAP s.r.l.
www.siapcn.it
v.S.Albano 13 12049
Trinità(CN) Italy
ph:+39 0172 652553
centralino: +39 0172 652511
fax: +39 0172 652519


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@xmlbeans.apache.org
For additional commands, e-mail: user-h...@xmlbeans.apache.org

Reply via email to