Thanks Dave and you are correct, I should have said 'well formed' as opposed to 
invalid.  They mean two different things.

Preciate the response.  In my particular case, I'll probably have to fix this 
on the originating side with CDATA blocks.

Cheers,
Keith


----- Original Message -----
> From: "Dave Kuhlman" <dkuhl...@pacbell.net>
> To: "Keith Robertson" <krobe...@redhat.com>, 
> generateds-users@lists.sourceforge.net
> Sent: Thursday, February 28, 2013 2:47:30 PM
> Subject: Re: [Generateds-users] What to do with invalid XML
> 
> > From: Keith Robertson
> 
> > Sent: Wednesday, February 27, 2013 12:13 PM
> > 
> 
> > Is there anything that can be done to "relax" processing of XML
> > documents that may contain invalid XML?  I am specifically thinking
> > about cases where the XML document contains <> within another
> > element
> > and the creator of the XML either didn't escape them or surround
> > them
> > with CDATA.
> > 
> 
> Keith -
> 
> Just so that we have our terminology consistent, to say that an XML
> document is invalid usually means that an attempt has been made to
> validate the document against the XML Schema (or DTD) for that
> document's document type and that validation attempt has failed.
> 
> What you are talking about in the case of mis-placed corner brackets
> is referred to being not well-formed XML.  When an XML document is
> not well-formed, usually that means that an XML parser will not
> accept it, and maybe even required to reject it.  For more on this,
> see: http://en.wikipedia.org/wiki/Well-formed_document
> 
> The code generated by generateDS.py uses the ElementTree or Lxml
> parsers to read in input XML documents.  Take a look at the parseXXX
> functions generated at the bottom of the generated python module.
> 
> So, at the least, you need to get those parsers (ElementTree or
> Lxml) to accept your document.
> 
> If you think that you know how to do some automatic code clean-up,
> then you might try writing a Python script, and pre-process your
> documents with that.
> 
> By the way, the decision to require that XML parsers not accept
> documents that are ill-formed, seems to be a conscious one.  I
> believe that there was quite a bit of discussion and controversy
> about that decision.  One rational is that because of this
> requirement that XML parsers reject documents that are not
> well-formed, you will be forced to push-back against the produces of
> the document and "encourage" them to fix those documents.  You will
> have to make up your own mind about this policy.  But, basically,
> it's not optional.
> 
> - Dave
> 
> 
>  
> --
> 
> Dave Kuhlman
> http://www.rexx.com/~dkuhlman
> 
> 

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Reply via email to