On Wed, Jun 27, 2012 at 3:43 PM, Mark Rejhon <[email protected]> wrote:
> On Wed, Jun 27, 2012 at 1:32 PM, Kevin Smith <[email protected]> wrote: > >> On Wed, Jun 27, 2012 at 6:26 PM, Mark Rejhon <[email protected]> wrote: >> > As long as the XML-processor limits its Unicode >> > processing to comply with the XML processing standard at w3.org/XML.... >> >> Just thinking aloud here, but - how does this interact with allowed >> XML mangling by parsers/serialisers? >> >> For example, if the client tries to send >> 1 >> 1 > >> 1 > 2 >> 1 < >> 1 < 2 >> >> And the initial > gets transformed into > by the XML library at the >> sending end? (I pick this example because it's one I've come across in >> the past). >> > > It's the responsibility of the XML processor to handle entity > encode/decode. > So from the XEP-0301, I only see one character whenever I see "<" or ">". > At the flow chart, > http://www.realjabber.org/flowchart_of_xmpp_rtt_path.pdf<http://www.marky.com/realjabber/flowchart_of_xmpp_rtt_path.pdf> > the > XML entity handling is handled by the XML processing step. > > I used to mention entity processing in an early draft of XEP-0301 almost > two years ago, but removed it, because it's the responsibility of the XML > processor, and thus outside the XEP-0301 chain. Comments? > -- Oh, and it becomes theoretically if the encoded entities of the XML processor on sender end is not compatible with the entity decoding of the XML processor on the recipient end. But that would be an XML processor compliance issue. I've never seen this happen. The vast majority of common XML processors (i.e. System.XML in C#, xmlpull in Java, etc) handle entities already. Strings I pass to the XML processor has entities automatically encoded and converted to UTF-8, and strings returned back to me from the XML processor already has entities decoded and converted to the language's native string format. If you're writing your own XML processor from scratch, it's your responsibility to follow the w3.org/XML processing spec. Entity encoding happens when you're generating the XML, and entity decoding happens when you're parsing the XML. So, your example is handled fine -- never seen a problem whenever I've used common XML processors... Good inquiry, mind you. Cheers Mark Rejhon
