On Wed, Jun 27, 2012 at 3:43 PM, Mark Rejhon <[email protected]> wrote:

> On Wed, Jun 27, 2012 at 1:32 PM, Kevin Smith <[email protected]> wrote:
>
>> On Wed, Jun 27, 2012 at 6:26 PM, Mark Rejhon <[email protected]> wrote:
>> > As long as the XML-processor limits its Unicode
>> > processing to comply with the XML processing standard at w3.org/XML....
>>
>> Just thinking aloud here, but - how does this interact with allowed
>> XML mangling by parsers/serialisers?
>>
>> For example, if the client tries to send
>> 1
>> 1 >
>> 1 > 2
>> 1 <
>> 1 < 2
>>
>> And the initial > gets transformed into &gt; by the XML library at the
>> sending end? (I pick this example because it's one I've come across in
>> the past).
>>
>
> It's the responsibility of the XML processor to handle entity
> encode/decode.
> So from the XEP-0301, I only see one character whenever I see "<" or ">".
> At the flow chart,
> http://www.realjabber.org/flowchart_of_xmpp_rtt_path.pdf<http://www.marky.com/realjabber/flowchart_of_xmpp_rtt_path.pdf>
>  the
> XML entity handling is handled by the XML processing step.
>
> I used to mention entity processing in an early draft of XEP-0301 almost
> two years ago, but removed it, because it's the responsibility of the XML
> processor, and thus outside the XEP-0301 chain.   Comments?
>

-- Oh, and it becomes theoretically if the encoded entities of the XML
processor on sender end is not compatible with the entity decoding of the
XML processor on the recipient end.  But that would be an XML processor
compliance issue.   I've never seen this happen.

The vast majority of common XML processors (i.e. System.XML in C#, xmlpull
in Java, etc) handle entities already.  Strings I pass to the XML processor
has entities automatically encoded and converted to UTF-8, and strings
returned back to me from the XML processor already has entities decoded and
converted to the language's native string format.  If you're writing your
own XML processor from scratch, it's your responsibility to follow the
w3.org/XML processing spec.   Entity encoding happens when you're
generating the XML, and entity decoding happens when you're parsing the
XML.

So, your example is handled fine -- never seen a problem whenever I've used
common XML processors...
Good inquiry, mind you.

Cheers
Mark Rejhon

Reply via email to