Re: Parser removes whitespace before and after html entity, is this a bug?

Michael Glavassevich Mon, 15 Feb 2010 07:50:39 -0800

There is a lot that you haven't said about your application but the most
likely reason for white space getting stripped is attribute value
normalization [1]. XML parsers are required to normalize attribute values
before passing them to the application. You cannot turn this process off.


Thanks.

[1] http://www.w3.org/TR/2006/REC-xml-20060816/#AVNormalize

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [email protected]
E-mail: [email protected]

Leon Radley <[email protected]> wrote on 02/15/2010 06:24:54 AM:

> I've parsed a rss feed from twitter, it contains html entities such as
&#246;
> the problem I have is that if a word ends in a entity or begins with
> an entity, the whitespace before or after gets striped out.
> The parser seems to decode the entities correctly but simply removes
> to much whitespace.
>
> Is this a bug, or can I somehow tell the parser to not decode the
> html entities?
>
> Cheers
> Leon

Re: Parser removes whitespace before and after html entity, is this a bug?

Reply via email to