Re: [fpc-devel] XML Components

Sergei Gorelkin Fri, 02 Nov 2012 06:32:44 -0700

02.11.2012 17:08, Michael Van Canneyt пишет:



On Fri, 2 Nov 2012, Andrew Brunner wrote:


I think it would be a good solution and even prove faster in controlled 
environments.  Plus all
data is stored as widestrings in the DOM.

The first question I have is if there was such an option would the patch be 
accepted.


I don't see how you can fix the problem. If the input is UTF8, and the result 
must be converted to a
widestring for the DOM, then a conversion MUST take place, there is no way to 
avoid it.
And a conversion means scanning the input byte for byte.

In each case, the input must be scanned byte for byte anyway, to detect all the 
tags. That's what
makes XML slow and unusable for large amount of data.

The next question is what is the problem with the uf8 routine that it left the 
offending byte
sequence intact without converting the bytes in my sample data?


Without error message, it is impossible to tell.

In this case, the issue is not encoding, but literal ESC (#27) code used in data. XML specificationdoes not allow codepoints below 32, except TAB,CR and LF, to appear in data, both in literal andescaped forms.In other words, XML is wrong technology to work with binary data, unless it is encoded into textualform (Base64 or alike).


Regards,
Sergei
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] XML Components

Reply via email to