Re: [fpc-devel] XML Components

2012-11-03 Thread Andrew Brunner
I just added this Prepare method to my database API. class function XML.IsInvalid(var Value:Byte):boolean; begin Result:=(Value9) or (Value=11) or (Value=12) or ( (Value13) and (Value32)); end; class function XML.Prepare(var sInput:string; Refactor:TStream):string; var bChar:byte;

Re: [fpc-devel] XML Components

2012-11-03 Thread Hans-Peter Diettrich
Andrew Brunner schrieb: I just added this Prepare method to my database API. class function XML.IsInvalid(var Value:Byte):boolean; begin Result:=(Value9) or (Value=11) or (Value=12) or ( (Value13) and (Value32)); end; [...] The question is, what is going to happen when the encoding is

Re: [fpc-devel] XML Components

2012-11-02 Thread Michael Van Canneyt
On Thu, 1 Nov 2012, Andrew Brunner wrote: I'm having a problem getting the XML parser to read. Is there any way I can get the attached program to work by changing a parsing option to one less strict. My XML documents get over 1-2 GBs since they represent files. So having to convert /scan

Re: [fpc-devel] XML Components

2012-11-02 Thread Andrew Brunner
On Nov 2, 2012, at 7:24 AM, Michael Van Canneyt mich...@freepascal.org wrote: On Thu, 1 Nov 2012, Andrew Brunner wrote: I'm having a problem getting the XML parser to read. Is there any way I can get the attached program to work by changing a parsing option to one less strict. My

Re: [fpc-devel] XML Components

2012-11-02 Thread Michael Van Canneyt
On Fri, 2 Nov 2012, Andrew Brunner wrote: As a consequence, the codepage in the XML must be checked and converted if need be. The input data in the example attached is converted. There is no attachment to your mail. Imagine you have a XML file encoded in UTF16, and we assume it's

Re: [fpc-devel] XML Components

2012-11-02 Thread Andrew Brunner
On 11/02/2012 08:08 AM, Michael Van Canneyt wrote: There is no attachment to your mail. The attachment was in my first posting. But just in cease I've attached it again. Please feel free to check it out. The example is stripped of most of the xml code that was successfully parsed.

Re: [fpc-devel] XML Components

2012-11-02 Thread Mattias Gaertner
Andrew Brunner atbrun...@aurawin.com hat am 2. November 2012 um 13:59 geschrieben: On Nov 2, 2012, at 7:24 AM, Michael Van Canneyt mich...@freepascal.org wrote: On Thu, 1 Nov 2012, Andrew Brunner wrote: I'm having a problem getting the XML parser to read. Is there any way I can

Re: [fpc-devel] XML Components

2012-11-02 Thread Sergei Gorelkin
02.11.2012 17:08, Michael Van Canneyt пишет: On Fri, 2 Nov 2012, Andrew Brunner wrote: I think it would be a good solution and even prove faster in controlled environments. Plus all data is stored as widestrings in the DOM. The first question I have is if there was such an option would

Re: [fpc-devel] XML Components

2012-11-02 Thread Andrew Brunner
On Nov 2, 2012, at 8:32 AM, Sergei Gorelkin sergei_gorel...@mail.ru wrote: In this case, the issue is not encoding, but literal ESC (#27) code used in data. XML specification does not allow codepoints below 32, except TAB,CR and LF, to appear in data, both in literal and escaped forms.

Re: [fpc-devel] XML Components

2012-11-02 Thread Mattias Gaertner
Sergei Gorelkin sergei_gorel...@mail.ru hat am 2. November 2012 um 14:32 geschrieben: 02.11.2012 17:08, Michael Van Canneyt пишет: On Fri, 2 Nov 2012, Andrew Brunner wrote: I think it would be a good solution and even prove faster in controlled environments. Plus all data is

Re: [fpc-devel] XML Components

2012-11-02 Thread Sergei Gorelkin
02.11.2012 17:44, Mattias Gaertner пишет: Sergei Gorelkin sergei_gorel...@mail.ru hat am 2. November 2012 um 14:32 geschrieben: In this case, the issue is not encoding, but literal ESC (#27) code used in data. XML specification does not allow codepoints below 32, except TAB,CR and LF, to

Re: [fpc-devel] XML Components

2012-11-02 Thread Andrew Brunner
So where in the specs does it say that parsers must reject certain byte sequences between cdata tags excepting XML tags. If this is supported by specs it would help shape a viable solution. On Nov 2, 2012, at 9:01 AM, Sergei Gorelkin sergei_gorel...@mail.ru wrote: 02.11.2012 17:44,

Re: [fpc-devel] XML Components

2012-11-02 Thread Michael Van Canneyt
On Fri, 2 Nov 2012, Andrew Brunner wrote: So where in the specs does it say that parsers must reject certain byte sequences between cdata tags excepting XML tags. If this is supported by specs it would help shape a viable solution. Where did you get that it is supported ? The specs list

Re: [fpc-devel] XML Components

2012-11-02 Thread Jeppe Græsdal Johansen
Den 02-11-2012 14:32, Sergei Gorelkin skrev: 02.11.2012 17:08, Michael Van Canneyt пишет: On Fri, 2 Nov 2012, Andrew Brunner wrote: I think it would be a good solution and even prove faster in controlled environments. Plus all data is stored as widestrings in the DOM. The first

Re: [fpc-devel] XML Components

2012-11-02 Thread Michael Van Canneyt
On Fri, 2 Nov 2012, Jeppe Græsdal Johansen wrote: and LF, to appear in data, both in literal and escaped forms. In other words, XML is wrong technology to work with binary data, unless it is encoded into textual form (Base64 or alike). Regards, Sergei

Re: [fpc-devel] XML Components

2012-11-02 Thread Jeppe Græsdal Johansen
Den 02-11-2012 18:04, Michael Van Canneyt skrev: On Fri, 2 Nov 2012, Jeppe Græsdal Johansen wrote: and LF, to appear in data, both in literal and escaped forms. In other words, XML is wrong technology to work with binary data, unless it is encoded into textual form (Base64 or alike).

Re: [fpc-devel] XML Components

2012-11-02 Thread Michael Van Canneyt
On Fri, 2 Nov 2012, Jeppe Græsdal Johansen wrote: Den 02-11-2012 18:04, Michael Van Canneyt skrev: On Fri, 2 Nov 2012, Jeppe Græsdal Johansen wrote: and LF, to appear in data, both in literal and escaped forms. In other words, XML is wrong technology to work with binary data, unless it

Re: [fpc-devel] XML Components

2012-11-02 Thread Sergei Gorelkin
02.11.2012 21:06, Jeppe Græsdal Johansen пишет: Den 02-11-2012 18:04, Michael Van Canneyt skrev: On Fri, 2 Nov 2012, Jeppe Græsdal Johansen wrote: and LF, to appear in data, both in literal and escaped forms. In other words, XML is wrong technology to work with binary data, unless it is

Re: [fpc-devel] XML Components

2012-11-02 Thread Sergei Gorelkin
02.11.2012 19:57, Andrew Brunner пишет: So where in the specs does it say that parsers must reject certain byte sequences between cdata tags excepting XML tags. If this is supported by specs it would help shape a viable solution. This is not supported. Encoding processing, line-feed

Re: [fpc-devel] XML Components

2012-11-02 Thread Jeppe Græsdal Johansen
Den 02-11-2012 18:19, Sergei Gorelkin skrev: 02.11.2012 21:06, Jeppe Græsdal Johansen пишет: Den 02-11-2012 18:04, Michael Van Canneyt skrev: On Fri, 2 Nov 2012, Jeppe Græsdal Johansen wrote: and LF, to appear in data, both in literal and escaped forms. In other words, XML is wrong

Re: [fpc-devel] XML Components

2012-11-02 Thread Sergei Gorelkin
02.11.2012 21:22, Jeppe Græsdal Johansen пишет: Den 02-11-2012 18:19, Sergei Gorelkin skrev: 02.11.2012 21:06, Jeppe Græsdal Johansen пишет: Den 02-11-2012 18:04, Michael Van Canneyt skrev: On Fri, 2 Nov 2012, Jeppe Græsdal Johansen wrote: and LF, to appear in data, both in literal and

Re: [fpc-devel] XML Components

2012-11-02 Thread waldo kitty
On 11/2/2012 09:32, Sergei Gorelkin wrote: In other words, XML is wrong technology to work with binary data, unless it is encoded into textual form (Base64 or alike). encoding into textual form one increases the size of the stream by at least 1/3rd... a 3M file will be a 4M stream when

Re: [fpc-devel] XML Components

2012-11-02 Thread Andrew Brunner
On Nov 2, 2012, at 6:39 PM, waldo kitty wkitt...@windstream.net wrote: On 11/2/2012 09:32, Sergei Gorelkin wrote: In other words, XML is wrong technology to work with binary data, unless it is encoded into textual form (Base64 or alike). encoding into textual form one increases the size

[fpc-devel] XML Components

2012-11-01 Thread Andrew Brunner
I'm having a problem getting the XML parser to read. Is there any way I can get the attached program to work by changing a parsing option to one less strict. My XML documents get over 1-2 GBs since they represent files. So having to convert /scan each byte is unacceptable. Is there