Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-07-05 Thread David Hyatt
Writing a new XML parser is a complete waste of time. If libxml has problems, fix them. If you throw out libxml, you'd have to throw out libxslt as well. The end result is not worth the engineering effort it would take to build it and make it work better than libxml/libxslt. dave (hy...@apple.

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Brent Fulgham
Hi, On Tue, Jun 28, 2011 at 10:10 PM, Dirk Schulze wrote: > > Am 29.06.2011 um 05:42 schrieb TAMURA, Kent: > >> I'm a little negative of developing a new XML parser. I'm afraid that the >> new parser introduces a lot of security/stability problems which existing >> parsers already resolved. > >

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Maciej Stachowiak
On Jun 29, 2011, at 7:14 AM, Alex Milowski wrote: > On Tue, Jun 28, 2011 at 10:10 PM, Dirk Schulze wrote: >> >> Am 29.06.2011 um 05:42 schrieb TAMURA, Kent: >> >>> I'm a little negative of developing a new XML parser. I'm afraid that the >>> new parser introduces a lot of security/stability p

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Rob Buis
On 29 June 2011 01:10, Dirk Schulze wrote: > > Am 29.06.2011 um 05:42 schrieb TAMURA, Kent: > >> I'm a little negative of developing a new XML parser. I'm afraid that the >> new parser introduces a lot of security/stability problems which existing >> parsers already resolved. > > I feel the same

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Alex Milowski
On Wed, Jun 29, 2011 at 10:01 AM, Adam Barth wrote: > For what it's worth, we've got an extremely primitive XML parser > PerformanceTest already: > > http://trac.webkit.org/browser/trunk/PerformanceTests/Parser/xml-parser.html > I also have some tests I could post as a patch. -- --Alex Milowski

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Adam Barth
For what it's worth, we've got an extremely primitive XML parser PerformanceTest already: http://trac.webkit.org/browser/trunk/PerformanceTests/Parser/xml-parser.html Adam On Wed, Jun 29, 2011 at 9:22 AM, Oliver Hunt wrote: > I think considerable effort should be put into building up a suite o

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Oliver Hunt
I think considerable effort should be put into building up a suite of performance tests in advance of the new parser (probably culled from xml encountered in the wild, but also a number of extreme edge cases wouldn't go a miss either). We should also put effort into reducing any/all recursion i

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread İsmail Dönmez
Hi; On Wed, Jun 29, 2011 at 3:12 AM, Jeffrey Pfau wrote: > Currently, WebCore uses libxml2, or, if available, QtXml to parse incoming > XML. However, QtXml isn't always available, and using libxml2 exposes its > own share of problems. As such, I'm undertaking writing an XML parser that > uses no

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Evan Martin
On Tue, Jun 28, 2011 at 6:12 PM, Jeffrey Pfau wrote: > Currently, WebCore uses libxml2, or, if available, QtXml to parse incoming > XML. However, QtXml isn't always available, and using libxml2 exposes its own > share of problems. As such, I'm undertaking writing an XML parser that uses > no ex

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Ryosuke Niwa
On Wed, Jun 29, 2011 at 6:55 AM, Alex Milowski wrote: > On Tue, Jun 28, 2011 at 6:50 PM, Eric Seidel wrote: > > > > I'm in general in favor of this effort (having worked extensively on > > the existing XML parsers). > > > > But I would caution you that xml is a ridiculously tiny fraction of > >

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Alex Milowski
On Wed, Jun 29, 2011 at 7:18 AM, wrote: > On Wed, 29 Jun 2011 06:55:57 -0700, Alex Milowski > wrote: >> I know the parser's speed is terrible as I've measured it recently. >> This is partially due to some of the things we are doing to deal with >> Unicode decoding to work around libxml2 issues.

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread paroga
On Wed, 29 Jun 2011 06:55:57 -0700, Alex Milowski wrote: > I know the parser's speed is terrible as I've measured it recently. > This is partially due to some of the things we are doing to deal with > Unicode decoding to work around libxml2 issues. I think moving to > native strings and decoding

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Alex Milowski
On Tue, Jun 28, 2011 at 10:10 PM, Dirk Schulze wrote: > > Am 29.06.2011 um 05:42 schrieb TAMURA, Kent: > >> I'm a little negative of developing a new XML parser. I'm afraid that the >> new parser introduces a lot of security/stability problems which existing >> parsers already resolved. > > I fe

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Alex Milowski
On Wed, Jun 29, 2011 at 3:39 AM, Maciej Stachowiak wrote: > > Both RapidXml and Expat apparently have not been updated in quite some time > (since 2009 and 2007 respectively). Copying an unmaintained project into the > WebKit repository and forking it is certainly a possible alternative to > wr

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Alex Milowski
On Tue, Jun 28, 2011 at 6:50 PM, Eric Seidel wrote: > > I'm in general in favor of this effort (having worked extensively on > the existing XML parsers). > > But I would caution you that xml is a ridiculously tiny fraction of > the web.  And it may not be worth the engineering effort to make a > b

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Alex Milowski
On Tue, Jun 28, 2011 at 6:41 PM, Jeffrey Pfau wrote: > See responses inline: > > On Jun 28, 2011, at 6:26 PM, Adam Barth wrote: > >> A question and a comment: >> >> 1) Will this let us to remove the code for both the libxml2 and the >> QtXml parsers?  I'd certainly much rather have one XML parser

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Maciej Stachowiak
On Jun 29, 2011, at 2:13 AM, Konstantin Tokarev wrote: > > > 29.06.2011, 07:42, "TAMURA, Kent" : >> I'm a little negative of developing a new XML parser. I'm afraid that the >> new parser introduces a lot of security/stability problems which existing >> parsers already resolved. >> How about

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Konstantin Tokarev
29.06.2011, 07:42, "TAMURA, Kent" : > I'm a little negative of developing a new XML parser. I'm afraid that the new > parser introduces a lot of security/stability problems which existing parsers > already resolved. > How about importing Expat parser to WebKit repository and maintain it by > o

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-29 Thread Patrick Gansterer
I had the "same idea" a year ago, and got only negative feedback. My main intention was/is the performance of the parser (see [1]). I improved the performance of it a lot in the meantime (see dependencies of [2]) [2] tries to remove this UTF-8 -> UTF-16 -> UTF-8 overhead. The patch isn't perfect

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Dirk Schulze
Am 29.06.2011 um 05:42 schrieb TAMURA, Kent: > I'm a little negative of developing a new XML parser. I'm afraid that the new > parser introduces a lot of security/stability problems which existing parsers > already resolved. I feel the same. Writing a new parser from scratch means introducing

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread TAMURA, Kent
I'm a little negative of developing a new XML parser. I'm afraid that the new parser introduces a lot of security/stability problems which existing parsers already resolved. How about importing Expat parser to WebKit repository and maintain it by ourselves? On Wed, Jun 29, 2011 at 10:12, Jeffre

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Maciej Stachowiak
Consolidating replies to avoid spamming the thread: On Jun 28, 2011, at 6:26 PM, Adam Barth wrote: > A question and a comment: > > 1) Will this let us to remove the code for both the libxml2 and the > QtXml parsers? I'd certainly much rather have one XML parser than > three. If the new parser

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Adam Barth
In case you're not aware, I believe you can access the XML parser via JavaScript at window.DOMParser, which might be helpful for testing. Adam On Jun 28, 2011 6:41 PM, "Jeffrey Pfau" wrote: > See responses inline: > > On Jun 28, 2011, at 6:26 PM, Adam Barth wrote: > >> A question and a comment:

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Eric Seidel
That was done, long ago. You can find the old patches in our svn history. :) On Tue, Jun 28, 2011 at 6:44 PM, Wyatt Carss wrote: > If that were all, would it be possible to patch libxml2 to use UTF-16? That > might be less of an undertaking than writing a new xml library, but that > could just b

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Eric Seidel
Correct. We convert from UTF16 to UTF8 (for libxml2) and then back to UTF16. There has been at least one libxml-related security fix to WebCore in recent memory. We have various hacks in the libxml2 parser due to libxml2 being designed to be a library used by applications, and not used by a libr

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Wyatt Carss
If that were all, would it be possible to patch libxml2 to use UTF-16? That might be less of an undertaking than writing a new xml library, but that could just be my youthful naivety.. On Tue, Jun 28, 2011 at 6:36 PM, Jeffrey Pfau wrote: > I don't know all of the problems libxml2 has, but one of

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Jeffrey Pfau
See responses inline: On Jun 28, 2011, at 6:26 PM, Adam Barth wrote: > A question and a comment: > > 1) Will this let us to remove the code for both the libxml2 and the > QtXml parsers? I'd certainly much rather have one XML parser than > three. This won't replace libxslt or QtXmlPatterns for

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Jeffrey Pfau
I don't know all of the problems libxml2 has, but one of the ones I've heard is that WebCore uses UTF-16 internally, and libxml2 uses UTF-8, so the data is perpetually converted between the two formats--and this is slow. If there are any other big ones, I haven't been told them, only that it wou

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Dirk Pranke
Can you expand a bit more on "using libxml2 exposes its own share of problems"? -- Dirk On Tue, Jun 28, 2011 at 6:12 PM, Jeffrey Pfau wrote: > Currently, WebCore uses libxml2, or, if available, QtXml to parse incoming > XML. However, QtXml isn't always available, and using libxml2 exposes its o

Re: [webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Adam Barth
A question and a comment: 1) Will this let us to remove the code for both the libxml2 and the QtXml parsers? I'd certainly much rather have one XML parser than three. 2) One thing we found very helpful in working on the HTML parser was a good test suite. Presumably there are existing XML parsin

[webkit-dev] Writing a new XML parser with no external libraries

2011-06-28 Thread Jeffrey Pfau
Currently, WebCore uses libxml2, or, if available, QtXml to parse incoming XML. However, QtXml isn't always available, and using libxml2 exposes its own share of problems. As such, I'm undertaking writing an XML parser that uses no external libraries. The first step to doing this is to add a ne