On Wed, Jun 29, 2011 at 7:18 AM, <par...@paroga.com> wrote: > On Wed, 29 Jun 2011 06:55:57 -0700, Alex Milowski <a...@milowski.org> > wrote: >> I know the parser's speed is terrible as I've measured it recently. >> This is partially due to some of the things we are doing to deal with >> Unicode decoding to work around libxml2 issues. I think moving to >> native strings and decoding would improve the speed by a huge amount. >> It would be well work it to some to fix this. > > With the same UTF-8 content the libxml2 parser is _faster_ than our HTML > parser: > https://bugs.webkit.org/show_bug.cgi?id=52036#c1 > > I don't think that there is a huge difference between the HTML and XML > parser, so comparing should be ok in this case. > > After my (simple) performance tests I still think that parsing UTF-8 is > better than UTF-16, since it usually has only half of the memory size. >
I should test your patch against the speed tests I used. I'll try to get to that soon. It is unclear to me how this relates to the original reasons why we decode, recode, and then decode due to issues with libxml2. -- --Alex Milowski "The excellence of grammar as a guide is proportional to the paucity of the inflexions, i.e. to the degree of analysis effected by the language considered." Bertrand Russell in a footnote of Principles of Mathematics _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev