Maciej,

*I* deemed using a character type template for the HTMLTokenizer as being 
unwieldy.  Given there was the existing SegmentedString input abstraction, it 
made logical sense to put the 8/16 bit coding there.  If I would have moved the 
8/16 logic into the tokenizer itself, we might have needed to do 8->16 up 
conversions when a SegmentedStrings had mixed bit-ness in the contained 
substrings.  Even if that wasn't the case, the patch would have been far larger 
and likely include tricky code for escapes.

As I got into the middle of the 8-bit strings, I realized that not only could I 
keep performance parity, but some of the techniques I came up with offered good 
performance improvement.  The HTMLTokenizer ended up being one of those cases.  
This patch required a couple of reworks for performance reasons and garnered a 
lot of discussion from various parts of the webkit community.  See 
https://bugs.webkit.org/show_bug.cgi?id=90321 for the trail.  Ryosuke noted 
that this patch was responsible for a 24% improvement in the url-parser test in 
their bots (comment 47).  My performance final results are in comment 43 and 
show between 1 and 9% progression on the various HTML parser tests.

Adam, If you believe there is more work to be done in the HTMLTokenizer, file a 
bug and cc me.  I'm interested in hearing your thoughts.

- Michael

On Mar 9, 2013, at 4:24 PM, Maciej Stachowiak <m...@apple.com> wrote:

> 
> On Mar 9, 2013, at 3:05 PM, Adam Barth <aba...@webkit.org> wrote:
>> 
>> In retrospect, I think what I was reacting to was msaboff statement
>> that an unnamed group of people had decided that the HTML tokenizer
>> was too unwieldy to have a dedicated 8-bit path.  In particular, it's
>> unclear to me who made that decision.  I certainly do not consider the
>> matter decided.
> 
> It would be good to find out who it was that said that (or more specifically: 
> "Using a character type template approach was deemed to be too unwieldy for 
> the HTML tokenizer.") so you can talk to them about it.
> 
> Michael?
> 
> Regards,
> Maciej
> 

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to