Re: [webkit-dev] custom containers advantage over stl containers

Darin Adler Mon, 07 Jan 2008 13:20:59 -0800

On Jan 7, 2008, at 1:04 PM, Jakob Praher wrote:

just out of curiosity, I would like to ask why you decided toimplement your own containter structures, like Vector or HashTable/Map/Set ...
What was your driving force?

We didn't make a blanket decision to implement our own containerobjects. We decided separately about each one.

For HashMap and HashSet, there was no suitable standard libraryversion available. The details of the hash algorithms are carefullytuned, and the way we can store a RefPtr in a hash table with minimaloverhead is as well. The hash-based collections from the standard C++library are still not present in all the compilers we need to support,and if they were I believe they'd be insufficient.

For Vector, one of the reasons was that WTF::Vector has a featurewhere it uses the vector object itself to store an initial fixedcapacity. We use this in contexts where we have a variable sizedobject but don't want to do any memory allocation unless it exceedsthe fixed size.

The standard C++ library std::vector and std::hash_map also rely on C++ exceptions, and our entire project works with a limited dialect of C++ that doesn't use RTTI or exceptions.

Note that we do use the standard C++ library functions such asstd::sort in a number of places.

In addition why did you choose to make the string internalrepresentation (UChar) 2 bytes wide? Isn't it that most web-sitesare encoded in UTF-8/Latin1?

It's true that most websites are encoded in Latin-1 (although it's theWindows variant with different meanings for 0x80-0x9F). And manymodern websites are encoded in UTF-8. Note, though, that those are twodifferent encodings; the internal coding couldn't be Latin-1 becauseit can't cover all the Unicode characters. So the candidate forinternal encoding is UTF-8.


There are multiple reasons we chose UTF-16 over UTF-8.

One "reason" is that the KHTML code base was already using UTF-16 whenwe started the WebKit project.

Another reason is that the JavaScript language gets at the DOM withJavaScript strings, and all JavaScript string operations are definedin terms of UTF-16 code units. If things were stored as UTF-8, they'dhave to be converted back and forth from UTF-16. Or we could changeJavaScript to use UTF-8, but then many JavaScript string operationswould require scanning from the beginning of the string to countUTF-16 code units.

I'm sure the reasons I list here are not all the reasons for any ofthese decisions.


The theme seems to be performance.

    -- Darin

_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo/webkit-dev

Re: [webkit-dev] custom containers advantage over stl containers

Reply via email to