Hi Vincent,

Thank you for this very detailed explanation of the issue. I need to think 
about this for a bit.

It seems like the WK2 infrastructure should suffer from this same issue. 
Perhaps if we revived WK2 on Windows it would provide a better match to this 
situation.

Could you please do me a favor: copy the text of your e-mail and paste it into 
a new bug at https://bugs.webkit.org and cc me and Darin on this? Just file it 
under "WebKit Misc".

Thanks!

-Brent

Sent from my iPad

On Sep 15, 2013, at 8:53 AM, "Van Den Berghe, Vincent" 
<[email protected]> wrote:

> Hello again,
>  
> (for Brent Fulgham: codeword “onion”, I repeat, codeword “onion” J)
>  
> Context
> Consider embedding WebKit in an ASP.NET website (even though what follows is 
> valid in other scenarios). In order to do this, you need a (managed) thread 
> with a message loop. If you want this to work reliably, you make sure all 
> WebKit’s COM objects are created on that thread. If you chose not to, what 
> follows is still valid.
> Note that this thread doesn’t need to be the main process thread: in the case 
> of a worker process of a web server, you don’t even have access to that. For 
> WebKit, it just needs to be “a” thread with a message loop, as long as it’s 
> the same one.
>  
> When you instantiate a Webkit COM object for the first time, Windows’ COM 
> will ultimately call LoadLibrary on WebKit.dll. It will do so on the thread 
> that wants to instantiate the COM object. This initializes the CRT and 
> triggers a call to DllMain with the argument ul_reason_for_call set to 
> DLL_PROCESS_ATTACH.
> When the last reference to the last WebKit object is released, COM will NOT 
> call FreeLibrary(). COM doesn’t know the concept of “last object”, and will 
> keep the DLL around to avoid the overhead of loading it again should another 
> instance be needed.
> By default, your unmanaged DLL will remain loaded until the process is 
> terminated. The main process thread will then call FreeLibrary(), prompting 
> indirectly another call to DllMain with the argument ul_reason_for_call 
> hanksset to DLL_PROCESS_DETACH.
> The main thing to take away from this, is that the thread executing 
> DLL_PROCESS_ATTACH code and the thread executing DLL_PROCESS_DETACH code will 
> not necessarily be the same. In the above scenario, they will never be the 
> same.
> For those who prefer the horse’s mouth: http://msdn.microsoft.com/en-ure 
> us/library/windows/desktop/ms682583(v=vs.85).aspxus 
>  
> AtomicString
> Consider the following code:
>  
> LONG FontPlatformData::adjustedGDIFontWeight(LONG gdiFontWeight, const 
> String& family)
> {
>     static AtomicString lucidaStr("Lucida Grande");
>     if (equalIgnoringCase(family, lucidaStr)) {
>         if (gdiFontWeight == FW_NORMAL)
>             return FW_MEDIUM;
>         if (gdiFontWeight == FW_BOLD)
>             return FW_SEMIBOLD;
>     }
>     return gdiFontWeight;
>  
>  
> This is a method (one of many examples) with a static AtomicString object. 
> The constructor of lucidaStr will be executed in a thread-safe way (exactly 
> once) by the first thread that will enter this method. The C++ compiler will 
> generate code that will make sure of it, at least if the compiler obeys 
> C++0x, which mandates this behavior.
> Code will also be executed that registers the destructor  of lucidaStr so 
> that it is executed when the program terminates. The mechanism is similar to 
> functions registered with atexit(). Destruction order will be reverse 
> construction order, as expected.
> This code being in a DLL, it means that the destructor will be called when 
> the DLL is unloaded: after the the DLL_PROCESS_DETACH code of DllMain, the 
> CRT will do so.
> Given the context described above, it means that the destructor of an 
> AtomicString will be executed by a different thread than the one that 
> executed its constructor.
>  
> When an AtomicString is destroyed, the following method is executed:
>  
> void AtomicString::remove(StringImpl* string)
> {
>     ASSERT(string->isAtomic());
>     AtomicStringTableLocker locker;
>     HashSet<StringImpl*>& atomicStringTable = stringTable();
>     HashSet<StringImpl*>::iterator iterator = atomicStringTable.find(string);
>     ASSERT_WITH_MESSAGE(iterator != atomicStringTable.end(), "The string 
> being removed is atomic in the string table of an other thread!");
>     atomicStringTable.remove(iterator);
> }
>  
>  
> This will fail on the assertion, since the string was created on the string 
> table of a different thread.
> The process will therefore crash on termination. Every time.
>  
> Why this is relevant
> One can argue that for “eternal’ processes, this is not relevant. 
> Unfortunately, this is wrong. For example on Windows, IIS worker processes 
> can be periodically recycled for legitimate reasons (elapsed or scheduled 
> time, number of requests, memory usage or on demand). Every legitimate 
> requests will log a process crash, which is bad form (to say the least).
>  
> Workaround
> The workaround I’ve applied was to change the ASSERT_WITH_MESSAGE and the 
> line below it in the above method to this:
>  
>     if(iterator != atomicStringTable.end())
>       atomicStringTable.remove(iterator);
>  
> The workaround sucks because is never a good idea to eliminate a perfectly 
> valid an assertion just for taking care of a corner case. Unfortunately, this 
> corner case was systematic crashing behavior, which is worse.
> I was afraid there were other classes whose static instances are so sensitive 
> to threading, but it seems that AtomicString is the only one. Whew!
> In my defense, I’ve seen worse. I’ve seen .NET wrappers for WebKit creating a 
> static instance of WebView and call various methods in order to initialize 
> the variables on the main thread and avoid the crash. Needless  to say, these 
> workarounds don’t work.
>  
> Theoretical but impractical workaround 1
> You can convince COM to unload unused libraries with a call to 
> CoFreeUnusedLibraries(). This will trigger a call to DllCanUnloadNow (which 
> is implemented by WebKit in WebKitDLL.cpp) and if that function returns S_OK 
> (for WebKit this means: when there are no instances and no locks left), the 
> DLL will be unloaded. Unfortunately, the question remains on which thread to 
> call CoFreeUnusedLibraries? If you take the “multithreaded route” 
> instantiating your objects, there is no right answer to that question, since 
> different static atomic strings inside methods can be initialized on demand 
> by different threads.
> On .NET, there is also the problem of termination detection: usually, your 
> managed “webkit thread” will be a background thread to avoid blocking the 
> termination of the worker process. This means that the thread can be 
> terminated without notice, and hence there is no good place to put 
> termination code.
>  
> Theoretical but impractical workaround 2
> You can forego all the built-in COM support, call LoadLibrary/FreeLibrary 
> yourself., and handle all instantiating/calling/marshalling in .NET. Tricky 
> and difficult, but not impossible.
> But then, what’s the point of providing COM support? And besides, calling 
> FreeLibrary is as impractical as calling CoFreeUnusedLibraries() so you’re 
> back to square one.
>  
> AtomicString (again)
> Unless I’m mistaken, an AtomicString instance is an immutable object, at 
> least conceptually.  It makes a lot of sense for immutable objects (like the 
> String class in managed environments Java and .NET) to be thread safe.
> This is not the case for AtomicString because of some (thread) affinity to a 
> thread-specific string table. This design doesn’t work for static instances. 
> For those instances, shouldn’t AtomicString reference a global and 
> thread-neutral string table? Perhaps via a special constructor to be used 
> only in those instances?
> I’m new here, so maybe the discussion is moot and there is some very good 
> reason why things are designed the way they are. I just don’t see it.  
> Explanations are welcome.
> The closest design I can compare it with is string interning 
> (http://en.wikipedia.org/wiki/String_interning). But in managed languages, 
> there’s one big (thread-neutral and thread-safe) string table, not one string 
> table per thread.
>  
> This is one of those cases where I feel WebKit’s design conspires against 
> you. Well, it certainly did against me. Now excuse me while I get my 
> handkerchief.
>  
>  
> Vincent
>  
> _______________________________________________
> webkit-dev mailing list
> [email protected]
> https://lists.webkit.org/mailman/listinfo/webkit-dev
_______________________________________________
webkit-dev mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to