Re: [webkit-dev] TimerWindowWndProc and dangling page instances

Van Den Berghe, Vincent Fri, 13 Sep 2013 01:35:25 -0700

Hello Mark,

Thank you for your tips. I am painfully aware of #1 and have compensated for it 
in much the same way you have. Tip #2 doesn't really apply to my scenario.
I'm embedding the Wincairo build of WebKit to generate PDF files from web pages 
in an ASP.NET application. Webkit isn't really designed for such a scenario but 
with some careful patching, a solution can be constructed which is "reasonably 
reliable".   However, these patches touch on WebKit's fundamental design 
decisions and I'm reluctant to submit them without first some validation of 
someone who knows the code better.   Some are more hacks than patches.
So I became a member of the webkit-dev list and submitted the result of my last 
frustrating bug hunt to see what happens. And it looks like it's not totally 
worthless.  As an added bonus, I can show it to my manager to explain why I was 
doing these last few days and why I was in such a foul disposition. Yay!

If people are interested, I'll post the rest of my experiences here, with the 
solutions I came up with. My employer's needs have priority, however.

Vincent

From: Salisbury, Mark [mailto:mark.salisb...@hp.com]
Sent: Friday, September 13, 2013 10:07 AM
To: Van Den Berghe, Vincent; webkit-dev@lists.webkit.org
Subject: RE: TimerWindowWndProc and dangling page instances

Vincent,

I've also embedded WebKit in a .NET application, and I've seen similar issues - 
not exactly what you're seeing though.

Maybe some tips I've learned could be of help to you or others:

*         Calling DestroyWindow will cause WebView::close() to be called (when 
WM_DESTROY is received), but it does not cause WebView's destructor to be 
called, that doesn't happen until the RCW holding the last COM reference is 
destroyed and the COM ref count goes to 0.  This happens on the GC thread (.NET 
internals), and deleting core webkit objects from a thread besides the main 
thread can lead to undefined behavior.  I'd see crashes from concurrent 
javascript GCs.  I fixed this by marshaling the delete call back to the main 
thread (in every single COM exposed class).

*         WebView's destructor calls DestroyWindow() on a tool tip window which 
may or may not have been destroyed (at least on WinCE).  When embedding in a 
.NET application, there can be a decent gap between the time the WebView's 
window is destroyed and the destructor is called.  It's possible that in the 
instances when Windows destroyed the tool tip window (because it is a child of 
the webview window) that the tool tip handle is assigned to another window by 
the time WebView's destructor is called.  You can guess what happens then when 
WebView's destructor deletes the tool tip window.  Windows in the system 
disappear - sometimes causing a crash.  The only fix I could figure out what to 
check to see if the window is still a tool tip window before deleting it.

I don't know if anyone wants to see a patch land for either of these problems, 
they're confined to embedded WebKit on Windows under .NET...

Mark

From: 
webkit-dev-boun...@lists.webkit.org<mailto:webkit-dev-boun...@lists.webkit.org> 
[mailto:webkit-dev-boun...@lists.webkit.org] On Behalf Of Van Den Berghe, 
Vincent
Sent: Friday, September 13, 2013 1:43 AM
To: webkit-dev@lists.webkit.org<mailto:webkit-dev@lists.webkit.org>
Subject: [webkit-dev] TimerWindowWndProc and dangling page instances

Hello,

Background
---------------
I'm currently embedding WebKit in a .NET application, which required me to 
solve a whole set of issues that are not relevant here. However, I've stumbled 
upon a scenario which I suspect is *really* problematic (but I can be wrong). 
Before patching anything, I'd like to submit this to greater minds for 
validation. The version of WebKit to which this applies is irrelevant: any 
recent version will do.

Problem
----------
When I'm finish with a WebView instance and before I'm releasing the COM 
object, I close it by doing a

DestroyWindow(m_viewWindow)

...where m_viewWindow is WebView's view window. Note that calling 
WebView::close() would work just as well.
This triggers a sequence of events that ultimately causes the destructor of the 
page to be called:

Page::~Page()

This works well, but when stress testing this will crash the WebKit.DLL with 
the following stack trace:

                WebKit.dll!WebCore::setImageLoadingSettings(WebCore::Page * 
page)  Line 57 + 0x3 bytes      C++

WebKit.dll!WebCore::Settings::imageLoadingSettingsTimerFired(WebCore::Timer<WebCore::Settings>
 * __formal)  Line 363 + 0x8 bytes C++

WebKit.dll!WebCore::Timer<WebCore::XMLHttpRequestProgressEventThrottle>::fired()
  Line 114 + 0xb bytes C++
               WebKit.dll!WebCore::ThreadTimers::sharedTimerFiredInternal()  
Line 132           C++
               WebKit.dll!WebCore::TimerWindowWndProc(HWND__ * hWnd, unsigned 
int message, unsigned int wParam, long lParam)  Line 111              C++

Depending on the managed environment (details are irrelevant for this 
discussion), the crash happens either immediately or after a while. Because of 
the way I embed WebKit, this usually happens when the second (or third, 
forth...) instance is being created.
The visible cause of the crash is the call to

WebKit.dll!WebCore::setImageLoadingSettings(WebCore::Page * page)

... with a page instance for which its destructor has already been executed.

Primary Cause
-----------------
I suspect the primary cause of a crash is the pending timer whose message is 
still in the message queue, that is executed just after the destruction of the 
page, but before any pending timers are killed. This causes a delayed execution 
of a function on a dangling page instance.
The problem doesn't occur when I don't call DestroyWindow or WebView::Close(), 
but this leads to unacceptable memory leaks.

Workaround
---------------
The minimal (but far from elegant) workaround is to add a method to 
WebCore::Settings:

        void clearPage()
        {
          m_page = 0;
        }

And call this method as the first line of Page's destructor:

Page::~Page()
{
    m_settings->clearPage();
    m_mainFrame->setView(0);
...

...and finally add a if(page) guard in setImageLoadingSettings like so:

static void setImageLoadingSettings(Page* page)
{
    if(page)
    for (Frame* frame = &page->mainFrame(); frame; frame = 
frame->tree().traverseNext()) {

frame->document()->cachedResourceLoader()->setImagesEnabled(page->settings().areImagesEnabled());

frame->document()->cachedResourceLoader()->setAutoLoadImages(page->settings().loadsImagesAutomatically());
    }
}

This doesn't remedy the delayed execution of the timer, but solves the problem 
by setting the deleted instance to NULL and making sure it's not used in the 
delayed call.

The questions of course are:

-          Is this behavior by design?

-          If it is, is there a way to clean up WebKit instances without 
causing this crashing behavior, and if so which one?

-          If it isn't, is there a more elegant workaround than the one 
outlined here?

Thanks for any feedback,

Vincent

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev

Re: [webkit-dev] TimerWindowWndProc and dangling page instances

Reply via email to