Re: [webkit-dev] Skipping Flakey Tests

2009-10-01 Thread Drew Wilson
I wanted to re-open this discussion with some real-world feedback.
In this case, there was a failure in one of the layout tests on the windows
platform, so following the advice below, aroben correctly checked in an
update to the test expectations instead of skipping the tests.

Downstream, this busted the Chromium tests, because that failure was not
happening in Chromium, and now our correct test output doesn't match the
incorrect test output that's been codified in the test expecations. We can
certainly manage this downstream by rebaselining the test and managing a
custom chromium test expectation, but that's a pain and is somewhat fragile
as it requires maintenance every time someone adds a new test case to the
test.

I'd really like to suggest that we skip broken tests rather than codify
their breakages in the expectations file. Perhaps we'd make exceptions to
this rule for tests that have a bunch of working test cases (in which case
there's value in running the other test cases instead of skipping the entire
test). But in general it's less work for everyone just to skip broken tests.

I don't have an opinion about flakey tests, but flat-out-busted tests should
get skipped. Any thoughts/objections?

-atw

On Fri, Sep 25, 2009 at 1:59 PM, Darin Adler da...@apple.com wrote:

 Green buildbots have a lot of value.

 I think it’s worthwhile finding a way to have them even when there are test
 failures.

 For predictable failures, the best approach is to land the expected failure
 as an expected result, and use a bug to track the fact that it’s wrong. To
 me this does seem a bit like “sweeping something under the rug”, a bug
 report is much easier to overlook than a red buildbot. We don’t have a great
 system for keeping track of the most important bugs.

 For tests that give intermittent and inconsistent results, the best we can
 currently do is to skip the test. I think it would make sense to instead
 allow multiple expected results. I gather that one of the tools used in the
 Chromium project has this concept and I think there’s no real reason not to
 add the concept to run-webkit-tests as long as we are conscientious about
 not using it when it’s not needed. And use a bug to track the fact that the
 test gives insufficient results. This has the same downsides as landing the
 expected failure results.

 For tests that have an adverse effect on other tests, the best we can
 currently do is to skip the test.

 I think we are overusing the Skipped machinery at the moment for platform
 differences. I think in many cases it would be better to instead land an
 expected failure result. On the other hand, one really great thing about the
 Skipped file is that there’s a complete list in the file, allowing everyone
 to see the list. It makes a good to do list, probably better than just a
 list of bugs. This made Darin Fisher’s recent “why are so many tests
 skipped, lets fix it” message possible.

-- Darin


 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Skipping Flakey Tests

2009-10-01 Thread Dirk Pranke
On Thu, Oct 1, 2009 at 11:47 AM, Darin Adler da...@apple.com wrote:
 On Oct 1, 2009, at 11:41 AM, Drew Wilson wrote:

 I don't have an opinion about flakey tests, but flat-out-busted tests
 should get skipped. Any thoughts/objections?

 I object.

 If a test fails on some platforms and succeeds on others, we should have the
 success result checked in as the default case, and the failure as an
 exception. And we should structure test results and exceptions so that it’s
 easy to get the expected failure on the right platforms and success on
 others. Your story about a slight inconvenience because a test failed on the
 base Windows WebKit but succeeded on the Chromium WebKit does not seem like
 a reason to change this!

 Skipping the test does not seem like a good thing to do for the long term
 health of the project. It is good to exercise all the other code each test
 covers and also to notice when a test result gets even worse or gets better
 when a seemingly unrelated change is made.

 I think we should skip only tests that endanger the testing strategy because
 they are super-slow, crash, or adversely affect other tests in some way.


I agree that skipping the test is the wrong thing to do. However,
checking in an incorrect baseline over the correct baseline is also
the wrong thing to do (because, as Drew points out, this can break
other platforms that don't have the bug).

Chromium does have the concept of marking tests as expected to FAIL,
but it does not have a way to capture what the expected failure is
(i.e., there is no way to capture a FAIL baseline). We discussed
this recently and punted on it because it was unclear how useful this
would really be, and -- as we all probably agree -- it's better not to
have failing tests in the first place.

Eric and Dimitry have suggested that we look into pulling the Chromium
expectations framework upstream into Webkit and adding the features
that WebKit's framework has that Chromium's doesn't. It sounds to me
like this might be the right long-term solution, and I'd be happy to
work on it.

In the meantime, maybe it makes sense to add Fail files alongside the
Skipped files? That would allow the bots to stay green, but would at
least keep the tests running.

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Skipping Flakey Tests

2009-10-01 Thread Eric Seidel
I agree with Darin.  I don't think that this is a good example of
where skipping would be useful.

I think more you're identifying that there is a test hierarchy problem
here.  Chromium really wants to base its tests off of some base win
implementation, and then win-apple, win-chromium, win-cairo
results could derive from that, similar to how mac and
mac-leopard, mac-tiger, mac-snowleopard work.

 I think we should skip only tests that endanger the testing strategy because
 they are super-slow, crash, or adversely affect other tests in some way.

Back to the original topic:  I do however see flakey tests as
endangering our testing strategy because they provide false
negatives, and greatly reduce the value of the layout tests and things
which run the layout tests, like the buildbots or the commit-bot.

I also agree with Darin's earlier comment that WebKit needs something
like Chromium's multiple-expected results support so that we can
continue to run flakey tests, even if they're flakey instead of having
to resort to skipping them.  But for now, skipping is the best we
have, and I still encourage us to use it when necessary instead of
leaving layout tests flakey. :)

-eric

On Thu, Oct 1, 2009 at 11:47 AM, Darin Adler da...@apple.com wrote:
 On Oct 1, 2009, at 11:41 AM, Drew Wilson wrote:

 I don't have an opinion about flakey tests, but flat-out-busted tests
 should get skipped. Any thoughts/objections?

 I object.

 If a test fails on some platforms and succeeds on others, we should have the
 success result checked in as the default case, and the failure as an
 exception. And we should structure test results and exceptions so that it’s
 easy to get the expected failure on the right platforms and success on
 others. Your story about a slight inconvenience because a test failed on the
 base Windows WebKit but succeeded on the Chromium WebKit does not seem like
 a reason to change this!

 Skipping the test does not seem like a good thing to do for the long term
 health of the project. It is good to exercise all the other code each test
 covers and also to notice when a test result gets even worse or gets better
 when a seemingly unrelated change is made.

 I think we should skip only tests that endanger the testing strategy because
 they are super-slow, crash, or adversely affect other tests in some way.

    -- Darin

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Skipping Flakey Tests

2009-10-01 Thread Darin Adler

On Oct 1, 2009, at 11:58 AM, Eric Seidel wrote:

I think more you're identifying that there is a test hierarchy  
problem here.  Chromium really wants to base its tests off of some  
base win implementation, and then win-apple, win-chromium,  
win-cairo results could derive from that, similar to how mac and  
mac-leopard, mac-tiger, mac-snowleopard work.


Something like that would be excellent if this pattern turns up often.  
I don’t think we should make the change because of one test, but if it  
comes up a lot we definitely should.


Back to the original topic:  I do however see flakey tests as  
endangering our testing strategy because they provide false  
negatives, and greatly reduce the value of the layout tests and  
things which run the layout tests, like the buildbots or the commit- 
bot.


I also agree with Darin's earlier comment that WebKit needs  
something like Chromium's multiple-expected results support so that  
we can continue to run flakey tests, even if they're flakey instead  
of having to resort to skipping them.  But for now, skipping is the  
best we have, and I still encourage us to use it when necessary  
instead of leaving layout tests flakey. :)


I agree on all of this.

Except that the two specific flakey tests we were discussing that got  
us started on this discussion were really serious bugs and it was  
really good to fix them rather than skipping them. After this  
experience, I now do share Alexey’s fear that if we had skipped them  
we would not have fixed the regression. Best, if possible, would have  
been to notice when they turned from reliable tests to flakey tests  
and rolled the change that made them flakey out.


-- Darin

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Skipping Flakey Tests

2009-10-01 Thread Drew Wilson
OK, I agree as well - skipping is not a good solution here; I don't think
the status quo is perfect, yet probably not imperfect enough to do anything
about :)
I guess there's just a process wrinkle we need to address on the Chromium
side. It's easy to rebaseline a test in Chromium, but less easy to figure
out when it's safe to un-rebaseline it.
-atw

On Thu, Oct 1, 2009 at 11:57 AM, Eric Seidel esei...@google.com wrote:

 I agree with Darin.  I don't think that this is a good example of
 where skipping would be useful.

 I think more you're identifying that there is a test hierarchy problem
 here.  Chromium really wants to base its tests off of some base win
 implementation, and then win-apple, win-chromium, win-cairo
 results could derive from that, similar to how mac and
 mac-leopard, mac-tiger, mac-snowleopard work.

  I think we should skip only tests that endanger the testing strategy
 because
  they are super-slow, crash, or adversely affect other tests in some way.

 Back to the original topic:  I do however see flakey tests as
 endangering our testing strategy because they provide false
 negatives, and greatly reduce the value of the layout tests and things
 which run the layout tests, like the buildbots or the commit-bot.

 I also agree with Darin's earlier comment that WebKit needs something
 like Chromium's multiple-expected results support so that we can
 continue to run flakey tests, even if they're flakey instead of having
 to resort to skipping them.  But for now, skipping is the best we
 have, and I still encourage us to use it when necessary instead of
 leaving layout tests flakey. :)

 -eric

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


[webkit-dev] Purging as much memory as possible

2009-10-01 Thread Peter Kasting
In Chromium, we have various events that we'd like to respond to by freeing
as much memory as possible.  (One example is system sleep, where we'd like
to dump memory before sleeping to avoid having to page it back in after
waking.)

I'm trying to find what areas in WebCore are good candidates for this kind
of work.  So far I've found WebCore::Cache, which is a singleton per
process, on which I can call setCapacity(..., ..., 0) to try and flush as
much as possible.

It's also been suggested to me that the Glyph cache is a good candidate; I
haven't quite figured out how this works, although I have found mitz'
recently-added showGlyphPageTree() function that I can probably use to do
some investigation.

My questions are:
* I notice that even when I set the WebCore::Cache capacity to zero, I can't
necessarily dump _everything_ out of it.  Is there some other set of calls I
should make to drop more references somewhere?
* Does anyone already know the likely footprint of the Glyph cache, or how
to clear it?
* Are there other obvious places to look?  Certainly the JS engine can hold
a lot of memory, but that's outside the scope of this question; I'm just
looking in WebCore.  Any other caches I can dump?

Thanks,
PK
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Purging as much memory as possible

2009-10-01 Thread Dan Bernstein


On Oct 1, 2009, at 6:24 PM, Peter Kasting wrote:

* Does anyone already know the likely footprint of the Glyph cache,  
or how to clear it?


FontCache::purgeInactiveFontData().

If you’re using Safari then you can see Font and Glyph Caches  
statistics in its Caches window, which also includes a Purge Inactive  
Font Data button that calls through to purgeInactiveFontData().

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev