Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-11-02 Thread Antti Koivisto
On Wed, Oct 31, 2012 at 12:05 AM, Alexey Proskuryakov a...@apple.com wrote:

 This will mean that cache is always almost empty, and all resources in it
 are extremely fresh. I don't know if this would provide substantial
 additional test coverage over cleaning the cache all the time, or just
 completely disabling it in WebKitTestRunner.


Certain areas of coverage would improve. The code paths taken when a
resource is restored from the memory cache can be quite different from the
usual loading. Many operations (like script execution) happen synchronously
if the resource is found from the cache. We reuse various decoded forms
(bitmaps, stylesheets, jsc parse structures, likely more in the future).
All data is available in single chunk. It is possible to write tests that
detect these differences (and it is possible that some tests hit them
accidentally).

We would still lose coverage for things that depend on having lots of
resources around like cache pruning.


   antti


 - WBR, Alexey Proskuryakov


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-11-02 Thread Vyacheslav Ostapenko
On Fri, Nov 2, 2012 at 12:33 PM, Antti Koivisto koivi...@iki.fi wrote:

 On Wed, Oct 31, 2012 at 12:05 AM, Alexey Proskuryakov a...@apple.comwrote:

 This will mean that cache is always almost empty, and all resources in it
 are extremely fresh. I don't know if this would provide substantial
 additional test coverage over cleaning the cache all the time, or just
 completely disabling it in WebKitTestRunner.


 Certain areas of coverage would improve. The code paths taken when a
 resource is restored from the memory cache can be quite different from the
 usual loading. Many operations (like script execution) happen synchronously
 if the resource is found from the cache. We reuse various decoded forms
 (bitmaps, stylesheets, jsc parse structures, likely more in the future).
 All data is available in single chunk. It is possible to write tests that
 detect these differences (and it is possible that some tests hit them
 accidentally).

 We would still lose coverage for things that depend on having lots of
 resources around like cache pruning.


In this case to improve code coverage all tests should run twice - 1st with
clear cache and 2nd run after that in order to test cached case.

Slava
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-29 Thread Maciej Stachowiak

On Oct 28, 2012, at 3:30 PM, Antti Koivisto koivi...@iki.fi wrote:

 We could clear the cache between tests but run each test twice in a row. 
 Second run will then happen with deterministically pre-populated cache. That 
 would both make things more predictable and improve our test coverage for 
 cached cases. Unfortunately it would also slow down testing significantly, 
 though less than 2x.

I actually really like this idea. Doing it this way would effectively run each 
test both completely uncached, and fully cached, which would be better test 
coverage than our current approach. Can we get an estimate on what this would 
cost if applied to our whole test suite? Could we do it for just a subset of 
the tests?

(BTW I think this is better than the virtual test suite approach suggested by 
Dirk; running the test with all its resources cached from having loaded it 
immediately before is more reliable and better test coverage than running it 
later as part of some sequence that doesn't clear the cache.)

Does anyone strongly object to this approach? It seems way better to me than 
other options discussed on this thread.

Regards,
Maciej
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-29 Thread Dirk Pranke
On Mon, Oct 29, 2012 at 5:48 AM, Maciej Stachowiak m...@apple.com wrote:

 On Oct 28, 2012, at 10:09 PM, Dirk Pranke dpra...@chromium.org wrote:


 On Sun, Oct 28, 2012 at 6:32 AM, Maciej Stachowiak m...@apple.com wrote:

 I think the nature of loader and cache code is that it's very hard to make 
 tests which always fail deterministically when regressions are introduced, 
 as opposed to randomly. The reason for this is that bugs in these areas are 
 often timing-dependent. I think it's likely this tendency to fail randomly 
 will be the case whether or not the tests are trying to explicitly test the 
 cache or are just incidentally doing so in the course of other things.


 I am not familiar with the loader and caching code in webkit, but I
 know enough about similar problem spaces to be puzzled by why it's
 impossible to write tests that can adequately test the code.

 Has anyone claimed that? I think impossible to write tests that can 
 adequately test the code is not a position that anyone in this thread has 
 taken, certainly not me above.

 My claim is only that many classes of loader and cache bugs, when first 
 introduced, are likely to cause nondeterministic test failures. And further, 
 this is likely to be the case even if tests are written to target that 
 subsystem. That's not the same as saying adequate tests are impossible.

I'm sorry, I didn't mean impossible literally. Please strike that,
as it sounds like it has just made a confusing situation worse.

But, you did claim that it would be very hard to make tests that
always fail deterministically, and I don't see why that's true?
Testing things that are timing-dependent only require that you be able
to control or simulate time. It may be that this is hard to do with
layout tests, but it's pretty straightforward with unit tests that
allow you to control the layers above and below the cache.

 It just means to have good testing of some areas of the code, we need a good 
 way of dealing with nondeterministic failures.

This is backwards. If you *don't* have good testing, more of your
failures are likely to show up sporadically, which leads you to want
to build tools for them. Randomized testing is a helpful tool to use
*alongside* focused testing to ensure coverage, but should not be used
as a replacement.


 What I personally would most wish for is good tools to catch when a test 
 starts failing nondeterministically, and to identify the revision where the 
 failures began. The reason we hate random failures is that they are hard to 
 track down and diagnose. But some types of bugs are unlikely to manifest in 
 a purely deterministic way. It would be good if we had a reliable and 
 useful way to catch those types of bugs.

 This is a fine idea -- and I'm always happy to talk about ways we can
 improve our test tooling, please feel free to start a separate thread
 on these issues -- but I don't want to lose sight of the main issue
 here.

 I think the problem I identified -- that it's overly hard to track down and 
 diagnose regressions that cause tests to fail only part of the time -- is 
 more important and more fundamental than any of the three problems that you 
 cite below. Our test infrastructure ultimately exists to help us notice and 
 promptly fix regressions, and for some types of regressions, namely those 
 that do not manifest 100% of the time, it is not working so well. The 
 problems you mention are all secondary consequences of that fundamental 
 problem, in my opinion.

First of all, this isn't an either/or situation. We should be capable
of addressing all of these issues in parallel.

Second, I don't see how the existence of bugs in the code, the lack of
test isolation, or the lack of good test coverage for certain layers
of the code follow from not having good tools to triage intermittent
failures? That seems like putting the cart before the horse.

Third, are you familiar with the flakiness dashboard?

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.orgbuilder=Apple%20Lion%20Debug%20WK1%20(Tests)

Does it not do exactly what you're describing? Are there things that
you would like added? If it would be helpful for us to have a meeting
or something to help explain how this works, I'm sure we could set one
up.


  - Maciej


 It sounds like we've identified three existing problems - please
 correct me if I'm misstating them:

 1. There appears to be a bug in the caching code that is causing tests
 for other parts of the system to fail randomly.

 2. DRT and WTR on some ports are implemented in a way that is causing
 the system to be more fragile than some of us would like it to be, and
 there doesn't seem to be an a priori need for this to be the case;
 indeed some ports already don't do this.

 3. We don't apparently have dedicated test coverage for caching and
 the loader that people think is good enough, and getting such tests
 might be hard.

 P.S. I do think your problem 

Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-28 Thread Maciej Stachowiak

On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote:

 
 I’m sure Antti, Alexey, and others who have worked on the loader and other 
 parts of WebKit are happy to write those tests or list the kind of things 
 they want to test. Heck, I don’t mind writing those tests if someone could 
 make a list.
 
 I totally sympathize with the sentiment to reduce the test flakiness but 
 loader and cache code have historically been under-tested, and we’ve had a 
 number of bugs detected only by running non-loader tests consecutively.
 
 On the contrary, we’ve had this DRT behavior for ages. Is there any reason we 
 can’t wait for another couple of weeks or months until we add more loader  
 cache tests before making the behavior change?

I think the nature of loader and cache code is that it's very hard to make 
tests which always fail deterministically when regressions are introduced, as 
opposed to randomly. The reason for this is that bugs in these areas are often 
timing-dependent. I think it's likely this tendency to fail randomly will be 
the case whether or not the tests are trying to explicitly test the cache or 
are just incidentally doing so in the course of other things.

Unfortunately, it's very tempting when a test is failing randomly to blame the 
test rather than to investigate whether there is an actual regression affecting 
it. And sometimes it really is the test's fault. But sometimes it is a genuine 
bug in the code. 

On the other hand, nondetermisitic test failures make it harder to use test 
infrastructure in general.

These are difficult things to reconcile. The original philosophy of WebKit 
tests is to test end-to-end under relatively realistic conditions, but at the 
same time unpredictability makes it hard to stay at zero regressions.

I think making different ports do testing under different conditions makes it 
more likely that some contributors will introduce regressions without noticing, 
leaving it for others to clean up. So it's regrettable if we go that way 
because we are unable to reach consensus. Creating some special opt-in --antti 
mode would be even worse, as it's almost certain that failures would creep into 
a mode that nobody runs.

What I personally would most wish for is good tools to catch when a test starts 
failing nondeterministically, and to identify the revision where the failures 
began. The reason we hate random failures is that they are hard to track down 
and diagnose. But some types of bugs are unlikely to manifest in a purely 
deterministic way. It would be good if we had a reliable and useful way to 
catch those types of bugs.

Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-28 Thread Antti Koivisto
We could clear the cache between tests but run each test twice in a row.
Second run will then happen with deterministically pre-populated cache.
That would both make things more predictable and improve our test coverage
for cached cases. Unfortunately it would also slow down testing
significantly, though less than 2x.


  antti
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-28 Thread Ami Fischman

 We can live in one of two worlds:
 1) LayoutTests that concern themselves with specific network/loading
 concerns need to use unique URLs to refer to static data; or
 2) DRT clears JS-visible state between tests.
 The pros/cons seem clear to me:
 Pro#1: loading/caching code is coincidentally tested by (unknown) tests
 that reuse URLs among themselves.
 Con#1: requires additional cognitive load for all webkit developers; the
 only way to write a test that won't be affected by future addition of
 unrelated tests is to use unique URLs
 Pro#2: principle of least-surprise is maintained; understanding DRT 
 reading a test (and not every other test) is enough to understand its
 behavior
 Con#2: loading/caching code needs to be tested explicitly.
 IMO (Pro#2 + -Con#1)  (Pro#1 + -Con#2).
 Are you saying you believe the inequality goes a different way, or am I
 missing some other feature of your thesis?

 Yes, this is a fair description.


I'm going to assume you mean that yes, you believe the inequality goes the
other way:  (Pro#2 + -Con#1)  (Pro#1 + -Con#2)


 This accidental testing is not something to be neglected


I'm not neglecting it, I'm evaluating its benefit to be less than its cost.

To make concrete the cost/benefit tradeoff, would you add a random sleep()
into DRT execution to detect timing-related bugs?
It seems like a crazy thing to do, to me, but it would certainly catch
timing-related bugs quite effectively.
If you don't think we should do that, can you describe how you're
evaluating cost/benefit in each of the cases and why you arrive at
different conclusions?

(of course, adding such random sleeps under default-disabled flag control
for bug investigation could make a lot of sense; but here I'm talking about
what we do on the bots  by default)


 It's not humanly possible to have tests for everything in advance.


Of course.  But we should at least make it humanly possible to understand
our tests as written :)
Making understanding our tests not humanly possible isn't the way to make
up for the not-humanly-possible nature of testing everything in every way.
It just means we push off not knowing how much coverage we really have, and
derive a false sense of security from the fact that bugs have been found in
the past.

I completely agree with Maciej's idea that we should think about ways to
 make non-deterministic failures easier to work with, so that they would
 lead to discovering the root cause more directly, and without the costs
 currently associated with it.


I have no problem with that, but I'm not sure how it relates to this thread
unless one takes an XOR approach, in which case I guess I have low faith
that the bigger problem Maciej highlights will be solved in a reasonable
timeframe (weeks/months).

 Memory allocator state. Computer's real time clock. Hard drive's head
 position if you have a spinning hard drive, or SSD controller state if you
 have an SSD. HTTP cookies. Should I continue the list?

 These things are all outside of webkit.

 Yes, they are outside WebKit, but not outside WebKit control, if needed.
 Did you intend that to be an objection?


I imagine Balazs was pointing out that you included items that are not
JS-visible in an answer to my question about things that are JS-visible.
 But that was part of an earlier fork of this thread that went nowhere, so
let's let it go.

Cheers,
-a
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-28 Thread Dirk Pranke
 On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote:
 I’m sure Antti, Alexey, and others who have worked on the loader and other 
 parts of WebKit are happy to write those tests or list the kind of things 
 they want to test. Heck, I don’t mind writing those tests if someone could 
 make a list.

 I totally sympathize with the sentiment to reduce the test flakiness but 
 loader and cache code have historically been under-tested, and we’ve had a 
 number of bugs detected only by running non-loader tests consecutively.

 On the contrary, we’ve had this DRT behavior for ages. Is there any reason 
 we can’t wait for another couple of weeks or months until we add more loader 
  cache tests before making the behavior change?


Please correct me if I'm misinformed, but it's been three months since
this issue was first raised, and it doesn't sound like they've been
writing those tests or are happy to do so, and despite people asking
on this thread, they haven't been listing the kinds of tests they
think they need.

Have we actually made any progress here, or was the issue dropped
until Ami raised it again? It seems like the latter to me ... again,
please correct me if this is being actively worked on, because that
would change the whole tenor of this debate.

On Sun, Oct 28, 2012 at 6:32 AM, Maciej Stachowiak m...@apple.com wrote:

 I think the nature of loader and cache code is that it's very hard to make 
 tests which always fail deterministically when regressions are introduced, as 
 opposed to randomly. The reason for this is that bugs in these areas are 
 often timing-dependent. I think it's likely this tendency to fail randomly 
 will be the case whether or not the tests are trying to explicitly test the 
 cache or are just incidentally doing so in the course of other things.


I am not familiar with the loader and caching code in webkit, but I
know enough about similar problem spaces to be puzzled by why it's
impossible to write tests that can adequately test the code. Is the
caching disk-based, and maybe running tests in parallel screwing with
things? If so, then maybe the fact that we now run tests in parallel
is why this is a problem now and hasn't been before? Or maybe the fact
that a given process doesn't always see the same tests in the same
order is the problem?

 Unfortunately, it's very tempting when a test is failing randomly to blame 
 the test rather than to investigate whether there is an actual regression 
 affecting it. And sometimes it really is the test's fault. But sometimes it 
 is a genuine bug in the code.

 On the other hand, nondetermisitic test failures make it harder to use test 
 infrastructure in general.

 These are difficult things to reconcile. The original philosophy of WebKit 
 tests is to test end-to-end under relatively realistic conditions, but at the 
 same time unpredictability makes it hard to stay at zero regressions.


Exactly. Personally, the cost of unpredictability in the test
infrastructure is so much higher than the value we're getting
(implicitly) that this is a no-brainer to me. There are some tradeoffs
(like running tests in parallel) that are worth it, but this isn't one
of them. I am happy to explain further my thinking and standards if
there's interest.

Hopefully that partially answers Alexey's questions about where we
should draw the line in trying to make our tests deterministic and
hermetic: do everything you reasonably can. We're not picking on
caching here.

 I think making different ports do testing under different conditions makes it 
 more likely that some contributors will introduce regressions without 
 noticing, leaving it for others to clean up. So it's regrettable if we go 
 that way because we are unable to reach consensus.

I agree that it is bad to have different ports behaving differently,
and I would like to avoid that as well. I don't want any port
suffering from flaky tests, but I also don't think it's reasonable to
have one group force that on everyone else indefinitely, either.

I am also fine with having some way to test systems more
non-deterministically in a way to expose more bugs, but that needs to
be clearly separated from the other testing we do; it is an unfair
cost to impose on the rest of the system otherwise and should be
tolerated only if we have no other choice. We have other choices.

 Creating some special opt-in --antti mode would be even worse, as it's almost 
 certain that failures would creep into a mode that nobody runs.


This comment (and Antti's suggestion, below) makes me think that you
didn't understand my virtual test suite suggestion; that's not
surprising, since Apple doesn't actual use this feature of NRWT yet.

A virtual test suite is a way of saying (re-)run the tests under
directory X with command-line flags Y and Z, and put the results in a
new directory. For example, Chromium runs all of the tests in
fast/canvas twice, once normally using the regular software code
path, and once with a 

Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-28 Thread Balazs Kelemen

On 10/28/2012 08:25 PM, Ami Fischman wrote:



We can live in one of two worlds:
1) LayoutTests that concern themselves with specific
network/loading concerns need to use unique URLs to refer to
static data; or
2) DRT clears JS-visible state between tests.
The pros/cons seem clear to me:
Pro#1: loading/caching code is coincidentally tested by (unknown)
tests that reuse URLs among themselves.
Con#1: requires additional cognitive load for all webkit
developers; the only way to write a test that won't be affected
by future addition of unrelated tests is to use unique URLs
Pro#2: principle of least-surprise is maintained; understanding
DRT  reading a test (and not every other test) is enough to
understand its behavior
Con#2: loading/caching code needs to be tested explicitly.
IMO (Pro#2 + -Con#1)  (Pro#1 + -Con#2).
Are you saying you believe the inequality goes a different way,
or am I missing some other feature of your thesis?

Yes, this is a fair description.


I'm going to assume you mean that yes, you believe the inequality goes 
the other way:  (Pro#2 + -Con#1)  (Pro#1 + -Con#2)


This accidental testing is not something to be neglected


I'm not neglecting it, I'm evaluating its benefit to be less than its 
cost.


To make concrete the cost/benefit tradeoff, would you add a random 
sleep() into DRT execution to detect timing-related bugs?
It seems like a crazy thing to do, to me, but it would certainly catch 
timing-related bugs quite effectively.
If you don't think we should do that, can you describe how you're 
evaluating cost/benefit in each of the cases and why you arrive at 
different conclusions?


(of course, adding such random sleeps under default-disabled flag 
control for bug investigation could make a lot of sense; but here I'm 
talking about what we do on the bots  by default)


It's not humanly possible to have tests for everything in advance.


Of course.  But we should at least make it humanly possible to 
understand our tests as written :)
Making understanding our tests not humanly possible isn't the way to 
make up for the not-humanly-possible nature of testing everything in 
every way.
It just means we push off not knowing how much coverage we really 
have, and derive a false sense of security from the fact that bugs 
have been found in the past.


I completely agree with Maciej's idea that we should think about
ways to make non-deterministic failures easier to work with, so
that they would lead to discovering the root cause more directly,
and without the costs currently associated with it.


I have no problem with that, but I'm not sure how it relates to this 
thread unless one takes an XOR approach, in which case I guess I have 
low faith that the bigger problem Maciej highlights will be solved in 
a reasonable timeframe (weeks/months).



Memory allocator state. Computer's real time clock. Hard drive's
head position if you have a spinning hard drive, or SSD
controller state if you have an SSD. HTTP cookies. Should I
continue the list?

These things are all outside of webkit.

Yes, they are outside WebKit, but not outside WebKit control, if
needed.
Did you intend that to be an objection?


I imagine Balazs was pointing out that you included items that are not 
JS-visible in an answer to my question about things that are 
JS-visible.  But that was part of an earlier fork of this thread that 
went nowhere, so let's let it go.


I was just meaning that it is not feasible to force every external 
dependency to reset it's state, neither we want it. We just trust in 
them. But the cache is in WebKit, and we can reset it's state. So either 
resetting the cache is a good or a bad idea, I think it has nothing to 
do with the fact that we cannot reset the OS and the hardware (and 
external libs of course).


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-28 Thread Dirk Pranke
On Sun, Oct 28, 2012 at 2:47 PM, Ryosuke Niwa rn...@webkit.org wrote:
 On Sun, Oct 28, 2012 at 2:09 PM, Dirk Pranke dpra...@chromium.org wrote:

  On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote:
  I’m sure Antti, Alexey, and others who have worked on the loader and
  other parts of WebKit are happy to write those tests or list the kind of
  things they want to test. Heck, I don’t mind writing those tests if 
  someone
  could make a list.
 
  I totally sympathize with the sentiment to reduce the test flakiness
  but loader and cache code have historically been under-tested, and we’ve 
  had
  a number of bugs detected only by running non-loader tests consecutively.
 
  On the contrary, we’ve had this DRT behavior for ages. Is there any
  reason we can’t wait for another couple of weeks or months until we add 
  more
  loader  cache tests before making the behavior change?
 

 Please correct me if I'm misinformed, but it's been three months since
 this issue was first raised, and it doesn't sound like they've been
 writing those tests or are happy to do so, and despite people asking
 on this thread, they haven't been listing the kinds of tests they
 think they need.


 I don't think anyone else had suggested adding tests as an option or set a
 deadline until I suggested yesterday (or when I did in my original reply to
 the thread). In fact, since Ami posted his reply on October 26th 1:20AM
 (PST), many contributors from non-PST timezones haven't even had a chance to
 read his post during normal business hours.

 Given that I'd think it's totally unreasonable to land the patch as is
 without giving people reasonable amount of time (~one week) to respond to
 this thread.


Both you and Eric U suggesting adding new tests for this in the
original thread on 8/9; in fact, this whole issue got a fair amount of
discussion then, and this round hasn't really added anything new.

I'm happy to wait a little longer if others want to come up with some
other suggestions; I apologize if my previous response sounded like I
was throwing down a gauntlet or otherwise not open to ideas; that was
definitely not my intent.

Rather, I was attempting to say that unless someone else has other
ideas, the right path forward seems fairly clear to me and that I
intended to proceed down it.

Does that seem more reasonable to you?

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-28 Thread Ryosuke Niwa
On Sun, Oct 28, 2012 at 4:37 PM, Dirk Pranke dpra...@chromium.org wrote:

 On Sun, Oct 28, 2012 at 2:47 PM, Ryosuke Niwa rn...@webkit.org wrote:
  On Sun, Oct 28, 2012 at 2:09 PM, Dirk Pranke dpra...@chromium.org
 wrote:
 
   On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote:
   I’m sure Antti, Alexey, and others who have worked on the loader and
   other parts of WebKit are happy to write those tests or list the
 kind of
   things they want to test. Heck, I don’t mind writing those tests if
 someone
   could make a list.
  
   I totally sympathize with the sentiment to reduce the test flakiness
   but loader and cache code have historically been under-tested, and
 we’ve had
   a number of bugs detected only by running non-loader tests
 consecutively.
  
   On the contrary, we’ve had this DRT behavior for ages. Is there any
   reason we can’t wait for another couple of weeks or months until we
 add more
   loader  cache tests before making the behavior change?
  
 
  Please correct me if I'm misinformed, but it's been three months since
  this issue was first raised, and it doesn't sound like they've been
  writing those tests or are happy to do so, and despite people asking
  on this thread, they haven't been listing the kinds of tests they
  think they need.
 
 
  I don't think anyone else had suggested adding tests as an option or set
 a
  deadline until I suggested yesterday (or when I did in my original reply
 to
  the thread). In fact, since Ami posted his reply on October 26th 1:20AM
  (PST), many contributors from non-PST timezones haven't even had a
 chance to
  read his post during normal business hours.
 
  Given that I'd think it's totally unreasonable to land the patch as is
  without giving people reasonable amount of time (~one week) to respond to
  this thread.
 

 Both you and Eric U suggesting adding new tests for this in the
 original thread on 8/9; in fact, this whole issue got a fair amount of
 discussion then, and this round hasn't really added anything new.


Yeah, but I don't think it got much traction back then. Also, we didn't
have any deadlines like weeks or months.

I'm happy to wait a little longer if others want to come up with some
 other suggestions; I apologize if my previous response sounded like I
 was throwing down a gauntlet or otherwise not open to ideas; that was
 definitely not my intent.

 Rather, I was attempting to say that unless someone else has other
 ideas, the right path forward seems fairly clear to me and that I
 intended to proceed down it.

 Does that seem more reasonable to you?


Yes.

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-27 Thread Ami Fischman

 There are lot of things remaining in the process across tests runs


What things remain in the process across test runs that are visible to
DRT/JS?

As I've said before in this thread, it seems axiomatic to me that tests can
only be reasoned about if they run in a pristine environment.  This is why
we 
TestShell::resetTestController()http://trac.webkit.org/browser/trunk/Tools/DumpRenderTree/chromium/TestShell.cpp#L300;
so that a given test passes or fails the same way regardless of what other
tests have run in the same process earlier.  Given that we *do* reset the
execution environment between tests, it seems arbitrary (and unworkable) to
*not* reset the cache.

Cheers,
-a
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-27 Thread Alexey Proskuryakov

27.10.2012, в 20:47, Ami Fischman fisch...@chromium.org написал(а):

 There are lot of things remaining in the process across tests runs
 
 What things remain in the process across test runs that are visible to 
 DRT/JS?

Memory allocator state. Computer's real time clock. Hard drive's head position 
if you have a spinning hard drive, or SSD controller state if you have an SSD. 
HTTP cookies. Should I continue the list?

 As I've said before in this thread, it seems axiomatic to me that tests can 
 only be reasoned about if they run in a pristine environment.

This is an empty statement. A computer always provides you with a pristine 
environment until its RAM or other storage starts randomly failing. I would 
agree that tests would become useless if ran on a machine with faulty RAM. But 
people working on the project have successfully reasoned about flaky test 
failures many times in the past.

 This is why we TestShell::resetTestController(); so that a given test passes 
 or fails the same way regardless of what other tests have run in the same 
 process earlier.  Given that we *do* reset the execution environment between 
 tests, it seems arbitrary (and unworkable) to *not* reset the cache.


I don't think that pure logic can prove the need. As mentioned before, cache is 
just an entirely arbitrary target from this point of view.

We do reset preferences that are temporarily changed by tests. This is 
basically modeled on user expectations - changing a preference is expected to 
change how your browser behaves, so it's OK for tests to depend on that. But 
visiting site A is not expected to affect behavior on site B, even though cache 
state was affected by site A.

- WBR, Alexey Proskuryakov

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Ami Fischman
This thread stalled out because although there seemed to be majority
agreement that hermetic/repeatable tests are a good thing, there was a
requirement that all ports be updated to the new behavior at the same time,
and I'm only competent to do the chromium DRT (see
https://bugs.webkit.org/show_bug.cgi?id=93195 for details).

Is anyone interested in stepping up and doing the equivalent (clear caches
between tests) for the mac and/or gtk ports' DRTs?


On Wed, Aug 8, 2012 at 2:35 PM, Dirk Pranke dpra...@chromium.org wrote:

 On Wed, Aug 8, 2012 at 10:47 AM, Ojan Vafai o...@chromium.org wrote:
  See https://bugs.webkit.org/show_bug.cgi?id=93195.
 
  media/W3C/video/networkState/networkState_during_progress.html and
  media/video-poster-blocked-by-willsendrequest.html are flaky on all
  platforms because they behave differently if the loaded resource is
 cached.
 
  Every time I've taken a stab at reducing test flakiness, I've come
 across at
  least a few tests that pass when run as part of the test suite, but fail
  when run by themselves (or in parallel) because they accidentally expect
 an
  image or something to be in the cache.
 
  I think it would make the tests more maintainable if we cleared the cache
  before each test run. This is *not* before each page load though. So
 tests
  that do multiple page loads will still test cross-navigation caching
  behavior.
 
  While it's true that we could one-off fix each of these tests, it's
 usually
  very time consuming to figure out that caching is the problem, that's
  assuming anyone takes the time to look into why the test is flaky in the
  first place.
 
  Any objections?
 

 Given that the way we run tests in parallel in NRWT means that
 different processes get different lists of tests each time, it sounds
 like we may be getting a fair amount of nondeterminism from the cache
 not being cleared between tests. That seems bad, so I'm in favor of
 clearing the cache :)

 -- Dirk

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Antti Koivisto
On Wed, Aug 8, 2012 at 9:54 PM, Eric U er...@google.com wrote:

 On Wed, Aug 8, 2012 at 11:43 AM, Alexey Proskuryakov a...@webkit.org
 wrote:
  I can see some downsides to emptying the cache before each test:
 
  - we won't be getting any test coverage for cache behavior when it hits
  non-trivial size;

 Then let's add a cache test explicitly for this.  Otherwise we just
 have to hope it gets tested accidentally along the way.


Cache has subtle interactions with other things being tested (-flakiness).
More explicit cache tests would be nice but we can't hope the replicate all
the accidental testing we now get. We are going to lose a large chunk of
existing test coverage if we do this.


  antti



  - this may well make tests measurably slower;
 
  - this will be yet another cause of subtle difference between platforms,
 as
  some will undoubtedly have this unimplemented for a long time.

 Both good points, but probably worth it, given the reliability
 improvement in the tests IMO.

 Eric
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Ami Fischman
On Fri, Oct 26, 2012 at 1:44 AM, Antti Koivisto koivi...@iki.fi wrote:

 Cache has subtle interactions with other things being tested
 (-flakiness). More explicit cache tests would be nice but we can't hope
 the replicate all the accidental testing we now get. We are going to lose a
 large chunk of existing test coverage if we do this.


The reality is that this test coverage today shows up as flakiness and so
is ignored anyway, meaning we don't actually have useful coverage here.
 Even when flakiness is investigated, the fix is to cache-bust using
unique URL params, which just means we lose the coverage you describe for
that test, anyway.

Brian notes in the bug that GTK  wk2 GTK+ are done.
I believe that just leaves chromium  mac.
Anyone wanting to step up to do mac, and, I guess, wk2 mac?

-a
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Dirk Pranke
I don't know that there was consensus that every port had to be
updated at the same time; in fact Balazs said Qt and EFL already clear
the cache.

I think you should just land the change for Chromium and let others
update their ports as needed. The value in reduced flakiness and more
predictability outweighs anything else in my book. Test coverage that
you can't explain or rely on doesn't count for much to me.

-- Dirk

On Fri, Oct 26, 2012 at 1:20 AM, Ami Fischman fisch...@chromium.org wrote:
 This thread stalled out because although there seemed to be majority
 agreement that hermetic/repeatable tests are a good thing, there was a
 requirement that all ports be updated to the new behavior at the same time,
 and I'm only competent to do the chromium DRT (see
 https://bugs.webkit.org/show_bug.cgi?id=93195 for details).

 Is anyone interested in stepping up and doing the equivalent (clear caches
 between tests) for the mac and/or gtk ports' DRTs?


 On Wed, Aug 8, 2012 at 2:35 PM, Dirk Pranke dpra...@chromium.org wrote:

 On Wed, Aug 8, 2012 at 10:47 AM, Ojan Vafai o...@chromium.org wrote:
  See https://bugs.webkit.org/show_bug.cgi?id=93195.
 
  media/W3C/video/networkState/networkState_during_progress.html and
  media/video-poster-blocked-by-willsendrequest.html are flaky on all
  platforms because they behave differently if the loaded resource is
  cached.
 
  Every time I've taken a stab at reducing test flakiness, I've come
  across at
  least a few tests that pass when run as part of the test suite, but fail
  when run by themselves (or in parallel) because they accidentally expect
  an
  image or something to be in the cache.
 
  I think it would make the tests more maintainable if we cleared the
  cache
  before each test run. This is *not* before each page load though. So
  tests
  that do multiple page loads will still test cross-navigation caching
  behavior.
 
  While it's true that we could one-off fix each of these tests, it's
  usually
  very time consuming to figure out that caching is the problem, that's
  assuming anyone takes the time to look into why the test is flaky in the
  first place.
 
  Any objections?
 

 Given that the way we run tests in parallel in NRWT means that
 different processes get different lists of tests each time, it sounds
 like we may be getting a fair amount of nondeterminism from the cache
 not being cleared between tests. That seems bad, so I'm in favor of
 clearing the cache :)

 -- Dirk


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Antti Koivisto
On Fri, Oct 26, 2012 at 6:09 PM, Ami Fischman fisch...@chromium.org wrote:

 The reality is that this test coverage today shows up as flakiness and
 so is ignored anyway, meaning we don't actually have useful coverage here.
  Even when flakiness is investigated, the fix is to cache-bust using
 unique URL params, which just means we lose the coverage you describe for
 that test, anyway.


When making cache related changes I have frequently found bugs from my
patches because some seemingly random test started failing and I
investigated. Without the test coverage some of those bugs would probably
now be in the tree.


  antti



 Brian notes in the bug that GTK  wk2 GTK+ are done.
 I believe that just leaves chromium  mac.
 Anyone wanting to step up to do mac, and, I guess, wk2 mac?

 -a

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Elliott Sprehn
On Fri, Oct 26, 2012 at 11:17 AM, Ryosuke Niwa rn...@webkit.org wrote:
 ...

 I agree this is a good change but it appears that we should add more
 cache/loader tests before changing DRT's behavior given that there are
 active contributors who rely on the current DRT behaviors to detect
 regressions.


Can we add a flag to control this behavior? Then Antti could run the
tests without cache clearing when modifying things possibly related to
the cache code. We could even run a separate cr-linux bot like we do
for debug builds.

- E
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Ryosuke Niwa
On Fri, Oct 26, 2012 at 11:33 AM, Elliott Sprehn espr...@chromium.orgwrote:

 On Fri, Oct 26, 2012 at 11:17 AM, Ryosuke Niwa rn...@webkit.org wrote:
  ...
 
  I agree this is a good change but it appears that we should add more
  cache/loader tests before changing DRT's behavior given that there are
  active contributors who rely on the current DRT behaviors to detect
  regressions.
 

 Can we add a flag to control this behavior? Then Antti could run the
 tests without cache clearing when modifying things possibly related to
 the cache code. We could even run a separate cr-linux bot like we do
 for debug builds.


I think having a set of tests that tests loaders/caches explicitly is more
useful.

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Dirk Pranke
On Fri, Oct 26, 2012 at 11:38 AM, Ryosuke Niwa rn...@webkit.org wrote:
 On Fri, Oct 26, 2012 at 11:33 AM, Elliott Sprehn espr...@chromium.org
 wrote:

 On Fri, Oct 26, 2012 at 11:17 AM, Ryosuke Niwa rn...@webkit.org wrote:
  ...
 
  I agree this is a good change but it appears that we should add more
  cache/loader tests before changing DRT's behavior given that there are
  active contributors who rely on the current DRT behaviors to detect
  regressions.
 

 Can we add a flag to control this behavior? Then Antti could run the
 tests without cache clearing when modifying things possibly related to
 the cache code. We could even run a separate cr-linux bot like we do
 for debug builds.


 I think having a set of tests that tests loaders/caches explicitly is more
 useful.


I think having a set of tests for loaders and caches would be more
useful as well, but I don't think it's fair to make that a requirement
to changing the default behavior here, especially since it's not clear
who all would be best suited to writing those tests or what the extent
of that work is.

I think Eliot's suggestion is a good one. I think the overall cost to
the project by having flakiness in the tests probably outweighs the
value we get in mysterious additional coverage, and it seems like
having a flag would be a good compromise.

-- Dirk


 - Ryosuke


 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Ami Fischman
On Fri, Oct 26, 2012 at 11:17 AM, Ryosuke Niwa rn...@webkit.org wrote:

 I agree this is a good change but it appears that we should add more
 cache/loader tests before changing DRT's behavior given that there are
 active contributors who rely on the current DRT behaviors to detect
 regressions.


Not knowing the specifics of the regressions in question, I don't have any
idea what these new cache-related tests would be.

-a
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Alexey Proskuryakov

26.10.2012, в 11:04, Antti Koivisto koivi...@iki.fi написал(а):

 The reality is that this test coverage today shows up as flakiness and so 
 is ignored anyway, meaning we don't actually have useful coverage here.  Even 
 when flakiness is investigated, the fix is to cache-bust using unique URL 
 params, which just means we lose the coverage you describe for that test, 
 anyway.

I think that this is the real issue here. Test flakiness is very important to 
investigate, this often leads to discovery of bad bugs, including security 
ones. The phrase flaky test often misplaces the blame.

 When making cache related changes I have frequently found bugs from my 
 patches because some seemingly random test started failing and I 
 investigated. Without the test coverage some of those bugs would probably now 
 be in the tree.

I agree with Antti. Finding regressions is what tests are for, and it would be 
difficult to make enough explicit tests to compensate for such loss of 
coverage. It would certainly be very unfortunate to lose test coverage without 
even an attempt to compensate for that.


- WBR, Alexey Proskuryakov


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Ami Fischman
Should we add random sleeps to DRT?  It'll certainly help find some
regressions (and even security bugs).
Of course the down-side is that it makes tests non-repeatable and difficult
to reason about.

I'm baffled by your priorities and don't know how to continue this
conversation productively.  Sorry.

Cheers,
-a

On Fri, Oct 26, 2012 at 12:43 PM, Alexey Proskuryakov a...@webkit.org wrote:


 26.10.2012, в 11:04, Antti Koivisto koivi...@iki.fi написал(а):

 The reality is that this test coverage today shows up as flakiness and
 so is ignored anyway, meaning we don't actually have useful coverage here.
  Even when flakiness is investigated, the fix is to cache-bust using
 unique URL params, which just means we lose the coverage you describe for
 that test, anyway.


 I think that this is the real issue here. Test flakiness is very important
 to investigate, this often leads to discovery of bad bugs, including
 security ones. The phrase flaky test often misplaces the blame.

 When making cache related changes I have frequently found bugs from my
 patches because some seemingly random test started failing and I
 investigated. Without the test coverage some of those bugs would probably
 now be in the tree.


 I agree with Antti. Finding regressions is what tests are for, and it
 would be difficult to make enough explicit tests to compensate for such
 loss of coverage. It would certainly be very unfortunate to lose test
 coverage without even an attempt to compensate for that.


 - WBR, Alexey Proskuryakov



 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Ryosuke Niwa
On Fri, Oct 26, 2012 at 11:43 AM, Dirk Pranke dpra...@chromium.org wrote:

 On Fri, Oct 26, 2012 at 11:38 AM, Ryosuke Niwa rn...@webkit.org wrote:
  On Fri, Oct 26, 2012 at 11:33 AM, Elliott Sprehn espr...@chromium.org
  wrote:
 
  On Fri, Oct 26, 2012 at 11:17 AM, Ryosuke Niwa rn...@webkit.org
 wrote:
   ...
  
   I agree this is a good change but it appears that we should add more
   cache/loader tests before changing DRT's behavior given that there are
   active contributors who rely on the current DRT behaviors to detect
   regressions.
  
 
  Can we add a flag to control this behavior? Then Antti could run the
  tests without cache clearing when modifying things possibly related to
  the cache code. We could even run a separate cr-linux bot like we do
  for debug builds.
 
 
  I think having a set of tests that tests loaders/caches explicitly is
 more
  useful.
 

 I think having a set of tests for loaders and caches would be more
 useful as well, but I don't think it's fair to make that a requirement
 to changing the default behavior here, especially since it's not clear
 who all would be best suited to writing those tests or what the extent
 of that work is.


I’m sure Antti, Alexey, and others who have worked on the loader and other
parts of WebKit are happy to write those tests or list the kind of things
they want to test. Heck, I don’t mind writing those tests if someone could
make a list.

I totally sympathize with the sentiment to reduce the test flakiness but
loader and cache code have historically been under-tested, and we’ve had a
number of bugs detected only by running non-loader tests consecutively.

On the contrary, we’ve had this DRT behavior for ages. Is there any reason
we can’t wait for another couple of weeks or months until we add more
loader  cache tests before making the behavior change?

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Ami Fischman
On Fri, Oct 26, 2012 at 2:11 PM, Ryosuke Niwa rn...@webkit.org wrote:

 Is there any reason we can’t wait for another couple of weeks or months
 until we add more loader  cache tests before making the behavior change?


There is no time pressure here other than a desire to avoid this falling
between the cracks and (continuing to) never being done.
Is anyone signing up to write or enumerate the tests, who can do the work
in the next weeks/months, but not immediately?

-a
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-10-26 Thread Alexey Proskuryakov

26.10.2012, в 14:57, Dirk Pranke dpra...@chromium.org написал(а):

 Perhaps a slight variant of this is that we can agree to make the
 changes on the Chromium port to clear the cache (much like the Qt and
 EFL ports already do), and you can continue to not clear the cache on
 the Apple Mac port until you feel comfortable that you've added
 additional tests?

This means that when someone introduces flakiness into resource caching, it 
will be only seen on Apple Mac bots. How is this good for anyone? I personally 
find this unacceptable, as this will reduce usefulness of Apple Mac bots.

The whole idea to clear cache between tests seems very arbitrary to me. There 
are lot of things remaining in the process across tests runs, and I'm not sure 
why you are picking on the one with the least explicit test coverage.

Historically, test flakiness appears to increase whenever we do anything to 
address it without actual investigation of the root cause. Not long ago, we 
could run tests without re-running flaky tests, and get 100% pass. Now, we have 
many more flaky tests, re-run them, but flakiness remains even after second 
run. I don't think that this is a result of project scale change - I think that 
this is a result of the desire to get green bots without doing real WebCore 
work to fix underlying bugs.

- WBR, Alexey Proskuryakov

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-08-09 Thread Balazs Kelemen

Actually Qt and EFL DRT's already does that.

On 08/08/2012 07:47 PM, Ojan Vafai wrote:

See https://bugs.webkit.org/show_bug.cgi?id=93195.

media/W3C/video/networkState/networkState_during_progress.html 
and media/video-poster-blocked-by-willsendrequest.html are flaky on 
all platforms because they behave differently if the loaded resource 
is cached.


Every time I've taken a stab at reducing test flakiness, I've come 
across at least a few tests that pass when run as part of the test 
suite, but fail when run by themselves (or in parallel) because they 
accidentally expect an image or something to be in the cache.


I think it would make the tests more maintainable if we cleared the 
cache before each test run. This is *not* before each page load 
though. So tests that do multiple page loads will still test 
cross-navigation caching behavior.


While it's true that we could one-off fix each of these tests, it's 
usually very time consuming to figure out that caching is the problem, 
that's assuming anyone takes the time to look into why the test is 
flaky in the first place.


Any objections?

Ojan


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


[webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-08-08 Thread Ojan Vafai
See https://bugs.webkit.org/show_bug.cgi?id=93195.

media/W3C/video/networkState/networkState_during_progress.html
and media/video-poster-blocked-by-willsendrequest.html are flaky on all
platforms because they behave differently if the loaded resource is cached.

Every time I've taken a stab at reducing test flakiness, I've come across
at least a few tests that pass when run as part of the test suite, but fail
when run by themselves (or in parallel) because they accidentally expect an
image or something to be in the cache.

I think it would make the tests more maintainable if we cleared the cache
before each test run. This is *not* before each page load though. So tests
that do multiple page loads will still test cross-navigation caching
behavior.

While it's true that we could one-off fix each of these tests, it's usually
very time consuming to figure out that caching is the problem, that's
assuming anyone takes the time to look into why the test is flaky in the
first place.

Any objections?

Ojan
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-08-08 Thread Ryosuke Niwa
That sounds like a great idea to me. I was actually surprised when fischman
told me we don't currently do this.

- Ryosuke

On Wed, Aug 8, 2012 at 10:47 AM, Ojan Vafai o...@chromium.org wrote:

 See https://bugs.webkit.org/show_bug.cgi?id=93195.

 media/W3C/video/networkState/networkState_during_progress.html
 and media/video-poster-blocked-by-willsendrequest.html are flaky on all
 platforms because they behave differently if the loaded resource is cached.

 Every time I've taken a stab at reducing test flakiness, I've come across
 at least a few tests that pass when run as part of the test suite, but fail
 when run by themselves (or in parallel) because they accidentally expect an
 image or something to be in the cache.

 I think it would make the tests more maintainable if we cleared the cache
 before each test run. This is *not* before each page load though. So tests
 that do multiple page loads will still test cross-navigation caching
 behavior.

 While it's true that we could one-off fix each of these tests, it's
 usually very time consuming to figure out that caching is the problem,
 that's assuming anyone takes the time to look into why the test is flaky in
 the first place.

 Any objections?

 Ojan

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-08-08 Thread Alexey Proskuryakov

I can see some downsides to emptying the cache before each test:

- we won't be getting any test coverage for cache behavior when it hits 
non-trivial size;

- this may well make tests measurably slower;

- this will be yet another cause of subtle difference between platforms, as 
some will undoubtedly have this unimplemented for a long time.

- WBR, Alexey Proskuryakov

08.08.2012, в 10:47, Ojan Vafai написал(а):

 See https://bugs.webkit.org/show_bug.cgi?id=93195.
 
 media/W3C/video/networkState/networkState_during_progress.html and 
 media/video-poster-blocked-by-willsendrequest.html are flaky on all platforms 
 because they behave differently if the loaded resource is cached.
 
 Every time I've taken a stab at reducing test flakiness, I've come across at 
 least a few tests that pass when run as part of the test suite, but fail when 
 run by themselves (or in parallel) because they accidentally expect an image 
 or something to be in the cache.
 
 I think it would make the tests more maintainable if we cleared the cache 
 before each test run. This is *not* before each page load though. So tests 
 that do multiple page loads will still test cross-navigation caching behavior.
 
 While it's true that we could one-off fix each of these tests, it's usually 
 very time consuming to figure out that caching is the problem, that's 
 assuming anyone takes the time to look into why the test is flaky in the 
 first place.
 
 Any objections?
 
 Ojan
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-08-08 Thread Eric U
On Wed, Aug 8, 2012 at 11:43 AM, Alexey Proskuryakov a...@webkit.org wrote:
 I can see some downsides to emptying the cache before each test:

 - we won't be getting any test coverage for cache behavior when it hits
 non-trivial size;

Then let's add a cache test explicitly for this.  Otherwise we just
have to hope it gets tested accidentally along the way.

 - this may well make tests measurably slower;

 - this will be yet another cause of subtle difference between platforms, as
 some will undoubtedly have this unimplemented for a long time.

Both good points, but probably worth it, given the reliability
improvement in the tests IMO.

Eric
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-08-08 Thread Ryosuke Niwa
On Wed, Aug 8, 2012 at 11:43 AM, Alexey Proskuryakov a...@webkit.org wrote:


 I can see some downsides to emptying the cache before each test:

 - we won't be getting any test coverage for cache behavior when it hits
 non-trivial size;


We should have a separate test for that as Eric pointed out.

- this may well make tests measurably slower;

 - this will be yet another cause of subtle difference between platforms,
 as some will undoubtedly have this unimplemented for a long time.


On the contrary, it may well improve the overall bot cycle time because
flaky tests are ran twice on new-run-webkit-tests if we actually have many
tests that are flaky because of this.

We also parallelize tests and resources are loaded from the disk (with
cache) so I highly suspect this will be an issue.

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?

2012-08-08 Thread Dirk Pranke
On Wed, Aug 8, 2012 at 10:47 AM, Ojan Vafai o...@chromium.org wrote:
 See https://bugs.webkit.org/show_bug.cgi?id=93195.

 media/W3C/video/networkState/networkState_during_progress.html and
 media/video-poster-blocked-by-willsendrequest.html are flaky on all
 platforms because they behave differently if the loaded resource is cached.

 Every time I've taken a stab at reducing test flakiness, I've come across at
 least a few tests that pass when run as part of the test suite, but fail
 when run by themselves (or in parallel) because they accidentally expect an
 image or something to be in the cache.

 I think it would make the tests more maintainable if we cleared the cache
 before each test run. This is *not* before each page load though. So tests
 that do multiple page loads will still test cross-navigation caching
 behavior.

 While it's true that we could one-off fix each of these tests, it's usually
 very time consuming to figure out that caching is the problem, that's
 assuming anyone takes the time to look into why the test is flaky in the
 first place.

 Any objections?


Given that the way we run tests in parallel in NRWT means that
different processes get different lists of tests each time, it sounds
like we may be getting a fair amount of nondeterminism from the cache
not being cleared between tests. That seems bad, so I'm in favor of
clearing the cache :)

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev