On Fri, Jul 1, 2011 at 3:37 PM, Dirk Pranke <dpra...@chromium.org> wrote:
> On Fri, Jul 1, 2011 at 3:24 PM, Darin Fisher <da...@chromium.org> wrote: > > On Fri, Jul 1, 2011 at 3:04 PM, Darin Adler <da...@apple.com> wrote: > >> > >> On Jul 1, 2011, at 2:54 PM, Dirk Pranke wrote: > >> > >> > Does that apply to -expected.txt files in the base directories, or > just > >> > platform-specific exceptions? > >> > >> Base directories. > >> > >> Expected files contain output reflecting the behavior of WebKit at the > >> time the test was checked in. The expected result when we re-run a test. > >> Many expected files contain text that says “FAIL” in them. The fact that > >> these expected results are not successes, but rather expected failures > does > >> not seem to me to be a subtle point, but one of the basic things about > how > >> these tests are set up. > > > > Right, it helps us keep track of where we are, so that we don't regress, > and > > only make forward progress. > > > >> > >> > I wonder how it is that I've been working (admittedly, mostly on > >> > tooling) in WebKit for more that two years and this is the first I'm > hearing > >> > about this. > >> > >> I’m guessing it’s because you have been working on Chrome. > >> > >> The Chrome project came up with a different system for testing layered > on > >> top of the original layout test machinery based on different concepts. I > >> don’t think anyone ever discussed that system with me; I was the one who > >> created the original layout test system, to help Dave Hyatt originally, > and > >> then later the rest of the team started using it. > > > > The granular annotations (more than just SKIP) in test_expectations.txt > was > > something we introduced back when Chrome was failing a large percentage > of > > layout tests, and we needed a system to help us triage the failures. It > was > > useful to distinguish tests that crash from tests that generate bad > results, > > for example. We then focused on the crashing tests first. > > In addition, we wanted to understand how divergent we were from the > standard > > WebKit port, and we wanted to know if we were failing to match text > results > > or just image results. This allowed us to measure our degree of > > incompatibility with standard WebKit. We basically used this mechanism > to > > classify differences that mattered and differences that didn't matter. > > I think that if we had just checked in a bunch of port-specific "failure" > > expectations as -expected files, then we would have had a hard time > > distinguishing failures we needed to fix for compat reasons from failures > > that were expected (e.g., because we have different looking form > controls). > > I'm not sure if we are at a point now where this mechanism isn't useful, > but > > I kind of suspect that it will always be useful. Afterall, it is not > > uncommon for a code change to result in different rendering behavior > between > > the ports. I think it is valuable to have a measure of divergence > between > > the various WebKit ports. We want to minimize such divergence from a web > > compat point of view, of course. Maybe the count of SKIPPED tests is > > enough? But, then we suffer from not running the tests at all. At least > by > > annotating expected IMAGE failures, we get to know that the TEXT output > is > > the same and that we don't expect a CRASH. > > There's at least two reasons for divergence .. one is that the port is > actually doing the wrong thing, and the other is that the port is > doing the "right" thing but the output is different anyway (e.g., a > control is rendered differently). We cannot easily separate the two if > we have only a single convention (platform-specific -expected files), > but SKIPPING tests seems wrong for either category. > > It seems like -failing gives you the control you would want, no? > Obviously, it wouldn't help the thousands of -expected files that are > "wrong" but at least it could keep things from getting worse. > > I will note that reftests might solve some issues but not all of them > (since obviously code could render both pages "wrong"). > > -- Dirk > > I'm not sure. It makes me a bit uneasy adding even more heft to the LayoutTests directory. -Darin > > I suspect this isn't the best solution to the problem though. > > -Darin > > > > > >> > >> > Are there reasons we [are] doing things this way[?] > >> > >> Sure. The idea of the layout test framework is to check if the code is > >> still behaving as it did when the test was created and last run; we want > to > >> detect any changes in behavior that are not expected. When there are > >> expected changes in behavior, we change the contents of the expected > results > >> files. > >> > >> It seems possibly helpful to augment the test system with editorial > >> comments about which tests show bugs that we’d want to fix. But I > wouldn’t > >> want to stop running all regression tests where the output reflects the > >> effects of a bug or missing feature. > >> > >> -- Darin > >> > > > > > > >
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev