Re: reftests
On May 9, 2011, at 7:07 PM, Benjamin Otte wrote: I think this only makes sense if we want to continue to maintain this division between frequently and less frequently run tests. What is a frequently run test? I can tell you that no treeview test qualifies as frequently run to me when I'm hacking on GtkPaned. And I think the same applies to you vice versa. Which reduces the frequently run tests to a number very close to 0 I guess. Good point. I agree. -kris. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: reftests
On May 9, 2011, at 7:57 PM, Matthias Clasen wrote: I don't have a fear of toplevel dirs. If people insist on having tests outside of the source tree (I personally like things in gtk/tests ...), then what is wrong with just having reftests/ as a toplevel directory ? We already have perf/, and we can add unittests/ too. Benjamin raised the issue that the tests are re-linked all the time if they are in gtk/tests/ (unless you learn yourself to use make all-am instead of a plain make like I did ...). If we decide the move the tests from gtk/tests/ to somewhere else, I wouldn't mind a toplevel unittests/ directory. As for testgtk, I think that would be really misplaced under demos/, since its code is the opposite of a demo of good GTK+ coding practices. If we want to clean this up some more, I'd instead propose to move the random other pixbuf test things from demos/ to tests/, and Good idea! I've never understood why these pixbuf things were in demos/. move the new examples from examples/ to demos/. At the same time we can kill the old, dead examples, and the incomplete testgtk copy under demos. Agreed. regards, -kris. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: reftests
On Fri, May 6, 2011 at 12:04 PM, Kristian Rietveld k...@gtk.org wrote: Which raises another question, would it be a good idea or make sense to merge the image differ into the GTK+ test utils so that other tests (e.g. the tree view scrolling test suite) can make use of it? I did not spend any time thinking about generalizing code, but I guess it would certainly make sense to merge the image diffing or even the glade file = cairo surface containing snapshot code. So if you need something like that, better generalize it then copying it or inventing your own nefarious scheme. 1) There seems to be some logic behind where frequently run and less frequently tests are located. Quoting from [1]: `` 1) Figure a place for the test case. For this it's useful to keep in mind that make check will traverse CWD recursively. So tests that should be run often when glib, gdk or gtk changed should go into glib/glib/tests/, gtk+/gtk/tests/ or gtk+/gdk/tests/. Tests more thorough or planned to be run less frequently can go into glib/tests/ or gtk+/tests/. This is e.g. the case for the generic object property tester in gtk+/tests/objecttests.c. To sum up: glib/tests/ # less frequently run GLib tests glib/glib/tests/ # frequent GLib testing glib/gobject/tests/ # frequent GObject testing gtk+/tests/ # less frequently run Gdk Gtk+ tests gtk+/gdk/tests/ # frequent Gdk testing gtk+/gtk/tests/ # frequent Gtk+ testing '' I think this only makes sense if we want to continue to maintain this division between frequently and less frequently run tests. What is a frequently run test? I can tell you that no treeview test qualifies as frequently run to me when I'm hacking on GtkPaned. And I think the same applies to you vice versa. Which reduces the frequently run tests to a number very close to 0 I guess. 2) Tests that are maintained outside the normal source tree could be subject to becoming unmaintained quite quickly [2], which in turn leads to tests being run less often. However, if the tests will still live under gtk+/unittest/ and are part of the default full build, then this might not be that big of a problem. I've heard that argument before and the only response it got from me was a confused look. Testsuite code is run regularly by make distcheck and buildbots. If you add to this the fact that the code will keep running unmodified until we break API again, this argument quickly stops making any sense. At least to me. So building tests only as part of make check (like everyone else does) seems like a very sound thing to do, in particular for code as stable as GTK. The only issue I have with tests/ is that non-automated and automated tests might be intermixed, which is IMHO confusing. Perhaps a subdirectory unittest/ under tests/ would work? I would personally move testgtk and friends into demos/tests/ but I don't care either way. I guess Matthias is the one that gets to decide here. Benjamin ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: reftests
On May 5, 2011, at 6:24 PM, Benjamin Otte wrote: On Thu, May 5, 2011 at 10:18 AM, Kristian Rietveld k...@gtk.org wrote: As we've already discussed on IRC some time ago, I would really like to see all GTK+ unit tests in one single place, instead of in several different places in the source code. We really need people to run the unit tests more often and thus this needs to be made easy (like you also mention in your enumeration above), I don't think putting different unit tests at different places makes this easier. I do agree with putting all things in one location. However, I do not agree with people need to run tests more often. I think running tests is a job for machines. Yes, ideally, we would have a buildbot that runs all tests after compiling a new revision of GTK+. Of course in addition to people that run the tests after refactoring a part of the source code before committing. We can always distribute the unit tests as a separate tarball if that will help, can't we? Well yeah, but then it still requires compilation of something. The approach with Here, open this in glade and Does it look like this screenshot? is a lot better, because it's very easy. For testing layout and rendering it will work very well, but I do not have the impression that all tests can be made as easy as this. For example, the tree view scrolling tests can definitely not be done in terms of creating a glade file and comparing screenshots. However, it would definitely be interesting to include reference screenshots for a few of the tests (not all, that would be too much). Which raises another question, would it be a good idea or make sense to merge the image differ into the GTK+ test utils so that other tests (e.g. the tree view scrolling test suite) can make use of it? Another question: why was gtk-reftest put in gtk+/tests/reftests/gtk-reftest instead of in gtk+/gtk/tests/gtk-reftest, with a subdirectory reftests containing the glade files? Then on make check for the GTK+ unit tests, the reftests would automatically be executed as well. Currently, you also need to compile a GTK+ checkout to use reftests, right? I never liked the idea of putting tests in the same directory tree as the actual source code, because in the good case it creates spam to stdout (about entering directories I don't care about) and in the worst case it actually compiles tests, so it not only spams stdout but it also takes lots of time relinking. If I am actively working on The relinking annoys me as well which is why I started to compile in the gtk/ subdirectory with make all-am to avoid that :) So I do think that tests, just like documentation should live outside of the normal source tree. Which is why I put it there. Makes sense and I have to add that I do not have a very strong opinion with regard to where exactly the tests are located. However, it looks like there are two main reasons the tests were put at this location: 1) There seems to be some logic behind where frequently run and less frequently tests are located. Quoting from [1]: `` 1) Figure a place for the test case. For this it's useful to keep in mind that make check will traverse CWD recursively. So tests that should be run often when glib, gdk or gtk changed should go into glib/glib/tests/, gtk+/gtk/tests/ or gtk+/gdk/tests/. Tests more thorough or planned to be run less frequently can go into glib/tests/ or gtk+/tests/. This is e.g. the case for the generic object property tester in gtk+/tests/objecttests.c. To sum up: glib/tests/# less frequently run GLib tests glib/glib/tests/ # frequent GLib testing glib/gobject/tests/# frequent GObject testing gtk+/tests/# less frequently run Gdk Gtk+ tests gtk+/gdk/tests/# frequent Gdk testing gtk+/gtk/tests/# frequent Gtk+ testing '' I think this only makes sense if we want to continue to maintain this division between frequently and less frequently run tests. 2) Tests that are maintained outside the normal source tree could be subject to becoming unmaintained quite quickly [2], which in turn leads to tests being run less often. However, if the tests will still live under gtk+/unittest/ and are part of the default full build, then this might not be that big of a problem. These sounds like numbers I would expect. What in GTest would need improvement to realize this? GTest mainly needs usability improvements such as those you pointed out by those error messages. (I'm sorry if I offended you by taking a test written by you as the bad example, I just took a random file from gtk/tests as an example of why our current approach is bad.) The problem is that currently running tests (or a single test) manually is complicated and oftentimes ends up with unparsable error messages that often are no help in actually figuring out what got broken.
Re: reftests
Hi Benjamin, On May 3, 2011, at 10:01 PM, Benjamin Otte wrote: with the latest commits[1] I have added reftests to GTK. Reftests are my approach at getting layout and rendering behavior of gtk tested. I've added a bunch of tests already for the things I have fixed and will continue to add tests for bugs I fix. For what the test runner does, see the commit message in [1], for what reftests are, see [2]. The test runner works very well, even though it is still a bit rough around the edges, but that's mostly because gtester needs to be made better to cope with generic testing. (It's way too crash-happy as-is.) Very nice to see that we are (finally) getting testing in place for layout and rendering code! In this mail, I want to go into the motivation for writing reftests and why I didn't want to make use of the previous test infrastructure. I tried to achieve the following goals (if you think I could achieve them better, please speak up): - It should be easy to create tests - It should be easy to run tests - It should be easy to understand tests - It should be easy to fix problems shown by tests - The test infrastructure should easily scale As we've already discussed on IRC some time ago, I would really like to see all GTK+ unit tests in one single place, instead of in several different places in the source code. We really need people to run the unit tests more often and thus this needs to be made easy (like you also mention in your enumeration above), I don't think putting different unit tests at different places makes this easier. So I think it would be good to consolidate into one location. Some ideas below. - It should be easy to create tests Writing a test is something people hate to do. It's the #1 reason why Open Source projects don't write tests. Also, it's the #1 reason why bugs aren't fixed. If people would file bugs with easy to reproduce tests instead of saying in my custom application, when I do X, Y happens and not Z, there'd be a much higher chance developers would be interested in looking at it. This is why the reftests use stock ui files that can be created in Glade. So everyone that is able to use Glade can create a test file. And we can just use it. Agreed. For all different components of GTK+, we need to think on how to make it easy to write tests. I did this for the filter model in the past and I actually receive additional tests in bugzilla now (which I am in the process of reviewing). - It should be easy to run tests It's quite hard to get someone to run a test. It requires compilation of a GTK checkout. That is not good. We can always distribute the unit tests as a separate tarball if that will help, can't we? For a developer, too, it's quite complicated to run a test from someone else, say from bugzilla or a pastebin. Either you have to invoke gcc manually or you have to integrate it into the testsuite infrastructure. With reftests, you dump the ui file somewhere and run tests/reftests/gtk-reftest path/to/file.ui and that's it. You can then spend the rest of the day updating the testcase wherever you want, and pastebin or mail it back and forth with whoever you work on the test together. Of course this will work fine with glade files, but I don't see how this makes it easier to run other kinds of tests. Another question: why was gtk-reftest put in gtk+/tests/reftests/gtk-reftest instead of in gtk+/gtk/tests/gtk-reftest, with a subdirectory reftests containing the glade files? Then on make check for the GTK+ unit tests, the reftests would automatically be executed as well. Currently, you also need to compile a GTK+ checkout to use reftests, right? - It should be easy to understand tests Here's an example output from the current testsuite: /FilterModel/filled/hide-root-level: ** ERROR **: Signal queue empty aborting... It's hard to understand what might be broken. The output from current tests is both sparse and not very informative. If somebody came into IRC and said he ran make check and got this, I doubt anybody would know how to fix it. This error is very easy to improve, for example, 4 lines down in the source code are expected this, got that error messages. I think your actual point is that the output of GTest can be significantly improved. These filter model errors are just done with separate g_error() and g_assert_not_reached() calls, because GTest did not provide API for outputting more elaborate diagnostics about test failures. I have a similar case in the scrolling tests for tree view: g_assert (allocation.y == rect.y + ((rect.height - allocation.height) / 2)); The output of this failed assertion is not really nice to the eyes. It would be nice if the assertion macros could be improved to also accept a human-readable string of what's going wrong together with the expected and received value. But perhaps this is already present in the gtestutils and I missed it.
gtestutils Re: reftests
On 05/05/11 04:18, Kristian Rietveld wrote: g_assert (allocation.y == rect.y + ((rect.height - allocation.height) / 2)); The output of this failed assertion is not really nice to the eyes. It would be nice if the assertion macros could be improved to also accept a human-readable string of what's going wrong together with the expected and received value. But perhaps this is already present in the gtestutils and I missed it. You can do: g_assert_cmpint (allocation.y, ==, rect.y + ((rect.height - allocation.height) / 2)); and that gives you expected ..., got ... part, but not a message. What I do now in HarfBuzz is to do g_test_message() earlier in the code about what's going on, and then run the test with --verbose and make sense of it. Far from what other test frameworks provide, which is a message to print only when the test fails. b ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: reftests
As an update: http://blogs.gnome.org/otte/2011/05/05/reftests/ has a tutorial for writing reftests. I put it in my blog as it's nicer to layout things there, I didn't want to send large GIF attachments via email and it's reasonably easy for me to show it to anyone in the future if I want them to write a reftest. Benjamin On Tue, May 3, 2011 at 10:01 PM, Benjamin Otte o...@gnome.org wrote: Hey, with the latest commits[1] I have added reftests to GTK. Reftests are my approach at getting layout and rendering behavior of gtk tested. I've added a bunch of tests already for the things I have fixed and will continue to add tests for bugs I fix. For what the test runner does, see the commit message in [1], for what reftests are, see [2]. The test runner works very well, even though it is still a bit rough around the edges, but that's mostly because gtester needs to be made better to cope with generic testing. (It's way too crash-happy as-is.) In this mail, I want to go into the motivation for writing reftests and why I didn't want to make use of the previous test infrastructure. I tried to achieve the following goals (if you think I could achieve them better, please speak up): - It should be easy to create tests - It should be easy to run tests - It should be easy to understand tests - It should be easy to fix problems shown by tests - The test infrastructure should easily scale That's the TL;DR version, here is the long one: - It should be easy to create tests Writing a test is something people hate to do. It's the #1 reason why Open Source projects don't write tests. Also, it's the #1 reason why bugs aren't fixed. If people would file bugs with easy to reproduce tests instead of saying in my custom application, when I do X, Y happens and not Z, there'd be a much higher chance developers would be interested in looking at it. This is why the reftests use stock ui files that can be created in Glade. So everyone that is able to use Glade can create a test file. And we can just use it. - It should be easy to run tests It's quite hard to get someone to run a test. It requires compilation of a GTK checkout. That is not good. For a developer, too, it's quite complicated to run a test from someone else, say from bugzilla or a pastebin. Either you have to invoke gcc manually or you have to integrate it into the testsuite infrastructure. With reftests, you dump the ui file somewhere and run tests/reftests/gtk-reftest path/to/file.ui and that's it. You can then spend the rest of the day updating the testcase wherever you want, and pastebin or mail it back and forth with whoever you work on the test together. - It should be easy to understand tests Here's an example output from the current testsuite: /FilterModel/filled/hide-root-level: ** ERROR **: Signal queue empty aborting... It's hard to understand what might be broken. The output from current tests is both sparse and not very informative. If somebody came into IRC and said he ran make check and got this, I doubt anybody would know how to fix it. Or be interested in actually fixing what is wrong. So it is important that tests provide output that is easy to digest and get a hunch of what is actually wrong. Which is why gtk-reftest outputs images - the reference rendering of the expected output[3], the actual rendering[4] and the difference between those[5]. And it should be reasonably easy to find the difference between them and get an idea of what is wrong (Pango doesn't ellipsize every row, only the last one. Bad Pango - and Behdad hasn't even applied my patch for this, I need to poke him again as I've just committed that test, ooops.) - It should be easy to fix problems shown by tests This is really a combination of the previous points, but deserves separate mention: If a test regresses in a year or so and the original author has left to work on Libreoffice, Mozilla or other exciting jobs, it should be easy for the current developer to fix the problem. - The test infrastructure should easily scale This is mostly a question about how to organize a test suite so that people actually run it. Or at least run the parts that are relevant to them and an automatic testing infrastructure can do the full run and actually produce useful output to developers of something fails. So far we're pretty bad at this. Our patented test runner named Dan Winship interacts with the developers by reopening bugs with a bit of output from stderr. That works for now, but I'm not sure that test runner wants to scale. To give everyone a clue for what I'm aiming at: * The Swfdec testsuite contains 2.500+ tests. It takes 3 minutes to run. * The cairo testsuite contains 350 tests. It takes about 10 minutes to run for a normal run. A full run easily takes an hour. * The Webkit testsuite contains 20.000+ tests. It takes 15-20 minutes to run them all. So from looking at those numbers (and I didn't include
Re: reftests
On Thu, May 5, 2011 at 10:18 AM, Kristian Rietveld k...@gtk.org wrote: As we've already discussed on IRC some time ago, I would really like to see all GTK+ unit tests in one single place, instead of in several different places in the source code. We really need people to run the unit tests more often and thus this needs to be made easy (like you also mention in your enumeration above), I don't think putting different unit tests at different places makes this easier. I do agree with putting all things in one location. However, I do not agree with people need to run tests more often. I think running tests is a job for machines. We can always distribute the unit tests as a separate tarball if that will help, can't we? Well yeah, but then it still requires compilation of something. The approach with Here, open this in glade and Does it look like this screenshot? is a lot better, because it's very easy. I guess having tests written in Python would come closest to this. Another question: why was gtk-reftest put in gtk+/tests/reftests/gtk-reftest instead of in gtk+/gtk/tests/gtk-reftest, with a subdirectory reftests containing the glade files? Then on make check for the GTK+ unit tests, the reftests would automatically be executed as well. Currently, you also need to compile a GTK+ checkout to use reftests, right? I never liked the idea of putting tests in the same directory tree as the actual source code, because in the good case it creates spam to stdout (about entering directories I don't care about) and in the worst case it actually compiles tests, so it not only spams stdout but it also takes lots of time relinking. If I am actively working on refactoring code I do not want to be annoyed by random test cases. Currently this is still bearable, but when there's 10.000 tests that each get relinked and that takes 5 minutes after every 1 line change in gtknotebook.c or so, I am going to be really annoyed. So I do think that tests, just like documentation should live outside of the normal source tree. Which is why I put it there. These sounds like numbers I would expect. What in GTest would need improvement to realize this? GTest mainly needs usability improvements such as those you pointed out by those error messages. (I'm sorry if I offended you by taking a test written by you as the bad example, I just took a random file from gtk/tests as an example of why our current approach is bad.) The problem is that currently running tests (or a single test) manually is complicated and oftentimes ends up with unparsable error messages that often are no help in actually figuring out what got broken. Also, invocation of test runners should be a lot simpler (see below). About organization, I think for one all GTK+ unit tests should be in one place (and the GDK tests in another place). Yes, I would very much suggest tests/ for this. I don't care a whole lot if we have tests/gdk and tests/gtk, because I think it's hard to test GDK without also testing GTK as GTK has all the niceties that you want to have when writing tests (like ui files) and it's very cumbersome to write to the GDK API. Secondly, we also need to develop a consistent naming scheme for tests. Unit tests currently have different ways of naming tests: /FilterModel/self/verify-test-suite: /expander/click-expander: /recent-manager/get-default: /tests/column-new: (these are for icon view) /Builder/Window: I would attribute that to GTester. The path naming idea is kind of useless, because what you actually need to remember is the name of the test binary (so that you can run it), not the name of the actual test. There is no way to run all GtkLabel tests by saying gtester -p /tests/gtk/widgets/label and that way making it grab all tests in the testsuite and make it run those that match this path. Instead you run tests/gtk/widgets/label and that's it. You can also see that gtk-reftest basically ignores the test paths - or better: abuses them to store filenames in them so that you actually get useful output when running a test, because it tells you the actual file that failed. So for this case we'd actually first need to have a use for these test paths. Benjamin ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: reftests
On 05/03/11 16:01, Benjamin Otte wrote: (Pango doesn't ellipsize every row, only the last one. Bad Pango - and Behdad hasn't even applied my patch for this, I need to poke him again as I've just committed that test, ooops.) You see. No Pango test suite means little confidence that an incoming patch doesn't break other stuff. So I have to find and spend a good 45min slot on it, to read the code, remember all the delicacies, walk through the change, and convince myself that all the corner cases will still work. Would have been much easier if it was here's a new test that fails; here's the patch; now everything psases. I know you already know this. I was wondering if you're making any progress on the Pango test suite front! :D Cheers, behdad PS. Thanks for the great writeup. I'm forking it to discuss gtest in general. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list