Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
All bots (except Qt) are now transitioned to using parallel testing by default. As expected, this was a big win for all the bots. (Gtk 32-bit -- as a randomly selected example -- went form 37min cycle times to 18min cycle times.) I expect there will be a few more flaky tests we'll need to fix/disable. Let me know if you see any troubles. -eric On Fri, Dec 2, 2011 at 3:55 PM, Eric Seidel e...@webkit.org wrote: run-webkit-tests is moving to parallell testing by default (this weekend) I just moved Mac this afternoon. The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). http://build.webkit.org/builders/SnowLeopard%20Intel%20Debug%20%28Tests%29/builds/3317 http://build.webkit.org/builders/SnowLeopard%20Intel%20Debug%20%28Tests%29/builds/3318 I will start working on the other bots once I believe the Mac ones to be stable. Let me know if you have any troubles! Thanks -eric p.s. I will not be moving the Qt bots, as Ossy has asked not to have them moved yet. ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
I should have said all platforms. run-webkit-tests will use parallel testing on all platforms (except Qt), not just on the buildbot machines. -eric On Mon, Dec 5, 2011 at 1:24 AM, Eric Seidel e...@webkit.org wrote: All bots (except Qt) are now transitioned to using parallel testing by default. As expected, this was a big win for all the bots. (Gtk 32-bit -- as a randomly selected example -- went form 37min cycle times to 18min cycle times.) I expect there will be a few more flaky tests we'll need to fix/disable. Let me know if you see any troubles. -eric On Fri, Dec 2, 2011 at 3:55 PM, Eric Seidel e...@webkit.org wrote: run-webkit-tests is moving to parallell testing by default (this weekend) I just moved Mac this afternoon. The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). http://build.webkit.org/builders/SnowLeopard%20Intel%20Debug%20%28Tests%29/builds/3317 http://build.webkit.org/builders/SnowLeopard%20Intel%20Debug%20%28Tests%29/builds/3318 I will start working on the other bots once I believe the Mac ones to be stable. Let me know if you have any troubles! Thanks -eric p.s. I will not be moving the Qt bots, as Ossy has asked not to have them moved yet. ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
Had to revert this for Mac in http://trac.webkit.org/changeset/102013 due to 20+ tests timing out and nrwt existing early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29?numbuilds=100 - Ryosuke On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0 (the worker that runs the http tests) took 2515 seconds to run. Ojan On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
I believe there are some tests (copy/paste) that it would be very hard to fully shard due to how they work. dave On Mon, Dec 5, 2011 at 11:08 AM, Ojan Vafai o...@chromium.org wrote: I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0 (the worker that runs the http tests) took 2515 seconds to run. Ojan On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
Why is that? I don't know about other ports, but AFAIK, chromium writes to a mock clipboard and the Apple mac port writes to a local OS clipboard instance instead of the global one, specifically to avoid copy/paste tests interacting. Even without running tests in parallel, it's probably a good idea in order to avoid copy-pastes the developer is doing from affecting the test run. That said, I believe we have a general way of marking subdirectories as needing to run serially (which is what we do for http) if there are other reasons we need to. On Mon, Dec 5, 2011 at 11:12 AM, David Levin le...@google.com wrote: I believe there are some tests (copy/paste) that it would be very hard to fully shard due to how they work. dave On Mon, Dec 5, 2011 at 11:08 AM, Ojan Vafai o...@chromium.org wrote: I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0 (the worker that runs the http tests) took 2515 seconds to run. Ojan On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
We never implemented the general way of marking subdirectories as needing to run serially, but it would be easy to do if we needed to [the 'http' dirs are still special-cased in the code]. There is code now (landed a few months ago) to control how many http tests run in parallel separately from the main parallelism flag (so you can run 16 workers but only 2 http tests at a time). I implemented it but never flipped the switch because it was in the middle of Eric flipping the bots over to NRWT in the first place. We should try experimenting with this to see at some point once everything stabilizes otherwise (this is the max_locked_shards() call in manager.py; there's no command line flag). -- Dirk On Mon, Dec 5, 2011 at 11:17 AM, Ojan Vafai o...@chromium.org wrote: Why is that? I don't know about other ports, but AFAIK, chromium writes to a mock clipboard and the Apple mac port writes to a local OS clipboard instance instead of the global one, specifically to avoid copy/paste tests interacting. Even without running tests in parallel, it's probably a good idea in order to avoid copy-pastes the developer is doing from affecting the test run. That said, I believe we have a general way of marking subdirectories as needing to run serially (which is what we do for http) if there are other reasons we need to. On Mon, Dec 5, 2011 at 11:12 AM, David Levin le...@google.com wrote: I believe there are some tests (copy/paste) that it would be very hard to fully shard due to how they work. dave On Mon, Dec 5, 2011 at 11:08 AM, Ojan Vafai o...@chromium.org wrote: I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0 (the worker that runs the http tests) took 2515 seconds to run. Ojan On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
On Mon, Dec 5, 2011 at 1:01 PM, Dirk Pranke dpra...@chromium.org wrote: We never implemented the general way of marking subdirectories as needing to run serially, but it would be easy to do if we needed to [the 'http' dirs are still special-cased in the code]. Does it also special-case the storage tests that may collide? I'm thinking particularly of the filesystem/filewriter tests, but anything that deals with quota may also have issues. There is code now (landed a few months ago) to control how many http tests run in parallel separately from the main parallelism flag (so you can run 16 workers but only 2 http tests at a time). I implemented it but never flipped the switch because it was in the middle of Eric flipping the bots over to NRWT in the first place. We should try experimenting with this to see at some point once everything stabilizes otherwise (this is the max_locked_shards() call in manager.py; there's no command line flag). -- Dirk On Mon, Dec 5, 2011 at 11:17 AM, Ojan Vafai o...@chromium.org wrote: Why is that? I don't know about other ports, but AFAIK, chromium writes to a mock clipboard and the Apple mac port writes to a local OS clipboard instance instead of the global one, specifically to avoid copy/paste tests interacting. Even without running tests in parallel, it's probably a good idea in order to avoid copy-pastes the developer is doing from affecting the test run. That said, I believe we have a general way of marking subdirectories as needing to run serially (which is what we do for http) if there are other reasons we need to. On Mon, Dec 5, 2011 at 11:12 AM, David Levin le...@google.com wrote: I believe there are some tests (copy/paste) that it would be very hard to fully shard due to how they work. dave On Mon, Dec 5, 2011 at 11:08 AM, Ojan Vafai o...@chromium.org wrote: I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO 31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0 (the worker that runs the http tests) took 2515 seconds to run. Ojan On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
On Mon, Dec 5, 2011 at 1:53 PM, Eric U er...@google.com wrote: On Mon, Dec 5, 2011 at 1:01 PM, Dirk Pranke dpra...@chromium.org wrote: We never implemented the general way of marking subdirectories as needing to run serially, but it would be easy to do if we needed to [the 'http' dirs are still special-cased in the code]. Does it also special-case the storage tests that may collide? I'm thinking particularly of the filesystem/filewriter tests, but anything that deals with quota may also have issues. It does not, although it would be easy to add such support. I will also note that by default all of the tests in the same directory are run sequentially in a single thread, so that may be why it hasn't been much of an issue. -- Dirk There is code now (landed a few months ago) to control how many http tests run in parallel separately from the main parallelism flag (so you can run 16 workers but only 2 http tests at a time). I implemented it but never flipped the switch because it was in the middle of Eric flipping the bots over to NRWT in the first place. We should try experimenting with this to see at some point once everything stabilizes otherwise (this is the max_locked_shards() call in manager.py; there's no command line flag). -- Dirk On Mon, Dec 5, 2011 at 11:17 AM, Ojan Vafai o...@chromium.org wrote: Why is that? I don't know about other ports, but AFAIK, chromium writes to a mock clipboard and the Apple mac port writes to a local OS clipboard instance instead of the global one, specifically to avoid copy/paste tests interacting. Even without running tests in parallel, it's probably a good idea in order to avoid copy-pastes the developer is doing from affecting the test run. That said, I believe we have a general way of marking subdirectories as needing to run serially (which is what we do for http) if there are other reasons we need to. On Mon, Dec 5, 2011 at 11:12 AM, David Levin le...@google.com wrote: I believe there are some tests (copy/paste) that it would be very hard to fully shard due to how they work. dave On Mon, Dec 5, 2011 at 11:08 AM, Ojan Vafai o...@chromium.org wrote: I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO 31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0 (the worker that runs the http tests) took 2515 seconds to run. Ojan On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
Some http tests make use of stateful php scritps with different tests utlizing the same scripts in some cases. Does each 'worker' get a dedicated http server instance or do they share the same http server? On Mon, Dec 5, 2011 at 1:01 PM, Dirk Pranke dpra...@chromium.org wrote: We never implemented the general way of marking subdirectories as needing to run serially, but it would be easy to do if we needed to [the 'http' dirs are still special-cased in the code]. There is code now (landed a few months ago) to control how many http tests run in parallel separately from the main parallelism flag (so you can run 16 workers but only 2 http tests at a time). I implemented it but never flipped the switch because it was in the middle of Eric flipping the bots over to NRWT in the first place. We should try experimenting with this to see at some point once everything stabilizes otherwise (this is the max_locked_shards() call in manager.py; there's no command line flag). -- Dirk On Mon, Dec 5, 2011 at 11:17 AM, Ojan Vafai o...@chromium.org wrote: Why is that? I don't know about other ports, but AFAIK, chromium writes to a mock clipboard and the Apple mac port writes to a local OS clipboard instance instead of the global one, specifically to avoid copy/paste tests interacting. Even without running tests in parallel, it's probably a good idea in order to avoid copy-pastes the developer is doing from affecting the test run. That said, I believe we have a general way of marking subdirectories as needing to run serially (which is what we do for http) if there are other reasons we need to. On Mon, Dec 5, 2011 at 11:12 AM, David Levin le...@google.com wrote: I believe there are some tests (copy/paste) that it would be very hard to fully shard due to how they work. dave On Mon, Dec 5, 2011 at 11:08 AM, Ojan Vafai o...@chromium.org wrote: I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0 (the worker that runs the http tests) took 2515 seconds to run. Ojan On Mon, Dec 5, 2011 at 6:55 AM, Adam Roben aro...@apple.com wrote: On Dec 2, 2011, at 6:55 PM, Eric Seidel wrote: The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). I suspect our Mac test bots could use a dose of RAM. Many of them only have 3GB, since when you're running tests one by one you don't really need much more. -Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
if http server instance == apache child process, then no. We don't do anything to particularly limit the number of apache children running or bind them, but given that in the normal case there is only ever one http test running at a time, you shouldn't see any issues from contention. (You would, of course, if we start running multiple http tests in parallel, so that's something to keep in mind, especially if we shard by-test instead of by-directory). -- Dirk On Mon, Dec 5, 2011 at 3:13 PM, Michael Nordman micha...@google.com wrote: Some http tests make use of stateful php scritps with different tests utlizing the same scripts in some cases. Does each 'worker' get a dedicated http server instance or do they share the same http server? On Mon, Dec 5, 2011 at 1:01 PM, Dirk Pranke dpra...@chromium.org wrote: We never implemented the general way of marking subdirectories as needing to run serially, but it would be easy to do if we needed to [the 'http' dirs are still special-cased in the code]. There is code now (landed a few months ago) to control how many http tests run in parallel separately from the main parallelism flag (so you can run 16 workers but only 2 http tests at a time). I implemented it but never flipped the switch because it was in the middle of Eric flipping the bots over to NRWT in the first place. We should try experimenting with this to see at some point once everything stabilizes otherwise (this is the max_locked_shards() call in manager.py; there's no command line flag). -- Dirk On Mon, Dec 5, 2011 at 11:17 AM, Ojan Vafai o...@chromium.org wrote: Why is that? I don't know about other ports, but AFAIK, chromium writes to a mock clipboard and the Apple mac port writes to a local OS clipboard instance instead of the global one, specifically to avoid copy/paste tests interacting. Even without running tests in parallel, it's probably a good idea in order to avoid copy-pastes the developer is doing from affecting the test run. That said, I believe we have a general way of marking subdirectories as needing to run serially (which is what we do for http) if there are other reasons we need to. On Mon, Dec 5, 2011 at 11:12 AM, David Levin le...@google.com wrote: I believe there are some tests (copy/paste) that it would be very hard to fully shard due to how they work. dave On Mon, Dec 5, 2011 at 11:08 AM, Ojan Vafai o...@chromium.org wrote: I looked at one example that didn't exit early: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29/builds/35153/steps/layout-test/logs/stdio In that case, the http tests were the long tail and took 6 minutes longer than all the other tests. We don't split the http tests up because every time we've tried it's caused too much flakiness. It's unclear if the flakiness points to a bug in the test harness (e.g. in how we setup apache) or to bugs in the tests themselves or both. If someone has time to look into this, this is probably the biggest benefit to be found in NRWT runtime when running tests in parallel. FYI, NRWT outputs a log of the runtime after each run: 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/9: 4696 tests, 1746.63 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/8: 1177 tests, 1693.47 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/3: 1408 tests, 2033.51 secs 2011-12-03 03:09:30,018 58036 printing.py:462 INFO worker/2: 941 tests, 2119.65 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/1: 1121 tests, 2041.97 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/0: 1453 tests, 2515.75 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/7: 1189 tests, 1731.12 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/6: 3556 tests, 2114.37 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/5: 948 tests, 2097.13 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/4: 1411 tests, 1716.66 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/15: 795 tests, 2027.16 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/14: 1123 tests, 1732.72 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/13: 425 tests, 2021.25 secs 2011-12-03 03:09:30,019 58036 printing.py:462 INFO worker/12: 1175 tests, 1710.09 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/11: 3462 tests, 2096.30 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO worker/10: 1449 tests, 1722.68 secs 2011-12-03 03:09:30,020 58036 printing.py:462 INFO 31120.45 cumulative, 1945.03 optimal That shows you that, if we fully sharded all the tests, they would in theory take 1945 seconds to run, but worker/0
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
On Fri, Dec 2, 2011 at 3:55 PM, Eric Seidel e...@webkit.org wrote: run-webkit-tests is moving to parallell testing by default (this weekend) I just moved Mac this afternoon. The SnowLeopard bot went from a 1 hr 4 min (!?!) cycle time, to 38 min (still !?!). http://build.webkit.org/builders/SnowLeopard%20Intel%20Debug%20%28Tests%29/builds/3317 http://build.webkit.org/builders/SnowLeopard%20Intel%20Debug%20%28Tests%29/builds/3318 I will start working on the other bots once I believe the Mac ones to be stable. Let me know if you have any troubles! I'm not sure we want to flip the switch on them just yet ... if you look at the log file: 2011-12-02 15:34:44,330 98936 printing.py:462 INFO Tests that timed out or crashed: 2011-12-02 15:34:44,330 98936 printing.py:462 INFO java/argument-to-object-type.html took 830.8 seconds 2011-12-02 15:34:44,331 98936 printing.py:462 INFO fast/frames/lots-of-iframes.html took 733.7 seconds 2011-12-02 15:34:44,331 98936 printing.py:462 INFO sputnik/Unicode/Unicode_320/S7.6_A3.1.html took 565.3 seconds 2011-12-02 15:34:44,331 98936 printing.py:462 INFO fast/frames/lots-of-objects.html took 498.0 seconds 2011-12-02 15:34:44,331 98936 printing.py:462 INFO java/array-return.html took 244.7 seconds 2011-12-02 15:34:44,331 98936 printing.py:462 INFO inspector/profiler/cpu-profiler-profiling.html took 161.0 seconds 2011-12-02 15:34:44,331 98936 printing.py:462 INFO some of those tests are taking 10 minutes or more to complete ... there's clearly one or more bugs here keeping NRWT from timing out DRT properly. Some are almost certainly in NRWT, but I wonder if there are things in the o/s or in DRT's implementation that are being serialized as well? For contrast, in the single-threaded case, even the slowest test in the previous run only took 140 seconds. Even in the absence of bugs, the mac port seems to be running with timeout values of 35 seconds and 175 seconds for slow tests. That seems awfully slow :) For comparison, the single-threaded Apple SL Release build took 26 minutes to complete in the most recent not-horribly-broken build (#35132); the Chromium SL Release took 15 minutes to complete using two workers (which seems about right, since Chromium's DRT is somewhat slower than Apple's IIRC). The Chromium SL Debug build also took a cumulative hour to run, but completed in ~30 minutes with two workers. -- Dirk PS: As an aside, we should turn on the allow multiple http tests in parallel flag on at least some of the bots; the Chromium Linux Debug bot is running on a 48-core machine (IIRC) and most of the workers are completing in 2-4 minutes but the http tests are running all in one worker and taking 24 ... with optimal sharding the whole test run should take about 4 minutes according to the log for that bot: http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Linux%20%28dbg%29/builds/903/steps/webkit_tests/logs/stdio ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] run-webkit-tests is moving to parallell testing by default (this weekend)
On Fri, Dec 2, 2011 at 6:44 PM, Dirk Pranke dpra...@chromium.org wrote: some of those tests are taking 10 minutes or more to complete ... there's clearly one or more bugs here keeping NRWT from timing out DRT properly. Some are almost certainly in NRWT, but I wonder if there are things in the o/s or in DRT's implementation that are being serialized as well? Are you suggesting this has something to do with using more than one child process? Mac has used NRWT by default for months now. :) The bots are showing faster times with parallel execution now enabled. So I suspect this 10-minute timeout problem has existed for a while. Please file a bug. -eric ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev