Re: Parallel execution of unittests
On 5/1/14, 9:07 AM, Dicebot wrote: On Thursday, 1 May 2014 at 15:37:21 UTC, Andrei Alexandrescu wrote: On 5/1/14, 8:04 AM, Dicebot wrote: On Thursday, 1 May 2014 at 14:55:50 UTC, Andrei Alexandrescu wrote: On 5/1/14, 1:34 AM, Dicebot wrote: I have just recently went through some of out internal projects removing all accidental I/O tests for the very reason that /tmp was full Well a bunch of stuff will not work on a full /tmp. Sorry, hard to elicit empathy with a full /tmp :o). -- Andrei So you are OK with your unit tests failing randomly with no clear diagnostics? I'm OK with my unit tests failing on a machine with a full /tmp. The machine needs fixing. -- Andrei It got full because of tests (surprise!). Your actions? Fix the machine and reduce the output created by the unittests. It's a simple engineering problem. -- Andrei
Re: Parallel execution of unittests
Le 01/05/2014 13:44, w0rp a écrit : On Thursday, 1 May 2014 at 11:05:55 UTC, Jacob Carlborg wrote: On 2014-04-30 23:35, Andrei Alexandrescu wrote: Agreed. I think we should look into parallelizing all unittests. -- Andrei I recommend running the tests in random order as well. This is a bad idea. Tests could fail only some of the time. Even if bugs are missed, I would prefer it if tests did exactly the same thing every time. I am in favor of randomized order, cause it can help to find real bugs.
Re: Parallel execution of unittests
Le 01/05/2014 16:01, Atila Neves a écrit : On Thursday, 1 May 2014 at 11:44:12 UTC, w0rp wrote: On Thursday, 1 May 2014 at 11:05:55 UTC, Jacob Carlborg wrote: On 2014-04-30 23:35, Andrei Alexandrescu wrote: Agreed. I think we should look into parallelizing all unittests. -- Andrei I recommend running the tests in random order as well. This is a bad idea. Tests could fail only some of the time. Even if bugs are missed, I would prefer it if tests did exactly the same thing every time. They _should_ do exactly the same thing every time. Which is why running in threads or at random is a great way to enforce that. Atila +1
Re: Parallel execution of unittests
On Thu, 01 May 2014 12:07:19 -0400, Dicebot pub...@dicebot.lv wrote: On Thursday, 1 May 2014 at 15:37:21 UTC, Andrei Alexandrescu wrote: On 5/1/14, 8:04 AM, Dicebot wrote: On Thursday, 1 May 2014 at 14:55:50 UTC, Andrei Alexandrescu wrote: On 5/1/14, 1:34 AM, Dicebot wrote: I have just recently went through some of out internal projects removing all accidental I/O tests for the very reason that /tmp was full Well a bunch of stuff will not work on a full /tmp. Sorry, hard to elicit empathy with a full /tmp :o). -- Andrei So you are OK with your unit tests failing randomly with no clear diagnostics? I'm OK with my unit tests failing on a machine with a full /tmp. The machine needs fixing. -- Andrei It got full because of tests (surprise!). Your actions? It would be nice to have a uniform mechanism to get a unique system-dependent file location for each specific unit test. The file should automatically delete itself at the end of the test. -Steve
Re: Parallel execution of unittests
On Thu, 01 May 2014 10:01:19 -0400, Atila Neves atila.ne...@gmail.com wrote: On Thursday, 1 May 2014 at 11:44:12 UTC, w0rp wrote: On Thursday, 1 May 2014 at 11:05:55 UTC, Jacob Carlborg wrote: On 2014-04-30 23:35, Andrei Alexandrescu wrote: Agreed. I think we should look into parallelizing all unittests. -- Andrei I recommend running the tests in random order as well. This is a bad idea. Tests could fail only some of the time. Even if bugs are missed, I would prefer it if tests did exactly the same thing every time. They _should_ do exactly the same thing every time. Which is why running in threads or at random is a great way to enforce that. But not a great way to debug it. If your test failure depends on ordering, then the next run will be random too. Proposal runtime parameter for pre-main consumption: ./myprog --rndunit[=seed] To run unit tests randomly. Prints out as first order of business the seed value before starting. That way, you can repeat the exact same ordering for debugging. -Steve
Re: Parallel execution of unittests
On 5/1/14, 10:09 AM, Steven Schveighoffer wrote: On Thu, 01 May 2014 12:07:19 -0400, Dicebot pub...@dicebot.lv wrote: On Thursday, 1 May 2014 at 15:37:21 UTC, Andrei Alexandrescu wrote: On 5/1/14, 8:04 AM, Dicebot wrote: On Thursday, 1 May 2014 at 14:55:50 UTC, Andrei Alexandrescu wrote: On 5/1/14, 1:34 AM, Dicebot wrote: I have just recently went through some of out internal projects removing all accidental I/O tests for the very reason that /tmp was full Well a bunch of stuff will not work on a full /tmp. Sorry, hard to elicit empathy with a full /tmp :o). -- Andrei So you are OK with your unit tests failing randomly with no clear diagnostics? I'm OK with my unit tests failing on a machine with a full /tmp. The machine needs fixing. -- Andrei It got full because of tests (surprise!). Your actions? It would be nice to have a uniform mechanism to get a unique system-dependent file location for each specific unit test. The file should automatically delete itself at the end of the test. Looks like /tmp (%TEMP% or C:\TEMP in Windows) in conjunction with the likes of mkstemp is what you're looking for :o). Andrei
Re: Parallel execution of unittests
On Thursday, 1 May 2014 at 17:24:58 UTC, Andrei Alexandrescu wrote: On 5/1/14, 10:09 AM, Steven Schveighoffer wrote: On Thu, 01 May 2014 12:07:19 -0400, Dicebot pub...@dicebot.lv wrote: On Thursday, 1 May 2014 at 15:37:21 UTC, Andrei Alexandrescu wrote: On 5/1/14, 8:04 AM, Dicebot wrote: On Thursday, 1 May 2014 at 14:55:50 UTC, Andrei Alexandrescu wrote: On 5/1/14, 1:34 AM, Dicebot wrote: I have just recently went through some of out internal projects removing all accidental I/O tests for the very reason that /tmp was full Well a bunch of stuff will not work on a full /tmp. Sorry, hard to elicit empathy with a full /tmp :o). -- Andrei So you are OK with your unit tests failing randomly with no clear diagnostics? I'm OK with my unit tests failing on a machine with a full /tmp. The machine needs fixing. -- Andrei It got full because of tests (surprise!). Your actions? It would be nice to have a uniform mechanism to get a unique system-dependent file location for each specific unit test. The file should automatically delete itself at the end of the test. Looks like /tmp (%TEMP% or C:\TEMP in Windows) in conjunction with the likes of mkstemp is what you're looking for :o). Andrei It hasn't been C:\TEMP for almost 13 years (before Windows XP which is now also end-of-life). Use GetTempPath. http://msdn.microsoft.com/en-us/library/windows/desktop/aa364992(v=vs.85).aspx
Re: Parallel execution of unittests
On 5/1/14, 10:32 AM, Brad Anderson wrote: It hasn't been C:\TEMP for almost 13 years About the time when I switched :o). -- Andrei
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 16:19:48 UTC, Byron wrote: On Wed, 30 Apr 2014 09:02:54 -0700, Andrei Alexandrescu wrote: I think indeed a small number of unittests rely on order of execution. Maybe nested unittests? unittest OrderTests { // setup for all child tests? unittest a { } unittest b { } } I like my unit tests to be next to the element under test, and it seems like this nesting would impose some limits on that. Another idea might be to use the level of the unit as an indicator of order dependencies. If UTs for B call/depend on A, then we would assign A to level 0, run it's UTs first, and assign B to level 1. All 0's run before all 1's. Could we use a template arg on the UT to indicate level? unittest!(0) UtA { // test A} unittest!{1} UtB { // test B} Or maybe some fancier compiler dependency analysis?
Re: Parallel execution of unittests
On Thu, 01 May 2014 13:25:00 -0400, Andrei Alexandrescu seewebsiteforem...@erdani.org wrote: On 5/1/14, 10:09 AM, Steven Schveighoffer wrote: On Thu, 01 May 2014 12:07:19 -0400, Dicebot pub...@dicebot.lv wrote: On Thursday, 1 May 2014 at 15:37:21 UTC, Andrei Alexandrescu wrote: On 5/1/14, 8:04 AM, Dicebot wrote: On Thursday, 1 May 2014 at 14:55:50 UTC, Andrei Alexandrescu wrote: On 5/1/14, 1:34 AM, Dicebot wrote: I have just recently went through some of out internal projects removing all accidental I/O tests for the very reason that /tmp was full Well a bunch of stuff will not work on a full /tmp. Sorry, hard to elicit empathy with a full /tmp :o). -- Andrei So you are OK with your unit tests failing randomly with no clear diagnostics? I'm OK with my unit tests failing on a machine with a full /tmp. The machine needs fixing. -- Andrei It got full because of tests (surprise!). Your actions? It would be nice to have a uniform mechanism to get a unique system-dependent file location for each specific unit test. The file should automatically delete itself at the end of the test. Looks like /tmp (%TEMP% or C:\TEMP in Windows) in conjunction with the likes of mkstemp is what you're looking for :o). No, I'm looking for unittest_getTempFile(Line = __LINE__, File = __FILE__)(), which handles all the magic of opening a temporary file, allowing me to use it for the unit test, and then closing and deleting it at the end, when the test passes. -Steve
Re: Parallel execution of unittests
On 5/1/14, 10:41 AM, Jason Spencer wrote: On Wednesday, 30 April 2014 at 16:19:48 UTC, Byron wrote: On Wed, 30 Apr 2014 09:02:54 -0700, Andrei Alexandrescu wrote: I think indeed a small number of unittests rely on order of execution. Maybe nested unittests? unittest OrderTests { // setup for all child tests? unittest a { } unittest b { } } I like my unit tests to be next to the element under test, and it seems like this nesting would impose some limits on that. Another idea might be to use the level of the unit as an indicator of order dependencies. If UTs for B call/depend on A, then we would assign A to level 0, run it's UTs first, and assign B to level 1. All 0's run before all 1's. Could we use a template arg on the UT to indicate level? unittest!(0) UtA { // test A} unittest!{1} UtB { // test B} Or maybe some fancier compiler dependency analysis? Well how complicated can we make it all? -- Andrei
Re: Parallel execution of unittests
On Thursday, 1 May 2014 at 17:04:53 UTC, Xavier Bigand wrote: Le 01/05/2014 16:01, Atila Neves a écrit : On Thursday, 1 May 2014 at 11:44:12 UTC, w0rp wrote: On Thursday, 1 May 2014 at 11:05:55 UTC, Jacob Carlborg wrote: On 2014-04-30 23:35, Andrei Alexandrescu wrote: Agreed. I think we should look into parallelizing all unittests. -- Andrei I recommend running the tests in random order as well. This is a bad idea. Tests could fail only some of the time. Even if bugs are missed, I would prefer it if tests did exactly the same thing every time. They _should_ do exactly the same thing every time. Which is why running in threads or at random is a great way to enforce that. Atila +1 Tests shouldn't be run in a random order all of the time, perhaps once in a while, manually. Having continuous integration randomly report build failures is crap. Either you should always see a build failure, or you shouldn't see it. You can only test things which are deterministic, at least as far as what you observe. Running tests in a random order should be something you do manually, only when you have some ability to figure out why the tests just failed.
Re: Parallel execution of unittests
On 2014-05-01 19:12, Steven Schveighoffer wrote: But not a great way to debug it. If your test failure depends on ordering, then the next run will be random too. Proposal runtime parameter for pre-main consumption: ./myprog --rndunit[=seed] To run unit tests randomly. Prints out as first order of business the seed value before starting. That way, you can repeat the exact same ordering for debugging. That's exactly what RSpec does. I think it works great. -- /Jacob Carlborg
Re: Parallel execution of unittests
On Thu, 01 May 2014 10:42:54 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thu, 01 May 2014 00:49:53 -0400, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 20:33:06 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: I do think there should be a way to mark a unit test as don't parallelize this. Regardless what our exact solution is, a key thing is that we need to be able have both tests which are run in parallel and tests which are run in serial. Switching to parallel by default will break code, but that may be acceptable. And I'm somewhat concerned about automatically parallelizing unit tests which aren't pure just because it's still trivial to write unittest blocks that aren't safely parallelizable (even if most such examples typically aren't good practice) whereas they'd work just fine now. But ultimately, my main concern is that we not enforce that all unit tests be parallelized, because that precludes certain types of tests. A function may be impure, but run in a pure way. True. The idea behind using purity is that it guarantees that the unittest blocks would be safely parallelizable. But even if we were to go with purity, that doesn't preclude having some way to mark a unittest as parallelizable in spite of its lack of purity. It just wouldn't be automatic. Anything that requires using the local time zone should be done in a single unit test. Most everything in std.datetime should use a defined time zone instead of local time. Because LocalTime is the default timezone, most of the tests use it. In general, I think that that's fine and desirable, because LocalTime is what most everyone is going to be using. Where I think that it actually ends up being a problem (and will eventually necessitate that I rewrite a number of the tests - possibly most of them) is when tests end up making assumptions that can break in certain time zones. So, in the long run, I expect that far fewer tests will use LocalTime than is currently the case, but I don't think that I agree that it should be avoided on quite the level that you seem to. It is on my todo list though to go over std.datetime's unit tests and make it stop using LocalTime where that will result in the tests failing in some time zones. Take for example, std.datetime. The constructor for SysTime has this line in it: _timezone = tz is null ? LocalTime() : tz; All unit tests that pass in a specific tz (such as UTC) could be pure calls. But because of that line, they can't be! Pretty much nothing involving SysTime is pure, because adjTime can't be pure, because LocalTime's conversion functions can't be pure, because it calls the system's functions to do the conversions. So, very few of SysTime's unit tests could be parallelized based on purity. The constructor is just one of many places where SysTime can't be pure. So, it's an example of the types of tests that would have to be marked as explicitly parallelizable if we used purity as a means of determining automatic parallelizabitily. - Jonathan M Davis
Re: Parallel execution of unittests
On 2014-05-01 17:15, Andrei Alexandrescu wrote: That's all nice, but I feel we're going gung ho with overengineering already. If we give unittests names and then offer people a button parallelize unittests to push (don't even specify the number of threads! let the system figure it out depending on cores), that's a good step to a better world. Sure. But on the other hand, why should D not have a great unit testing framework built-in. -- /Jacob Carlborg
Re: Parallel execution of unittests
Am Thu, 01 May 2014 08:07:51 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 5/1/14, 4:31 AM, Johannes Pfau wrote: @Andrei do you think having to explicitly import modules to be tested is an issue? Well it kinda is. All that's written on the package is unittest, we should add no fine print to it. -- Andrei It'd be possible to make it work transparently, but that's much more work. We might need to do that at some point, as it's also necessary for std.benchmark and similar code but for now that's probably over-engineering. Here's the revived pull request: https://github.com/D-Programming-Language/dmd/pull/3518 https://github.com/D-Programming-Language/druntime/pull/782
Re: Parallel execution of unittests
On 5/1/14, 11:49 AM, Jacob Carlborg wrote: On 2014-05-01 17:15, Andrei Alexandrescu wrote: That's all nice, but I feel we're going gung ho with overengineering already. If we give unittests names and then offer people a button parallelize unittests to push (don't even specify the number of threads! let the system figure it out depending on cores), that's a good step to a better world. Sure. But on the other hand, why should D not have a great unit testing framework built-in. It should. My focus is to get (a) unittest names and (b) parallel testing into the language ASAP. Andrei
Re: Parallel execution of unittests
On 5/1/14, 12:05 PM, Johannes Pfau wrote: Am Thu, 01 May 2014 08:07:51 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 5/1/14, 4:31 AM, Johannes Pfau wrote: @Andrei do you think having to explicitly import modules to be tested is an issue? Well it kinda is. All that's written on the package is unittest, we should add no fine print to it. -- Andrei It'd be possible to make it work transparently, but that's much more work. We might need to do that at some point, as it's also necessary for std.benchmark and similar code but for now that's probably over-engineering. Here's the revived pull request: https://github.com/D-Programming-Language/dmd/pull/3518 https://github.com/D-Programming-Language/druntime/pull/782 I'm unclear what this work does even after having read the description. What does it require in addition to just sprinkling unittests around? -- Andrei
Re: Parallel execution of unittests
Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. The current system of running unit tests prior to main is, in my opinion, fundamentally broken. Logically, the unit tests are a build step - something you do after compile to ensure things are good. Tying them to running main means I cannot have a build that passes unit tests that is also a production build. Granted, it is (as far as I know) impossible to actually compile a production version of code separately from the unittest code, and be able to run the one on the other. But it would be nice to move to something more in line with unittest-as-build-step, rather than -as-different-build. On named tests, I heartily support this. Especially if it comes with the ability to selectively run one test - such is incredibly useful for large projects, to quickly iterate on broken bits.
Re: Parallel execution of unittests
Am Thu, 01 May 2014 12:26:07 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 5/1/14, 12:05 PM, Johannes Pfau wrote: Here's the revived pull request: https://github.com/D-Programming-Language/dmd/pull/3518 https://github.com/D-Programming-Language/druntime/pull/782 I'm unclear what this work does even after having read the description. What does it require in addition to just sprinkling unittests around? -- Andrei Nothing. This just changes the way druntime internally handles the unit tests, but nothing changes for the user: Right now we have one function per module, which calls all unit tests in these modules, but the test runner does not have access to the individual unittest functions. Instead of exposing one function per module, this now exposes every single unittest function. So you can now run every test individually or pass the function pointer to a different thread and run it there. Additionally this provides information about the source location of every unit test. I thought the example is quite clear? - bool tester() { import std.stdio; //iterate all modules foreach(info; ModuleInfo) { //iterate unit test in modules foreach(test; info.unitTests) { //access unittest information writefln(ver=%s file=%s:%s disabled=%s func=%s, test.ver, test.file, test.line, test.disabled, test.func); //execute unittest test.func()(); } } return true; } shared static this() { Runtime.moduleUnitTester = tester; } - You customize the test runner just like you did before, by setting Runtime.moduleUnitTester in a module constructor. Now you still have to implement a custom test runner to run unittest in parallel, but that's trivial. We can of course add an implementation this to druntime, but we can't use std.parallelism there. What's a little more complicated is the versioning scheme, but that's an implementation detail. It ensures that we can change the exposed information (for example if we want to add a name field, or remove some field) without breaking anything.
Re: Parallel execution of unittests
On 5/1/14, 12:47 PM, Johannes Pfau wrote: Am Thu, 01 May 2014 12:26:07 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 5/1/14, 12:05 PM, Johannes Pfau wrote: Here's the revived pull request: https://github.com/D-Programming-Language/dmd/pull/3518 https://github.com/D-Programming-Language/druntime/pull/782 I'm unclear what this work does even after having read the description. What does it require in addition to just sprinkling unittests around? -- Andrei Nothing. This just changes the way druntime internally handles the unit tests, but nothing changes for the user: Great, thanks. Just making sure there's no unstated assumptions somewhere. Help with reviewing from the compiler and druntime folks would be appreciated! -- Andrei
Re: Parallel execution of unittests
Am Thu, 01 May 2014 13:00:44 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: Nothing. This just changes the way druntime internally handles the unit tests, but nothing changes for the user: Great, thanks. Just making sure there's no unstated assumptions somewhere. Help with reviewing from the compiler and druntime folks would be appreciated! -- Andrei I added an parallel test runner example: http://dpaste.dzfl.pl/69baabd83e68 Output: Test 2, Thread 7FE6F0EA4D00 Test 4, Thread 7FE6F0EA4E00 Test 7, Thread 7FE6F0EA4E00 Test 8, Thread 7FE6F0EA4E00 Test 9, Thread 7FE6F0EA4E00 Test 10, Thread 7FE6F0EA4E00 Test 1, Thread 7FE6F0EA4F00 Test 5, Thread 7FE6F0EA4B00 Test 3, Thread 7FE6F0EA4C00 Test 6, Thread 7FE6F0EA4D00
Re: Parallel execution of unittests
On Thursday, 1 May 2014 at 17:57:05 UTC, Andrei Alexandrescu wrote: Well how complicated can we make it all? -- Andrei As simple as possible, but no simpler :) I've seen you favor this or that feature because it would make unit testing easier and more accessible, and eschew features that would cause folks to not bother. In truth, we could leave it how it is. But I surmise you started this thread to improve the feature and encourage more use of unit test. So we're looking for the sweet spot. I don't think it's important to support the sharing of state between unit tests. But I do see value in being able to influence the order of test execution, largely for debugging reasons. It's important for a module's tests to be able to depend on other modules--otherwise, unittest is not very enticing. If it does and there's a failure, it's hugely helpful to know the failure is caused by the unit-under-test, and not the dependency(s). The common way to do that is to run the tests in reverse order of dependency--i.e. levelize the design and test from the bottom up. See Large Scale C++ SW Design, Lakos, Chp. 3-4. I imagine there are other niche reasons for order, but for me, this is the driving reason. So it seems a nice middle ground. If order is important, it might be a workable approach to run unittest in the reverse module dependency order by default. A careful programmer could arrange those classes/functions in modules to take advantage of that order if it were important. Seems like we'd have the dependency information--building and traversing a tree shouldn't be that tough To preserve it, you'd only be able to parallelize the UTs within a module at a time (unless there's a different flag or something.) But it seems the key question is whether order can EVER be important for any reason. I for one would be willing to give up parallelization to get levelized tests. What are you seeing on your project? How do you allow tests to have dependencies and avoid order issues? Why is parallelization more important than that?
Re: Parallel execution of unittests
On 5/1/14, 2:28 PM, Jason Spencer wrote: On Thursday, 1 May 2014 at 17:57:05 UTC, Andrei Alexandrescu wrote: Well how complicated can we make it all? -- Andrei As simple as possible, but no simpler :) I've seen you favor this or that feature because it would make unit testing easier and more accessible, and eschew features that would cause folks to not bother. In truth, we could leave it how it is. But I surmise you started this thread to improve the feature and encourage more use of unit test. So we're looking for the sweet spot. I don't think it's important to support the sharing of state between unit tests. But I do see value in being able to influence the order of test execution, largely for debugging reasons. It's important for a module's tests to be able to depend on other modules--otherwise, unittest is not very enticing. If it does and there's a failure, it's hugely helpful to know the failure is caused by the unit-under-test, and not the dependency(s). The common way to do that is to run the tests in reverse order of dependency--i.e. levelize the design and test from the bottom up. See Large Scale C++ SW Design, Lakos, Chp. 3-4. I imagine there are other niche reasons for order, but for me, this is the driving reason. So it seems a nice middle ground. If order is important, it might be a workable approach to run unittest in the reverse module dependency order by default. A careful programmer could arrange those classes/functions in modules to take advantage of that order if it were important. Seems like we'd have the dependency information--building and traversing a tree shouldn't be that tough To preserve it, you'd only be able to parallelize the UTs within a module at a time (unless there's a different flag or something.) But it seems the key question is whether order can EVER be important for any reason. I for one would be willing to give up parallelization to get levelized tests. What are you seeing on your project? How do you allow tests to have dependencies and avoid order issues? Why is parallelization more important than that? I'll be blunt. What you say is technically sound (which is probably why you believe it is notable) but seems to me an unnecessarily complex engineering contraption that in all likelihood has more misuses than good uses. I fully understand you may think I'm a complete chowderhead for saying this; in the past I've been in your place and others have been in mine, and it took me years to appreciate both positions. -- Andrei
Re: Parallel execution of unittests
On 5/1/14, 4:22 PM, Andrei Alexandrescu wrote: On 5/1/14, 11:49 AM, Jacob Carlborg wrote: On 2014-05-01 17:15, Andrei Alexandrescu wrote: That's all nice, but I feel we're going gung ho with overengineering already. If we give unittests names and then offer people a button parallelize unittests to push (don't even specify the number of threads! let the system figure it out depending on cores), that's a good step to a better world. Sure. But on the other hand, why should D not have a great unit testing framework built-in. It should. My focus is to get (a) unittest names and (b) parallel testing into the language ASAP. Andrei What's the rush?
Re: Parallel execution of unittests
On Thu, 01 May 2014 14:40:41 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/1/14, 2:28 PM, Jason Spencer wrote: But it seems the key question is whether order can EVER be important for any reason. I for one would be willing to give up parallelization to get levelized tests. What are you seeing on your project? How do you allow tests to have dependencies and avoid order issues? Why is parallelization more important than that? I'll be blunt. What you say is technically sound (which is probably why you believe it is notable) but seems to me an unnecessarily complex engineering contraption that in all likelihood has more misuses than good uses. I fully understand you may think I'm a complete chowderhead for saying this; in the past I've been in your place and others have been in mine, and it took me years to appreciate both positions. -- Andrei It's my understanding that given how druntime is put together, it should be possible to override some of its behaviors such that you could control the order in which tests were run (the main thing lacking at this point is the fact that you can currently only control it at module-level granularity) and that that's what existing third party unit test frameworks for D do. So, I would think that we could make it so that the default test runner does things the sensible way that works for most everyone, and then anyone who really wants more control can choose to override the normal test runner to do run the tests the way that they want to. That should be essentially the way that it is now. The main question then is which features we think are sensible for everyone, and I think that based on this discussion, at this point, it's primarily 1. Make it possible for druntime to access unit test functions individually. 2. Make it so that druntime runs unit test functions in parallel unless they're marked as needing to be run in serial (probably with a UDA for that purpose). 3. Make it so that we can name unittest blocks so that stack traces have better function names in them. With those sorted out, we can look at further features like whether we want to be able to run unit tests by name (or whatever other nice features we can come up with), but we might as well start there rather than trying to come up with a comprehensive list of the features that D's unit testing facilities should have (especially since we really should be erring on the side of simple). - Jonathan M Davis
Re: Parallel execution of unittests
On Thursday, 1 May 2014 at 21:40:38 UTC, Andrei Alexandrescu wrote: I'll be blunt. What you say is technically sound (which is probably why you believe it is notable)... Well, I suppose that's not the MOST insulting brush-off I could hope for, but it falls short of encouraging me to contribute ideas for the improvement of the language. I'll just add this: I happen to introduce a colleague to the D webpage the other day, and ran across this in the overview: D ... doesn't come with a VM, a religion, or an overriding philosophy. It's a practical language for practical programmers who need to get the job done quickly, reliably, and leave behind maintainable, easy to understand code. This business that only inherently parallel tests that never access disk, share setup, etc. are TRUE unit tests smack much more of religion than pragmatism. Indeed, phobos demonstrates that sometimes, the practical thing to do is to violate these normally good rules. Another overriding principle of D is that the easy thing to do should be the safe thing to do, and dangerous things should take some work. I don't see that reflected in the proposal to turn parallelism on by default. This seems like a time bomb waiting to go off on unsuspecting acolytes of the cult of inherently-parallel-tests-onlyism. If we don't want to consider how we can accommodate both camps here, then I must at least support Jonathan's modest suggestion that parallel UTs require active engagement rather than being the default.
Re: Parallel execution of unittests
On Friday, 2 May 2014 at 03:04:39 UTC, Jason Spencer wrote: If we don't want to consider how we can accommodate both camps here, then I must at least support Jonathan's modest suggestion that parallel UTs require active engagement rather than being the default. Use chroot() and fork(). Solves all problems.
Re: Parallel execution of unittests
On 5/1/2014 12:32 AM, bearophile wrote: This is just the basic idea, and perhaps people suggested something better than this. You've already got it working with version, that's what version is for. Why add yet another way to do it?
Parallel execution of unittests
Hello, A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale - coworkers are increasingly talking about the unittest at line 2035 is failing and such. With unittests executing in multiple threads and issuing e.g. logging output, this is only likely to become more exacerbated. We've resisted named unittests but I think there's enough evidence to make the change. Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. Thoughts? Would anyone want to work on such stuff? Andrei
Re: Parallel execution of unittests
Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} We've resisted named unittests but I think there's enough evidence to make the change. Yes, the optional name for unittests is an improvement: unittest {} unittest foo {} I am very glad your coworker find such usability problems :-) We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Good. I'd also like some built-in way (or partially built-in) to use a module only as main module (to run its demos) or as module to be imported. This problem is solved in Python with the if __name__ == __main__: idiom. Bye, bearophile
Re: Parallel execution of unittests
Am Wed, 30 Apr 2014 08:43:31 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. I filed a pull request which allowed running unit tests individually (and in different threads*) two years ago, but didn't pursue this further: https://github.com/D-Programming-Language/dmd/pull/1131 https://github.com/D-Programming-Language/druntime/pull/308 To summarize: It provides a function pointer for every unit test to druntime or user code. This is actually easy to do. Naming tests requires changes in the parser, but I guess that shouldn't be difficult either. * Some time ago there was a discussion whether unit tests can rely on other tests being executed first / execution order. AFAIK some phobos tests require this. That of course won't work if you run the tests in different threads.
Re: Parallel execution of unittests
On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei
Re: Parallel execution of unittests
On 4/30/14, 8:54 AM, Johannes Pfau wrote: Am Wed, 30 Apr 2014 08:43:31 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. I filed a pull request which allowed running unit tests individually (and in different threads*) two years ago, but didn't pursue this further: https://github.com/D-Programming-Language/dmd/pull/1131 https://github.com/D-Programming-Language/druntime/pull/308 To summarize: It provides a function pointer for every unit test to druntime or user code. This is actually easy to do. Naming tests requires changes in the parser, but I guess that shouldn't be difficult either. That's fantastic, would you be willing to reconsider that work? * Some time ago there was a discussion whether unit tests can rely on other tests being executed first / execution order. AFAIK some phobos tests require this. That of course won't work if you run the tests in different threads. I think indeed a small number of unittests rely on order of execution. Those will be still runnable with a fork factor of 1. We'd need a way to specify that - either a flag or: static shared this() { Runtime.unittestThreads = 1; } Andrei
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 09:02:54 -0700, Andrei Alexandrescu wrote: I think indeed a small number of unittests rely on order of execution. Those will be still runnable with a fork factor of 1. We'd need a way to specify that - either a flag or: static shared this() { Runtime.unittestThreads = 1; } Andrei Named tested seems like a no brainier to me. Maybe nested unittests? unittest OrderTests { // setup for all child tests? unittest a { } unittest b { } } I also wonder if its just better to extend/expose the unittest API for more advanced things like order of execution, test reporting, and parallel execution. And we can just support an external unittesting library to do all the advanced testing options.
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 15:43:35 UTC, Andrei Alexandrescu wrote: Hello, A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale - coworkers are increasingly talking about the unittest at line 2035 is failing and such. With unittests executing in multiple threads and issuing e.g. logging output, this is only likely to become more exacerbated. We've resisted named unittests but I think there's enough evidence to make the change. Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. Thoughts? Would anyone want to work on such stuff? Andrei An existing library implementation: https://github.com/atilaneves/unit-threaded
Re: Parallel execution of unittests
On 4/30/14, 9:19 AM, Byron wrote: On Wed, 30 Apr 2014 09:02:54 -0700, Andrei Alexandrescu wrote: I think indeed a small number of unittests rely on order of execution. Those will be still runnable with a fork factor of 1. We'd need a way to specify that - either a flag or: static shared this() { Runtime.unittestThreads = 1; } Andrei Named tested seems like a no brainier to me. Maybe nested unittests? unittest OrderTests { // setup for all child tests? unittest a { } unittest b { } } I wouldn't want to get too excited about stuff without there being a need for it. We risk overcomplicating things (i.e what happens inside loops etc). I also wonder if its just better to extend/expose the unittest API for more advanced things like order of execution, test reporting, and parallel execution. And we can just support an external unittesting library to do all the advanced testing options. That would be pretty rad. Andrei
Re: Parallel execution of unittests
On 4/30/14, 9:24 AM, QAston wrote: An existing library implementation: https://github.com/atilaneves/unit-threaded Nice! The Warning: With dmd 2.064.2 and the gold linker on Linux 64-bit this code crashes. is hardly motivating though :o). I think this project is a confluence of a couple others, such as logging and a collection of specialized assertions. But it's hard to tell without documentation, and the linked output https://github.com/atilaneves/unit-threaded/blob/master/unit_threaded/io.d does not exist. Andrei
Re: Parallel execution of unittests
Am Wed, 30 Apr 2014 09:28:18 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: I also wonder if its just better to extend/expose the unittest API for more advanced things like order of execution, test reporting, and parallel execution. And we can just support an external unittesting library to do all the advanced testing options. That would be pretty rad. We can kinda do that. I guess the main problem for a simple approach is that unittests are functions but advanced frameworks often have unittest classes/objects. We can't really emulate that on top of functions. What we can easily do is parse UDAs on unittests and provide access to these UDAs. For example: module my.testlib; struct Author { string _name; string serialize() {return _name;} //Must be evaluated in CTFE } module test; import my.testlib; @Author(The Author) unittest { //Code goes here } Then with the mentioned pull request we just add another field to the runtime unittest information struct: An associative array with string keys matching the qualified name of the UDA and as values the strings returned by serialize() (evaluated by CTFE). Then we have for the test runner: foreach( m; ModuleInfo ) { foreach(test; m.unitTests) { if(my.testlib.Author in test.uda) writefln(Author: %s, test.uda[my.testlib.Author]); } } This is some more work to implement though, but it's additive so we can first implement the basic mechanism in pull #1131 then add this uda stuff later.
Re: Parallel execution of unittests
Am Wed, 30 Apr 2014 09:02:54 -0700 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: https://github.com/D-Programming-Language/dmd/pull/1131 https://github.com/D-Programming-Language/druntime/pull/308 To summarize: It provides a function pointer for every unit test to druntime or user code. This is actually easy to do. Naming tests requires changes in the parser, but I guess that shouldn't be difficult either. That's fantastic, would you be willing to reconsider that work? Sure, I'll have a look later today.
Re: Parallel execution of unittests
On Wed, Apr 30, 2014 at 08:43:31AM -0700, Andrei Alexandrescu via Digitalmars-d wrote: [...] Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. [...] Actually, I still run unittests before main. :) When I want to *not* run unittests, I just recompile with -release (and no -unittest). The nice thing about unittests running before main is that during the code-compile-test cycle I can have the unittests run *and* manually test the program afterwards -- usually in this case I only run the program once before modifying the code and recompiling, so it would be needless work to have to compile the program twice (once for unittests, once for main). An alternative, perhaps nicer, idea is to have a *runtime* switch to run unittests, recognized by druntime, perhaps something like: ./program --pragma-druntime-run-unittests Or something similarly unlikely to clash with real options accepted by the program. T -- Genius may have its limitations, but stupidity is not thus handicapped. -- Elbert Hubbard
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 16:24:16 UTC, QAston wrote: On Wednesday, 30 April 2014 at 15:43:35 UTC, Andrei Alexandrescu wrote: Hello, A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale - coworkers are increasingly talking about the unittest at line 2035 is failing and such. With unittests executing in multiple threads and issuing e.g. logging output, this is only likely to become more exacerbated. We've resisted named unittests but I think there's enough evidence to make the change. Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. Thoughts? Would anyone want to work on such stuff? Andrei An existing library implementation: https://github.com/atilaneves/unit-threaded Beat me to it! :P The concurrency and naming aspects are exactly what drove me to write unit-threaded to begin with. I probably wouldn't have bothered if D already had the functionality I wanted. Atila
Re: Parallel execution of unittests
As a note, I'm one of those that have used the main function in addition to unittests. I use it in the unittest build mode of my JSON serialization library, using the unittests to ensure I didn't break anything, and then using the main to run a performance test that my changes actually did make it faster. On 4/30/14, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, Apr 30, 2014 at 08:43:31AM -0700, Andrei Alexandrescu via Digitalmars-d wrote: [...] Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. [...] Actually, I still run unittests before main. :) When I want to *not* run unittests, I just recompile with -release (and no -unittest). The nice thing about unittests running before main is that during the code-compile-test cycle I can have the unittests run *and* manually test the program afterwards -- usually in this case I only run the program once before modifying the code and recompiling, so it would be needless work to have to compile the program twice (once for unittests, once for main). An alternative, perhaps nicer, idea is to have a *runtime* switch to run unittests, recognized by druntime, perhaps something like: ./program --pragma-druntime-run-unittests Or something similarly unlikely to clash with real options accepted by the program. T -- Genius may have its limitations, but stupidity is not thus handicapped. -- Elbert Hubbard
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 16:32:19 UTC, Andrei Alexandrescu wrote: On 4/30/14, 9:24 AM, QAston wrote: An existing library implementation: https://github.com/atilaneves/unit-threaded Nice! The Warning: With dmd 2.064.2 and the gold linker on Linux 64-bit this code crashes. is hardly motivating though :o). I think this project is a confluence of a couple others, such as logging and a collection of specialized assertions. But it's hard to tell without documentation, and the linked output https://github.com/atilaneves/unit-threaded/blob/master/unit_threaded/io.d does not exist. Andrei I'm thinking of removing the warning but I have no idea how many people are using dmd 2.064.2, and it does crash if used with ld.gold. It was a dmd bug that got fixed (I voted on it). I fixed the Markdown link. That's what happens when I move code around! If you want to see what the output is like you can check out https://travis-ci.org/atilaneves/cerealed or git clone it and run dub test. I think seeing failing output is just as interesting as well, so there's a failing example in there that can be executed. The README.md says how. When I first wrote this I tried using it on Phobos to see if I could run the unit tests a lot faster but didn't have a lot of luck. I think I ran out of memory trying to reflect on its modules but I can't remember. I should try that again. Atila
Re: Parallel execution of unittests
I believe only missing step right now is propagation of UDA's to RTInfo when demanded. Everything else can be done as Phobos solution. And if requirement to have all modules transitively accessible from root one is acceptable it can be already done with http://dlang.org/traits.html#getUnitTests Simplicity of D unit tests is their best feature.
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. While it's not something that I generally like to do, I know that we have instances of that where I work. Also, if the unit tests have to deal with shared resources, they may very well be theoretically independent but would run afoul of each other if run at the same time - a prime example of this would be std.file, which has to operate on the file system. I fully expect that if std.file's unit tests were run in parallel, they would break. Unit tests involving sockets would be another type of test which would be at high risk of breaking, depending on what sockets they need. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. - Jonathan M Davis
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 17:50:34 UTC, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. While it's not something that I generally like to do, I know that we have instances of that where I work. Also, if the unit tests have to deal with shared resources, they may very well be theoretically independent but would run afoul of each other if run at the same time - a prime example of this would be std.file, which has to operate on the file system. I fully expect that if std.file's unit tests were run in parallel, they would break. Unit tests involving sockets would be another type of test which would be at high risk of breaking, depending on what sockets they need. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. - Jonathan M Davis You're right; blindly enabling parallelisation after the fact is likely to cause problems. Unit tests though, by definition (and I'm aware there are more than one) have to be independent. Have to not touch the filesystem, or the network. Only CPU and RAM. In my case, and since I had the luxury of implementing a framework first and only writing tests after it was done, running them in parallel was an extra check that they are in fact independent. Now, it does happen that you're testing code that isn't thread-safe itself, and yes, in that case you have to run them in a single thread. That's why I added the @SingleThreaded UDA to my library to enable that. As soon as I tried calling legacy C code... We could always make running in threads opt-in. Atila
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 17:30:30 UTC, Dicebot wrote: I believe only missing step right now is propagation of UDA's to RTInfo when demanded. Everything else can be done as Phobos solution. And if requirement to have all modules transitively accessible from root one is acceptable it can be already done with http://dlang.org/traits.html#getUnitTests Simplicity of D unit tests is their best feature. IMHO this best feature is only useful when writing a small script-like program. The hassle of using anything more heavy-duty is likely to make one not want to write tests. The unittest blocks are simple, and that's good. But for me I wouldn't (and haven't) use them for real work. When tests pass, it doesn't really matter if they were written with only using assert or what the output was like or any of those things. But when they fail, I want to: . Run the failing test(s) in isolation, selecting them on the command-line by name . Have tests grouped in categories (I use packages) to run similar tests together . Be able to enable debug output that is normally supressed . Know the name of the test to know which one is failing . Have meaningful output from the failure without having to construct said meaningful output myself (assertEquals vs assert) I don't know about anyone else, but I make my tests fail a lot. I also added threading, hidden tests, and tests expected to fail to that list but they are nice-to-have features. I can't do without the rest though. Also, I like pretty colours in the output for failure and success, but that might be just me :P Atila
Re: Parallel execution of unittests
Andrei Alexandrescu wrote in message news:ljr6ld$1mft$2...@digitalmars.com... This doesn't follow. All unittests should be executable concurrently. -- Andrei That's like saying all makefiles should work with -j
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 17:58:34 + Atila Neves via Digitalmars-d digitalmars-d@puremagic.com wrote: Unit tests though, by definition (and I'm aware there are more than one) have to be independent. Have to not touch the filesystem, or the network. Only CPU and RAM. I disagree with this. A unit test is a test that tests a single piece of functionality - generally a function - and there are functions which have to access the file system or network. And those tests are done in unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not). - Jonathan M Davis
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 15:54:42 UTC, bearophile wrote: We've resisted named unittests but I think there's enough evidence to make the change. Yes, the optional name for unittests is an improvement: unittest {} unittest foo {} I am very glad your coworker find such usability problems :-) If we do name the unittests, then can we name them with strings? No need to polute namespace with ugly symbols. Also: // unittest Sort: Non-Lvalue RA range { ... } // vs // unittest SortNonLvalueRARange { ... } //
Re: Parallel execution of unittests
On 4/30/2014 8:54 AM, bearophile wrote: I'd also like some built-in way (or partially built-in) to use a module only as main module (to run its demos) or as module to be imported. This problem is solved in Python with the if __name__ == __main__: idiom. dmd foo.d -unittest -main
Re: Parallel execution of unittests
monarch_dodra: If we do name the unittests, then can we name them with strings? No need to polute namespace with ugly symbols. Are UDAs enough? @uname(foo) unittest {} What I'd like is to tie one or more unittests to other entities, like all the unittests of a specific function. Bye, bearophile
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 18:19:34 UTC, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 17:58:34 + Atila Neves via Digitalmars-d digitalmars-d@puremagic.com wrote: Unit tests though, by definition (and I'm aware there are more than one) have to be independent. Have to not touch the filesystem, or the network. Only CPU and RAM. I disagree with this. A unit test is a test that tests a single piece of functionality - generally a function - and there are functions which have to access the file system or network. They _use_ access to file system or network, but it is _not_ their functionality. Unit testing is all about verifying small perfectly separated pieces of functionality which don't depend on correctness / stability of any other functions / programs. Doing I/O goes against it pretty much by definition and is unfortunately one of most common testing antipatterns.
Re: Parallel execution of unittests
On 4/30/14, bearophile via Digitalmars-d digitalmars-d@puremagic.com wrote: What I'd like is to tie one or more unittests to other entities, like all the unittests of a specific function. This would also lead to a more stable documented unittest feature and the ability to declare documented unittests outside the scope of the target symbol. E.g. if you have a templated aggregate and a function inside it you may want to add a single documented unittest for the function /outside/ the aggregate, otherwise it will get compiled for every unique instance of that aggregate.
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 18:04:43 UTC, Atila Neves wrote: On Wednesday, 30 April 2014 at 17:30:30 UTC, Dicebot wrote: I believe only missing step right now is propagation of UDA's to RTInfo when demanded. Everything else can be done as Phobos solution. And if requirement to have all modules transitively accessible from root one is acceptable it can be already done with http://dlang.org/traits.html#getUnitTests Simplicity of D unit tests is their best feature. IMHO this best feature is only useful when writing a small script-like program. The hassle of using anything more heavy-duty is likely to make one not want to write tests. The unittest blocks are simple, and that's good. But for me I wouldn't (and haven't) use them for real work. When tests pass, it doesn't really matter if they were written with only using assert or what the output was like or any of those things. But when they fail, I want to: . Run the failing test(s) in isolation, selecting them on the command-line by name . Have tests grouped in categories (I use packages) to run similar tests together . Be able to enable debug output that is normally supressed . Know the name of the test to know which one is failing . Have meaningful output from the failure without having to construct said meaningful output myself (assertEquals vs assert) I don't know about anyone else, but I make my tests fail a lot. I think this is key difference. For me failing unit test is always exceptional situation. And if test group is complex enough to require categorization then either my code is not procedural enough or module is just too big and needs to be split. There are of course always some tests with complicated environment and/or I/O. But those are never unit tests and thus part of completely different framework.
Re: Parallel execution of unittests
On Wed, 2014-04-30 at 10:50 -0700, Jonathan M Davis via Digitalmars-d wrote: […] In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. While it's not something that I generally like to do, I know that we have instances of that where I work. Also, if the unit tests have to deal with shared resources, they may very well be theoretically independent but would run afoul of each other if run at the same time - a prime example of this would be std.file, which has to operate on the file system. I fully expect that if std.file's unit tests were run in parallel, they would break. Unit tests involving sockets would be another type of test which would be at high risk of breaking, depending on what sockets they need. Surely if there is expensive set up you are doing an integration or system test not a unit test. In a unit test all expensive set up should be mocked out. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. All tests should be independent, therefore there should be no problem executing all tests at the same time and/or in any order. If tests have to be executed in a specific order then they are not separate tests and should be merged into a single test — which likely means they are integration or system tests not unit tests. -- Russel. = Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.win...@ekiga.net 41 Buckmaster Roadm: +44 7770 465 077 xmpp: rus...@winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Re: Parallel execution of unittests
On Wed, 2014-04-30 at 11:19 -0700, Jonathan M Davis via Digitalmars-d wrote: […] I disagree with this. A unit test is a test that tests a single piece of functionality - generally a function - and there are functions which have to access the file system or network. And those tests are done in These are integration/system tests not unit tests. For unit tests network activity should be mocked out. unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not). In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency. -- Russel. = Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.win...@ekiga.net 41 Buckmaster Roadm: +44 7770 465 077 xmpp: rus...@winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Re: Parallel execution of unittests
On 2014-04-30 17:43, Andrei Alexandrescu wrote: Hello, A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. Can't we just collect all unit tests with __traits(getUnitTests) and put them through std.parallelism: foreach (unitTest ; unitTests.parallel) unitTest(); This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale - coworkers are increasingly talking about the unittest at line 2035 is failing and such. With unittests executing in multiple threads and issuing e.g. logging output, this is only likely to become more exacerbated. We've resisted named unittests but I think there's enough evidence to make the change. Named unit tests are already possible with the help of UDA's: @name(foo bar) unittest { assert(true); } I've tried several times here, in reviews, to get people to add some description to the unit tests. But so far no one has agreed. I'm using something quite similar to RSpec from the Ruby world: describe! toMsec in { it! returns the time in milliseconds in { assert(true); } } This uses the old syntax, with UDA's it becomes something like this: @describe(toMsec) { @it(returns the time in milliseconds) unittest { assert(true); } } Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } Or dmd -unittest -main -run foo.d I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. Fine by me, I don't like that main is run after the unit tests. Thoughts? Would anyone want to work on such stuff? Are you thinking of built-in support or an external library? -- /Jacob Carlborg
Re: Parallel execution of unittests
On 2014-04-30 19:30, Dicebot wrote: I believe only missing step right now is propagation of UDA's to RTInfo when demanded. Everything else can be done as Phobos solution. I don't see why this is necessary for this case. -- /Jacob Carlborg
Re: Parallel execution of unittests
On 4/30/14, 10:58 AM, Atila Neves wrote: We could always make running in threads opt-in. Yah, great idea. -- Andrei
Re: Parallel execution of unittests
On 4/30/14, 10:50 AM, Jonathan M Davis via Digitalmars-d wrote: There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. Default thread-local globals? -- Andrei
Re: Parallel execution of unittests
On 4/30/14, 11:13 AM, Daniel Murphy wrote: Andrei Alexandrescu wrote in message news:ljr6ld$1mft$2...@digitalmars.com... This doesn't follow. All unittests should be executable concurrently. -- Andrei That's like saying all makefiles should work with -j They should. -- Andrei
Re: Parallel execution of unittests
On 4/30/14, 11:53 AM, monarch_dodra wrote: On Wednesday, 30 April 2014 at 15:54:42 UTC, bearophile wrote: We've resisted named unittests but I think there's enough evidence to make the change. Yes, the optional name for unittests is an improvement: unittest {} unittest foo {} I am very glad your coworker find such usability problems :-) If we do name the unittests, then can we name them with strings? No need to polute namespace with ugly symbols. Also: // unittest Sort: Non-Lvalue RA range { ... } // vs // unittest SortNonLvalueRARange { ... } // I'd argue for regular identifiers instead of strings - they can be seen in stack traces, accessed with __FUNCTION__ etc. -- Andrei
Re: Parallel execution of unittests
On 4/30/14, 1:09 PM, Russel Winder via Digitalmars-d wrote: And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency. Agreed. -- Andrei
Re: Parallel execution of unittests
On 4/30/14, 1:19 PM, Jacob Carlborg wrote: On 2014-04-30 17:43, Andrei Alexandrescu wrote: Hello, A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. Can't we just collect all unit tests with __traits(getUnitTests) and put them through std.parallelism: foreach (unitTest ; unitTests.parallel) unitTest(); I didn't know of that trait; I adapted code from druntime/src/test_runner.d. Named unit tests are already possible with the help of UDA's: @name(foo bar) unittest { assert(true); } I've tried several times here, in reviews, to get people to add some description to the unit tests. But so far no one has agreed. Yah I think that's possible but I'd like the name to be part of the function name as well e.g. unittest__%s. I'm using something quite similar to RSpec from the Ruby world: describe! toMsec in { it! returns the time in milliseconds in { assert(true); } } This uses the old syntax, with UDA's it becomes something like this: @describe(toMsec) { @it(returns the time in milliseconds) unittest { assert(true); } } That looks... interesting. Thoughts? Would anyone want to work on such stuff? Are you thinking of built-in support or an external library? Built in with possible help from druntime and/or std. Andrei
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 19:20:20 UTC, Dicebot wrote: On Wednesday, 30 April 2014 at 18:04:43 UTC, Atila Neves wrote: On Wednesday, 30 April 2014 at 17:30:30 UTC, Dicebot wrote: I believe only missing step right now is propagation of UDA's to RTInfo when demanded. Everything else can be done as Phobos solution. And if requirement to have all modules transitively accessible from root one is acceptable it can be already done with http://dlang.org/traits.html#getUnitTests Simplicity of D unit tests is their best feature. IMHO this best feature is only useful when writing a small script-like program. The hassle of using anything more heavy-duty is likely to make one not want to write tests. The unittest blocks are simple, and that's good. But for me I wouldn't (and haven't) use them for real work. When tests pass, it doesn't really matter if they were written with only using assert or what the output was like or any of those things. But when they fail, I want to: . Run the failing test(s) in isolation, selecting them on the command-line by name . Have tests grouped in categories (I use packages) to run similar tests together . Be able to enable debug output that is normally supressed . Know the name of the test to know which one is failing . Have meaningful output from the failure without having to construct said meaningful output myself (assertEquals vs assert) I don't know about anyone else, but I make my tests fail a lot. I think this is key difference. For me failing unit test is always exceptional situation. I TDD a lot. Tests failing are normal. Not only that, I refactor a lot as well. Which causes tests to fail. Fortunately I have tests failing to tell me I screwed up. Even if failing tests were exceptional, I still want everything I just mentioned. And if test group is complex enough to require categorization then either my code is not procedural enough or module is just too big and needs to be split. And when I split them I put them into a subcategory.
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 18:19:34 UTC, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 17:58:34 + Atila Neves via Digitalmars-d digitalmars-d@puremagic.com wrote: Unit tests though, by definition (and I'm aware there are more than one) have to be independent. Have to not touch the filesystem, or the network. Only CPU and RAM. I disagree with this. A unit test is a test that tests a single piece of functionality - generally a function - and there are functions which have to access the file system or network. And those tests are done in unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not). - Jonathan M Davis On what's a unit test: I +1 everything Dicebot and Russell Winder said. Of course there are functions with side effects. Of course they should be tested. But those tests aren't unit tests. Which won't stop people from using a unit test framework to run them. In fact, every test I've ever written using python's unittest module was an integration test. But again, you're right. Whatever changes happen have to take into account the current status. And the current status makes it difficult if not impossible to run existing tests in multiple threads by default. One could argue that the Phobos tests should be changed too, but that won't help with the existing client codebase out there.
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 13:26:40 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 10:50 AM, Jonathan M Davis via Digitalmars-d wrote: There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. Default thread-local globals? -- Andrei Sure, that helps, but it's trivial to write a unittest block which depends on a previous unittest block, and as soon as a unittest block uses an external resource such as a socket or file, then even if a unittest block doesn't directly depend on the end state of a previous unittest block, it still depends on external state which could be affected by other unittest blocks. So, ultimately, the language really doesn't ensure that running a unittest block can be parallelized. If it's pure as bearophile suggested, then it can be done, but as long as a unittest block is impure, then it can rely on global state - even inadvertently - (be it state directly in the program or state outside the program) and therefore not work when pararellized. So, I suppose that you could parallelize unittest blocks if they were marked as pure (though I'm not sure if that's currently a legal thing to do), but impure unittest blocks aren't guaranteed to be parallelizable. I'm all for making it possible to parallelize unittest block execution, but as it stands, doing so automatically would be a bad idea. We could make it so that a unittest block could be marked as parallelizable, or we could even move towards making parallelizable the default and require that a unittest block be marked as unparallelizable, but we'd have to be very careful with that, as it will break code if we're not careful about how we do that transition. I'm inclined to think that marking unittest blocks as pure to parallelize them is a good idea, because then the unittest blocks that are guaranteed to be parallelizable are run in parallel, whereas those that aren't wouldn't be. The primary dowside would be that the cases where the programmer knew that they could be parallelized but they weren't pure, since those unittest blocks wouldn't be parallelized. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 18:53:22 + monarch_dodra via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wednesday, 30 April 2014 at 15:54:42 UTC, bearophile wrote: We've resisted named unittests but I think there's enough evidence to make the change. Yes, the optional name for unittests is an improvement: unittest {} unittest foo {} I am very glad your coworker find such usability problems :-) If we do name the unittests, then can we name them with strings? No need to polute namespace with ugly symbols. Also: // unittest Sort: Non-Lvalue RA range { ... } // vs // unittest SortNonLvalueRARange { ... } // It would be simple enough to avoid polluting the namespace. IIRC, right now, the unittest blocks get named after the line number that they're on. All we'd have to do is change it so that their name included the name given by the programmer rather than being the name given by the programmer. e.g. unittest(testFoo) { } results in a function called something like unittest_testFoo. - Jonathan M Davis
Re: Parallel execution of unittests
On 4/30/14, 2:25 PM, Jonathan M Davis via Digitalmars-d wrote: Sure, that helps, but it's trivial to write a unittest block which depends on a previous unittest block, and as soon as a unittest block uses an external resource such as a socket or file, then even if a unittest block doesn't directly depend on the end state of a previous unittest block, it still depends on external state which could be affected by other unittest blocks. So, ultimately, the language really doesn't ensure that running a unittest block can be parallelized. If it's pure as bearophile suggested, then it can be done, but as long as a unittest block is impure, then it can rely on global state - even inadvertently - (be it state directly in the program or state outside the program) and therefore not work when pararellized. So, I suppose that you could parallelize unittest blocks if they were marked as pure (though I'm not sure if that's currently a legal thing to do), but impure unittest blocks aren't guaranteed to be parallelizable. Agreed. I think we should look into parallelizing all unittests. -- Andrei
Re: Parallel execution of unittests
On 4/30/14, Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: This brings up the issue of naming unittests. See also this ER where I discuss why I wanted this recently: https://issues.dlang.org/show_bug.cgi?id=12473
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 21:09:14 +0100 Russel Winder via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 2014-04-30 at 11:19 -0700, Jonathan M Davis via Digitalmars-d wrote: unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not). In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency. Why? Because Andrei suddenly proposed that we parallelize unittest blocks? If I want to test a function, I'm going to put a unittest block after it to test it. If that means accessing I/O, then it means accessing I/O. If that means messing with mutable, global variables, then that means messing with mutable, global variables. Why should I have to put the tests elsewhere or make is that they don't run whenthe -unttest flag is used just because they don't fall under your definition of unit test? There is nothing in the language which has ever mandated that unittest blocks be parallelizable or that they be pure (which is essentially what you're saying all unittest blocks should be). And restricting unittest blocks so that they have to be pure (be it conceptually pure or actually pure) would be a _loss_ of functionality. Sure, let's make it possible to parallelize unittest blocks where appropriate, but I contest that we should start requiring that unittest blocks be pure (which is what a function has to be in order to be pararellized whether it's actually marked as pure or not). That would force us to come up with some other testing mechanism to run those tests when there is no need to do so (and I would argue that there is no compelling reason to do so other than ideology with regards to what is truly a unit test). On the whole, I think that unittest blocks work very well as they are. If we want to expand on their features, then great, but let's do so without adding new restrictions to them. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, Apr 30, 2014 at 02:48:38PM -0700, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 21:09:14 +0100 Russel Winder via Digitalmars-d digitalmars-d@puremagic.com wrote: [...] In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency. Why? Because Andrei suddenly proposed that we parallelize unittest blocks? If I want to test a function, I'm going to put a unittest block after it to test it. If that means accessing I/O, then it means accessing I/O. If that means messing with mutable, global variables, then that means messing with mutable, global variables. Why should I have to put the tests elsewhere or make is that they don't run whenthe -unttest flag is used just because they don't fall under your definition of unit test? [...] What about allowing pure marking on unittests, and those unittests that are marked pure will be parallelized, and those that aren't marked will be run serially? T -- Amateurs built the Ark; professionals built the Titanic.
Re: Parallel execution of unittests
On Wed, Apr 30, 2014 at 02:25:22PM -0700, Jonathan M Davis via Digitalmars-d wrote: [...] Sure, that helps, but it's trivial to write a unittest block which depends on a previous unittest block, and as soon as a unittest block uses an external resource such as a socket or file, then even if a unittest block doesn't directly depend on the end state of a previous unittest block, it still depends on external state which could be affected by other unittest blocks. In this case I'd argue that the test was poorly-written. I can see multiple unittests using, say, the same temp filename for testing file I/O, in which case they shouldn't be parallelized; but if a unittest depends on a file created by a previous unittest, then something is very, very wrong with the unittest. [...] I'm inclined to think that marking unittest blocks as pure to parallelize them is a good idea, because then the unittest blocks that are guaranteed to be parallelizable are run in parallel, whereas those that aren't wouldn't be. Agreed. The primary dowside would be that the cases where the programmer knew that they could be parallelized but they weren't pure, since those unittest blocks wouldn't be parallelized. [...] Is it a big loss to have *some* unittests non-parallelizable? (I don't know, do we have hard data on this front?) T -- The two rules of success: 1. Don't tell everything you know. -- YHL
Re: Parallel execution of unittests
What about allowing pure marking on unittests, and those unittests that are marked pure will be parallelized, and those that aren't marked will be run serially? I guess that goes for inferred purity aswell...
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 13:50:10 -0400, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. int a; unittest { // set up a; } unittest { // use a; } == unittest { int a; { // set up a; } { // use a; } } It makes no sense to do it the first way, you are not gaining anything. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Then you didn't write your unit-tests correctly. True unit tests-anyway. In fact, the very quality that makes unit tests so valuable (that they are independent of other code) is ruined by sharing state across tests. If you are going to share state, it really is one unit test. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. I think that if we add the assumption, the resulting fallout would be easy to fix. Note that we can't require unit tests to be pure -- non-pure functions need testing too :) I can imagine that even if you could only parallelize 90% of unit tests, that would be an effective optimization for a large project. In such a case, the rare (and I mean rare to the point of I can't think of a single use-case) need to deny parallelization could be marked. -Steve
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 11:43:31 -0400, Andrei Alexandrescu seewebsiteforem...@erdani.org wrote: Hello, A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. I am not sure, but are unit-test blocks one function each, or one function per module? If the latter, that would have to be changed. This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale - coworkers are increasingly talking about the unittest at line 2035 is failing and such. With unittests executing in multiple threads and issuing e.g. logging output, this is only likely to become more exacerbated. We've resisted named unittests but I think there's enough evidence to make the change. I would note this enhancement, which Walter agreed should be done at DConf '13 ;) https://issues.dlang.org/show_bug.cgi?id=10023 Jacob Carlborg has tried to make this work, but the PR has not been pulled yet (I think it needs some updating at least, and there were some unresolved questions IIRC). Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. The runtime can intercept this parameter. I would like a mechanism to run main decided at runtime. We need no compiler modifications to effect this. Thoughts? Would anyone want to work on such stuff? I can probably take a look at changing the unittests to avoid main without a runtime parameter. I have a good grasp on how the pre-main runtime works, having rewritten the module constructor algorithm a while back. I am hesitant to run all unit tests in parallel without an opt-out mechanism. The above enhancement being implemented would give us some ways to play around, though. -Steve
Re: Parallel execution of unittests
Le 30/04/2014 17:43, Andrei Alexandrescu a écrit : Hello, A coworker mentioned the idea that unittests could be run in parallel (using e.g. a thread pool). I've rigged things to run in parallel unittests across modules, and that works well. However, this is too coarse-grained - it would be great if each unittest could be pooled across the thread pool. That's more difficult to implement. I think it's a great idea, mainly for TDD. I had experiment it with Java, and when execution time grow TDD loose rapidly his efficiently. Some Eclipse's plug-ins are able to run them in parallel if I remember correctly. This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale - coworkers are increasingly talking about the unittest at line 2035 is failing and such. With unittests executing in multiple threads and issuing e.g. logging output, this is only likely to become more exacerbated. We've resisted named unittests but I think there's enough evidence to make the change. IMO naming is important for reporting tools (tests status, benchmarks,...). Unittests evolves with the rest of the code. Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. Thoughts? Would anyone want to work on such stuff? Andrei
Re: Parallel execution of unittests
Le 30/04/2014 17:59, Andrei Alexandrescu a écrit : On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei But sometimes unittests have to use shared data that need to be initialized before them. File system operations are generally a critical point, if many unittest are based on same auto-generated file data it's a good idea to run this generation once before all tests (they eventually do a file copy that is fast with copy-on-write file system or those data can be used as read only by all tests). So for those kind of situation some functions have must be able to run before unittest, and I think that the case of static this() function of modules?
Re: Parallel execution of unittests
Le 30/04/2014 19:50, Jonathan M Davis via Digitalmars-d a écrit : On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. While it's not something that I generally like to do, I know that we have instances of that where I work. Also, if the unit tests have to deal with shared resources, they may very well be theoretically independent but would run afoul of each other if run at the same time - a prime example of this would be std.file, which has to operate on the file system. I fully expect that if std.file's unit tests were run in parallel, they would break. Unit tests involving sockets would be another type of test which would be at high risk of breaking, depending on what sockets they need. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. - Jonathan M Davis I shared this kind of experience too. pure unittest name {} seems a good idea, and it's more intuitive to have the same behaviour of other functions with a closer signature.
Re: Parallel execution of unittests
Le 30/04/2014 19:58, Atila Neves a écrit : On Wednesday, 30 April 2014 at 17:50:34 UTC, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. While it's not something that I generally like to do, I know that we have instances of that where I work. Also, if the unit tests have to deal with shared resources, they may very well be theoretically independent but would run afoul of each other if run at the same time - a prime example of this would be std.file, which has to operate on the file system. I fully expect that if std.file's unit tests were run in parallel, they would break. Unit tests involving sockets would be another type of test which would be at high risk of breaking, depending on what sockets they need. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. - Jonathan M Davis You're right; blindly enabling parallelisation after the fact is likely to cause problems. Unit tests though, by definition (and I'm aware there are more than one) have to be independent. Have to not touch the filesystem, or the network. Only CPU and RAM. In my case, and since I had the luxury of implementing a framework first and only writing tests after it was done, running them in parallel was an extra check that they are in fact independent. Why a test don't have to touch filesystem? That really restrictive, you just can't have a good code coverage on a lot libraries with a such restriction. I had work on a Source Control Management software, and all tests have to deal with a DB which requires file system and network operations. IMO it's pretty much like impossible to miss testing of functions relations, simple integration tests are often needed to ensure that the application is working correctly. If D integrate features to support automatized testing maybe it must not be to restrictive mainly if everybody will expect more features commonly used (like named tests, formated result output,...). Some of those common features have to be added to phobos instead of the language. Now, it does happen that you're testing code that isn't thread-safe itself, and yes, in that case you have to run them in a single thread. That's why I added the @SingleThreaded UDA to my library to enable that. As soon as I tried calling legacy C code... We could always make running in threads opt-in. Atila
Re: Parallel execution of unittests
Le 30/04/2014 21:23, Dicebot a écrit : On Wednesday, 30 April 2014 at 18:19:34 UTC, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 17:58:34 + Atila Neves via Digitalmars-d digitalmars-d@puremagic.com wrote: Unit tests though, by definition (and I'm aware there are more than one) have to be independent. Have to not touch the filesystem, or the network. Only CPU and RAM. I disagree with this. A unit test is a test that tests a single piece of functionality - generally a function - and there are functions which have to access the file system or network. They _use_ access to file system or network, but it is _not_ their functionality. Unit testing is all about verifying small perfectly separated pieces of functionality which don't depend on correctness / stability of any other functions / programs. Doing I/O goes against it pretty much by definition and is unfortunately one of most common testing antipatterns. Splitting all features at an absolute atomic level can be achieve for open-source libraries, but it's pretty much impossible for an industrial software. Why being so restrictive when it's possible to support both vision by extending a little the language by something already logical?
Re: Parallel execution of unittests
On 4/30/14, 6:20 PM, Xavier Bigand wrote: Le 30/04/2014 17:59, Andrei Alexandrescu a écrit : On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei But sometimes unittests have to use shared data that need to be initialized before them. File system operations are generally a critical point, if many unittest are based on same auto-generated file data it's a good idea to run this generation once before all tests (they eventually do a file copy that is fast with copy-on-write file system or those data can be used as read only by all tests). So for those kind of situation some functions have must be able to run before unittest, and I think that the case of static this() function of modules? Yah version(unittest) static shared this() { ... } covers that. -- Andrei
Re: Parallel execution of unittests
Le 30/04/2014 22:09, Russel Winder via Digitalmars-d a écrit : On Wed, 2014-04-30 at 11:19 -0700, Jonathan M Davis via Digitalmars-d wrote: […] I disagree with this. A unit test is a test that tests a single piece of functionality - generally a function - and there are functions which have to access the file system or network. And those tests are done in These are integration/system tests not unit tests. For unit tests network activity should be mocked out. And how you do when your mock is bugged? Or you risk to have the mock up to date when changing the code but not the running application cause before the commit you'll run only your unittests. IMO every tests can be automatize and run in a few time have to be run before each commit even if some are integration tests. unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not). In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency.
Re: Parallel execution of unittests
Le 30/04/2014 18:19, Byron a écrit : On Wed, 30 Apr 2014 09:02:54 -0700, Andrei Alexandrescu wrote: I think indeed a small number of unittests rely on order of execution. Those will be still runnable with a fork factor of 1. We'd need a way to specify that - either a flag or: static shared this() { Runtime.unittestThreads = 1; } Andrei Named tested seems like a no brainier to me. Maybe nested unittests? unittest OrderTests { // setup for all child tests? unittest a { } unittest b { } } I also wonder if its just better to extend/expose the unittest API for more advanced things like order of execution, test reporting, and parallel execution. And we can just support an external unittesting library to do all the advanced testing options. I don't see the usage? I'll find nice enough if IDEs will be able to put unittest in a tree and using the module's names for the hierarchy.
Re: Parallel execution of unittests
Le 01/05/2014 03:54, Andrei Alexandrescu a écrit : On 4/30/14, 6:20 PM, Xavier Bigand wrote: Le 30/04/2014 17:59, Andrei Alexandrescu a écrit : On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei But sometimes unittests have to use shared data that need to be initialized before them. File system operations are generally a critical point, if many unittest are based on same auto-generated file data it's a good idea to run this generation once before all tests (they eventually do a file copy that is fast with copy-on-write file system or those data can be used as read only by all tests). So for those kind of situation some functions have must be able to run before unittest, and I think that the case of static this() function of modules? Yah version(unittest) static shared this() { ... } covers that. -- Andrei Then I am pretty much ok with the parallelization of all unittests. It stay the question of name, I don't really know if it have to be in the language or in phobos like other tests features (test-logger, benchmark,...).
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 15:33:17 -0700 H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, Apr 30, 2014 at 02:48:38PM -0700, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 21:09:14 +0100 Russel Winder via Digitalmars-d digitalmars-d@puremagic.com wrote: [...] In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency. Why? Because Andrei suddenly proposed that we parallelize unittest blocks? If I want to test a function, I'm going to put a unittest block after it to test it. If that means accessing I/O, then it means accessing I/O. If that means messing with mutable, global variables, then that means messing with mutable, global variables. Why should I have to put the tests elsewhere or make is that they don't run whenthe -unttest flag is used just because they don't fall under your definition of unit test? [...] What about allowing pure marking on unittests, and those unittests that are marked pure will be parallelized, and those that aren't marked will be run serially? I think that that would work, and if we added purity inferrence to unittest blocks as Nordlow suggests, then you wouldn't even have to mark them as pure unless you wanted to enforce that it be runnable in parallel. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 20:33:06 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 13:50:10 -0400, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. int a; unittest { // set up a; } unittest { // use a; } == unittest { int a; { // set up a; } { // use a; } } It makes no sense to do it the first way, you are not gaining anything. It can make sense to do it the first way when it's more like LargeDocumentOrDatabase foo; unittest { // set up foo; } unittest { // test something using foo } unittest { // do other tests using foo which then take advantage of changes made // by the previous test rather than doing all of those changes to // foo in order to set up this test } In general, I agree that tests shouldn't be done that way, and I don't think that I've ever done it personally, but I've seen it done, and for stuff that requires a fair bit of initialization, it can save time to have each test build on the state of the last. But even if we all agree that that sort of testing is a horrible idea, the language supports it right now, and automatically parallelizing unit tests will break any code that does that. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Then you didn't write your unit-tests correctly. True unit tests-anyway. In fact, the very quality that makes unit tests so valuable (that they are independent of other code) is ruined by sharing state across tests. If you are going to share state, it really is one unit test. All it takes is that tests in two separate modules which have separate functionality access the file system or sockets or some other system resource, and they could end up breaking due to the fact that the other test is messing with the same resource. I'd expect that to be a relatively rare case, but it _can_ happen, so simply parallelizing tests across modules does risk test failures that would not have occurred otherwise. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. I think that if we add the assumption, the resulting fallout would be easy to fix. Note that we can't require unit tests to be pure -- non-pure functions need testing too :) Sure, they need testing. Just don't test them in parallel, because they're not guaranteed to work in parallel. That guarantee _does_ hold for pure functions, because they don't access global, mutable state. So, we can safely parallelize a unittest block that is pure, but we _can't_ safely paralellize one that isn't - not in a guaranteed way. I can imagine that even if you could only parallelize 90% of unit tests, that would be an effective optimization for a large project. In such a case, the rare (and I mean rare to the point of I can't think of a single use-case) need to deny parallelization could be marked. std.file's unit tests would break immediately. It wouldn't surprise me if std.socket's unit tests broke. std.datetime's unit tests would probably break on Posix systems, because some of them temporarily set the local time zone - which sets it for the whole program, not just the current thread (those tests aren't done on Windows, because Windows only lets you set it for the whole OS, not just the program). Any tests which aren't pure risk breakage due to changes in whatever global, mutable state they're accessing. I would strongly argue that automatically parallelizing any unittest block which isn't pure is a bad idea, because it's not guaranteed to work, and it _will_ result in bugs in at least some cases. If we make it so that unittest blocks have their purity inferred (and allow you to mark them as pure to enforce that they be pure if you want to require it),
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 14:35:45 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: Agreed. I think we should look into parallelizing all unittests. -- I'm all for parallelizing all unittest blocks that are pure, as doing so would be safe, but I think that we're making a big mistake if we try and insist that all unittest blocks be able to be run in parallel. Any that aren't pure are not guaranteed to be parallelizable, and any which access system resources or other global, mutable state stand a good chance of breaking. If we make it so that the functions generated from unittest blocks have their purity inferred, then any unittest block which can safely be parallelized could then be parallelized by the test runner based on their purity, and any impure unittest functions could then be safely run in serial. And if you want to make sure that a unittest block is parallizable, then you can just explicitly mark it as pure. With that approach, we don't risk breaking existing unit tests, and it allows tests that need to not be run in parallel to work properly by guaranteeing that they're still run serially. And it even make it so that many tests are automatically parallelizable without the programmer having to do anything special for it. - Jonathan M Davis
Re: Parallel execution of unittests
On Wednesday, 30 April 2014 at 15:43:35 UTC, Andrei Alexandrescu wrote: This brings up the issue of naming unittests. It's becoming increasingly obvious that anonymous unittests don't quite scale A message structured like this would be awesome. Unittest Failed foo.d:345 Providing null input throws exception Last but not least, virtually nobody I know runs unittests and then main. This is quickly becoming an idiom: version(unittest) void main() {} else void main() { ... } I think it's time to change that. We could do it the non-backward-compatible way by redefining -unittest to instruct the compiler to not run main. Or we could define another flag such as -unittest-only and then deprecate the existing one. I would like to see -unittest redefined.
Re: Parallel execution of unittests
On 4/30/14, 10:01 PM, Jonathan M Davis via Digitalmars-d wrote: I'm all for parallelizing all unittest blocks that are pure, as doing so would be safe, but I think that we're making a big mistake if we try and insist that all unittest blocks be able to be run in parallel. Any that aren't pure are not guaranteed to be parallelizable, and any which access system resources or other global, mutable state stand a good chance of breaking. There are a number of assumptions here: (a) most unittests that can be effectively parallelized can be actually inferred (or declared) as pure; (b) most unittests that cannot be inferred as pure are likely to break; (c) it's a big deal if unittests break. I question all of these assumptions. In particular I consider unittests that depend on one another an effective antipattern that needs to be eradicated. Andrei