Hi Chris: I did draw attention to my change that removed the "wait;" statement from the loop in paratest.server that waits for all child processes to complete. That, combined with your observation that you are unable to create more threads points at the problem: there are not enough physical threads to go around; at least one of them is dying of starvation. There ought to be more than enough threads to go around, so perhaps the problem also involves mismatched priorities.
As it stands, paratest.server expects there to be at least w+1 threads available (for w workers and paratest.server iteself) and for scheduling among those threads to be reasonably fair. I have assumed all along that the call to sleep() in that wait loop yields to waiting threads. If not, then we might need to find a different way to pass the time between checking for updates. I have not examined the paratest.client script to see if there are potential gotchas there. I'll play with this a bit, and see if I can duplicate your problem on my workstation. THH ________________________________ From: Chris Wailes [[email protected]] Sent: Friday, April 04, 2014 3:43 PM To: Brad Chamberlain Cc: Tom Hildebrandt; Lydia Duncan; [email protected] Subject: Re: [Chapel-developers] Paratest and TooManyThreads.chpl After bisecting the commit log I found that commit 22715 is responsible for this issue. Oddly, before the script actually exits my system becomes unable to create new threads and grep says it is unable to find log files. - Chris On Mon, Mar 31, 2014 at 7:03 PM, Brad Chamberlain <[email protected]<mailto:[email protected]>> wrote: I don't have any insights, but will note that in our use cases, we tend not to use paratest to oversubscribe testing on a single machine; rather we farm out across multiple machines; so there may be some race/conflict which only shows up in that situation? Assuming any issue is in the paratest servers themselves, it shouldn't take you long to do the binary search -- I think there have only been five changes to it since Jan. -Brad On Mon, 31 Mar 2014, Tom Hildebrandt wrote: Hi Chris: The other change that I made in paratest.server was to remove the "wait" command on line 172 or thereabouts, so the timeout time is updated each second. I can't really see how this would cause the error messages you're seeing. On the other hand, I have never tested by forking a number of children equal to the number of processors available. I'll give that a try (most likely this evening). Tom H. _____________________________________________________________________________ From: Chris Wailes [[email protected]<mailto:[email protected]>] Sent: Monday, March 31, 2014 3:34 PM To: Tom Hildebrandt Cc: Brad Chamberlain; Lydia Duncan; [email protected]<mailto:[email protected]> Subject: Re: [Chapel-developers] Paratest and TooManyThreads.chpl I've been playing with this for a couple of days now, and even with skipif files for what I thought were the offending directories I end up getting the following output (https://gist.github.com/chriswailes/a1b0c4d8df4eb983607c) before the paratest.server script fails. Running start_test works just fine, but if I try to run even 4 tests at once on my quad-core, hyperthreaded machine, I get these error messages. I haven't been as diligent with my rebasing as I should have been, so the last time I know the mainline's version of the scripts worked was on January 29th. Does anyone know what might have changed since then to have caused this problem? Before I was able to run 10 tests at a time on this same machine. I'm about to head home now, but tomorrow I'll run a binary search on the commit history to try and pin down the commit that caused this to stop working. - Chris On Fri, Mar 28, 2014 at 12:32 PM, Tom Hildebrandt <[email protected]<mailto:[email protected]>> wrote: That is correct. Note also that the .skipif file the skips a directory and it descendents is a sibling of the directory to be skipped, whereas the directory-wide SKIPIF file resides within the directory it affects. Compare test/chpldoc <-- Skip testing here and in all descendents test/chpldoc.skipif <-- if this script tests true. vs. test/distributions/deitz/SKIPIF <-- Skip testing in the containing directory (only) if this script tests true. THH ________________________________________ From: Brad Chamberlain [[email protected]<mailto:[email protected]>] Sent: Friday, March 28, 2014 6:37 AM To: Lydia Duncan; [email protected]<mailto:[email protected]> Subject: Re: [Chapel-developers] Paratest and TooManyThreads.chpl IIRC, a difference between the two approaches is that putting it in the parent skips all recursive traversal below that directory as well, whereas putting it within the directory just skips that directory, but not its children? -Brad ________________________________________ From: Lydia Duncan [[email protected]<mailto:[email protected]>] Sent: Thursday, March 27, 2014 2:57 PM To: [email protected]<mailto:[email protected]> Subject: Re: [Chapel-developers] Paratest and TooManyThreads.chpl On 03/27/2014 02:53 PM, Chris Wailes wrote: > Do skipif files work for directories? Yup! You can either make a SKIPIF within the directory, or make a <dirname>.skipif file in its parent directory. Lydia ---------------------------------------------------------------------------- -- _______________________________________________ Chapel-developers mailing list [email protected]<mailto:[email protected]> https://lists.sourceforge.net/lists/listinfo/chapel-developers ---------------------------------------------------------------------------- -- _______________________________________________ Chapel-developers mailing list [email protected]<mailto:[email protected]> https://lists.sourceforge.net/lists/listinfo/chapel-developers ---------------------------------------------------------------------------- -- _______________________________________________ Chapel-developers mailing list [email protected]<mailto:[email protected]> https://lists.sourceforge.net/lists/listinfo/chapel-developers
------------------------------------------------------------------------------
_______________________________________________ Chapel-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/chapel-developers
