Re: Linux testing on single-core VMs nowadays
We do talos testing on in-house machinery (iX machines with 4-core). Not sure if that would trigger some of the issues you are hoping to be caught. In the future, we should be able to have some jobs run on different EC2 instance types. See https://bugzilla.mozilla.org/show_bug.cgi?id=985650 It will require lots of work but it is possible. cheers, Armen On 14-04-08 03:45 AM, ishikawa wrote: > On (2014年04月08日 15:20), Gabriele Svelto wrote: >> On 07/04/2014 23:13, Dave Hylands wrote: >>> Personally, I think that the more ways we can test for threading issues the >>> better. >>> It seems to me that we should do some amount of testing on single core and >>> multi-core. >>> >>> Then I suppose the question becomes how many cores? 2? 4? 8? >>> >>> Maybe we can cycle through some different number of cores so that we get >>> coverage without duplicating everything? >> >> One configuration that is particularly good at catching threading errors >> (especially narrow races) is constraining the software to run on two >> hardware threads on the same SMT-enabled core. This effectively forces >> the threads to share the L1 D$ which in turn can reveal some otherwise >> very-hard-to-find data synchronization issues. >> >> I don't know if we have that level of control on our testing hardware >> but if we do then that's a scenario we might want to include. >> >> Gabriele > > I run thunderbird under valgrind from time to time. > > Valgrind slows down the CPU execution by a very large factor and > it seems to open many windows for thread races. > (Sometimes a very short window is prolonged enough so that events caused by, > say, > I/O can fall inside this prolonged usually short window.) > > During valgrind execution,I have seen errors that were not reported > anywhere, and many have > happened only once :-( > > If VM (such as VirtualBox, VMplayer or something) can artificially > change the execution time of CPU or even different cores slightly (maybe > 1/2, 1/3, 1/4) > I am sure many thread-race issues will be caught. > > I agree that this is a brute-force approach, but please recall that the > first space shuttle launch needed to be > aborted due to software glitch. It was a timing issue and according to the > analysis of the time, > it could happen once in 72 (or was it 74) cases. > Even NASA with a large pocket of money and its subcontractor could not catch > it before launch. > > I am afraid that the situation has not changed much (unless we use a > computer language well suited to > avoid these thread-race issues.) > We need all the help to track down visible and dormant thread-races. > If artificial CPU execution tweaking (by changing the # of cores or even > more advanced tweaking methods if available) can help, it is worth a try. > Maybe not always if such a work cost extra money, but > a prolonged (say a week) testing from time to time (each quarter or half a > year, or > maybe just prior to testing of beta of major release?). > > > TIA > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Linux testing on single-core VMs nowadays
On (2014年04月08日 15:20), Gabriele Svelto wrote: > On 07/04/2014 23:13, Dave Hylands wrote: >> Personally, I think that the more ways we can test for threading issues the >> better. >> It seems to me that we should do some amount of testing on single core and >> multi-core. >> >> Then I suppose the question becomes how many cores? 2? 4? 8? >> >> Maybe we can cycle through some different number of cores so that we get >> coverage without duplicating everything? > > One configuration that is particularly good at catching threading errors > (especially narrow races) is constraining the software to run on two > hardware threads on the same SMT-enabled core. This effectively forces > the threads to share the L1 D$ which in turn can reveal some otherwise > very-hard-to-find data synchronization issues. > > I don't know if we have that level of control on our testing hardware > but if we do then that's a scenario we might want to include. > > Gabriele I run thunderbird under valgrind from time to time. Valgrind slows down the CPU execution by a very large factor and it seems to open many windows for thread races. (Sometimes a very short window is prolonged enough so that events caused by, say, I/O can fall inside this prolonged usually short window.) During valgrind execution,I have seen errors that were not reported anywhere, and many have happened only once :-( If VM (such as VirtualBox, VMplayer or something) can artificially change the execution time of CPU or even different cores slightly (maybe 1/2, 1/3, 1/4) I am sure many thread-race issues will be caught. I agree that this is a brute-force approach, but please recall that the first space shuttle launch needed to be aborted due to software glitch. It was a timing issue and according to the analysis of the time, it could happen once in 72 (or was it 74) cases. Even NASA with a large pocket of money and its subcontractor could not catch it before launch. I am afraid that the situation has not changed much (unless we use a computer language well suited to avoid these thread-race issues.) We need all the help to track down visible and dormant thread-races. If artificial CPU execution tweaking (by changing the # of cores or even more advanced tweaking methods if available) can help, it is worth a try. Maybe not always if such a work cost extra money, but a prolonged (say a week) testing from time to time (each quarter or half a year, or maybe just prior to testing of beta of major release?). TIA ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Linux testing on single-core VMs nowadays
On 07/04/2014 23:13, Dave Hylands wrote: > Personally, I think that the more ways we can test for threading issues the > better. > It seems to me that we should do some amount of testing on single core and > multi-core. > > Then I suppose the question becomes how many cores? 2? 4? 8? > > Maybe we can cycle through some different number of cores so that we get > coverage without duplicating everything? One configuration that is particularly good at catching threading errors (especially narrow races) is constraining the software to run on two hardware threads on the same SMT-enabled core. This effectively forces the threads to share the L1 D$ which in turn can reveal some otherwise very-hard-to-find data synchronization issues. I don't know if we have that level of control on our testing hardware but if we do then that's a scenario we might want to include. Gabriele signature.asc Description: OpenPGP digital signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Linux testing on single-core VMs nowadays
Hey Ted, - Original Message - > From: "Ted Mielczarek" > To: "Mozilla Platform Development" > Sent: Monday, April 7, 2014 11:11:22 AM > Subject: Linux testing on single-core VMs nowadays > > I wanted to post about this because I don't think it's common knowledge > (I only just came to the realization today) and it has potential impact > on the effectiveness of our unit tests. > > Currently we run our Linux unit tests exclusively on Amazon EC2 > m1.medium[1] instances which have only one CPU core. Previously we used > to run Linux tests on in-house multicore hardware. This means that we're > testing different threading behavior now. In more concrete terms, a > threading bug[2] was found recently by AddressSanitizer but it only > manifested on the build machines (conveniently we still run some limited > xpcshell testing as part of `make check` as well as during packaging) > and not in our extensive unit tests running on the test machines. This > seems unfortunate. > > I'm not sure what the real impact of this is. Threading bugs can > certainly manifest on single-core machines, but the scheduling behavior > is different so they're likely to be different bugs. Is this an issue we > should address? Personally, I think that the more ways we can test for threading issues the better. It seems to me that we should do some amount of testing on single core and multi-core. Then I suppose the question becomes how many cores? 2? 4? 8? Maybe we can cycle through some different number of cores so that we get coverage without duplicating everything? Threading issues probably don't happen all that often, but when they do happen they can be more difficult to track down. So being able to get some coverage on machines with different numbers of core seems useful (especially if the number of cores is readily available and logged along with the TBPL failures). Dave Hylands ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Linux testing on single-core VMs nowadays
I wanted to post about this because I don't think it's common knowledge (I only just came to the realization today) and it has potential impact on the effectiveness of our unit tests. Currently we run our Linux unit tests exclusively on Amazon EC2 m1.medium[1] instances which have only one CPU core. Previously we used to run Linux tests on in-house multicore hardware. This means that we're testing different threading behavior now. In more concrete terms, a threading bug[2] was found recently by AddressSanitizer but it only manifested on the build machines (conveniently we still run some limited xpcshell testing as part of `make check` as well as during packaging) and not in our extensive unit tests running on the test machines. This seems unfortunate. I'm not sure what the real impact of this is. Threading bugs can certainly manifest on single-core machines, but the scheduling behavior is different so they're likely to be different bugs. Is this an issue we should address? -Ted 1. http://aws.amazon.com/ec2/instance-types/#Instance_Types 2. https://bugzilla.mozilla.org/show_bug.cgi?id=990230 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform