Continuing this investigation, I ran the following using the mono-3-2 branch as 
of a0fc6ba35b7454425b8ec772b2652730b8030a52.

I couldn't run a top-level "make check" because of this bug,

https://bugzilla.xamarin.com/show_bug.cgi?id=14049

Because of this limitation, I ran "make check" in the mono directory.

In 117 failures in 441 iterations (26.5%). Here's the count of the tests that 
failed,

    103 gsharing-valuetype-layout.exe
      8 sgen-bridge.exe|ms-conc
      2 gc-altstack.exe
      1 sgen-weakref-stress.exe|ms-par
      1 sgen-case-23400.exe|ms-par
      1 sgen-bridge.exe|plain
      1 bug-10127.exe

Looking only at the 103 failures of gsharing-valuetype-layout.exe, in 81 of the 
failures were the 120 second test timeout. A successful run of this test takes 
less than a second. In the timeout case, mono simply appears to hang.

Running this manually, when it hangs it stops using CPU and strace reports,

# strace -fp 4289
Process 4289 attached with 3 threads
[pid  4292] futex(0x7f6264000020, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  4290] futex(0x967340, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  4289] futex(0x1c84f1c, FUTEX_WAIT_PRIVATE, 3, NULL

Here's a gstack stack trace,

http://sprunge.us/CfjX

This is trivial to reproduce on my system,

# uname -a
Linux linux-mono.....com 3.7.10-1.1-desktop #1 SMP PREEMPT Thu Feb 28 15:06:29 
UTC 2013 (82d3f21) x86_64 x86_64 x86_64 GNU/Linux

Running as a VMware virtual machine, 4 CPU, 8 GB RAM.

I use this simple script to repeatedly run the commands,

http://sprunge.us/VKTS

E.g.,

        ./repeat.sh mono gsharing-valuetype-layout.exe

Filed bug 14073 to track this,

https://bugzilla.xamarin.com/show_bug.cgi?id=14073

Looking back at previous failures, I realize that this hang can be worked 
around by disabling AOT using the mono option '-O=-aot". Ugh. Given that, this 
may be the same as bug 7564,

https://bugzilla.xamarin.com/show_bug.cgi?id=7564

-Charles

-----Original Message-----
From: Charles Randall 
Sent: Wednesday, August 14, 2013 2:36 PM
To: mono-devel-list@lists.ximian.com
Subject: RE: [Mono-dev] mono-3.2.1 "make check" failures & sgen assertion

Continuing to dig into these failures, here is what I've found so far.

The majority of the bug-10127 test failures were due to bug 13604 and now 
resolves the assert in sgen-os-posix.c:60 and is already in the mono 3.2 branch 
and should be included in the upcoming mono 3.2.2.
 
https://bugzilla.xamarin.com/show_bug.cgi?id=13604

The failures in sgen-weakref-stress were resolved in this fix which is planned 
to be in the upcoming 3.2.2,

https://github.com/mono/mono/commit/aef4b77ea79aa0a4c06e10bd5842da9df0d10973

The majority of delegate2 test failures are due to bug 7564. There is a 
workaround for this listed in the bug report.

https://bugzilla.xamarin.com/show_bug.cgi?id=7564

That bug is pretty disturbing. Once you've determined you need the workaround, 
your application has already hung. If your application is critical to your 
business that's a tough lesson to learn.

The discrepancy between my observed "make check" failures and the "all green" 
results of the monkey wrench automated tests appears to be because many tests 
are disabled for monkey wrench. See DISABLED_TESTS_WRENCH in 
mono/tests/Makefile for the details. Notably, bug-10127 is disabled.

Continuing these tests with the 3.2 branch and the second fix above.

-Charles


_______________________________________________
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list

Reply via email to