What this is: The Mono team has a CI (continuous integration) system which 
builds and runs automated tests on every commit checked in to git (specifically 
the master branch). We have a test log 
 on Jenkins that tracks the results (currently only accessible to github 
project admins, sorry). Once a week I sweep through and write an email with a 
list of the most frequently-failing automated tests.

We have had a pretty rough couple weeks on the master branch and I actually 
have not even been able to create this list for the last two weeks.  We’re 
starting to get back to stable but now there are some new issues, some with 
strange signatures and frustratingly low frequencies. #s 1, 2, 3 and 6 in 
particular this week are new or effectively new and need someone to look at 

Here are top recurring failures currently ruining Jenkins builds:

0. System.Security.Cryptography.X509Certificates, various [New]

A couple of weeks ago we checked in a major set of changes— adding the 
BoringTLS system— to master without going through our normal PR process. This 
has lead to a large number of consistent failures on some lanes. This is being 
worked on, but there are still every-build failures on Mac Intel64 and Mac 
Intel32. It is not totally clear yet whether the final failures are due to a 
bug or due to certificates not being registered properly. The main person 
working on fixing this is Martin Baulig.

1. Null reference exceptions in System.Text.StringBuilder.Append, etc [New]

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=45335 , examples in 
bugzilla. Null reference exceptions are turning up in various runtime tests, 
most often in StringBuilder.Append. It is not immediately clear if the problem 
is in the class libraries or if the runtime is turning references null.

2. AppDomain.internalUnload crash [Existing, but has increased in frequency]

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=45337 , examples in 
bugzilla. We are seeing segfaults whose stacktraces have the signature:

  at (wrapper managed-to-native) System.AppDomain.InternalUnload (int) <0x00012>
  at System.AppDomain.Unload (System.AppDomain) <0x0002c>

We were seeing this a couple weeks ago when I last sent a CI weather report. A 
few people have told me this will possibly be fixed if we merge 
https://github.com/mono/mono/pull/3364 .

3. Hang doing thread join while closing process after ServiceModel tests 
[Existing, but has increased in frequency]

Hangs have been seen for the last few weeks while running the ServiceModel test 
suites. This has been seen on both Mac and Linux (I think it likes to crash on 
mac and hang on Linux?). Nothing is filed. When the crash occurs, it happens in 
the test runner itself, waiting for the tests to finish.

An example from this week:


Some more from a couple weeks ago:


Managed stack looks like:

  at (wrapper managed-to-native) System.Threading.Thread.JoinInternal 
(System.Threading.Thread,int) <IL 0x00014, 0x00067>
  at System.Threading.Thread.Join () [0x00000] in 
  at NUnit.Core.TestRunnerThread.Wait () [0x00010] in 
  at NUnit.Core.ThreadedTestRunner.Wait () [0x0000b] in 
  at NUnit.Core.ThreadedTestRunner.EndRun () [0x00000] in 
  at NUnit.Core.ThreadedTestRunner.Run 
(NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x00008] in 
  at NUnit.Core.ProxyTestRunner.Run 
(NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x00007] in 
  at NUnit.Core.RemoteTestRunner.Run 
(NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x0002b] in 

Native stack, when we get one, looks like:

        0   mono                                0x00000001073a7d5a 
mono_handle_native_sigsegv + 282
        1   libsystem_platform.dylib            0x00007fff91ff152a _sigtramp + 
        2   ???                                 0x00000001081b9a00 0x0 + 
        3   mono                                0x000000010756f763 
mono_os_cond_timedwait + 163
        4   mono                                0x000000010756e326 
mono_w32handle_timedwait_signal_handle + 358
        5   mono                                0x000000010756e0e1 
mono_w32handle_wait_one + 897
        6   mono                                0x00000001075535f9 
wapi_WaitForSingleObjectEx + 9
        7   mono                                0x00000001074a2cfe 
ves_icall_System_Threading_Thread_Join_internal + 174

4. ThreadAbortException in System.Threading.Timer+Scheduler.SchedulerThread  
(the "List`1 issue")  [Existing]

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43320 , currently 
assigned to Rodrigo. I thought this had gotten better for a week or so, but in 
the last 48 hours it’s occurred on at least one lane on almost every build, so 
maybe it was random fluctuations.

This occurs in many different places but the crash message always looks the 
same. It is believed to be existing bad behavior brought into the light by 
recent fixes by Vargaz around finalizers and VM shutdown.

Unhandled Exception:

System.TypeInitializationException: The type initializer for 
'System.Collections.Generic.List`1' threw an exception. ---> 

   --- End of inner exception stack trace ---

  at System.Threading.Timer+Scheduler.SchedulerThread () [0x0000f] in <filename 

  at System.Threading.ThreadHelper.ThreadStart_Context (System.Object state) 
[0x00017] in <filename unknown>:0

  at System.Threading.ExecutionContext.RunInternal 
(System.Threading.ExecutionContext executionContext, 
System.Threading.ContextCallback callback, System.Object state, System.Boolean 
preserveSyncCtx) [0x0008d] in <filename unknown>:0

  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext 
executionContext, System.Threading.ContextCallback callback, System.Object 
state, System.Boolean preserveSyncCtx) [0x00000] in <filename unknown>:0

  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext 
executionContext, System.Threading.ContextCallback callback, System.Object 
state) [0x00031] in <filename unknown>:0

  at System.Threading.ThreadHelper.ThreadStart () [0x0000b] in <filename 

[MVID] 0deb57f9de664ff681556c641423618d 0,1,2,3,4,5

[ERROR] FATAL UNHANDLED EXCEPTION: Nested exception trying to figure out what 
went wrong

Some places this failure is seen include 
MonoTests.gshared.generic-marshalbyref.2.exe, MonoTests.runtime.bug-415577.exe, 
and as an unknown-test failure when a test suite (such as mcs/class/corlib) is 
shutting down.

A recent example:


An old examples:

 (test shutdown)

5. MonoTests.System.Net.Sockets.SocketTest.SendAsyncFile [Existing]

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43172 , currently 
assigned to Marcos Heinrich.

This has been failing for a pretty long time. It only occurs on Linux but on 
Linux it fails over 20% of the time. (It has also been seen on Android.) It is 
possible this is only an issue in CI (see akoeplinger note in bug).

The failure is consistent and looks like:

                                                System.Exception : Could not 
abort registered blocking threads before closing socket.
Thread StackTrace:
  at System.Net.Sockets.SafeSocketHandle.RegisterForBlockingSyscall () 
[0x00057] in 
  at System.Net.Sockets.Socket.SendFile_internal 
(System.Net.Sockets.SafeSocketHandle safeHandle, System.String filename, 
System.Byte[] pre_buffer, System.Byte[] post_buffer, 
System.Net.Sockets.TransmitFileOptions flags) [0x00000] in 
  at System.Net.Sockets.Socket.SendFile (System.String fileName, System.Byte[] 
preBuffer, System.Byte[] postBuffer, System.Net.Sockets.TransmitFileOptions 
flags) [0x00028] in 




6. Hang(?) during MS acceptance tests [New]

Not filed. See 

The ms-test-suite normally takes six to nine minutes to run. Four times in the 
last week at least, always on Mac, it takes up the full 15 minutes and then 
times out. The failure occurs in different places so it doesn’t obviously 
appear one single test is hanging. This seems a bit high to be normal variance.
Mono-devel-list mailing list

Reply via email to