What this is: The Mono team has a CI (continuous integration) system which 
builds and runs automated tests on every commit checked in to git (specifically 
the master branch). We have a test log 
viewer<https://jenkins.mono-project.com/view/All/job/jenkins-testresult-viewer/Test_Result_View/>
 on Jenkins that tracks the results. Once a week I sweep through and write an 
email with a list of the most frequently-failing automated tests. This is both 
so that everyone on the team is aware of our current stability level, and so 
that when people see failures in the github PR tests they know whether to treat 
them as known bugs or new failures. In the interest of making our development 
process more open, I’m going to start crossposting this weekly email on the 
public mailing list.

Here’s an overview of the top recurring failures currently ruining Jenkins 
builds:

1. MonoTests.runtime.reference-loader.exe failing 100%

This morning the fix for bug 42584 was reverted in order to fix a failure in 
the build of the GTK+ package. The corresponding unit test for 42584 did not 
get reverted and that test is now failing on every single build. This will be 
fixed soon either by removing the test or by re-applying the 42584 patch but 
for the moment, this failure is seen in every build.

2. MonoTests.System.Net.Sockets.SocketTest.SendAsyncFile

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43172 , currently 
assigned to Marcos Heinrich.

This has been failing for a pretty long time. It only occurs on Linux but on 
Linux it fails over 20% of the time. (It has also been seen on Android.) It is 
possible this is only an issue in CI (see akoeplinger note in bug).

The failure is consistent and looks like:


                                                MESSAGE:
                                                System.Exception : Could not 
abort registered blocking threads before closing socket.
Thread StackTrace:
  at System.Net.Sockets.SafeSocketHandle.RegisterForBlockingSyscall () 
[0x00057] in 
/mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/SafeSocketHandle.cs:114
  at System.Net.Sockets.Socket.SendFile_internal 
(System.Net.Sockets.SafeSocketHandle safeHandle, System.String filename, 
System.Byte[] pre_buffer, System.Byte[] post_buffer, 
System.Net.Sockets.TransmitFileOptions flags) [0x00000] in 
/mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/Socket.cs:2944
  at System.Net.Sockets.Socket.SendFile (System.String fileName, System.Byte[] 
preBuffer, System.Byte[] postBuffer, System.Net.Sockets.TransmitFileOptions 
flags) [0x00028] in 
/mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/Socket.cs:2893

[snip]

Examples:

https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-amd64/556/testReport/MonoTests.System.Net.Sockets/SocketTest/SendAsyncFile/https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-i386/558/testReport/MonoTests.System.Net.Sockets/SocketTest/SendAsyncFile/

2.5. MonoTests.Remoting.RemotingServicesTest.MarshalThrowException

On ARM64 only, when this test calls ChannelServices.UnregisterChannel(), 
sometimes a KeyNotFoundException is generated somewhere in the guts of 
Socket.Close. This is filed as 
https://bugzilla.xamarin.com/show_bug.cgi?id=43727 . It is possible this is the 
same issue as #2 above (see akoeplinger note in bug).

Examples:

https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/641/testReport/MonoTests.Remoting/RemotingServicesTest/MarshalThrowException/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/636/testReport/MonoTests.Remoting/RemotingServicesTest/MarshalThrowException/

3. ThreadAbortException in System.Threading.Timer+Scheduler.SchedulerThread
Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43320 , currently 
assigned to Rodrigo.

This occurs in many different places but the crash message always looks the 
same. It is believed to be existing bad behavior brought into the light by 
recent fixes by Vargaz around finalizers and VM shutdown.


Unhandled Exception:

System.TypeInitializationException: The type initializer for 
'System.Collections.Generic.List`1' threw an exception. ---> 
System.Threading.ThreadAbortException

   --- End of inner exception stack trace ---

  at System.Threading.Timer+Scheduler.SchedulerThread () [0x0000f] in <filename 
unknown>:0

  at System.Threading.ThreadHelper.ThreadStart_Context (System.Object state) 
[0x00017] in <filename unknown>:0

  at System.Threading.ExecutionContext.RunInternal 
(System.Threading.ExecutionContext executionContext, 
System.Threading.ContextCallback callback, System.Object state, System.Boolean 
preserveSyncCtx) [0x0008d] in <filename unknown>:0

  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext 
executionContext, System.Threading.ContextCallback callback, System.Object 
state, System.Boolean preserveSyncCtx) [0x00000] in <filename unknown>:0

  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext 
executionContext, System.Threading.ContextCallback callback, System.Object 
state) [0x00031] in <filename unknown>:0

  at System.Threading.ThreadHelper.ThreadStart () [0x0000b] in <filename 
unknown>:0

[MVID] 0deb57f9de664ff681556c641423618d 0,1,2,3,4,5

[ERROR] FATAL UNHANDLED EXCEPTION: Nested exception trying to figure out what 
went wrong


Some places this failure is seen include 
MonoTests.gshared.generic-marshalbyref.2.exe, MonoTests.runtime.bug-415577.exe, 
and as an unknown-test failure when a test suite (such as mcs/class/corlib) is 
shutting down.

Examples:

https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4606/testReport/MonoTests/gshared/generic_marshalbyref_2_exe_3/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4607/testReport/MonoTests/gshared/generic_marshalbyref_2_exe/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4608/testReport/MonoTests/runtime/bug_415577_exe/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4656/parsed_console/log_content.html#WARNING1
 (test shutdown)

4. __icall_wrapper_mono_gc_alloc_vector crash during thread start / domain 
unload

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43921 , currently 
assigned to Aleksey. We have started seeing SIGSEGVs in a range of tests 
related to domain unloading, or thread creation around the same time as the GC 
stopping the world. Mac only (?).

https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4742/testReport/MonoTests/sgen-regular-tests-ms-split-95/sgen_domain_unload_2_exe/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4744/testReport/MonoTests/sgen-regular-tests-ms-split-clear-at-gc/sgen_new_threads_dont_join_stw_2_exe/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4812/parsed_console/log_content.html#WARNING2

4.5 (?). AppDomain.internalUnload crash

This is also mac-only and might be the same failure as #4? Aleksey is looking 
into it.

https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4812/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4811/<https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4811/parsed_console/log_content.html#WARNING1>
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4813/testReport/MonoTests/sgen-regular-tests-plain/sgen_domain_unload_exe_timedout/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4812/testReport/
 (both failures)
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4811/testReport/MonoTests/runtime/remoting4_exe_timedout/
Crashes, managed stack looks like:

  at (wrapper managed-to-native) System.AppDomain.InternalUnload (int) <0x00012>
  at System.AppDomain.Unload (System.AppDomain) [0x00011] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-i386/mcs/class/corlib/System/AppDomain.cs:1200
  at MonoTests.System.AppDomainTest.TearDown () [0x0000b] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-i386/mcs/class/corlib/Test/System/AppDomainTest.cs:71
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ 
(object,intptr,intptr,intptr) <IL 0x0004f, 0x00092>


...

5. Crash doing thread join while closing process after ServiceModel tests

Both crashes and hangs have been seen recently while running the ServiceModel 
test suites. This has been seen on both Mac and Linux. Nothing is filed. When 
the crash occurs, it happens in the test runner itself, waiting for the tests 
to finish.

https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4808/parsed_console/log_content.html#WARNING1
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4794/parsed_console/log_content.html#WARNING2
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/739/parsed_console/log_content.html#WARNING1

Managed stack looks like:

  at (wrapper managed-to-native) System.Threading.Thread.JoinInternal 
(System.Threading.Thread,int) <IL 0x00014, 0x00067>
  at System.Threading.Thread.Join () [0x00000] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/class/referencesource/mscorlib/system/threading/thread.cs:697
  at NUnit.Core.TestRunnerThread.Wait () [0x00010] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/TestRunnerThread.cs:118
  at NUnit.Core.ThreadedTestRunner.Wait () [0x0000b] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ThreadedTestRunner.cs:63
  at NUnit.Core.ThreadedTestRunner.EndRun () [0x00000] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ThreadedTestRunner.cs:55
  at NUnit.Core.ThreadedTestRunner.Run 
(NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x00008] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ThreadedTestRunner.cs:36
  at NUnit.Core.ProxyTestRunner.Run 
(NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x00007] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ProxyTestRunner.cs:133
  at NUnit.Core.RemoteTestRunner.Run 
(NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x0002b] in 
/Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/RemoteTestRunner.cs:63


Native stack, when we get one, looks like:

        0   mono                                0x00000001073a7d5a 
mono_handle_native_sigsegv + 282
        1   libsystem_platform.dylib            0x00007fff91ff152a _sigtramp + 
26
        2   ???                                 0x00000001081b9a00 0x0 + 
4430993920
        3   mono                                0x000000010756f763 
mono_os_cond_timedwait + 163
        4   mono                                0x000000010756e326 
mono_w32handle_timedwait_signal_handle + 358
        5   mono                                0x000000010756e0e1 
mono_w32handle_wait_one + 897
        6   mono                                0x00000001075535f9 
wapi_WaitForSingleObjectEx + 9
        7   mono                                0x00000001074a2cfe 
ves_icall_System_Threading_Thread_Join_internal + 174


6. Tarjan GC bridge crashing in “major fragmentation” test

sgen-bridge-major-fragmentation.exe, which runs with a simulated version of the 
Android GC bridge, has in the last 24 hours started segfaulting about 1/3 of 
the time on ÅRM soft float (but never on any other platform). The stacks are 
consistent. Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=44397 , 
currently assigned to me.

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list@lists.dot.net
http://lists.dot.net/mailman/listinfo/mono-devel-list

Reply via email to