What this is: The Mono team has a CI (continuous integration) system which builds and runs automated tests on every commit checked in to git (specifically the master branch). We have a test log viewer<https://jenkins.mono-project.com/view/All/job/jenkins-testresult-viewer/Test_Result_View/> on Jenkins that tracks the results. Once a week I sweep through and write an email with a list of the most frequently-failing automated tests. This is both so that everyone on the team is aware of our current stability level, and so that when people see failures in the github PR tests they know whether to treat them as known bugs or new failures. In the interest of making our development process more open, I’m going to start crossposting this weekly email on the public mailing list.
Here’s an overview of the top recurring failures currently ruining Jenkins builds: 1. MonoTests.runtime.reference-loader.exe failing 100% This morning the fix for bug 42584 was reverted in order to fix a failure in the build of the GTK+ package. The corresponding unit test for 42584 did not get reverted and that test is now failing on every single build. This will be fixed soon either by removing the test or by re-applying the 42584 patch but for the moment, this failure is seen in every build. 2. MonoTests.System.Net.Sockets.SocketTest.SendAsyncFile Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43172 , currently assigned to Marcos Heinrich. This has been failing for a pretty long time. It only occurs on Linux but on Linux it fails over 20% of the time. (It has also been seen on Android.) It is possible this is only an issue in CI (see akoeplinger note in bug). The failure is consistent and looks like: MESSAGE: System.Exception : Could not abort registered blocking threads before closing socket. Thread StackTrace: at System.Net.Sockets.SafeSocketHandle.RegisterForBlockingSyscall () [0x00057] in /mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/SafeSocketHandle.cs:114 at System.Net.Sockets.Socket.SendFile_internal (System.Net.Sockets.SafeSocketHandle safeHandle, System.String filename, System.Byte[] pre_buffer, System.Byte[] post_buffer, System.Net.Sockets.TransmitFileOptions flags) [0x00000] in /mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/Socket.cs:2944 at System.Net.Sockets.Socket.SendFile (System.String fileName, System.Byte[] preBuffer, System.Byte[] postBuffer, System.Net.Sockets.TransmitFileOptions flags) [0x00028] in /mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/Socket.cs:2893 [snip] Examples: https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-amd64/556/testReport/MonoTests.System.Net.Sockets/SocketTest/SendAsyncFile/https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-i386/558/testReport/MonoTests.System.Net.Sockets/SocketTest/SendAsyncFile/ 2.5. MonoTests.Remoting.RemotingServicesTest.MarshalThrowException On ARM64 only, when this test calls ChannelServices.UnregisterChannel(), sometimes a KeyNotFoundException is generated somewhere in the guts of Socket.Close. This is filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43727 . It is possible this is the same issue as #2 above (see akoeplinger note in bug). Examples: https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/641/testReport/MonoTests.Remoting/RemotingServicesTest/MarshalThrowException/ https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/636/testReport/MonoTests.Remoting/RemotingServicesTest/MarshalThrowException/ 3. ThreadAbortException in System.Threading.Timer+Scheduler.SchedulerThread Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43320 , currently assigned to Rodrigo. This occurs in many different places but the crash message always looks the same. It is believed to be existing bad behavior brought into the light by recent fixes by Vargaz around finalizers and VM shutdown. Unhandled Exception: System.TypeInitializationException: The type initializer for 'System.Collections.Generic.List`1' threw an exception. ---> System.Threading.ThreadAbortException --- End of inner exception stack trace --- at System.Threading.Timer+Scheduler.SchedulerThread () [0x0000f] in <filename unknown>:0 at System.Threading.ThreadHelper.ThreadStart_Context (System.Object state) [0x00017] in <filename unknown>:0 at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x0008d] in <filename unknown>:0 at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00000] in <filename unknown>:0 at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state) [0x00031] in <filename unknown>:0 at System.Threading.ThreadHelper.ThreadStart () [0x0000b] in <filename unknown>:0 [MVID] 0deb57f9de664ff681556c641423618d 0,1,2,3,4,5 [ERROR] FATAL UNHANDLED EXCEPTION: Nested exception trying to figure out what went wrong Some places this failure is seen include MonoTests.gshared.generic-marshalbyref.2.exe, MonoTests.runtime.bug-415577.exe, and as an unknown-test failure when a test suite (such as mcs/class/corlib) is shutting down. Examples: https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4606/testReport/MonoTests/gshared/generic_marshalbyref_2_exe_3/ https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4607/testReport/MonoTests/gshared/generic_marshalbyref_2_exe/ https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4608/testReport/MonoTests/runtime/bug_415577_exe/ https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4656/parsed_console/log_content.html#WARNING1 (test shutdown) 4. __icall_wrapper_mono_gc_alloc_vector crash during thread start / domain unload Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43921 , currently assigned to Aleksey. We have started seeing SIGSEGVs in a range of tests related to domain unloading, or thread creation around the same time as the GC stopping the world. Mac only (?). https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4742/testReport/MonoTests/sgen-regular-tests-ms-split-95/sgen_domain_unload_2_exe/ https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4744/testReport/MonoTests/sgen-regular-tests-ms-split-clear-at-gc/sgen_new_threads_dont_join_stw_2_exe/ https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4812/parsed_console/log_content.html#WARNING2 4.5 (?). AppDomain.internalUnload crash This is also mac-only and might be the same failure as #4? Aleksey is looking into it. https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4812/ https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4811/<https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4811/parsed_console/log_content.html#WARNING1> https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4813/testReport/MonoTests/sgen-regular-tests-plain/sgen_domain_unload_exe_timedout/ https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4812/testReport/ (both failures) https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4811/testReport/MonoTests/runtime/remoting4_exe_timedout/ Crashes, managed stack looks like: at (wrapper managed-to-native) System.AppDomain.InternalUnload (int) <0x00012> at System.AppDomain.Unload (System.AppDomain) [0x00011] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-i386/mcs/class/corlib/System/AppDomain.cs:1200 at MonoTests.System.AppDomainTest.TearDown () [0x0000b] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-i386/mcs/class/corlib/Test/System/AppDomainTest.cs:71 at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <IL 0x0004f, 0x00092> ... 5. Crash doing thread join while closing process after ServiceModel tests Both crashes and hangs have been seen recently while running the ServiceModel test suites. This has been seen on both Mac and Linux. Nothing is filed. When the crash occurs, it happens in the test runner itself, waiting for the tests to finish. https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4808/parsed_console/log_content.html#WARNING1 https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4794/parsed_console/log_content.html#WARNING2 https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/739/parsed_console/log_content.html#WARNING1 Managed stack looks like: at (wrapper managed-to-native) System.Threading.Thread.JoinInternal (System.Threading.Thread,int) <IL 0x00014, 0x00067> at System.Threading.Thread.Join () [0x00000] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/class/referencesource/mscorlib/system/threading/thread.cs:697 at NUnit.Core.TestRunnerThread.Wait () [0x00010] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/TestRunnerThread.cs:118 at NUnit.Core.ThreadedTestRunner.Wait () [0x0000b] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ThreadedTestRunner.cs:63 at NUnit.Core.ThreadedTestRunner.EndRun () [0x00000] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ThreadedTestRunner.cs:55 at NUnit.Core.ThreadedTestRunner.Run (NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x00008] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ThreadedTestRunner.cs:36 at NUnit.Core.ProxyTestRunner.Run (NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x00007] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/ProxyTestRunner.cs:133 at NUnit.Core.RemoteTestRunner.Run (NUnit.Core.EventListener,NUnit.Core.ITestFilter) [0x0002b] in /Users/builder/jenkins/workspace/test-mono-mainline/label/osx-amd64/mcs/nunit24/NUnitCore/core/RemoteTestRunner.cs:63 Native stack, when we get one, looks like: 0 mono 0x00000001073a7d5a mono_handle_native_sigsegv + 282 1 libsystem_platform.dylib 0x00007fff91ff152a _sigtramp + 26 2 ??? 0x00000001081b9a00 0x0 + 4430993920 3 mono 0x000000010756f763 mono_os_cond_timedwait + 163 4 mono 0x000000010756e326 mono_w32handle_timedwait_signal_handle + 358 5 mono 0x000000010756e0e1 mono_w32handle_wait_one + 897 6 mono 0x00000001075535f9 wapi_WaitForSingleObjectEx + 9 7 mono 0x00000001074a2cfe ves_icall_System_Threading_Thread_Join_internal + 174 6. Tarjan GC bridge crashing in “major fragmentation” test sgen-bridge-major-fragmentation.exe, which runs with a simulated version of the Android GC bridge, has in the last 24 hours started segfaulting about 1/3 of the time on ÅRM soft float (but never on any other platform). The stacks are consistent. Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=44397 , currently assigned to me.
_______________________________________________ Mono-devel-list mailing list [email protected] http://lists.dot.net/mailman/listinfo/mono-devel-list
