Re: [Mono-list] Looking for mono expert to help debug a hanging process

2015-11-11 Thread River Satya
Great, thanks for the helpful response Edward. My responses to your
comments inline, and updated info at the end:

On 10 November 2015 at 08:56, Edward Ned Harvey (mono) <
edward.harvey.m...@clevertrove.com> wrote:

> > From: mono-list-boun...@lists.ximian.com [mailto:mono-list-
> > boun...@lists.ximian.com] On Behalf Of River Satya
> >
> > We have a c# binary running under mono on Ubuntu 14.04 which hangs
> > periodically.
> >
> > When it hangs, SIGQUIT does not generate a thread dump, and all threads,
> > including one heartbeat thread that does very little but pulse the logs
> once a
> > minute, seem to stop.
>
> First and foremost, make sure you're running the latest version of mono.
> What version are you on?
>

Mono JIT compiler version 4.0.4 (Stable 4.0.4.1/5ab4c0d Tue Aug 25 23:11:51
UTC 2015)


>
> You should also be aware, that Xamarin has a list of 3rd party contractors
> for support work like this. You should be able to find that on their
> website.
>

Great, thanks!


> Sounds like (probably) a deadlock. But a deadlock between some other
> threads shouldn't affect your heartbeat thread - unless your heartbeat
> thread is dependent on something. How is your heartbeat thread written?
>

It's a bit of a stretch to call it a heartbeat thread. It's actually the
main thread, and writes a log line once per minute. It also watches a
CancellationTokenSource for  cancellation via a Unix signal (to allow
graceful shutdown in case of SIGTERM/SIGINT). It also does cleanup of other
completed Tasks etc. It's certainly not impossible that it's blocking on
another thread.


> For example, if you have a heartbeat thread that uses a Timer, the Timer
> needs to raise an async event from the threadpool, so if the threadpool is
> drained by some other threads, then your Timer event might not occur. But
> if you created a managed instance of System.Threading.Thread, and then
> launched it into a while(true) loop, that uses
> System.Threading.Thread.Sleep(), you can be assured you don't have a
> dependency on the threadpool.


We don't use the threadpool from the main thread (apart from at startup),
though it is used elsewhere in the app. I'd be very surprised if we're
maxing out the threadpool (default number is 100?), unless there's a leak
somewhere, which is possible, though I think I'd have seen it. We never
seem to go above 40 total threads.


> But if you accidentally drop reference to your heartbeat thread, some time
> later it will be collected by the GC (while it's still running) which is no
> bueno.


It's the main thread, so presumably this isn't a problem.


> If the heartbeat thread is using any locking, that's a possible issue. If
> it's writing to some log resource, or file, which is shared by other
> threads, that's a possible issue.
>

It writes logs using log4net, and there is definitely some locking code in
it.

time passes

Okay, so it turns out that the machine was low on memory at the time that
this happened (~ 80MiB). I'm not sure if this is a symptom of what was
happening or the cause. Either way, I spun up a new instance with double
the memory and retested.

Now I'm seeing different symptoms, but still concerning, and possibly
related.

Several times a day, we get segfaults printed to stdout, often with no
stacktrace:
ie

Stacktrace:
>
> Native stacktrace:


and sometimes with a stacktrace: eg

* Assertion at mono-internal-hash.c:125, condition `0' not met
>
>
>> Stacktrace:
>
>
>>
>> Native stacktrace:
>
>
>> /usr/bin/mono() [0x4b23dc]
>
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f9f06cbc340]
>
> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x39) [0x7f9f0691dcc9]
>
> /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7f9f069210d8]
>
> /usr/bin/mono() [0x629869]
>
> /usr/bin/mono() [0x629a77]
>
> /usr/bin/mono() [0x629bc6]
>
> /usr/bin/mono() [0x6193ac]
>
> /usr/bin/mono() [0x422086]
>
> /usr/bin/mono() [0x5a6f02]
>
> /usr/bin/mono() [0x5b0610]
>
> /usr/bin/mono() [0x5a1c89]
>
> /usr/bin/mono() [0x5a1cc0]
>
> /usr/bin/mono() [0x5a215d]
>
> /usr/bin/mono() [0x5874e8]
>
> /usr/bin/mono() [0x623a36]
>
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f9f06cb4182]
>
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f9f069e147d]
>
>
>> Debug info from gdb:
>
>
>>
>> =
>
> Got a SIGABRT while executing native code. This usually indicates
>
> a fatal error in the mono runtime or one of the native libraries
>
> used by your application.
>
> =
>
>
>
Thanks again for your help!

Cheers,

River
___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list


Re: [Mono-list] Looking for mono expert to help debug a hanging process

2015-11-10 Thread Elmar Haneke

>
> I've tried building with debug symbols and attaching gdb to work out
> where it's stopped, but gdb segfaults when I try to get a stack trace
> for certain threads.

Did you try to debug using MonoDevelop integrated debugger?
When the problem arises in managed (C#) Code you can debug with IDE. I'm
not sure if gdb debugging in machine level is the right tool for that.

Elmar
___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list


Re: [Mono-list] Looking for mono expert to help debug a hanging process

2015-11-09 Thread Edward Ned Harvey (mono)
> From: mono-list-boun...@lists.ximian.com [mailto:mono-list-
> boun...@lists.ximian.com] On Behalf Of River Satya
> 
> We have a c# binary running under mono on Ubuntu 14.04 which hangs
> periodically.
> 
> When it hangs, SIGQUIT does not generate a thread dump, and all threads,
> including one heartbeat thread that does very little but pulse the logs once a
> minute, seem to stop.

First and foremost, make sure you're running the latest version of mono. What 
version are you on?

You should also be aware, that Xamarin has a list of 3rd party contractors for 
support work like this. You should be able to find that on their website.

Sounds like (probably) a deadlock. But a deadlock between some other threads 
shouldn't affect your heartbeat thread - unless your heartbeat thread is 
dependent on something. How is your heartbeat thread written?

For example, if you have a heartbeat thread that uses a Timer, the Timer needs 
to raise an async event from the threadpool, so if the threadpool is drained by 
some other threads, then your Timer event might not occur. But if you created a 
managed instance of System.Threading.Thread, and then launched it into a 
while(true) loop, that uses System.Threading.Thread.Sleep(), you can be assured 
you don't have a dependency on the threadpool. But if you accidentally drop 
reference to your heartbeat thread, some time later it will be collected by the 
GC (while it's still running) which is no bueno. If the heartbeat thread is 
using any locking, that's a possible issue. If it's writing to some log 
resource, or file, which is shared by other threads, that's a possible issue.
___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list