Thanks to Ian and Alan for the replies. I have done some further elimination (by removing runtime components) and I don't think it is the new board interface causing this. I think it is another component, that isn't quite as new, but I had forgotten is new in this context (ubuntu server). This component periodically uses a graphing library (ZedGraph) to generate line graphs from the data collected from the input boards. I have included the entire capture of the stack trace that mono sends to stdout. Note that this is a capture of the console, not a log file, so it includes system status messages from my code as well - I left them in for context, whether it matters or not.

http://pastebin.com/kQFF4TUB

I currently have a test running that eliminates this graphing component, but includes the new board component, and it seems promising so far. I'll feel better after it runs for a week though, since I've had it run for almost 5 days before it crashed.

At any rate, if it is this new component and the graphing library causing this issue, I need figure out how to fix it. Also, I have used ZedGraph for a very long time to generate images like this, but the frequency used to be limited to once per day. Now it can be once per minute. The once/day generation is done in yet another component, so it could be the two 'walking' on each other if the underlying code isn't thread-safe. I would expect some kind of time correlation if that was the case, and I just don't see that. I have some ideas on how to serialize all of these operations to a single thread, but I'd need to be fairly sure of the problem before I went to the effort to implement that.

If I could get a good bead on what I'm doing that causes this error I can work around it.

Thanks again for the help,
Danny


On 04/08/2013 06:21 AM, Alan wrote:
I'm not sure if fontconfig is threadsafe and the finalizer thread is
directly unreffing some fontconfig objects. This could easily be causing
the corruption you're seeing if that's the case. Can you paste the full
stacktrace of your crash (including all threads!) in a pastebin, or
attach it to your email in some way?

Alan


On 8 April 2013 08:42, Ian Norton
<[email protected]
<mailto:[email protected]>> wrote:

    I'd be sure to check your struct packing and call conventions
    properly. And
    perhaps be sure that you aren't passing in any "ref System.String"
    instead of
    StringBuilders

    Ian

    On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote:
     > Hello,
     >
     > I'm having a difficult time with an application I have written.  I
     > recently made some changes and I'm having a problem with it
    failing at
     > seemingly random times and locations (within the code), with sigsegv
     > errors.  This is a multithreaded plugin-style daemon/service (can be
     > launched from CLI) and I recently added a new component to it to
    poll a
     > data acquisition board via USB using FTDI.
     >
     > Almost all of our integrations like this use a shared library (or
    DLL on
     > Windows) and p/invoke to access hardware.  I have done dozens of
    these
     > integrations over USB without a persistent issue like this.  But
    still
     > at first I suspected this new component, as I had initially
    thought it
     > was trashing RAM because of the problems I had developing the
    shared library
     >
     > However, at the same time as I made this addition, I was also
    (somewhat)
     > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on
     > 10.04).  So unfortunately, I have more than one variable changing
    at a
     > time.  So I confirmed, with a configuration that eliminates the newly
     > developed component, that this problem occurs without that running.
     >
     > That's good and bad, since now it seems likely that the offending
    code
     > is out of my control.  I am hoping to get some information on the
     > error(s) I was able to capture, or some advice on how to debug
    the root
     > cause of this problem.
     >
     > I have a couple of stack traces captured and I'll include what I
    believe
     > is the crucial part of one here.  It's worth noting that not all
    of the
     > stack traces are the same.  It's also worth noting that I have seen
     > libgdiplus.so in other traces that I didn't get captured.
     >
     > I tried setting up a 10.04 machine to test with, but one of our newer
     > dependencies (ServiceStack) introduced a class that is not in the
     > default mono on that platform, giving a startup error trying to
    resolve
     > the IgnoreDataMemberAttribute class.  So I then got the latest
    mono set
     > up on that machine now, but fear that this will result in the
    same error
     > I am reporting (ie: I believe this to be a mono problem), since it
     > should be the same mono framework running there.
     >
     > Any help is greatly appreciated.
     >
     >
     >
     > <snip - a bunch of standard output msgs from the service />
     >
     > Stacktrace:
     >
     >    at (wrapper managed-to-native)
    System.Drawing.GDIPlus.GdipDeleteFont
     > (intptr) <0xffffffff>
     >    at System.Drawing.Font.Dispose () <0x0002b>
     >    at (wrapper remoting-invoke-with-check)
    System.Drawing.Font.Dispose
     > () <0xffffffff>
     >    at System.Drawing.Font.Finalize () <0x00013>
     >    at (wrapper runtime-invoke)
     > object.runtime_invoke_virtual_void__this__
    (object,intptr,intptr,intpt$
     >
     > Native stacktrace:
     >
     >          mono() [0x80e16fc]
     >          mono() [0x81209fc]
     >          mono() [0x806094d]
     >          [0xb770240c]
     >
     > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15)
     > [0xb4b1b9b5]
     >          /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43)
    [0xb4b29b43]
     >
     > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82)
     > [0xb4b29e12]
     >          /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132)
    [0xb5004642]
     >          /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c]
     >          [0xaf711940]
     >          [0xaf7118cc]
     >          [0xaf711870]
     >          [0xaf7117ec]
     >          [0xb5cddf41]
     >          mono() [0x8150107]
     >
     > <snip - 42 thread stack details>
     >
     > =================================================================
     > Got a SIGSEGV while executing native code. This usually indicates
     > a fatal error in the mono runtime or one of the native libraries
     > used by your application.
     > =================================================================
     >
     >
     >
     >
     > Danny
     > _______________________________________________
     > Mono-list maillist  - [email protected]
    <mailto:[email protected]>
     > http://lists.ximian.com/mailman/listinfo/mono-list
    _______________________________________________
    Mono-list maillist  - [email protected]
    <mailto:[email protected]>
    http://lists.ximian.com/mailman/listinfo/mono-list


_______________________________________________
Mono-list maillist  -  [email protected]
http://lists.ximian.com/mailman/listinfo/mono-list

Reply via email to