#> > I found a segmentation violation in /var/dt/Xerrors that proceeds
#> each termination by
#> > what looks to be 1.5 - 2 hours, (but it may also just not be time-
#> stamped).  I'm not
#> > sure what process is dying.
#>
#> The Xerrors messages come from Sun Ray X servers that have crashed
#> because of a segmentation violation.  That is, they've tried to access
#> a
#> memory address that is outside the process's address space, and that's
#> earned them a SIGSEGV (signal 11).  If you have a support contract
#> then
#> you should open a call.  The support folks should be able to tell you
#> whether those dumps match a known bug and if so what the status of
#> the fix is.
#>

Support call has been opened.

#> > Oct 31 16:33:13 sr03c utauthd: [ID 794400 user.info] SessionManager0
#> NOTICE: EMPTY: ACTIVE session
#> > Oct 31 16:33:13 sr03c gconfd (foo-589): [ID 702911 user.info]
#> Received signal 15, shutting down cleanly
#> > Oct 31 16:33:13 sr03c gconfd (foo-589): [ID 702911 user.info]
#> Exiting
#>
#> These look like normal logout messages to me.
#>

Yes, they do, but the user wasn't touching the keyboard.  This is the result of
the crash below.

#>
#> > Signal 11 received! (pid 29924)
#> > pc = 0x3425C
#> > npc = 0x34260
#> > mem_catch at 0xFE18E200
#> > Machine context:
#> > FE901003 0003425C 00034260 00000000 000DE000 00033E5C 00033C00
#> C0000000
#> > 0000034E 00000000 FF212A00 2097FFFE 00401400 00000004 00000004
#> 00818EA0
#> > 00000000 FFBFE670 00034260 00000000 00E1EE80 0086F498 00E1EE80
#> 0086F498
#> > 0143C320 088A4BB0 04B2DEC0 0179E450 0086F498 00B48718 00E1E4D0
#> 00E90F08
#> > 088A4BB0 044E2890 0179E450 0143C320 0251C7B8 04B2DEC0 41978000
#> 00000000
#> > 40979797 97979798 3FB548B6 AB580104 00000000 00000000 40240000
#> 00000000
#> > 40340000 00000000 40700000 00000000 00000000 00000820 00080100
#> 00000000
#> > 78727300 FFBFE4F8 00000000 00000000 00000000 00000000 00000000
#> 00000000
#> > 00000000 00000000 00000000 00000000 00000000 00000000 00000000
#> 00000000
#> > 00000000 00000000 00000000 00000000 00000000 00000000
#>
#> This isn't normal, it's the X server crashing.  The data here is
#> enough
#> to give someone a chance of figuring out what happened.  You're
#> correct that there's no time stamp, so there's no knowing how much
#> time passed since the previous message.
#>
#> I think (I'm not certain) the handler that writes this data then goes
#> on to
#> try to leave a core file, but because the X server is a getgid process
#> by default the system won't collect the core.  You can override that
#> by
#> using the 'coreadm' command, and that'll let you collect a snapshot of
#> the failed process which should make it easier to figure out what
#> happened than just having the hex dump above.  If the dump doesn't
#> match a known bug then I expect the service guys will ask you to try
#> 'coreadm'.
#>
#> OttoM.

Excellent suggestion, and thank you.  I'll look into coreadm and see if
we can't give them what they need before they ask for it:-)

Thanks again!

-Bill


_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users

Reply via email to