Thanks Ben

We're having problems determining where the 11 is coming from.

On Fri, Nov 13, 2009 at 1:49 PM, Ben Chernys <
ben.cher...@softwaretoolhouse.com> wrote:

> **
>
>  PS.  The 91 is a red herring.  It's the Sig 11 (SEGV) you need to worry
> about.   The 91 is another process not being able to communicate with the
> arserverd process.
>
> Cheers
> Ben
>
>  ------------------------------
> *From:* Ben Chernys [mailto:ben.cher...@softwaretoolhouse.com]
> *Sent:* November 13, 2009 8:42 PM
> *To:* 'arslist@ARSLIST.ORG'
> *Subject:* RE: Prod server down - services will not stay up
>
>   The signal 11 is bad code - simple as that.  It's a "segmentation
> violation" which means that the server (arserverd) attempted to read or
> write to an address not allocated to its virtual space.  It can also be
> caused by a double free or two pointers to one block which has been freed.
> In any event, you cannot fix this without the ARS source code which I expect
> you would find hard to get.
>
> That being said, the easiest way to determine (and then circumvent) these
> types of things is to turn on SQL logging on the server before the system
> starts (through the ar.conf file).  The exact settings are in the
> configuring ARS guide.
>
> Then, when the blow up happens, see what the server was attempting to do.
> You can usually spot some possible internal database inconsistencies (in ARS
> meta-data) in this way and then repair them manually through SQL before the
> ARS start-up.
>
> Additionally, there may be patches available that address the problem.
>
> Cheers
> Ben Chernys
>
>
>
>  ------------------------------
> *From:* Action Request System discussion list(ARSList) [mailto:
> arsl...@arslist.org] *On Behalf Of *Susan Palmer
> *Sent:* November 13, 2009 8:30 PM
> *To:* arslist@ARSLIST.ORG
> *Subject:* Prod server down - services will not stay up
>
> **
> Help !!
>
> Working with support but could use anyone else's input.  I'm at WWRUG so
> it's somewhat limiting.
>
> We did a truss log and and when the services drop (arerror 91) we see the
> following:
> 167
> /11:    read(54, "\0FE\0\006\0\0\0\0\01017".., 2064)    = 254
> /11:    write(54, "\0A1\0\006\0\0\0\0\003 ^".., 161)    = 161
> /11:    read(54, "\0F7\0\006\0\0\0\0\01017".., 2064)    = 247
> /11:        Incurred fault #6, FLTBOUNDS  %pc = 0xFE6A3558
> /11:          siginfo: SIGSEGV SEGV_MAPERR addr=0xFB47FB4C
> /11:        Received signal #11, SIGSEGV [caught]
> /11:          siginfo: SIGSEGV SEGV_MAPERR addr=0xFB47FB4C
>
> The services do restart automatically so armonitor is doing it's job.
> We've commented out everything from armonitor but the arserverd command.
>
> We stay up for between 2-10 minutes and then wham, we're down again.
> Obviously this just started this morning.
>
> unix sun solaris 10
> oracle 10g
> ars 7.0.1P2
>
> They did expand the database size last night if that has any bearing.  But
> we can connect to the database successfully when ar is down.
>
> Nothing else helpful in arerror.log, only 91 error.
>
> I'm at the Hardrock hotel, call room 30601 if you have questions or can
> help!
>
> Thanks,
> Susan
>
>
>
>
>
> _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the Answers
> Are"_
>  _Platinum Sponsor: rmisoluti...@verizon.net ARSlist: "Where the Answers
> Are"_
>

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
Platinum Sponsor:rmisoluti...@verizon.net ARSlist: "Where the Answers Are"

Reply via email to