Hi John!

Just a thought:
Disable escalations in the ar config file and restart the server - what happens?
If the server runs fine (for, say, 15-30 min after starting up), back up all 
escalations to a .DEF, delete them, stop ar server, enable escalations, restart 
the server and start importing escalations 1 at a time to see which 
escalation(s) are causing the problem.

Best Regards,
Theo

Sent from my Black/Silver Personal Computer ....
 
"Try not to become a person of success, but a person of value." - Albert 
Einstein

-----Original Message-----
From: Action Request System discussion list(ARSList) 
[mailto:arslist@ARSLIST.ORG] On Behalf Of Reiser, John J
Sent: 27 June 2011 23:27
To: arslist@ARSLIST.ORG
Subject: arserver.exe is consuming 100% cpu - possible DB corruption? (Long 
Post)

Hello Listers,
ARS 7.6.03
MS 2003 Enterprise
MS SQL 2005 (remote)
Total home grown system. No OOTB modules.


I have a real stumper here. It even has BMC scratching their heads.
I have a production system that is experiencing cpu overload that runs up to 99 
in the processes and sits there.
The ARSystem server is virtual machine. We thought maybe it was a MS "Patch 
Tuesday" issue and we removed the 10 recent MS patches one at a time and 
restarted the machine each time. The problem still exists after the arserver 
service starts. Sometime immediately and sometimes it will sit for 1- 20 
minutes before it starts to hog the CPUs.
To eliminate any other OS and file system issues we grabbed a two week old 
backup image of the server and restored it.
The system came back ok for a short while and then started to lock up the CPU 
again.
Working with BMC I set the logs on and restarted. We saw the system jump to 
100% within a minute and captured a 10MB arsql.log file.
It can force the overload at anytime by firing filter workflow with a 
notification action in it.
I disabled this one filter but the system still loaded up. I added a Filter 
that ran a 0 and the only action was Goto 1000 to jump all Filter actions that 
fired on the change of the Status field in question.
Still no joy. 
I've disabled every piece of Notify workflow. That worked the best and kept the 
system alive for longer stretches but we can't run a system that way.

I've come to the realization that there may be corrupted information in the DB 
object tables and I wanted to get some feedback.
Using rrrChive I can pull a copy of every form's data since, say, two weeks 
ago. Then have the DBA restore the entire system from that date. After the 
restore I would use rrrChive to reload the two weeks' data (Modified date' > 
"06/11/2011") and hope for the best.

Any workflow that was changed in the last two weeks is negligible and could be 
recreated/updated as needed.

Do you think this is a viable solution?
When I asked the BMC tech if I could dump the T,H & B tables ; restore the db 
and reload the T, H & B tables he reminded me that the arschema and other meta 
tables would probably be out of synch.
That's when I thought of using rrrChive.

Sorry to be so long winded but I need to get this back online, BMC can't find 
anything in the logs and I don't want to lose the tickets we've taken in the 
last week.




--- 
John J. Reiser 
Remedy Developer/Administrator 
Senior Software Development Analyst 
Lockheed Martin - MS2 
The star that burns twice as bright burns half as long. 
Pay close attention and be illuminated by its brilliance. - paraphrased by me 

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

Reply via email to