Those two are actually in our short list of proposed fixes. Unfortunately we don't have definitive proof of what exactly the problem is.
Our proposed list of possible fixes is this: -Set Next-ID-Commit: T (<---default prior to 7.6.03 is F, default in newest versions is T, ours is old hence we need to specify it). -Set Next-ID-Block-Size: 10 -Purge the table the rows are being put into. The performance actually seems pretty good when things are working well but the data is transactional throw-away data once a person is authenticated. We are doing this one for sure this weekend just to be safe -Offload the IVR transactions to their own web server (not likely due to redundancy issues). William Rentfrow [email protected] http://www.stratacominc.com 715-204-3061 Office 715-498-5056 Cell -----Original Message----- From: Action Request System discussion list(ARSList) [mailto:[email protected]] On Behalf Of Longwing, LJ CTR MDA/IC Sent: Friday, November 09, 2012 9:29 AM To: [email protected] Subject: Re: XMLCE hangs, Oracle locks, arschema won't update (long, boring, but important...) William, If you feel you are dealing with a concurrency issue on arschema, please look into two ar.cfg parameters Next-ID-Commit Next-ID-Block-Size Either one, or a combination of both may help you with your situation. -----Original Message----- From: Action Request System discussion list(ARSList) [mailto:[email protected]] On Behalf Of William Rentfrow Sent: Friday, November 09, 2012 8:25 AM To: [email protected] Subject: XMLCE hangs, Oracle locks, arschema won't update (long, boring, but important...) ** Hi listers - We're seeing something peculiar. This is on a fairly high volume (5000 incidents/day) system. There's an integration with an IVR that does a lookup via web services to authenticate someone is actually an employee. That uses a web service that's been dependable for a couple of years (at least). All this particular WS does is create a new record in a table. The system was recently migrated from Solaris to Linux. It's using ARS 7.1 and ITSM 7.1x - the database is Oracle RAC, remote. I am wondering if there is a known bug because when we look at the log we see this when things are behaving fine (IP removed): Tue Nov 06 2012 10:23:51.7183 */+XMLCE ARXMLCreateEntry from Webservice (protocol 13) at IP address xx.xx.xx.xx Tue Nov 06 2012 10:23:51.7686 */-XMLCE OK But occasionally we see the first line and no corresponding -XMLCE. About 11 minutes (note milliseconds in the error below) later in our SystemOut.log from Websphere: [11/6/12 13:34:04:866 CST] 00000037 SystemOut O - Connects to ARServer (server name removew) through Java Rpc failed with: ERROR (91): RPC call failed; ONC/RPC call timed out [11/6/12 13:34:04:882 CST] 0000006c SystemOut O - getApiRecording: 0 [11/6/12 13:34:04:883 CST] 00000049 SystemOut O - getApiRecording: 0 [11/6/12 13:34:04:920 CST] 00000069 SystemOut O - getApiRecording: 0 [11/6/12 13:36:04:847 CST] 0000006c SystemOut O - getApiRecording: 0 [11/6/12 13:36:04:878 CST] 00000069 SystemOut O - getApiRecording: 0 [11/6/12 13:36:53:342 CST] 000000a2 SystemOut O - getApiRecording: 0 [11/6/12 13:36:53:343 CST] 000000a1 SystemOut O - getApiRecording: 0 [11/6/12 13:37:15:266 CST] 000000a4 SystemOut O - getApiRecording: 0 [11/6/12 13:37:15:266 CST] 000000a5 SystemOut O - getApiRecording: 0 [11/6/12 13:37:35:817 CST] 000000a7 SystemOut O - getApiRecording: 0 [11/6/12 13:38:04:820 CST] 00000037 SystemOut O - getApiRecording: 0 [11/6/12 13:38:04:822 CST] 00000053 SystemOut O - getApiRecording: 0 [11/6/12 13:38:04:826 CST] 0000006c SystemOut O - getApiRecording: 0 [11/6/12 13:38:04:843 CST] 0000002e SystemOut O - getApiRecording: 0 [11/6/12 13:38:04:852 CST] 00000069 SystemOut O - getApiRecording: 0 [11/6/12 13:38:53:307 CST] 000000a2 SystemOut O - getApiRecording: 0 [11/6/12 13:40:04:784 CST] 00000053 SystemOut O - getApiRecording: 0 [11/6/12 13:40:04:802 CST] 0000003c SystemOut O - getApiRecording: 0 [11/6/12 13:40:53:267 CST] 000000a2 SystemOut O - getApiRecording: 0 [11/6/12 13:41:09:066 CST] 00000032 ThreadMonitor W WSVR0605W: Thread "WebContainer : 51" (00000063) has been active for 675278 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung. at com.remedy.arsys.goat.preferences.ARUserPreferences.getUserPreferences(Unknown Source) at com.remedy.arsys.stubs.SessionData.<init>(Unknown Source) at com.remedy.arsys.session.Login.initWebService(Unknown Source) at com.remedy.arsys.ws.services.ARService.login(Unknown Source) at com.remedy.arsys.ws.services.ARService.processRequest(Unknown Source) at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:618) at org.apache.axis.providers.java.MsgProvider.processMessage(MsgProvider.java:141) at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:319) at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32) at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118) at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83) at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:453) at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281) at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699) at javax.servlet.http.HttpServlet.service(HttpServlet.java:763) at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327) at javax.servlet.http.HttpServlet.service(HttpServlet.java:856) at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1213) at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1154) at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:118) at com.ibm.ws.webcontainer.filter.WebAppFilterChain._doFilter(WebAppFilterChain.java:87) at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:848) at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:691) at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:654) at com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:526) at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90) at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:764) at com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1478) at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:133) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:450) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:508) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:296) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready(HttpInboundLink.java:270) at com.ibm.ws.ssl.channel.impl.SSLConnectionLink.determineNextChannel(SSLConnectionLink.java:1069) at com.ibm.ws.ssl.channel.impl.SSLConnectionLink.readyInboundPostHandshake(SSLConnectionLink.java:728) at com.ibm.ws.ssl.channel.impl.SSLConnectionLink$MyHandshakeCompletedCallback.complete(SSLConnectionLink.java:415) at com.ibm.ws.ssl.channel.impl.SSLUtils.handleHandshake(SSLUtils.java:942) at com.ibm.ws.ssl.channel.impl.SSLHandshakeIOCallback.complete(SSLHandshakeIOCallback.java:70) at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165) at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217) at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161) at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:136) at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:196) at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:792) at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:881) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1551) At this point we experience a database lock on the arschema table where Oracle has been trying to update the nextid for the table in the XMLCE and it has not released the lock for the row in question. This is the row for the table that the XMLCE is firing against. Our DBA has to manually clear the locked thread(s - usually more than one) and then the final -XMLCE entry is put into the arapi log file and the system frees up. I've never see this before - and there's not enough information in the logs to determine what exactly is happening. We have a case open with BMC but this has all of the hallmarks of an obscure bug that only happens on Tuesdays during a thunderstorm. Unfortunately it's becoming more frequent. William Rentfrow [email protected] http://www.stratacominc.com 715-204-3061 Office 715-498-5056 Cell _attend WWRUG12 www.wwrug.com ARSlist: "Where the Answers Are"_ _______________________________________________________________________________ UNSUBSCRIBE or access ARSlist Archives at www.arslist.org attend wwrug12 www.wwrug12.com ARSList: "Where the Answers Are" _______________________________________________________________________________ UNSUBSCRIBE or access ARSlist Archives at www.arslist.org attend wwrug12 www.wwrug12.com ARSList: "Where the Answers Are"

