Reality is it could be something in the OS, but I don't like the idea of 'whack a mole' as a troubleshooting methodology. I like evidence based actions.
Sent from my iPhone On Oct 18, 2011, at 3:19 PM, Fred James <[email protected]> wrote: > Daniel Herrington wrote: >> On Tue, Oct 18, 2011 at 1:35 PM, Paul Heinlein <[email protected]> wrote: >> >> >>> On Tue, 18 Oct 2011, Daniel Herrington wrote: >>> >>> >>>> All, >>>> >>>> We're working an issue with extreme latency on one of our >>>> application servers. The lead tech, who hasn't established much >>>> credibility, keeps saying he wants to bounce the Sun Solaris >>>> servers, as they have been up for 169 days. He feels that may be the >>>> cause of the issue. I highly doubt it as the System Administrators >>>> are saying that resources are available for the application. What's >>>> the recommended reboot cycle for Sun Solaris servers? >>>> >>> I haven't administered any mission-critical Solaris boxes for several >>> years, but I never had to reboot Solaris to solve an application-level >>> problem. I suspect truss or dtrace can identify the source of the >>> latency, if it actually is caused by Solaris. >>> >>> >>> >> I don't think the latency is coming from the OS, but who knows. At this >> point the Scheduler (CA Inc) is seeing it in the OCI calls, but that just >> means it's downstream from them. Oracle is saying they can't find anything >> in the logs, and so I've got a whole lot of shrugging shoulders. The Storage >> guys say the OS can tell is they are having resource issues, and the OS guys >> are saying they don't see any resources spikes. The only thing I do know is >> that no one has a clue. To solve the problem we 'roll around the horn' so to >> speak on the RAC environment. Latency disappears after that. it smells like >> an Oracle issue, but at this point I'm stuck. >> > Not sure this is going to help much, but just in case > IRIX 6.4 (SGI MIPS) > Oracle 8.0.4 > What appear as a latency issue was that Oracle didn't like the way SVR4 > systems counted time, and processes (it was a know bug but Oracle's fix > was to move to 9.x which wasn't supported on those offending OS's). The > known bug, you see, was found on other SVR4 systems as well, and > Oracle's known work around was to periodically power cycle the > machine(s), resetting the counters in question. In our case it was <28 > days (we made it 21 so it could always happen on Sundays), on some SVR4 > boxes it was <280 days ... that was dependent on how the OS counted time. > Hope that helps > Regards > Fred James > > _______________________________________________ > PLUG mailing list > [email protected] > http://lists.pdxlinux.org/mailman/listinfo/plug _______________________________________________ PLUG mailing list [email protected] http://lists.pdxlinux.org/mailman/listinfo/plug
