RE: UV command failing mystery
We upgraded our system from a 16 processor AV2 to an 8 processor AV25000. Because each block of 4 cpus can only support 4gb of memory our upgrade was going to cut our memory in half. So, the main purpose of the changes was to cut the memory requirements of the universe lock tables. And in fact we successfully cut the per user memory usage in half. The other changes were made as part of the overall review of both uv.config and the kernel parameters prior to the upgrade. Because several people suggested that the semaphore and/or shared memory changes could be the cause of the problem I had them changed back to their original values yesterday. So far we have not had a failure, but its a little too soon to celebrate. Thanks for the advice on using the trace. I was not able to pick out the system calls and before I make another attempt I think I will wait and see if the weekends changes fixed the problem. If the problem is solved I will post in case someone else runs in the same problem in the future. Thanks, Vance Dailey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Ken Wallis Sent: Sunday, February 08, 2004 7:52 PM To: 'U2 Users Discussion List' Subject: RE: UV command failing mystery From: Vance Dailey It was suggested that I try to run dg_strace. I ran it on one of the failing uv processes. It generated a 1mb file. I can see where it executes uvsh and It fails just after the 7th occurance of RUN APP.PROGS PACKAGE.INS. (The 7th run is just after the string SPECIAL.EDITOR.SELECT.DATA\OLONG.) I have no idea how to read this file but I thought it might help identify where the error occurred. I have included the very end of the output of dg_strace below: ... close(3)= 0 sigaction_svr3(SIGQUIT, {...}, {...}) = 1253 sigaction_svr3(SIGNULL, {...}, {0xc0a0d, [XCPU XFSZ], SA_RESTART|SA_SIGINFO}) = 2130681856 ... This tool seems to be showing you the system calls that uvsh is making and the values returned from them (the bit after the =). The section you have shown is simply the program trying to tidy up and exit after detecting something it didn't like. You'll need to look higher up in the output for a system call which seems to return an error code. Unfortunately, you need to know what sort of system calls should return 0 all the time and which ones regularly return other values. I think I'd be looking at calls to sem...() or shm...() functions that return non-zero and then using errmsg (if DG/UX has that, or vi-ing /usr/include/sys/errno.h if it doesn't) to see what the error numbers returned mean and man to interpret from that where the problem lies. I can't remember the exact numbers you quoted earlier, but certainly with your user counts I'd be very suspicious of the reductions you made to the semaphore kernel parameters. Just as a matter of interest, why were these reductions made? HTH, Ken -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
RE: UV command failing mystery
I checked out the link you provided. It appears that the command dg_strace may be what you are suggesting. I have not had time to try it but it looks very interesting. Not having used a command like this before I may have a bit of a learning curve. Once I do try it on a failure I will let you know what I find. Thanks for the tip. Vance Dailey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Dave Kimmel Sent: Wednesday, February 04, 2004 11:08 PM To: U2 Users Discussion List Subject: Re: UV command failing mystery On Feb 4, 2004, at 4:38 PM, Vance Dailey wrote: Well, it did not take long to get a chance to test trying 'uvsh'. The problem reared its ugly head again today and we now know that uvsh also does not work. The command returns to the shell so quickly that whatever the problem is it must occur near the start of the program. If you happen to know what the uvsh does first it may help me figure out where the problem may lie. I would have thought that an error of this sort would have been recorded in a log file but I have not stumbled across one yet. Does your system have a command to trace system calls? You can use this to see what UniVerse is doing (at a very low level) - it may help you find the cause of this problem. As for finding the command, the various unix flavors all seem to call it something slightly different, but the Rosetta Stone may be able to help you: http://bhami.com/rosetta.html Look under the tracing utility item. -- Dave Kimmel [EMAIL PROTECTED] -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
RE: UV command failing mystery
I don't think its a license issue. analye.shm -c report the correct number of licenses (1065) while we only have 700 users at peak times. Also, when the uv command fails it returns to the shell almost instantly with no message. echo $? returns a 1 (one). Still, that is a difference in the code executed by a terminal users and a phantom. Perhaps the command fails in the process of checking the licenses? Thanks, vance Dailey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Hennessey, Mark F. Sent: Friday, February 06, 2004 11:51 AM To: U2 Users Discussion List Subject: RE: UV command failing mystery snip It appears that Phantoms can always get into Universe but Terminal sessions sometimes can not. /snip For what it's worth, phantoms do not consume licenses, while terminals do... -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
RE: UV command failing mystery
It looks like the command is dg_strace for DG/UX. My problem is that I don't understand the output yet. It looks like it may be a good tool to know how to use. Thanks, Vance Dailey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Gerry Maddock Sent: Friday, February 06, 2004 11:42 AM To: 'U2 Users Discussion List' Subject: RE: UV command failing mystery If your running Redhat,Mandrake, or Fedora, the command is strace -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vance Dailey Sent: Friday, February 06, 2004 11:36 AM To: 'U2 Users Discussion List' Subject: RE: UV command failing mystery I checked out the link you provided. It appears that the command dg_strace may be what you are suggesting. I have not had time to try it but it looks very interesting. Not having used a command like this before I may have a bit of a learning curve. Once I do try it on a failure I will let you know what I find. Thanks for the tip. Vance Dailey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Dave Kimmel Sent: Wednesday, February 04, 2004 11:08 PM To: U2 Users Discussion List Subject: Re: UV command failing mystery On Feb 4, 2004, at 4:38 PM, Vance Dailey wrote: Well, it did not take long to get a chance to test trying 'uvsh'. The problem reared its ugly head again today and we now know that uvsh also does not work. The command returns to the shell so quickly that whatever the problem is it must occur near the start of the program. If you happen to know what the uvsh does first it may help me figure out where the problem may lie. I would have thought that an error of this sort would have been recorded in a log file but I have not stumbled across one yet. Does your system have a command to trace system calls? You can use this to see what UniVerse is doing (at a very low level) - it may help you find the cause of this problem. As for finding the command, the various unix flavors all seem to call it something slightly different, but the Rosetta Stone may be able to help you: http://bhami.com/rosetta.html Look under the tracing utility item. -- Dave Kimmel [EMAIL PROTECTED] -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
RE: UV command failing mystery
I don't think its a license issue. analye.shm -c report the correct number of licenses (1065) while we only have 700 users at peak times. Also, when the uv command fails it returns to the shell almost instantly with no message. echo $? returns a 1 (one). Still, that is a difference in the code executed by a terminal users and a phantom. Perhaps the command fails in the process of checking the licenses? Thanks, vance Dailey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Timothy Snyder Sent: Friday, February 06, 2004 11:55 AM To: U2 Users Discussion List Subject: Re: UV command failing mystery It appears that Phantoms can always get into Universe but Terminal sessions sometimes can not. Is there more information besides the fact that they can't get in? It could mean that you're exhausting all of the available licenses. Terminal sessions will consume a license while phantoms will not. Tim Snyder IBM Data Management Solutions Consulting I/T Specialist , U2 Professional Services [EMAIL PROTECTED] -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
RE: UV command failing mystery
It was suggested that I try to run dg_strace. I ran it on one of the failing uv processes. It generated a 1mb file. I can see where it executes uvsh and It fails just after the 7th occurance of RUN APP.PROGS PACKAGE.INS. (The 7th run is just after the string SPECIAL.EDITOR.SELECT.DATA\OLONG.) I have no idea how to read this file but I thought it might help identify where the error occurred. I have included the very end of the output of dg_strace below: , {NULL, 0}, {NULL, 0}, {NULL, 0}, {NULL, 0}, {NULL, 0}, {NULL, 0}, {NULL, 0}, { NULL, 0}, {NULL, 0}, {NULL, 134688924}], 8192) = 8192 close(3)= 0 sigaction_svr3(SIGQUIT, {...}, {...}) = 1253 sigaction_svr3(SIGNULL, {...}, {0xc0a0d, [XCPU XFSZ], SA_RESTART|SA_SIGINFO}) = 2130681856 sigaction_svr3(SIGINT, {0xc0a0d, [XCPU XFSZ], SA_RESTART|SA_SIGINFO}, {0x74706f2 f, [HUP INT QUIT ILL ABRT KILL SEGV PIPE ALRM TERM CLD PWR URG POLL STOP CONT TT IN TTOU VTALRM XCPU ??? XCPU XFSZ PROF ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ? ?? ??? DGTIMER3 DGTIMER4], SA_RESTART|SA_SIGINFO|SA_RESETHAND|SA_ONSTACK|SA_NOCL DSTOP|SA_NOCLDWAIT|0x72707520}) = 0 wait([WIFSIGNALED(s) WTERMSIG(s) == SIG???]) = 20325 wait4(134508240, unfinished ... --- SIGCLD --- ... wait4 resumed [WIFSIGNALED(s) WTERMSIG(s) == SIG???], WEXITED|WTRAPPED, NULL) = 20325 _exit(1)= ? +++ Exited with status 1 +++ Process 20284 detached root imm # -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
Re: UV command failing mystery
The 'uv' command is basically a small front ender to the 'uvsh' executable, which is really the guts of universe. The uv command does little more (for universe) then to check ulimit settings, increasing them when necessary, then issuing an execve() call to uvsh. If you bypass 'uv' and just use 'uvsh', does the problem still occur? Your universe configurables seem reasonable, as well as your kernel params, but I do recall an issue whereby the OS would disallow exec's (and /or forks) due to system resource exhaustion, although it has been a while and I cannot recall the details. At 05:23 PM 02/03/2004, you wrote: We are having a very strange intermittent problem with the UV command not working from Unix. -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
RE: UV command failing mystery
Thanks for the suggestion. We will try uvsh the next time we have the failure. Last night we had the problem occur for 50 mins. Because it continued for so long we were able to have users in our various locations make multiple attempts to login from multiple PCs. Surprisingly, some locations claimed they could not login at all, others claimed that some PCs could be logged in while other could not. Either our assumption that the problem affects all users is false or the problem occurred many times but each time for only a short period of time. Once again users already in Universe noticed no problems and no unusual locks we noted when we used the command list.readu every. analyze.shm -s does not show any unusually high numbers of Collisions. Are there any log files we should be checking? We are more puzzled than ever. Vance Dailey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Glenn Herbert Sent: Wednesday, February 04, 2004 10:13 AM To: U2 Users Discussion List Subject: Re: UV command failing mystery The 'uv' command is basically a small front ender to the 'uvsh' executable, which is really the guts of universe. The uv command does little more (for universe) then to check ulimit settings, increasing them when necessary, then issuing an execve() call to uvsh. If you bypass 'uv' and just use 'uvsh', does the problem still occur? Your universe configurables seem reasonable, as well as your kernel params, but I do recall an issue whereby the OS would disallow exec's (and /or forks) due to system resource exhaustion, although it has been a while and I cannot recall the details. At 05:23 PM 02/03/2004, you wrote: We are having a very strange intermittent problem with the UV command not working from Unix. -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
RE: UV command failing mystery
Vance: Is echo $? returning anything meaningful after uvsh fails? Lee -- Lee J. Leitner, Ph.D. [EMAIL PROTECTED] http://www.leitner.org/~leitnerl The world can only be grasped by action, not by contemplation. The hand is the cutting edge of the mind. -- Jacob Bronowski V.13.0 --- -- u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users