Re: [U2] Tracing UV system calls, linux
-Original Message- From: u2-users-boun...@listserver.u2ug.org [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of John Hester Sent: Saturday, June 06, 2009 6:51 PM To: U2 Users List Subject: Re: [U2] Tracing UV system calls, linux -Original Message- From: u2-users-boun...@listserver.u2ug.org [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of John Hester Sent: Friday, June 05, 2009 11:35 AM To: U2 Users List Subject: [U2] Tracing UV system calls, linux Does anyone know of a way to trace system calls made by UV as a non-root user? Here's the problem we have: UV on RH ES 5.1 joined to an W2K3 native mode AD domain. We have an AD issue that causes UV to either fail to execute, or die before the user can enter the environment. This doesn't happen to all users, and appears to be random, but affects more users over time. Usually a user can get into UV eventually after repeated attempts. Once they're logged in, everything's fine. --- Update - turns out this is not an AD issue after all (at least not directly, anyway), it's an issue with specific /dev/pts/... terminal device files. I wasn't seeing it with non-AD UOJ connections because they don't use a tty device. Once I realized it only affected a specific tty device, I was able to reproduce it with any login, including root. I noticed that 2 UV sessions were showing in PORT.STATUS with ? as their PID and a blank tty device. uvlictool clean_lic -a got back 2 seats. I restarted UV after that, and now the problem's gone. Unfortunately I didn't think to try to reproduce the problem before restarting, so I'm not sure if cleaning up the licencing alone fixed it. Update #2 - problem resolved Turns out uvdlockd normally cleans up any orphaned memory segments. The uvdlockd.log file in the uv home directory hit the 2GB file size limit sometime between mid April and mid May, preventing uvdlockd from doing its job. When a user was assigned the same UV user # as the previous owner of the orphaned memory segment, UV would fail to execute. IBM U2 support also discovered that UV 10.2.7 installed as root on linux doesn't have the suid bit set on a number of executables that it should. When the suid bit is set on the correct files, a user's UV session will clean up a prior orphaned memory segment on its own if uvdlockd hasn't already (assuming uvdlockd.log isn't full - the process immediately dies in this case since it also tries to write to the log). Without the suid bit set on these files, the process will loop until uvdlockd removes the segment. The suid issue is resolved in the current 10.3.1 release (not sure if it was resolved in this release or a prior one). These are the files in `cat /.uvhome`/bin that should have suid set: -rwsr-x--x 1 root bin 1351096 Dec 5 2007 clean -rwsr-x--x 1 root bin 3040420 Dec 5 2007 convchar -rwsr-x--x 1 root bin 1378924 Dec 5 2007 list_readu -rwsr-x--x 1 root bin 42016 Dec 5 2007 uv -rwsr-x--x 1 root bin 3035716 Dec 5 2007 uvadmsh -rwsr-x--x 2 root bin 3134936 Dec 5 2007 uvbackup -rwsr-x--x 1 root bin 1371152 Dec 5 2007 uvdlockd -rwsr-x--x 1 root root 5199327 Dec 5 2007 uvdlockd.d -rwsr-x--x 1 root bin 41888 Dec 5 2007 uvdls -rwsr-x--x 1 root bin 1358008 Dec 5 2007 uvlictool -rwsr-x--x 1 root bin 3928 Dec 5 2007 uvpset -rwsr-x--x 2 root bin 3134936 Dec 5 2007 uvrestore -rwsr-x--x 1 root bin 3252 Dec 5 2007 uvsetacc -rwsr-x--x 1 root bin 1354924 Dec 5 2007 uvtl_helper The IBM U2 support engineer gave me this script to correct the suid problem: #!/bin/sh # Add suid bit to the necessary executables. # Based on the executables in group MAIN from 10.3.1 release. # ECase: 11963 Cmds='uvtl_helper uvsetacc uvdlockd uvdlockd.d uv uvpset convchar uvlictool uvadmsh list_readu uvdls convencfile uvrestore uvbackup clean' for Cmd in $Cmds do if [ -f `cat /.uvhome`/bin/$Cmd ] then chmod u+s `cat /.uvhome`/bin/$Cmd fi done -John ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users
Re: [U2] Tracing UV system calls, linux
Hi. We had a similar issue with Universe under Vmware Infraestructure. But it was a telnet issue. We solved it after: - patching the Universe host to the last O.S. patch level, - installing the last VmwareTools - AND making sure that the host Operating System was exactly described in the ESX server. HTH Augusto - Original Message - From: John Hester jhes...@momtex.com To: U2 Users List u2-users@listserver.u2ug.org Sent: Sunday, June 07, 2009 3:51 AM Subject: Re: [U2] Tracing UV system calls, linux -Original Message- From: u2-users-boun...@listserver.u2ug.org [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of John Hester Sent: Friday, June 05, 2009 11:35 AM To: U2 Users List Subject: [U2] Tracing UV system calls, linux Does anyone know of a way to trace system calls made by UV as a non-root user? Here's the problem we have: UV on RH ES 5.1 joined to an W2K3 native mode AD domain. We have an AD issue that causes UV to either fail to execute, or die before the user can enter the environment. This doesn't happen to all users, and appears to be random, but affects more users over time. Usually a user can get into UV eventually after repeated attempts. Once they're logged in, everything's fine. --- Update - turns out this is not an AD issue after all (at least not directly, anyway), it's an issue with specific /dev/pts/... terminal device files. I wasn't seeing it with non-AD UOJ connections because they don't use a tty device. Once I realized it only affected a specific tty device, I was able to reproduce it with any login, including root. I noticed that 2 UV sessions were showing in PORT.STATUS with ? as their PID and a blank tty device. uvlictool clean_lic -a got back 2 seats. I restarted UV after that, and now the problem's gone. Unfortunately I didn't think to try to reproduce the problem before restarting, so I'm not sure if cleaning up the licencing alone fixed it. Anyone know what might suddenly cause UV tty sessions to hang in an ambiguous state on exit? Thanks, John ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users
Re: [U2] Tracing UV system calls, linux
Can you provide details of the environment? e.g. production/dr/uat/development or all universe release number of users, type of application (generic) guest operation system and release, kernel, glibc etc vmware tools AND what you mean by the last part. This would be useful information as @ the moment I am planning with a client to a possible move from RHEL 4.0 onto a VMWare platform for all environments, production, dr and dev. Cheers, Phil. -Original Message- From: u2-users-boun...@listserver.u2ug.org [mailto:u2-users- boun...@listserver.u2ug.org] On Behalf Of Augusto Alonso Sent: Monday, 8 June 2009 11:16 p.m. To: U2 Users List Subject: Re: [U2] Tracing UV system calls, linux Hi. We had a similar issue with Universe under Vmware Infraestructure. But it was a telnet issue. We solved it after: - patching the Universe host to the last O.S. patch level, - installing the last VmwareTools - AND making sure that the host Operating System was exactly described in the ESX server. HTH Augusto - Original Message - From: John Hester jhes...@momtex.com To: U2 Users List u2-users@listserver.u2ug.org Sent: Sunday, June 07, 2009 3:51 AM Subject: Re: [U2] Tracing UV system calls, linux -Original Message- From: u2-users-boun...@listserver.u2ug.org [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of John Hester Sent: Friday, June 05, 2009 11:35 AM To: U2 Users List Subject: [U2] Tracing UV system calls, linux Does anyone know of a way to trace system calls made by UV as a non- root user? Here's the problem we have: UV on RH ES 5.1 joined to an W2K3 native mode AD domain. We have an AD issue that causes UV to either fail to execute, or die before the user can enter the environment. This doesn't happen to all users, and appears to be random, but affects more users over time. Usually a user can get into UV eventually after repeated attempts. Once they're logged in, everything's fine. --- Update - turns out this is not an AD issue after all (at least not directly, anyway), it's an issue with specific /dev/pts/... terminal device files. I wasn't seeing it with non-AD UOJ connections because they don't use a tty device. Once I realized it only affected a specific tty device, I was able to reproduce it with any login, including root. I noticed that 2 UV sessions were showing in PORT.STATUS with ? as their PID and a blank tty device. uvlictool clean_lic -a got back 2 seats. I restarted UV after that, and now the problem's gone. Unfortunately I didn't think to try to reproduce the problem before restarting, so I'm not sure if cleaning up the licencing alone fixed it. Anyone know what might suddenly cause UV tty sessions to hang in an ambiguous state on exit? Thanks, John ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users
Re: [U2] Tracing UV system calls, linux
-Original Message- From: u2-users-boun...@listserver.u2ug.org [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of John Hester Sent: Friday, June 05, 2009 11:35 AM To: U2 Users List Subject: [U2] Tracing UV system calls, linux Does anyone know of a way to trace system calls made by UV as a non-root user? Here's the problem we have: UV on RH ES 5.1 joined to an W2K3 native mode AD domain. We have an AD issue that causes UV to either fail to execute, or die before the user can enter the environment. This doesn't happen to all users, and appears to be random, but affects more users over time. Usually a user can get into UV eventually after repeated attempts. Once they're logged in, everything's fine. --- Update - turns out this is not an AD issue after all (at least not directly, anyway), it's an issue with specific /dev/pts/... terminal device files. I wasn't seeing it with non-AD UOJ connections because they don't use a tty device. Once I realized it only affected a specific tty device, I was able to reproduce it with any login, including root. I noticed that 2 UV sessions were showing in PORT.STATUS with ? as their PID and a blank tty device. uvlictool clean_lic -a got back 2 seats. I restarted UV after that, and now the problem's gone. Unfortunately I didn't think to try to reproduce the problem before restarting, so I'm not sure if cleaning up the licencing alone fixed it. Anyone know what might suddenly cause UV tty sessions to hang in an ambiguous state on exit? Thanks, John ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users
[U2] Tracing UV system calls, linux
Does anyone know of a way to trace system calls made by UV as a non-root user? Here's the problem we have: UV on RH ES 5.1 joined to an W2K3 native mode AD domain. We have an AD issue that causes UV to either fail to execute, or die before the user can enter the environment. This doesn't happen to all users, and appears to be random, but affects more users over time. Usually a user can get into UV eventually after repeated attempts. Once they're logged in, everything's fine. The current UV server has been in production for over a year with no AD issues. Nothing has changed on the UV server. Prior to that we ran UV on RH AS 3.0 joined to the same domain for 3+ years without issue. We virtualized one of our domain controllers on VMware ESX in October, and no issues between then and now. Kerberos authentication always works. The user logs in OK at the OS level, but UV will not execute. I suspect a user or group permission problem, but at the OS level, all of the various AD connectivity validation methods work OK (id, wbinfo -i, wbinfo -u, wbinfo -g, getent passwd, getent group). The permissions on the UV directory are rwxrwxr-x, and the group ownership is the AD domain users group. I tried adding world write permissions in our development account, but that didn't help. When this issue first happened a few weeks ago, rebooting all 3 domain controllers made the problem disappear for a little over 2 weeks. When it recurred, rebooting only the domain controllers didn't work, but rebooting them along with the UV server got us by for 4 days. The Windows admin also fixed an AD replication problem at that time. RH ES 5.1 doesn't have strace installed. It has autrace, which is supposedly similar, but looks like it can only be run as root. I've verified that if I run UV as a local user, it will work. Our web app server uses a local user ID for UOJ connections, and the UOJ connections always work. I need some way to determine at what point the UV executable is dying to determine which system call is being affected by AD, and that requires executing it as an AD user. Another thought I had for a workaround was to change the ownership of the UV executable to a local /etc/passwd user who has the domain users group #, and us chmod +s to make uv run as that user. Does anyone know if that would cause problems? Also, would anything break if I copied the uv executable to something like uv_test so I could try this with a test login and not affect the entire server? Thanks, John ___ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users