Hi there, I've implemented the following script that seems to work nicely for us. We've however seen some cases where the client is hung and it reports being IDLE in the utsession -p output, so this will probably also require some script to restart idle sessions from time to time (say once every 2 hours), so after cycling it should report the correct value. Odd, though.
Again, thanks for your help! #!/bin/bash # Paths to commands LOGGER=`which logger` UTSESSIONCMD=/opt/SUNWut/sbin/utsession UTDESKTOPCMD=/opt/SUNWut/sbin/utdesktop UTLOADCMD=/opt/SUNWut/lib/utload IFS=$'\n' # The following loop will return any sessions in the SID, Login, Status format for dtu in `$UTSESSIONCMD -p | tr -s ' ' | cut -d' ' -f1,3,5`; do # The first returned row is empty, we discard it [ -z "$dtu" ] && continue # Acquiring needed values for each row sid=`echo $dtu | cut -d' ' -f1` login=`echo $dtu | cut -d' ' -f2` state=`echo $dtu | cut -d' ' -f3` # The utsession doesn't seem to be meant to be formatted for scripting, there will be # some rows that will return useless info. [ $sid == "Token" ] && continue [[ $sid =~ ----* ]] && continue # If the DTU haven't an associated login and neither has the IDLE flag, probably is hung # We restart it! if [ $login == '???' ] && ! [[ $state == I* ]]; then $UTDESKTOPCMD -d `echo $sid | cut -d'.' -f2` $UTLOADCMD -r -t $sid $UTSESSIONCMD -k -t $sid $LOGGER -t hung_session Restarting $sid as it seems to be hung echo "DTU $sid has been restarted" fi done exit 0 2014-10-03 19:42 GMT+01:00 James Michels <karma.sometimes.hu...@gmail.com>: > Greg, > > 2014-10-03 19:32 GMT+01:00 Rodenhiser, Greg <grode...@holycross.edu>: > >> The other way I've seen, even before a power cycle is via utsession -p. >> If the Unix session column is ??? but the state is U (as opposed to IU) it >> is the Sunray session(s) that are hung on 26B. At 4AM everyday I run >> utstop/utstart via cron, and check every morning for the hung session. In >> our case it's always just a single display out of the 35 we use. >> >> > This seems very promising! I'll check on monday and post some feedback > about it, respectively the script I've implemented to kill hung sessions. > > Thank you all, your tips will most probably solve my issue. > > Happy weekend, > > James > > >> On Fri, Oct 3, 2014 at 2:14 PM, James Michels < >> karma.sometimes.hu...@gmail.com> wrote: >> >>> Hello Greg, >>> >>> 2014-10-03 19:01 GMT+01:00 Rodenhiser, Greg <grode...@holycross.edu>: >>> >>>> I may have a way to auto detect a hung session (at least for our 26B >>>> hangs it works). Run a utwho -a. Any session that is owned by root is a >>>> hung session for us (provided root is not actually logged into any of our >>>> Sunray sessions). The fact that it's owned by root seems to block Xnewt >>>> from spinning up on that display, giving the dreaded 26B (and maybe D)? >>>> >>>> >>> This is the way we use, but it only works when the client has been >>> already power-cycled :-( After that, the utwho -ca command shows that MAC >>> being owned by root as you describe, but only after that fact. Until >>> power-cycling, it appears being an idle session. That's the reason why I >>> was tinkering with utquery and see its parameters, but they're the same as >>> for a "sane" session. Even packets being sent from that hung session seems >>> to be the same of a normal session, so its camouflage is brilliant :-) >>> >>> Thanks for the tip! >>> >>> James >>> >>> >>> >>>> On Fri, Oct 3, 2014 at 1:51 PM, James Michels < >>>> karma.sometimes.hu...@gmail.com> wrote: >>>> >>>>> Hello all, >>>>> >>>>> As promised, I'm posting some feedback about your idea. In conclusion, >>>>> a mix of both Scott's and Daniel's ideas has *almost*-worked for me. >>>>> When using just the utload & utsession commands, the thing didn't seem to >>>>> work. However, I tried adding/removing things and tracing their effects >>>>> and >>>>> the conclusions are the following: >>>>> >>>>> - The command combination that seems to work for me is "utdesktop >>>>> -d XXXXXXXXXXXX" + "utload -r -t pseudo.XXXXXXXXXXXX" + "utsession -k >>>>> -t >>>>> pseudo.XXXXXXXXXXXX". >>>>> - The strange thing is that run just once, the client still seems >>>>> to be hung after reboot. Only after running the same command 5-6 >>>>> times, the >>>>> client seems to start correctly. That seems pretty strange to me, as >>>>> the >>>>> three commands are run exactly the same way the 5-6 times. >>>>> - This combination implies Scott's solution, there's just one path >>>>> left: /tmp/SUNWut/kiosk/:DISPLAY..., so I made a script that >>>>> additionally >>>>> removes that path and added it to the 3 commands above. >>>>> - As far as my pretension of detecting hung sessions without >>>>> restarting the client for the first time goes, it seems that it won't >>>>> work >>>>> the way I meant. The utquery command returns correct values on the >>>>> cmdcachesize parameter before rebooting, so this won't work. Now I'm >>>>> still >>>>> looking for a way to detect those hung clients without telling our >>>>> workers >>>>> to power cycle them manually. >>>>> >>>>> This has been a big help as at least I can reset the hung state of the >>>>> clients now and I don't need to wait for the nightly utstart -c, so thank >>>>> you Scott and Daniel. >>>>> >>>>> If someone has an idea on how to detect a hung client without needing >>>>> to power cycle them, I'll be very grateful. >>>>> >>>>> James >>>>> >>>>> >>>>> 2014-10-02 20:25 GMT+01:00 James Michels < >>>>> karma.sometimes.hu...@gmail.com>: >>>>> >>>>>> Hello Daniel, >>>>>> >>>>>> 2014-10-02 18:51 GMT+01:00 Beckman, Daniel <d...@loc.gov>: >>>>>> >>>>>>> This may be a different issue, but with SRS 5.3.1 on Solaris 10, >>>>>>> when we run “utdesktop –lw” we sometimes show units that can’t get a >>>>>>> session and are “stuck”. I think they are 26Ds as well. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On an individual DTU basis, to fix remotely we issue this: >>>>>>> >>>>>>> >>>>>>> >>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.00144fd6d4c7 && utsession -k -t >>>>>>> pseudo.00144fd6d4c7 >>>>>>> >>>>>>> >>>>>>> >>>>>>> Where “pseudo.xxx..” is the identifier that shows up in output of >>>>>>> “utdesktop –lw”. >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> In our case the 26D seems to be a bit more complicated to detect (not >>>>>> sure if OL and Sun base might have different behaviors), but when a >>>>>> session >>>>>> gets hung, initially it's still recognized as a valid session and doesn't >>>>>> appear in the -lw list until someone power cycles the client. Then the >>>>>> 26D >>>>>> is shown again, but this time it appears in the -lw list. Our aim is to >>>>>> find out a way to discover those hung sessions in their 'initial' state, >>>>>> where the SRSS doesn't catalogue them as hung yet. >>>>>> >>>>>> I've a possible idea, but I have to test it quite lot yet. It's based >>>>>> on the output of the utquery command, concretly on the cmdcachesize >>>>>> parameter which seems to be 0 when the session is hung, but as I said, >>>>>> I'm >>>>>> not quite sure of this yet. >>>>>> >>>>>> I've tried using the 'utsession -k -t' command, but it doesn't seem >>>>>> to "unstick" the session itself, however, I've not tried in combination >>>>>> with the 'utload' command, so I'll also try that tomorrow and see its >>>>>> behavior. >>>>>> >>>>>> Thanks very much for that idea, too. >>>>>> >>>>>> James >>>>>> >>>>>> >>>>>>> What that will do is “unstick” (for lack of a better term) the >>>>>>> session associated with that DTU and then reboot it. >>>>>>> >>>>>>> >>>>>>> >>>>>>> To make things easier we have a simple script that asks for the >>>>>>> identifier: >>>>>>> >>>>>>> >>>>>>> >>>>>>> #!/usr/bin/bash >>>>>>> >>>>>>> read -p "Enter the Desktop ID: : " DesktopID >>>>>>> >>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k -t >>>>>>> pseudo.$DesktopID >>>>>>> >>>>>>> >>>>>>> >>>>>>> To make things automated we have a script that runs via cron by the >>>>>>> hour: >>>>>>> >>>>>>> >>>>>>> >>>>>>> #!/usr/bin/bash >>>>>>> >>>>>>> # Set variable for "DesktopID" based on output of utdesktop -lw >>>>>>> >>>>>>> DesktopID=$(utdesktop -lw | awk 'NR>3 && NR<5 {print $1}' ) >>>>>>> >>>>>>> # Only continue if "utdesktop -lw" reports a hung session, indicated >>>>>>> by existence of ID starting with 00 >>>>>>> >>>>>>> if [[ "$DesktopID" == 00* ]] >>>>>>> >>>>>>> then >>>>>>> >>>>>>> echo "There's hung sessions -- fixing them..." >>>>>>> >>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k >>>>>>> -t pseudo.$DesktopID >>>>>>> >>>>>>> else >>>>>>> >>>>>>> echo "No hung sessions -- we're done here!" >>>>>>> >>>>>>> fi >>>>>>> >>>>>>> >>>>>>> >>>>>>> I’m not a programmer so I apologize if the scripts are a bit crude – >>>>>>> but they work for us. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hope that helps! >>>>>>> >>>>>>> >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Daniel >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* James Michels [mailto:karma.sometimes.hu...@gmail.com] >>>>>>> *Sent:* Thursday, October 02, 2014 8:33 AM >>>>>>> *To:* sunray-users@filibeto.org >>>>>>> *Subject:* [SunRay-Users] 26D and ability to effectively erase >>>>>>> sessions >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> We're getting spotaneous and unpredictable 26D screens sometimes. >>>>>>> This doesn't happen quite often, but what's worrying is that we're >>>>>>> unable >>>>>>> to restore the affected client's state to be reset. >>>>>>> >>>>>>> We've tried to reset the client using the utsession -k -t command, >>>>>>> also utdisplay -d and both of them seem uneffective, as when rebooted, >>>>>>> the >>>>>>> client remains in the same 26D state. >>>>>>> >>>>>>> The only thing that helps is a complete server reboot. >>>>>>> >>>>>>> When the client reconnects to the server we're seeing this in the >>>>>>> log so maybe it's related: >>>>>>> >>>>>>> Oct 2 12:43:52 srss7 utauthd: search_for_entries(): Found multiple >>>>>>> matching entries, was expecting a single match >>>>>>> >>>>>>> >>>>>>> >>>>>>> I deduce that the session is not being cleaned up entirely, so >>>>>>> here's my question: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Is there a *effective* way for completely wipe the information from >>>>>>> a client? Something like this must be possible, otherwise a complete >>>>>>> server >>>>>>> restart wouldn't help either. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I don't mind connecting to the local LDAP server and deleting >>>>>>> 'something' by hand, but I'd like to know a way. We're running OL6.5. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> >>>>>>> >>>>>>> James >>>>>>> >>>>>>> _______________________________________________ >>>>>>> SunRay-Users mailing list >>>>>>> SunRay-Users@filibeto.org >>>>>>> http://www.filibeto.org/mailman/listinfo/sunray-users >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> SunRay-Users mailing list >>>>> SunRay-Users@filibeto.org >>>>> http://www.filibeto.org/mailman/listinfo/sunray-users >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> >>>> Greg Rodenhiser >>>> Technical Services Engineer >>>> College of the Holy Cross >>>> >>>> _______________________________________________ >>>> SunRay-Users mailing list >>>> SunRay-Users@filibeto.org >>>> http://www.filibeto.org/mailman/listinfo/sunray-users >>>> >>>> >>> >>> _______________________________________________ >>> SunRay-Users mailing list >>> SunRay-Users@filibeto.org >>> http://www.filibeto.org/mailman/listinfo/sunray-users >>> >>> >> >> >> -- >> >> >> Greg Rodenhiser >> Technical Services Engineer >> College of the Holy Cross >> >> _______________________________________________ >> SunRay-Users mailing list >> SunRay-Users@filibeto.org >> http://www.filibeto.org/mailman/listinfo/sunray-users >> >> >
_______________________________________________ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users