Greg,

2014-10-03 19:32 GMT+01:00 Rodenhiser, Greg <grode...@holycross.edu>:

> The other way I've seen, even before a power cycle is via utsession -p.
> If the Unix session column is ??? but the state is U (as opposed to IU) it
> is the Sunray session(s) that are hung on 26B.  At 4AM everyday I run
> utstop/utstart via cron, and check every morning for the hung session.  In
> our case it's always just a single display out of the 35 we use.
>
>
This seems very promising! I'll check on monday and post some feedback
about it, respectively the script I've implemented to kill hung sessions.

Thank you all, your tips will most probably solve my issue.

Happy weekend,

James


> On Fri, Oct 3, 2014 at 2:14 PM, James Michels <
> karma.sometimes.hu...@gmail.com> wrote:
>
>> Hello Greg,
>>
>> 2014-10-03 19:01 GMT+01:00 Rodenhiser, Greg <grode...@holycross.edu>:
>>
>>> I may have a way to auto detect a hung session (at least for our 26B
>>> hangs it works).  Run a utwho -a.  Any session that is owned by root is a
>>> hung session for us (provided root is not actually logged into any of our
>>> Sunray sessions).  The fact that it's owned by root seems to block Xnewt
>>> from spinning up on that display, giving the dreaded 26B (and maybe D)?
>>>
>>>
>> This is the way we use, but it only works when the client has been
>> already power-cycled :-( After that, the utwho -ca command shows that MAC
>> being owned by root as you describe, but only after that fact. Until
>> power-cycling, it appears being an idle session. That's the reason why I
>> was tinkering with utquery and see its parameters, but they're the same as
>> for a "sane" session. Even packets being sent from that hung session seems
>> to be the same of a normal session, so its camouflage is brilliant :-)
>>
>> Thanks for the tip!
>>
>> James
>>
>>
>>
>>> On Fri, Oct 3, 2014 at 1:51 PM, James Michels <
>>> karma.sometimes.hu...@gmail.com> wrote:
>>>
>>>> Hello all,
>>>>
>>>> As promised, I'm posting some feedback about your idea. In conclusion,
>>>> a mix of both Scott's and Daniel's ideas has *almost*-worked for me.
>>>> When using just the utload & utsession commands, the thing didn't seem to
>>>> work. However, I tried adding/removing things and tracing their effects and
>>>> the conclusions are the following:
>>>>
>>>>    - The command combination that seems to work for me is "utdesktop
>>>>    -d XXXXXXXXXXXX" + "utload -r -t pseudo.XXXXXXXXXXXX" + "utsession -k -t
>>>>    pseudo.XXXXXXXXXXXX".
>>>>    - The strange thing is that run just once, the client still seems
>>>>    to be hung after reboot. Only after running the same command 5-6 times, 
>>>> the
>>>>    client seems to start correctly. That seems pretty strange to me, as the
>>>>    three commands are run exactly the same way the 5-6 times.
>>>>    - This combination implies Scott's solution, there's just one path
>>>>    left: /tmp/SUNWut/kiosk/:DISPLAY..., so I made a script that 
>>>> additionally
>>>>    removes that path and added it to the 3 commands above.
>>>>    - As far as my pretension of detecting hung sessions without
>>>>    restarting the client for the first time goes, it seems that it won't 
>>>> work
>>>>    the way I meant. The utquery command returns correct values on the
>>>>    cmdcachesize parameter before rebooting, so this won't work. Now I'm 
>>>> still
>>>>    looking for a way to detect those hung clients without telling our 
>>>> workers
>>>>    to power cycle them manually.
>>>>
>>>> This has been a big help as at least I can reset the hung state of the
>>>> clients now and I don't need to wait for the nightly utstart -c, so thank
>>>> you Scott and Daniel.
>>>>
>>>> If someone has an idea on how to detect a hung client without needing
>>>> to power cycle them, I'll be very grateful.
>>>>
>>>> James
>>>>
>>>>
>>>> 2014-10-02 20:25 GMT+01:00 James Michels <
>>>> karma.sometimes.hu...@gmail.com>:
>>>>
>>>>> Hello Daniel,
>>>>>
>>>>> 2014-10-02 18:51 GMT+01:00 Beckman, Daniel <d...@loc.gov>:
>>>>>
>>>>>> This may be a different issue, but with SRS 5.3.1 on Solaris 10, when
>>>>>> we run “utdesktop –lw” we sometimes show units that can’t get a session 
>>>>>> and
>>>>>> are “stuck”. I think they are 26Ds as well.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On an individual DTU basis, to fix remotely we issue this:
>>>>>>
>>>>>>
>>>>>>
>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.00144fd6d4c7 && utsession -k -t
>>>>>> pseudo.00144fd6d4c7
>>>>>>
>>>>>>
>>>>>>
>>>>>> Where “pseudo.xxx..” is the identifier that shows up in output of
>>>>>> “utdesktop –lw”.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> In our case the 26D seems to be a bit more complicated to detect (not
>>>>> sure if OL and Sun base might have different behaviors), but when a 
>>>>> session
>>>>> gets hung, initially it's still recognized as a valid session and doesn't
>>>>> appear in the -lw list until someone power cycles the client. Then the 26D
>>>>> is shown again, but this time it appears in the -lw list. Our aim is to
>>>>> find out a way to discover those hung sessions in their 'initial' state,
>>>>> where the SRSS doesn't catalogue them as hung yet.
>>>>>
>>>>> I've a possible idea, but I have to test it quite lot yet. It's based
>>>>> on the output of the utquery command, concretly on the cmdcachesize
>>>>> parameter which seems to be 0 when the session is hung, but as I said, I'm
>>>>> not quite sure of this yet.
>>>>>
>>>>> I've tried using the 'utsession -k -t' command, but it doesn't seem to
>>>>> "unstick" the session itself, however, I've not tried in combination with
>>>>> the 'utload' command, so I'll also try that tomorrow and see its behavior.
>>>>>
>>>>> Thanks very much for that idea, too.
>>>>>
>>>>> James
>>>>>
>>>>>
>>>>>> What that will do is “unstick” (for lack of a better term) the
>>>>>> session associated with that DTU and then reboot it.
>>>>>>
>>>>>>
>>>>>>
>>>>>> To make things easier we have a simple script that asks for the
>>>>>> identifier:
>>>>>>
>>>>>>
>>>>>>
>>>>>> #!/usr/bin/bash
>>>>>>
>>>>>> read -p "Enter the Desktop ID: : " DesktopID
>>>>>>
>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k -t
>>>>>> pseudo.$DesktopID
>>>>>>
>>>>>>
>>>>>>
>>>>>> To make things automated we have a script that runs via cron by the
>>>>>> hour:
>>>>>>
>>>>>>
>>>>>>
>>>>>> #!/usr/bin/bash
>>>>>>
>>>>>> # Set variable for "DesktopID" based on output of utdesktop -lw
>>>>>>
>>>>>> DesktopID=$(utdesktop -lw | awk 'NR>3 && NR<5 {print $1}' )
>>>>>>
>>>>>> # Only continue if "utdesktop -lw" reports a hung session, indicated
>>>>>> by existence of ID starting with 00
>>>>>>
>>>>>> if [[ "$DesktopID" == 00* ]]
>>>>>>
>>>>>> then
>>>>>>
>>>>>>     echo "There's hung sessions -- fixing them..."
>>>>>>
>>>>>>     /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k -t
>>>>>> pseudo.$DesktopID
>>>>>>
>>>>>> else
>>>>>>
>>>>>>     echo "No hung sessions -- we're done here!"
>>>>>>
>>>>>> fi
>>>>>>
>>>>>>
>>>>>>
>>>>>> I’m not a programmer so I apologize if the scripts are a bit crude –
>>>>>> but they work for us.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hope that helps!
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Daniel
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* James Michels [mailto:karma.sometimes.hu...@gmail.com]
>>>>>> *Sent:* Thursday, October 02, 2014 8:33 AM
>>>>>> *To:* sunray-users@filibeto.org
>>>>>> *Subject:* [SunRay-Users] 26D and ability to effectively erase
>>>>>> sessions
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We're getting spotaneous and unpredictable 26D screens sometimes.
>>>>>> This doesn't happen quite often, but what's worrying is that we're unable
>>>>>> to restore the affected client's state to be reset.
>>>>>>
>>>>>> We've tried to reset the client using the utsession -k -t command,
>>>>>> also utdisplay -d and both of them seem uneffective, as when rebooted, 
>>>>>> the
>>>>>> client remains in the same 26D state.
>>>>>>
>>>>>> The only thing that helps is a complete server reboot.
>>>>>>
>>>>>> When the client reconnects to the server we're seeing this in the log
>>>>>> so maybe it's related:
>>>>>>
>>>>>> Oct  2 12:43:52 srss7 utauthd: search_for_entries(): Found multiple
>>>>>> matching entries, was expecting a single match
>>>>>>
>>>>>>
>>>>>>
>>>>>> I deduce that the session is not being cleaned up entirely, so here's
>>>>>> my question:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Is there a *effective* way for completely wipe the information from
>>>>>> a client? Something like this must be possible, otherwise a complete 
>>>>>> server
>>>>>> restart wouldn't help either.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I don't mind connecting to the local LDAP server and deleting
>>>>>> 'something' by hand, but I'd like to know a way. We're running OL6.5.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>>
>>>>>>
>>>>>> James
>>>>>>
>>>>>> _______________________________________________
>>>>>> SunRay-Users mailing list
>>>>>> SunRay-Users@filibeto.org
>>>>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> SunRay-Users mailing list
>>>> SunRay-Users@filibeto.org
>>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>>
>>> Greg Rodenhiser
>>> Technical Services Engineer
>>> College of the Holy Cross
>>>
>>> _______________________________________________
>>> SunRay-Users mailing list
>>> SunRay-Users@filibeto.org
>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>
>>>
>>
>> _______________________________________________
>> SunRay-Users mailing list
>> SunRay-Users@filibeto.org
>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>
>>
>
>
> --
>
>
> Greg Rodenhiser
> Technical Services Engineer
> College of the Holy Cross
>
> _______________________________________________
> SunRay-Users mailing list
> SunRay-Users@filibeto.org
> http://www.filibeto.org/mailman/listinfo/sunray-users
>
>
_______________________________________________
SunRay-Users mailing list
SunRay-Users@filibeto.org
http://www.filibeto.org/mailman/listinfo/sunray-users

Reply via email to