Re: [SunRay-Users] Connection reset, reauthenticatingDuplicateTID

2014-09-23 Thread Rodenhiser, Greg
We have having the exact same issue, luckily it's a single session that
hangs (only a single root owned session in utwho -ac, and it's been :11.  I
have an open ticket with Oracle about this issue.  Anyone found a way to
clear those root owned sessions?

On Tue, Sep 23, 2014 at 1:15 PM, Nicolás  wrote:

> El 23/09/2014 a las #4, Nicolás escribió:
>
>> That's something I've not tried yet and it's a good idea, I'll try to
>> trace the traffic for that client and verify what kind of traffic is being
>> sent between client and server. I've seen this is an issue that formerly
>> appeared in 2010 but has not a clear diagnosis yet, and I don't know
>> clearly how to force it for reproduction, but indeed once I do I'll try and
>> post some updates.
>>
>> I've seen a similar issue on a blog which talks about some mess with the
>> ARP table on the server side (http://www.planetgeek.ch/
>> 2012/04/02/sunray-terminal-unexpected-reboot/). I'll try to check the
>> ARP table as well and see if it helps determining the problem.
>>
>
> Today we've been "lucky" and had about 8-9 clients hung with this issue,
> so I tried further tests and here are the conclussions:
>
> * Yesterday we had upgraded from 5.4.0 to 5.4.3, obviously this didn't fix
> the issue.
> * The ARP table issue has nothing to do with this. The ARP table is ok.
> * Even using 'utsession -k -t', the hung client keeps sending KeepAlive
> packets like these:
>
>keepAliveReq _=1 byteCount=0 connTime=5722.63
> fw=11.1.3.0_26_2013.10.28.09.53,Boot:MfgPkg_4.15_2006.07.
> 20.16.57;\0402006.07.20-17:04:56-PDT hw=SunRayP8 idleTime=5308.54
> latency=869 lossCount=0 namespace=IEEE802 pktCount=0 pn=38837
> sn=00144f3c8156 state=connected.
>
> * We discovered we can have a list of hung clients via the 'utwho'
> command, as these are marked as 'root' sessions.
>
>  33 pseudo.00144fXX  root
>  34 pseudo.00144fXX  root
>  36 pseudo.00144fXX  root
>  37 pseudo.00144fXX  root
>  38 pseudo.00144fXX  root
>  39 pseudo.00144fXX  root
>  40 pseudo.00144fXX  root
>
> * What is even more worrying: Today I had an idle client just near me for
> about 1 hour, it simply destroyed the GDM greeter and stucked at the 26D
> screen, without any human action.
>
> * I guess an 'utsession -k -t' command would destroy the session, but
> despite this the client would follow sending the above keepalive packets. I
> don't know whether this could be fixed killing the corresponding GDM
> process for a client? However, I don't know either how to know which gdm
> process is attached to a user session, is that possible to know?
>
> Regards.
> ___
> SunRay-Users mailing list
> SunRay-Users@filibeto.org
> http://www.filibeto.org/mailman/listinfo/sunray-users
>



-- 


Greg Rodenhiser
Technical Services Engineer
College of the Holy Cross
___
SunRay-Users mailing list
SunRay-Users@filibeto.org
http://www.filibeto.org/mailman/listinfo/sunray-users


Re: [SunRay-Users] Connection reset, reauthenticatingDuplicateTID

2014-09-23 Thread Nicolás

El 23/09/2014 a las #4, Nicolás escribió:
That's something I've not tried yet and it's a good idea, I'll try to 
trace the traffic for that client and verify what kind of traffic is 
being sent between client and server. I've seen this is an issue that 
formerly appeared in 2010 but has not a clear diagnosis yet, and I 
don't know clearly how to force it for reproduction, but indeed once I 
do I'll try and post some updates.


I've seen a similar issue on a blog which talks about some mess with 
the ARP table on the server side 
(http://www.planetgeek.ch/2012/04/02/sunray-terminal-unexpected-reboot/). 
I'll try to check the ARP table as well and see if it helps 
determining the problem.


Today we've been "lucky" and had about 8-9 clients hung with this issue, 
so I tried further tests and here are the conclussions:


* Yesterday we had upgraded from 5.4.0 to 5.4.3, obviously this didn't 
fix the issue.

* The ARP table issue has nothing to do with this. The ARP table is ok.
* Even using 'utsession -k -t', the hung client keeps sending KeepAlive 
packets like these:


   keepAliveReq _=1 byteCount=0 connTime=5722.63 
fw=11.1.3.0_26_2013.10.28.09.53,Boot:MfgPkg_4.15_2006.07.20.16.57;\0402006.07.20-17:04:56-PDT 
hw=SunRayP8 idleTime=5308.54 latency=869 lossCount=0 namespace=IEEE802 
pktCount=0 pn=38837 sn=00144f3c8156 state=connected.


* We discovered we can have a list of hung clients via the 'utwho' 
command, as these are marked as 'root' sessions.


 33 pseudo.00144fXX  root
 34 pseudo.00144fXX  root
 36 pseudo.00144fXX  root
 37 pseudo.00144fXX  root
 38 pseudo.00144fXX  root
 39 pseudo.00144fXX  root
 40 pseudo.00144fXX  root

* What is even more worrying: Today I had an idle client just near me 
for about 1 hour, it simply destroyed the GDM greeter and stucked at the 
26D screen, without any human action.


* I guess an 'utsession -k -t' command would destroy the session, but 
despite this the client would follow sending the above keepalive 
packets. I don't know whether this could be fixed killing the 
corresponding GDM process for a client? However, I don't know either how 
to know which gdm process is attached to a user session, is that 
possible to know?


Regards.
___
SunRay-Users mailing list
SunRay-Users@filibeto.org
http://www.filibeto.org/mailman/listinfo/sunray-users


Re: [SunRay-Users] Connection reset, reauthenticatingDuplicateTID

2014-09-22 Thread Nicolás

Hi Nicolas,

First off, thanks for your kind help!

El 23/09/2014 a las #4, Nicolas Schier escribió:

Dear Nicolás,


Recently we moved the Sun Ray clients to run against the production
server, resulting in some of them running a graphic session without
any faults, and some of them stucking in the '26D' screen. [...]
I read that this could be due to the inability of renewing the DHCP
lease, however, the SRS server is *not* configured as a DHCP
server, we have a dedicated server for that.

unfortunately I do not have a good idea what would might cause such
logs.  But I'm trying to ask some questions, hoping that one of them
might be helpful in searching further...

  - Did you check that your DHCP server is still working as expected?
Might some IPs be assigned non-exclusively?


Yes, it is. I must emphasize that we don't use the Sun Ray's DHCP server 
(utadm), but a dedicated server for DHCP purposes and it's working fine 
all time. I don't think there might be an IP conflict, as the IP 
addresses for our clients are all bound by their MAC address.



  - Does the 'realIP' (0adb08e8=10.219.8.232 in your log example) stay
the same or change according to your DHCP configuration?


It's always the same for each client address, that's the weird thing. 
However, I've been looking for some extended information about this and 
I see people having this problem in the middle of users' session. In our 
case, it just happens when someone restarts the Sun-Ray client or logs 
out (as far as we have seen up to now).



  - If you trace the network traffic for a particular 26D-client: can
you verify the connection loss as it is claimed by the server log?


That's something I've not tried yet and it's a good idea, I'll try to 
trace the traffic for that client and verify what kind of traffic is 
being sent between client and server. I've seen this is an issue that 
formerly appeared in 2010 but has not a clear diagnosis yet, and I don't 
know clearly how to force it for reproduction, but indeed once I do I'll 
try and post some updates.


I've seen a similar issue on a blog which talks about some mess with the 
ARP table on the server side 
(http://www.planetgeek.ch/2012/04/02/sunray-terminal-unexpected-reboot/). I'll 
try to check the ARP table as well and see if it helps determining the 
problem.


Again, thanks for your help!

Regards,

Nicolás


Good luck and kind regards,
Nicolas




___
SunRay-Users mailing list
SunRay-Users@filibeto.org
http://www.filibeto.org/mailman/listinfo/sunray-users


Re: [SunRay-Users] Connection reset, reauthenticatingDuplicateTID

2014-09-19 Thread Nicolás

El 12/09/2014 a las #4, Nicolás escribió:

Hi,

I'm running SRS v. 5.4.0.0.44 together with SROS v. 11.1.3.0.26 on a 
Oracle Linux 6.3 distribution. Recently we moved the Sun Ray clients 
to run against the production server, resulting in some of them 
running a graphic session without any faults, and some of them 
stucking in the '26D' screen. I know what this state means, but what I 
can't find out is the meaning of what the server's log shows:


Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: DISCONNECT 
IEEE802.00144f797688, pseudo.00144f797688 reauthenticatingDuplicateTID
Sep 12 13:29:49 srs utauthd: Worker6 UNEXPECTED: during send to: 
java.net.SocketOutputStream@166a22b error=java.net.SocketException: 
Connection reset
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: DESTROY 
pseudo.00144f797688 lifetime=893640
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: whichServer 
pseudo.00144f797688:
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: CLAIMED by 
StartSession.m5 NAME: pseudo.00144f797688 PARAMETERS: 
{terminalIPA=CLIENT.IP.ADDR, type=pseudo, 
fw=11.1.3.0_26_2013.10.28.09.53,Boot:2.0; 2007.08.17-17:32:09-PDT, 
state=disconnected, cause=insert, doamgh=true, barrierLevel=451, 
rawId=00144f797688, terminalCID=IEEE802.00144f797688, MTU=1500, 
tokenSeq=1, firstServer=0adb08dc, namespace=IEEE802, 
keyTypes=dsa-sha1-x1,dsa-sha1, ddcconfig=1, 
clientRand=4izTSm/yqqCyqD4nwX6D1IM1Ng4FwZ5ysyzKx0qS0iy, 
id=00144f797688, realIP=0adb08e8, startRes=1280x1024:1280x1024, 
useReal=true, event=insert, sn=00144f797688, rawType=pseudo, 
clientKeyStatus=unconfirmed, hw=SunRayP8-270, initState=1, _=1}
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: CONNECT 
IEEE802.00144f797688, pseudo.00144f797688, all connections allowed

Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: MTU = 1500

That keps happening all time along for the same clients, whilst some 
others were working without any issue. After this, we rebooted the 
server and afterwards all the clients were working, but this is still 
quite worrying because we don't know what is causing it.


In this link 
 I read 
that this could be due to the inability of renewing the DHCP lease, 
however, the SRS server is *not* configured as a DHCP server, we have 
a dedicated server for that.


Any thoughts about what could be causing this and how to solve it are 
welcome.


Regards,

Nicolás


___
SunRay-Users mailing list
SunRay-Users@filibeto.org
http://www.filibeto.org/mailman/listinfo/sunray-users


Hi,

Any hint for this? This is starting to be a real pain, I can't find out 
any pattern and what's worse, I can't find a way to "reset" the state of 
the client that has this problem. It seems to happen upon a some logout 
actions, and the same lines show up in the log. It doesn't happen for 
the same clients all the time, seems to happen randomly.


I've tried the following:

* Reconnecting the client, doesn't help.
* utsession -k -t pseudo. doesn't help.
* A warm restart doesn't help.
* A cold restart temporarily helps until it starts happening again after 
some time.


Any hint is really appreciated.

Nicolás
___
SunRay-Users mailing list
SunRay-Users@filibeto.org
http://www.filibeto.org/mailman/listinfo/sunray-users


[SunRay-Users] Connection reset, reauthenticatingDuplicateTID

2014-09-12 Thread Nicolás

Hi,

I'm running SRS v. 5.4.0.0.44 together with SROS v. 11.1.3.0.26 on a 
Oracle Linux 6.3 distribution. Recently we moved the Sun Ray clients to 
run against the production server, resulting in some of them running a 
graphic session without any faults, and some of them stucking in the 
'26D' screen. I know what this state means, but what I can't find out is 
the meaning of what the server's log shows:


Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: DISCONNECT 
IEEE802.00144f797688, pseudo.00144f797688 reauthenticatingDuplicateTID
Sep 12 13:29:49 srs utauthd: Worker6 UNEXPECTED: during send to: 
java.net.SocketOutputStream@166a22b error=java.net.SocketException: 
Connection reset
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: DESTROY pseudo.00144f797688 
lifetime=893640
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: whichServer 
pseudo.00144f797688:
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: CLAIMED by StartSession.m5 
NAME: pseudo.00144f797688 PARAMETERS: {terminalIPA=CLIENT.IP.ADDR, 
type=pseudo, fw=11.1.3.0_26_2013.10.28.09.53,Boot:2.0; 
2007.08.17-17:32:09-PDT, state=disconnected, cause=insert, doamgh=true, 
barrierLevel=451, rawId=00144f797688, terminalCID=IEEE802.00144f797688, 
MTU=1500, tokenSeq=1, firstServer=0adb08dc, namespace=IEEE802, 
keyTypes=dsa-sha1-x1,dsa-sha1, ddcconfig=1, 
clientRand=4izTSm/yqqCyqD4nwX6D1IM1Ng4FwZ5ysyzKx0qS0iy, id=00144f797688, 
realIP=0adb08e8, startRes=1280x1024:1280x1024, useReal=true, 
event=insert, sn=00144f797688, rawType=pseudo, 
clientKeyStatus=unconfirmed, hw=SunRayP8-270, initState=1, _=1}
Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: CONNECT 
IEEE802.00144f797688, pseudo.00144f797688, all connections allowed

Sep 12 13:29:49 srs utauthd: Worker6 NOTICE: MTU = 1500

That keps happening all time along for the same clients, whilst some 
others were working without any issue. After this, we rebooted the 
server and afterwards all the clients were working, but this is still 
quite worrying because we don't know what is causing it.


In this link 
 I read 
that this could be due to the inability of renewing the DHCP lease, 
however, the SRS server is *not* configured as a DHCP server, we have a 
dedicated server for that.


Any thoughts about what could be causing this and how to solve it are 
welcome.


Regards,

Nicolás
___
SunRay-Users mailing list
SunRay-Users@filibeto.org
http://www.filibeto.org/mailman/listinfo/sunray-users