Ives, Keith-P59429 wrote:
Thanks Bob.
Can you clearify your position on NSCM. I have experimented with this
mode of operation but do not employ it. Why would the use of this
policy affect load balancing at all?
When NSCM is disabled a load-balancing decision is
made at the time the Sun Ray boots up, and a DTU
session is created on whatever server is least
loaded *at that time*. When the user eventually
logs in (without a smartcard), their desktop
session is created on the same server.
When NSCM is enabled, another load balancing
placement occurs *after the user enters their
username* and before the desktop session is
authenticated/created.
This difference in timing can be particularly
significant when a FOG has its servers rebooted.
Whichever servers come online first will likely
receive all the Sun Ray greeter sessions. This
means that without NSCM all non-smartcard users
will log into those small number of servers
eventually.
With NSCM, due to the delay before users will
typically authenticate (unless they're all crouched
over their keyboards ready to pounce and login
immediately ;-), all servers are likely to be
online at that time and proper FOG load balancing
will occur. The effect of the initial NSCM
greeters all being on a single server is typically
minimal, since the greeter is lightweight relative
to a full desktop session.
A similar argument can be made for smartcard
users, who will insert their cards at a later time
when ready to log in (unless they leave their
cards inserted while the servers are rebooted - in
that case it's just as bad as the DTU case I
mentioned above. Another reason to encourage
users to remove their cards when away from the
DTU).
Once users log out, another rebalancing occurs
before the new session is created, so over time
the load disperses properly across the FOG and the
effect disappears. It's just those initial
sessions that might not be balanced well for a
non-NSCM environment.
Before upgrading to 3.1, we would run a home-grown script to balance
sessions with a combination of utadm -f and killing sessions. We would
typically run this after rebooting the app servers. This script worked
fine with 2.0 but now it's ineffective... Most of the sunrays appear to
attach to more than 1 server producing (utsession) many "DI" in the
state field.
Ick. utadm -f seems quite heavy-weight for this
sort of use. You might try "utswitch -t" but I'm
not sure we bias the connection times for CAM like
we do for non-CAM sessions. I think we do but
if we don't it's just a NOP :-(.
However, the effect you describe is to be
expected. You've created a session for CAM, and
then you redirect the DTU to a new server, where
another CAM session is created. You could try
adding a "pkill Xsun" at the end of the CAM script
after rebalancing occurs, to kill off the X server
and terminate the session. I'm not sure how
permissions are managed though and whether this is
possible in CAM - I suspect not.
There's probably a client-side script that you could
kill instead to terminate your session but I don't have
a CAM system in front of me at the moment to tell
you which.
I'm also not sure whether there's a CAM Reaper to kill
the "DI" sessions after a period of inactivity
(which is what we do for non-CAM sessions but
that's managed by some dtlogin scripts which don't
pertain to 3.1 CAM). I don't see one in kiosk.start, but
perhaps there's one launched elsewhere in the CAM
scripts...
You *should* get the Reaper running in 4.0 Kiosk
mode, I think, since it leverages dtlogin for Kiosk.
So, I think you'd be a lot better off with 4.0 but
perhaps you can use my suggestions above to get
3.1 working how you want. I think that with a large
number of Sun Rays the load balancing should even
out over time, with no special effort on your part
required. The CAM non-smartcard situation
is much like the non-NSCM situation I described
above in terms of initial session placement after
a FOG restart. You could simply shut down SRSS
on the server or two that got all the sessions after
the rest of the FOG is up, if you have enough
servers, to force the load across the rest of the FOG.
-Bob
Message: 5
Date: Tue, 21 Aug 2007 17:14:12 -0400
From: Bob Doolittle <[EMAIL PROTECTED]>
Subject: Re: [SunRay-Users] Fail-over Group configuration questions
To: SunRay-Users mailing list <[email protected]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; format=flowed; charset=ISO-8859-1
I've seen no response to this I think?
Ives, Keith-P59429 wrote:
I have 5 servers in a fail-over group; class C dedicated network; each
sharing 34 Ips;
1 of the servers happens to have a public LAN as well for a few
remote sunrays Recently upgraded to both Trusted Solaris 2/04 and
SRSS 3.1 120879-06
What a shame you didn't wait for SRSS 4.0 and use Solaris 10 with
Trusted Extensions :-(. We had to drop support for TSol8 in this
release - too many platforms.
Issues with the sunrays connecting to servers following reboots has me
questioning my configuration. I also question the uninstall/install
of the SW... /etc/opt/SUNWut and /var/opt/SUNWut are left behind...
what else isn't cleaned up properly?
The only other directory we use is /tmp/SUNWut but since that's tmpfs
there should be no problem. Many of the files in /etc/opt/SUNWut are
editable and as such are not cleaned up during package removal. The
files in /var/opt/SUNWut are dynamic and thus don't belong to any
package either. These directories can be safely removed manually after
uninstallation, and it's a fair cop that we ought to clean them out from
utinstall I think.
Config questions:
1: When setting up the utadm -a <NIC>, does it matter if I have an
"auth server list" ?
Not really.
I have been operating without. If I should be using 1,
would I
list only other 4 servers, or all 5 ?
I'm assuming you're referring to this dialog:
Accept as is? ([Y]/N): n
new netmask: [255.255.255.0]
Do you want to offer IP addresses for this subnet? (Y/[N]):
auth server list: 172.16.0.29
To read auth server list from file, enter file name:
Auth server IP address (enter <CR> to end list):
I always enter all the addresses I'm interested in here, including the
one it already lists. You're building the AltAuth list here, which
takes precedence over AuthSrvr, which was the previous value. They are
not merged.
That said, this isn't critical if you have so many servers, since you
just need to enter enough addresses to meet your HA needs.
The entire group will load balance as appropriate when the session is
created. I'd be sure to use the NSCM policy so that non-card sessions
get load balanced upon login, if you allow pseudo sessions at all.
2: I use the default DHCP lease time ( 24 hr. on solaris ) Is this
good, bad, or indifferent?
That's the default I believe. I've not heard any issues regarding this.
Regards,
Bob
Disclaimer: Opinions expressed in this mail are my own, and are not
necessarily shared by my employer
------------------------------
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users