Mike, A few more questions. How many jumpstart servers do you have ? How long does/would it take to setup a server that meet your current requirements. Is the ability to capture and replicate your jumpstart server setup important/interesting ?
-Sanjay Mike Gerdts wrote: > On Fri, May 8, 2009 at 9:20 AM, Sarah Jelinek <Sarah.Jelinek at sun.com> > wrote: > >> Hi Mike, >> >> Thank you for this data! I do have some comments inline.. >> >>> I noticed in the AI Client Redesign Meeting Notes[1]: >>> >>> Then there was a discussion about Derived Profiles. The outcome was >>> to gather requirements around the following: >>> >>> | - What does deriving mean? That is, what aspects of the profile may >>> | be derived? What problems are we trying to solve here? >>> | - Who derives a profile? Some clients or all the clients? >>> | - Should the client support substitution of certain fields in the >>> | AI manifest? If yes, what problem will that solve? >>> | - How does the impact the criteria selection on the AI server? >>> >>> Currently I use derived profiles to do the following: >>> >>> - Customize partitioning based upon server model, disk size, >>> memory size, etc. >>> >>> >> Can you be specific about the criteria you feel are requirements? What I >> mean by criteria is what things do you believe must be included so that the >> client can probe and effectively create the correct derived profile? >> > > Things that are in my current and/or begin scripts that derive the > profile include: > > - Always create / and alt-/ of the same size and on the same disks > - If enough disks are available, mirror everything > - If the disks are big enough, create / and alt-/ as X GB, else X/2 GB > - If running Solaris 10 or later, use leftover space for soft partitions. > - If running Solaris 9 or earlier, mount leftover space at /local > - If running on V240, V440, T2000, etc., root gets mirrored across > disks on the same controller > - If running on a 6800, 15K, 25K, etc., find the two JBODs that are > attached and mirror across them. > - If running on a Thumper, be sure to mirror across the devices that > the BIOS has access to > - If special device aliases (e.g. jsroot1, jsroot2) are found by > probing OBP, find the disks associated with them and install there > instead of using the rules above. > - Determine what site I am in (based on IP) and download the flash > archive from there > > Translated into the new way, this probably means: > > - Have the ability for the sysadmin - not the tool - to select which > disks to install onto. > - Provide a means that is flexible enough that disk selection can be > done by physical path, as I do above with jsroot*. This is important > because the controller number can vary based on which PCI or PCIe > cards are installed. I would hate to install Solaris onto SAN disks > (overwriting application data) when I meant to write to local disk. > - Have the ability to specify the size of rpool, which may be smaller > than a single disk. > - Have the ability to specify other zpools should reside > - Have the ability to tune the size and possibly location of swap & > dump. That is, a system with small drives (old or SSD) might put swap > & dump in a separate pool - or may decide to use SVM for swap & dump > because ZFS increases the space requirements for them. > - Specify which mirror(s) or repository(s) to install from, based on > locally defined location rules. > - Specify proxy based on locally defined location rules. > > >>> - Select the appropriate flash archive based on server model >>> (primarily sun4u vs. sun4v vs i86pc) >>> >>> I have a lot of logic in finish scripts (JASS) and third-party system >>> management tools that does various other things based upon location >>> (derived from IP address), OS revision, and other criteria that is very >>> hard or impossible to acquire automatically. Arguably, the bulk of JASS >>> is >>> legacy baggage with secure by default. >>> >>> As I look forward, I would like to derive profiles that: >>> >>> - Lays out storage properly. The definition of "properly" will be likely >>> be dependent on criteria that doesn't work for everyone. That is, at >>> MyCo we may boot from local disk and want two compressed mirrors. At >>> YourCo "properly" means to use the lowest numbered LUN presented via >>> iSCSI from storage array X. >>> - Select software to install based on somewhat arbitrary rules. That is, >>> at site X I need the omniback package and site Y I need netbackup. If >>> it's the primary ldom of a sun4v box, install LDoms Manager 2.4. >>> >>> >> What types of data would drive the rules for the software choices? >> > > The primary IP address along with a populated netmasks file and some > home-brew logic drives site identification. > > Probing OBP for aliases (prtpicl -c aliases -v) is great for > system-specific overrides. > > Querying the network for which subnets are available shows some > promise (e.g. snooping for EIGRP packets on and seeing "VLAN#50 > 10.0.50.0 - 224.0.0.10", or eventually LLDP) for making decisions. > > >>> - Select repository (or mirror) based on location such that I don't >>> install >>> across the Atlantic if I have a closer copy. >>> - Select repository based on location (lab installs experimental bits) >>> - Require production servers to have packages signed by the OS vendor or >>> by >>> internal QA. That is, make it impossible to install experimental >>> third-party software on production. >>> >>> >> How would we be able to determine it is a production server? I assume the >> profile you would derive in this case would have its ips repo set for >> installation such that there wouldn't be experimental software. Is this what >> you are thinking? >> > > This would likely feed off of subnet-based rules. Arguably, this is > probably more easily dealt with by having selecting a different base > installation profile (prod vs. lab) on the AI server. The derived > profile would probably just tweak this base profile for > hardware-specific items and picking the closest appropriate repo. > > >> A few more questions: >> >> 1. How easy is it for you to use, and configure your current jumpstart >> configuration to enable derived profiles? Are the user interfaces easy to >> use? >> > > The current setup of a jumpstart client involves (as a non-privileged > user), setting up a system-specific wanboot.conf and system.conf using > a fairly simple script. > > jumpstartzone$ /jumpstart/<release>/add_client_wanboot -e <mac> -h > <hostname> ... > Run this at the OpenBoot prompt: > ok setenv network-boot-arguments=... > > ok setenv network-boot-arguments=... > ok boot net - install > > Every client uses the same begin script to derive the profile. I > tweak the Begin/derive-profile.beg script when I do a new image > release (point it to the next flar) or something comes up that causes > other problems (like Solaris becomes huge and needs more than 8 GB for > /). > > The rules for site determination use a netmasks file and a "subnets" file. > e.g.: > > 10.0.1.0 SiteA > 10.0.2.0 SiteB > > The rules for selecting installation disks require a diskmap file that > looks like: > > # 480R- note that they use a qlogic fiber channel chip just like our HBA's do > root1 SUNW,Sun-Fire-480R c.t0d0s2 ../../devices/pci at 9,600000/SUNW,qlc at > 2/fp at 0,0 > root2 SUNW,Sun-Fire-480R c.t1d0s2 ../../devices/pci at 9,600000/SUNW,qlc at > 2/fp at 0,0 > > # T5220 > root1 SUNW,SPARC-Enterprise-T5220 c.t0d0s2 > ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0 > root2 SUNW,SPARC-Enterprise-T5220 c.t1d0s2 > ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0 > data1 SUNW,SPARC-Enterprise-T5220 c.t2d0s2 > ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0 > data2 SUNW,SPARC-Enterprise-T5220 c.t3d0s2 > ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0 > > If I just used the first two disks in $SI_DISK_LIST, the 480R may give > the disks I list above or something that is storing an oracle database > out on the SAN. Best to avoid overwriting the database. > > >> 2. What do you like about the way it is currently implemented? >> > > - It works. > - I can trust that the just-hired-last-week junior sysadmin armed with > a simple procedure can install Solaris per standards without risk of > breaking the jumpstart environment for everyone else. That is, since > there is no customization to perform on the jumpstart server there is > no chance that someone that is not tasked with maintaining jumpstart > will break jumpstart. > - Policy enforcement via scripting is much easier, accurate, and > cost-effective than policy enforcement via training, audits, > remediation, retraining, etc. (Sysadmins need to know what they are > doing, but need to be focused on value-add, not minutia.) > > >> 3. What don't you like? >> > > - It took way too much work to make all of this reliable and workable > for a single process to use on a global basis. > - Making everything work equally well for network-based and DVD-based > installations was difficult. > - When things don't work (which is extremely rarely - see 2 above) > jumpstart is hard to debug because of lack of documented ways to > observe the process along with lack of a documented way to restart the > process without enduring POST, slow wanboot download, etc. I know > many tricks to get around this, but learning them was painful and > often times only possible because of the extensive use of shell > scripts during installation. > - Automated installation has way too much of "every customer must > figure it out for themselves." > > It seems as though the last point is supposed to be addressed with > JASS and/or JET. For me, > JASS (now unmaintained and not yet open source) was a big help for > automation of security hardening. JET came to my attention long after > JASS was already working (including various custom written modules). > Almost every introduction I had to JET felt like a sales job for > professional services. That is, the thing that I needed didn't come > with JET, but if I paid for some professional services they would > provide it. Well, the thing that I needed was typically easier to > bolt onto JASS than it was to go through the requisition process for > professional services. > > Striking the balance between everyone having to figure it out for > themselves and lacking required flexibility is extremely difficult. I > would prefer that we currently err in favor of giving too much > flexibility. Having flexibility will allow sysadmins to come up with > clever ways of accomplishing what they need to do, hopefully leading > to contributions of those clever things back to the community. I > worry that lack of flexibility will hinder adoption at large sites - > all of which will suddenly sing the praises of jumpstart. > >