[caiman-discuss] derived profiles requirements

sanjay nadkarni (Laptop) Fri, 08 May 2009 23:16:12 -0600

Mike,
    A few more questions.  How many jumpstart servers do you have ?  How 
long does/would it take to setup a server that meet your current 
requirements.   Is the ability to capture and replicate your jumpstart 
server setup important/interesting ?


-Sanjay


Mike Gerdts wrote:
> On Fri, May 8, 2009 at 9:20 AM, Sarah Jelinek <Sarah.Jelinek at sun.com> 
> wrote:
>   
>> Hi Mike,
>>
>> Thank you for this data! I do have some comments inline..
>>     
>>> I noticed in the AI Client Redesign Meeting Notes[1]:
>>>
>>> Then there was a discussion about Derived Profiles. The outcome was
>>> to gather requirements around the following:
>>>
>>> | - What does deriving mean? That is, what aspects of the profile may
>>> |   be derived? What problems are we trying to solve here?
>>> | - Who derives a profile? Some clients or all the clients?
>>> | - Should the client support substitution of certain fields in the
>>> |   AI manifest? If yes, what problem will that solve?
>>> | - How does the impact the criteria selection on the AI server?
>>>
>>> Currently I use derived profiles to do the following:
>>>
>>> - Customize partitioning based upon server model, disk size,
>>>  memory size, etc.
>>>
>>>       
>> Can you be specific about the criteria you feel are requirements? What I
>> mean by criteria is what things do you believe must be included so that the
>> client can probe and effectively create the correct derived profile?
>>     
>
> Things that are in my current and/or begin scripts that derive the
> profile include:
>
> - Always create / and alt-/ of the same size and on the same disks
> - If enough disks are available, mirror everything
> - If the disks are big enough, create / and alt-/ as X GB, else X/2 GB
> - If running Solaris 10 or later, use leftover space for soft partitions.
> - If running Solaris 9 or earlier, mount leftover space at /local
> - If running on V240, V440, T2000, etc., root gets mirrored across
> disks on the same controller
> - If running on a 6800, 15K, 25K, etc., find the two JBODs that are
> attached and mirror across them.
> - If running on a Thumper, be sure to mirror across the devices that
> the BIOS has access to
> - If special device aliases (e.g. jsroot1, jsroot2) are found by
> probing OBP, find the disks associated with them and install there
> instead of using the rules above.
> - Determine what site I am in (based on IP) and download the flash
> archive from there
>
> Translated into the new way, this probably means:
>
> - Have the ability for the sysadmin - not the tool - to select which
> disks to install onto.
> - Provide a means that is flexible enough that disk selection can be
> done by physical path, as I do above with jsroot*.  This is important
> because the controller number can vary based on which PCI or PCIe
> cards are installed.  I would hate to install Solaris onto SAN disks
> (overwriting application data) when I meant to write to local disk.
> - Have the ability to specify the size of rpool, which may be smaller
> than a single disk.
> - Have the ability to specify other zpools should reside
> - Have the ability to tune the size and possibly location of swap &
> dump.  That is, a system with small drives (old or SSD) might put swap
> & dump in a separate pool - or may decide to use SVM for swap & dump
> because ZFS increases the space requirements for them.
> - Specify which mirror(s) or repository(s) to install from, based on
> locally defined location rules.
> - Specify proxy based on locally defined location rules.
>
>   
>>> - Select the appropriate flash archive based on server model
>>>  (primarily sun4u vs. sun4v vs i86pc)
>>>
>>> I have a lot of logic in finish scripts (JASS) and third-party system
>>> management tools that does various other things based upon location
>>> (derived from IP address), OS revision, and other criteria that is very
>>> hard or impossible to acquire automatically.  Arguably, the bulk of JASS
>>> is
>>> legacy baggage with secure by default.
>>>
>>> As I look forward, I would like to derive profiles that:
>>>
>>> - Lays out storage properly.  The definition of "properly" will be likely
>>>  be dependent on criteria that doesn't work for everyone.  That is, at
>>>  MyCo we may boot from local disk and want two compressed mirrors.  At
>>>  YourCo "properly" means to use the lowest numbered LUN presented via
>>>  iSCSI from storage array X.
>>> - Select software to install based on somewhat arbitrary rules.  That is,
>>>  at site X I need the omniback package and site Y I need netbackup.  If
>>>  it's the primary ldom of a sun4v box, install LDoms Manager 2.4.
>>>
>>>       
>> What types of data would drive the rules for the software choices?
>>     
>
> The primary IP address along with a populated netmasks file and some
> home-brew logic drives site identification.
>
> Probing OBP for aliases (prtpicl -c aliases -v) is great for
> system-specific overrides.
>
> Querying the network for which subnets are available shows some
> promise (e.g. snooping for EIGRP packets on and seeing "VLAN#50
> 10.0.50.0 - 224.0.0.10", or eventually LLDP) for making decisions.
>
>   
>>> - Select repository (or mirror) based on location such that I don't
>>> install
>>>  across the Atlantic if I have a closer copy.
>>> - Select repository based on location (lab installs experimental bits)
>>> - Require production servers to have packages signed by the OS vendor or
>>> by
>>>  internal QA.  That is, make it impossible to install experimental
>>>  third-party software on production.
>>>
>>>       
>> How would we be able to determine it is a production server? I assume the
>> profile you would derive in this case would have its ips repo set for
>> installation such that there wouldn't be experimental software. Is this what
>> you are thinking?
>>     
>
> This would likely feed off of subnet-based rules.  Arguably, this is
> probably more easily dealt with by having selecting a different base
> installation profile (prod vs. lab) on the AI server.  The derived
> profile would probably just tweak this base profile for
> hardware-specific items and picking the closest appropriate repo.
>
>   
>> A few more questions:
>>
>> 1. How easy is it for you to use, and configure your current jumpstart
>> configuration to enable derived profiles? Are the user interfaces easy to
>> use?
>>     
>
> The current setup of a jumpstart client involves (as a non-privileged
> user), setting up a system-specific wanboot.conf and system.conf using
> a fairly simple script.
>
> jumpstartzone$ /jumpstart/<release>/add_client_wanboot -e <mac> -h
> <hostname> ...
> Run this at the OpenBoot prompt:
>     ok setenv network-boot-arguments=...
>
> ok setenv network-boot-arguments=...
> ok boot net - install
>
> Every client uses the same begin script to derive the profile.  I
> tweak the Begin/derive-profile.beg script when I do a new image
> release (point it to the next flar) or something comes up that causes
> other problems (like Solaris becomes huge and needs more than 8 GB for
> /).
>
> The rules for site determination use a netmasks file and a "subnets" file. 
> e.g.:
>
>     10.0.1.0 SiteA
>     10.0.2.0 SiteB
>
> The rules for selecting installation disks require a diskmap file that
> looks like:
>
> # 480R- note that they use a qlogic fiber channel chip just like our HBA's do
> root1 SUNW,Sun-Fire-480R c.t0d0s2 ../../devices/pci at 9,600000/SUNW,qlc at 
> 2/fp at 0,0
> root2 SUNW,Sun-Fire-480R c.t1d0s2 ../../devices/pci at 9,600000/SUNW,qlc at 
> 2/fp at 0,0
>
> # T5220
> root1 SUNW,SPARC-Enterprise-T5220 c.t0d0s2
> ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0
> root2 SUNW,SPARC-Enterprise-T5220 c.t1d0s2
> ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0
> data1 SUNW,SPARC-Enterprise-T5220 c.t2d0s2
> ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0
> data2 SUNW,SPARC-Enterprise-T5220 c.t3d0s2
> ../../devices/pci at 0/pci at 0/pci at 2/scsi at 0
>
> If I just used the first two disks in $SI_DISK_LIST, the 480R may give
> the disks I list above or something that is storing an oracle database
> out on the SAN.  Best to avoid overwriting the database.
>
>   
>> 2. What do you like about the way it is currently implemented?
>>     
>
> - It works.
> - I can trust that the just-hired-last-week junior sysadmin armed with
> a simple procedure can install Solaris per standards without risk of
> breaking the jumpstart environment for everyone else.  That is, since
> there is no customization to perform on the jumpstart server there is
> no chance that someone that is not tasked with maintaining jumpstart
> will break jumpstart.
> - Policy enforcement via scripting is much easier, accurate, and
> cost-effective than policy enforcement via training, audits,
> remediation, retraining, etc. (Sysadmins need to know what they are
> doing, but need to be focused on value-add, not minutia.)
>
>   
>> 3. What don't you like?
>>     
>
> - It took way too much work to make all of this reliable and workable
> for a single process to use on a global basis.
> - Making everything work equally well for network-based and DVD-based
> installations was difficult.
> - When things don't work (which is extremely rarely - see 2 above)
> jumpstart is hard to debug because of lack of documented ways to
> observe the process along with lack of a documented way to restart the
> process without enduring POST, slow wanboot download, etc.  I know
> many tricks to get around this, but learning them was painful and
> often times only possible because of the extensive use of shell
> scripts during installation.
> - Automated installation has way too much of "every customer must
> figure it out for themselves."
>
> It seems as though the last point is supposed to be addressed with
> JASS and/or JET.  For me,
> JASS (now unmaintained and not yet open source) was a big help for
> automation of security hardening.  JET came to my attention long after
> JASS was already working (including various custom written modules).
> Almost every introduction I had to JET felt like a sales job for
> professional services.  That is, the thing that I needed didn't come
> with JET, but if I paid for some professional services they would
> provide it.  Well, the thing that I needed was typically easier to
> bolt onto JASS than it was to go through the requisition process for
> professional services.
>
> Striking the balance between everyone having to figure it out for
> themselves and lacking required flexibility is extremely difficult.  I
> would prefer that we currently err in favor of giving too much
> flexibility. Having flexibility will allow sysadmins to come up with
> clever ways of accomplishing what they need to do, hopefully leading
> to contributions of those clever things back to the community.  I
> worry that lack of flexibility will hinder adoption at large sites -
> all of which will suddenly sing the praises of jumpstart.
>
>

[caiman-discuss] derived profiles requirements

Reply via email to