jan damborsky wrote:
> I was investigating bug 8130 for a while in order to determine
> what the problem is and if this might be considered as a stopper
> for 2009.06 release. I would like to share my thoughts and observations,
> since it seems that the problem is partially related to the chosen
> implementation and at this point addressing it as a whole would be
> too risky.
> 
> Problem:
> --------
> In current implementation, AI client boot process contains several
> steps. Those interesting with respect to 8130 are
> 
> [1] locating and downloading boot_archive
> [2] locating and downloading additional compressed archives (solaris.zlib,
>    solarismisc.zlib).
> 
> In current implementation, it is required that [1] and [2] are taken
> from the same AI image. The issue here is that in specific configuration
> affecting Sparc client, mismatch between [1] and [2] could occur
> (boot_archive is taken from different AI image than compressed archives).
> 
> For x86, this mismatch doesn't occur, since both locations are specified
> at one place - (in GRUB menu.lst file) and are always updated at once.
> 
> For Sparc, those locations are separated and there are scenarios when
> they could currently become out of sync. Location of boot archive is
> specified in wanboot.conf file (as 'root_file' option) and location of
> compressed archives is provided as 'RootPath' option by DHCP server.
> 
> The mismatch doesn't occur if AI Sparc client is explicitly associated
> with given install service and AI image by using 'create-client'
> installadm(1M) subcommand. In that case, both DHCP server as well as
> wanboot.conf files are appropriately configured:
> 
> * client specific DHCP macro containing location of compressed archives
>  is (re)created. It takes precedence over service specific DHCP macro.
>  It assures that client is always provided with correct 'RootPath'
>  information.
> 
> * client specific wanboot.conf file containing location of boot_archive
>  is (re)created in /etc/netboot/<network_address>/<client_id> directory.
>  Again, it takes precedence over other wanboot.conf files stored in other
>  locations within /etc/netboot directory.
> 
> The problematic scenario is when Sparc AI client is not explicitly 
> configured
> with 'create-client' command. In that case, it is provided with 
> boot_archive
> picked up from location specified in /etc/netboot/wanboot.conf file and
> with RootPath option pointing to location of compressed archives which
> is taken from service-specific DHCP macro. Those are configured when 
> 'create-service'
> command is used to create install service.
> 
> The problem is that /etc/netboot/wanboot.conf file is populated each time
> new install service is created, but service-specific DHCP macro is assigned
> to given pool of IP addresses (by calling pntadm(1M)) only when new pool
> of IP addresses is asked to be created (by providing -i and -c options).
> 
> e.g. the problem occurs when:
> 
> [1] first install service is created along with pool of IP addresses
> # installadm create-service -n service_1 -i <start_IP> -c <IP_pool_size> \
>  -s <ai_iso_image_1> <ai_image_1>
> 
> * /etc/netboot/wanboot.conf is created and points to boot_archive in
>  <ai_image_1>
> 
> * service specific DHCP macro dhcp_macro_service_1 is created with
>  'RootPath' pointing to <ai_image_1>
> 
> * created IP addresses are assigned to dhcp_macro_service_1 macro
>  using pntadm(1M) command
> 
> [2] second service is created
> # installadm create-service -n service_2 -s <ai_iso_image_2> <ai_image_2>
> 
> * /etc/netboot/wanboot.conf is (re)created and points to boot_archive in
>  <ai_image_2>
> 
> * service specific DHCP macro dhcp_macro_service_2 is created with
>  'RootPath' pointing to <ai_image_2>, but not associated with IP
>  addresses
> 
> Now when Sparc AI client is booted, it picks up boot archive from 
> <ai_image_2>
> and compressed archives from <ai_image_1>
> 
> [3] second service is deleted along with AI image
> # installadm delete-service -x service_2
> 
> * /etc/netboot/wanboot.conf is left untouched and points to boot_archive in
>  already deleted <ai_image_2>
> 
> Now when Sparc AI client is asked to boot, it fails when trying to obtain
> boot_archive.
> 
> Proposed final solution:
> ------------------------
> I think that the final solution here is to worked out set of requirements
> we would like to address and reconsider existing design and implementation
> with respect to
> 
> * what are desired install service scopes to be available
>  - currently for Sparc we can either explicitly associate install
>    service with particular client (identifying it by MAC address)
>    and use another one for rest of Sparc clients. More than one
>    service can't be created serving broader scope, since only
>    one /etc/netboot/wanboot.conf file can be created.
> 
> * how Sparc client obtains location of AI images
>  - now it is spread across two places - one for boot_archive,
>    one for compressed archives. It should be consolidated, so
>    that it is less error prone and easier to maintain.
> 
> Proposed fix for now:
> ---------------------
> For now any significant design changes are not appropriate,
> since they would be too risky. Based on this I am thinking about
> following temporary solution before final approach can be taken:
> 
> * when new service is created, don't touch /etc/netboot/wanboot.conf
>  if it contains pointer to existing boot archive. It makes sure
>  that once /etc/netboot/wanboot.conf is created for one service,
>  it is not accidentaly overwritten by another service. So clients would
>  continue to use first service as a default (in cases 'create-client'
>  is not called) and mismatch would be avoided in this case.
> 
> * when service is deleted along with associated AI image (by passing
>  '-x' option) and if /etc/netboot/wanboot.conf file contains pointer
>  to boot archive in that image, /etc/netboot/wanboot.conf will be
>  deleted along with that AI image. It would avoid
>  /etc/netboot/wanboot.conf pointing to non-existent AI image.
> 
> When those changes are applied, behavior for Sparc clients would be similar
> to the one for x86 clients.
> 
> I have prepared preliminary fix with those changes and tested it for
> Sparc as well as x86 clients.
> 
> The preliminary webrev is available at following location:
> http://cr.opensolaris.org/~dambi/bug-8130/
> 
> please let me know, if you think that this problem can be qualified
> as stopper for 2009.06, if there are other related issues I have
> not noticed and if solution mentioned above can be acceptable
> or different approach should be taken. Any comments are highly appreciated.
> 
> Thank you very much in advance,
> Jan
> 
> _______________________________________________
> caiman-discuss mailing list
> caiman-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss


Jan,

Thanks for looking into this and the great description!

Please forgive me if my perspective is not correct. I'm learning how 
this works as I go.

Until the existing design can be reworked, it seems to me a safer 
approach might be to not allow a consecutive invocation of "installadm 
create-service" on SPARC.

If a SPARC user issues "installadm create-service" and a service is 
already created issue an error message that only one service is 
currently allowed on SPARC and "installadm delete-service" must be run 
prior to creating a second service.

OK I can imagine this is not flexible for customers but it might be a 
safe approach to help avoid customer problems/confusion until the design 
can be reworked.

Again I apologies if my perspective is naive. I'm just thinking here and 
trying to help.

Joe


Reply via email to