I'm sponsoring the following fast-track for Sree Vemuri, with
timeout of one week, 11/26/2008.

The project requests patch/minor binding, with the path name
of the command committed and the output of the command
not an interface.

-------


1. Introduction
   1.1. Project/Component Working Name:
        Device Remap Script for SPARC Enterprise T5440 Servers

   1.2. Name of Document Author/Supplier:
        Sree Vemuri

   1.3. Date of This Document:
        10/09/2008

   1.4. Name of Major Document Customer(s)/Consumer(s):
        1.4.1. The PAC or CPT you expect to review your project:
               Solaris PAC
        1.4.2. The ARC(s) you expect to review your project:
               PSARC
        1.4.3. The Director/VP who is "Sponsoring" this project:
               quresh.dhoon at sun.com
        1.4.4. The name of your business unit:
               Niagara Rock SW Engineering

   1.5. Email Aliases:
        1.5.1. Responsible Manager: Fadi Salem
        1.5.2. Responsible Engineer: Sree Vemuri
        1.5.3. Interest List: John Johnson

2. Project Summary
   2.1. Project Description:

        Certain multi-node sun4v platforms (Batoka or Maramba) have an 
integrated
        PCI topology that causes the IO device paths to change in a CMP node
        failover condition. In the event a CMP node fails, the system can be
        booted to OBP with the remaining CMP nodes. But, with the current
        architecture Solaris fails to boot as the PCIE device paths have
        changed. The device paths are hard coded in the /etc/path_to_inst
        file and the symlinks under /dev.  When the same device is found
        at a different I/O path after power-cycle it is treated as a new
        device. Customer can not easily recover from the loss of I/O.
        A change to the boot path requires Solaris re-install.

        This project creates a process that allows the system administrator
        to modify the Solaris idea of the device paths to match that of the HW.
   
   2.2. Risks and Assumptions:

        The solaris script should pose no risk.
        The script assumes /etc/path_to_inst doesn't change format.

3. Business Summary
   3.1. Problem Area:

        Current architecture prevents operating system boot if the PCIe
        device paths have changed due to a node failover condition.

        This case is attempting to solve a long standing problem where I/O
        topology changes necessitate an operating system reinstall.

4. Technical Description:
   4.1. Details:

        The T5440 FW describes the possible device path aliases and current
        configuration in the MD (see FWARC 2008/349).  For example:

        When cpu0 exists, we use its path (/pci at 400) to access IO; the other
        paths via other CPUs are listed as aliases: 

        node ioalias pcie0 {
            current = "/pci at 400";
            aliases = "/pci at 500/pci at 0/pci at 1/pci at 0/pci at 1" +
                  "/pci at 600/pci at 0/pci at 1" +
                  "/pci at 700/pci at 0/pci at 1/pci at 0/pci at 1/pci at 0/pci 
at 1";
            back -> ioaliases;
        };

        This project involves a perl script that combines the current Solaris
        device paths and the ioalias information from the FW to change
        /etc/path_to_inst and the /dev symlinks to reflect the new HW
        configuration.

        Note that since the device path of the root file system may be affected,
        the suggested procedure will be to boot the failsafe archive (which
        is on a ramdisk that is immune to device path changes) and run the
        script from there.

5. Reference Documents:

       [1] FWARC 2008/349 New MD Nodes to Support I/O Reconfiguration for 
Failover Conditions
       http://sac.sfbay.sun.com/Archives/CaseLog/arc/FWARC/2008/349/onepager.txt





System Administration Commands                  device_remap(1M)

NAME
     device_remap - administration program  for the  Solaris I/O
     remapping feature

SYNOPSIS
     /usr/platform/sun4v/sbin/device_remap
         [-v | -d dir]

DESCRIPTION
     Certain multi-node  sun4v  platforms, for  example T5440 and
     T5240 servers have an integrated PCI topology that cause the 
     IO device paths to change in a CPU node failover condition.
     The  Device  Remapping  Script  for  SPARC  Enterprise T5440
     servers remaps  the device  paths in  /etc/path_to_inst
     file and the symlinks under /dev.

OPTIONS
     The following options are supported:

     -v

        Prints the path_to_inst and /dev/symlink changes

     -d dir

        Operate on the /etc/path_to_inst and /dev on the 
        root image at <dir>


USAGE
     
     The primary  function of  device_remap  is to  remap the
     device paths in /etc/path_to_inst file  and  the symlinks under
     /dev in a CPU node failover condition.
     
     After adding CPU node/s or removing CPU node/s, boot the system
     to OBP prompt and use the following procedure:

         1. Boot an install miniroot, either with "boot net -s" or
            "boot -F failsafe".

         2. Mount the root disk as /mnt.

         3. Change directory to the mounted root disk.
         cd /mnt
         
         4. Run device_remap script.
         /mnt/usr/platform/sun4v/sbin/device_remap

         5. Reboot the system.

     All the error messages are self-explanatory, except for the error
     message "missing ioaliases node" which means the FW on the system
     doesn't support device remapping.

EXAMPLES
     Example 1
     
     Prints the path_to_inst and /dev/symlink changes

       # device_remap -v

     Example 2

     Changes directory to dir before doing any changes

       # device_remap -d dir


ATTRIBUTES
     See attributes(5) for descriptions of the  following  attri-
     butes:

     ____________________________________________________________
    |       ATTRIBUTE TYPE        |       ATTRIBUTE VALUE       |
    |_____________________________|_____________________________|
    | Availability                | SUNWkvm                     |
    |_____________________________|_____________________________|
    | Interface Stability         | Committed                   |
    |_____________________________|_____________________________|

SEE ALSO
     boot(1M)

NOTES


Reply via email to