I am sponsoring this fasttrack for Dave Plauger and Steve Sistare. The timer is set for next Tuesday August 18, 2009. Requesting Micro/Patch binding.
The one-pager, project specification and man page diffs are available in the materials directory. Sherry Moore Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI This information is Copyright 2009 Sun Microsystems 1. Introduction 1.1. Project/Component Working Name: Fast Crash Dump 1.2. Name of Document Author/Supplier: Dave Plauger Steve Sistare 1.3. Date of This Document: August 11, 2009 4. Technical Description: 4.1. Details: New command line flags to dumpadm(1M) control how core files are to be saved. The setting is saved across reboots in /etc/dumpadm.conf. The new default behavior is to save files in compressed format instead of always uncompressing them, as it does currently. If saving compressed, savecore(1M) copies the core file from the dump device to vmdump.N, where N is the usual dump integer. Copying a core file is much faster than uncompressing it into unix.N and vmcore.N images; and it takes up much less disk space. On systems that dump to the swap area there is less risk that the core image will be over-written by swap activity before it can be extracted. savecore(1M) performance has been improved by reading the dump file with fread(3C) instead of pread(2). The compressed core dump format is largely unchanged. The dump header and the dump version number are unchanged. However, there are changes to the way memory pages are saved within the compressed file. A compressed dump file must be uncompressed first, by manually running savecore(1M) a second time, before it can be used by other tools. Therefore, there is no impact on mdb(1) due to the changes in compression methods. Once the core dump has been uncompressed, the resulting unix.N and vmcore.N files are in the same format as before. This project has introduced a bzip2 compression library [2] into Solaris common code, where it is shared by savecore(1M) and the kernel. The current lzjb compression algorithm is much faster, but also much weaker. The bzip2 library requires much more memory and compute resources. If these resources are available during panic, the kernel will save memory pages with bzip2 instead of lzjb, and savecore(1M) will use the same bzip2 library in order to uncompress the pages. The kernel function dumpsys() does most of the work in creating core dump images. The section that saves memory pages has been expanded to support parallelism. Most kernel services are not available during panic. Instead, CPUs spinning in panic_idle() call into dumpsys() and coordinate via memory. These helper CPUs copy pages, compress them, and produce streams of compression data that savecore(1M) can uncompress. The panic CPU acts as the master and does all page mapping and I/O operations. There are two compression modes in this implementation. The older method, lzjb is the default on smaller systems. With up to 4 CPUs, it usually speeds up dump by 2-4 times. The new bzip2 library is employed on large systems with many spare CPUs and memory. This can speed up dumps by 4-10 times. The mode is chosen at crash time based on processor type, number of CPUs, and available free memory for buffers. The existing savecore -L (live dump) option creates a dump image on a running system. This option is available only when there is a dedicated dump device. In this case, the dump helpers in the kernel run as system tasks. file(1) can detect the new compressed format. For example, # file vmcore.0 vmcore.0: SunOS 5.11 snv_81 64-bit SPARC crash dump from 'oaf415' # file vmdump.0 vmdump.0: SunOS 5.11 snv_81 64-bit SPARC compressed crash dump from 'oaf415' For more information, see the project specification in the materials directory. 4.2. Bug/RFE Number(s): RFE 6828976 Fast Crash Dump 4.5. Interfaces: New command line flags to dumpadm(1M). And additions to the meaning of existing flags to savecore(1M). Micro/patch binding requested. INTERFACE COMMITMENT LEVEL COMMENT dumpadm -z (1M) Committed Enables save compressed, or not. 4.6. Doc Impact: Man page changes for dumpadm(1M) and savecore(1M). See appendix A for diffs. 6. Resources and Schedule: 6.4. Product Approval Committee requested information: 6.4.1. Consolidation or Component Name: OS/Net 6.5. ARC review type: Fasttrack 6.6. ARC Exposure: Open. A. Man pages A.1 dumpadm(1M) A.2 savecore(1M) A.1 Man pages dumpadm(1M) System Administration Commands dumpadm(1M) NAME dumpadm - configure operating system crash dump SYNOPSIS /usr/sbin/dumpadm [-nuy] [-c content-type] [-d dump-device] [-m mink | minm | min%] [-s savecore-dir] [-r root-dir] [-z y | n] | DESCRIPTION The dumpadm program is an administrative command that manages the configuration of the operating system crash dump facility. A crash dump is a disk copy of the physical memory of the computer at the time of a fatal system error. When a fatal operating system error occurs, a message describing the error is printed to the console. The operating system then generates a crash dump by writing the contents of phy- sical memory to a predetermined dump device, which is typi- cally a local disk partition. The dump device can be config- ured by way of dumpadm. Once the crash dump has been written to the dump device, the system will reboot. Fatal operating system errors can be caused by bugs in the | operating system, its associated device drivers and loadable | modules, or by faulty hardware. Whatever the cause, the | crash dump itself provides invaluable information to your | support engineer to aid in diagnosing the problem. As such, | it is vital that the crash dump be retrieved and given to | your support provider. Following an operating system crash, | the savecore(1M) utility is executed automatically during | boot to retrieve the crash dump from the dump device, and | write it to your file system in compressed form to a file | name vmdump.X, where X is an integer identifying the dump. | Afterwards, savecore(1M) can be invoked on the same or | another system to expand the compressed crash dump to a pair | of files named unix.X and vmcore.X. The directory in which | the crash dump is saved on reboot can also be configured | using dumpadm. For systems with a UFS root file system, the default dump | device is configured to be an appropriate swap partition. | Swap partitions are disk partitions reserved as virtual | memory backing store for the operating system. Thus, no per- | manent information resides in swap to be overwritten by the | dump. See swap(1M). For systems with a ZFS root file system, | dedicated ZFS volumes are used for swap and dump areas. For | further information about setting up a dump area with ZFS, | see the ZFS Administration Guide. To view the current dump | configuration, use the dumpadm command with no arguments: example# dumpadm Dump content: kernel pages Dump device: /dev/dsk/c0t0d0s1 (swap) Savecore directory: /var/crash/saturn Savecore enabled: yes Save compressed: yes | When no options are specified, dumpadm prints the current | crash dump configuration. The example shows the set of | default values: the dump content is set to kernel memory | pages only, the dump device is a swap disk partition, the | directory for savecore files is set to /var/crash/hostname. | savecore(1M) is set to run automatically on reboot and save | the crash dump in a compressed format. -z y | n | Modify the dump configuration to control the operation | of savecore on reboot. The options are y (yes) to enable | saving core files in a compressed format, and n (no) | automatically uncompress the crash dump file. The | default is yes, because crash dump files can be very | large and will require less file system space if saved | in a compressed format. | EXAMPLES Example 1 Reconfiguring The Dump Device To A Dedicated Dump Device: The following command reconfigures the dump device to a dedicated dump device: example# dumpadm -d /dev/dsk/c0t2d0s2 Dump content: kernel pages Dump device: /dev/dsk/c0t2d0s2 (dedicated) Savecore directory: /var/crash/saturn Savecore enabled: yes Save compressed: yes | A.2 Man pages for savecore(1M) System Administration Commands savecore(1M) NAME savecore - save a crash dump of the operating system SYNOPSIS /usr/bin/savecore [-Lvd] [-f dumpfile] [directory] DESCRIPTION The savecore utility saves a crash dump of the kernel (assuming that one was made) and writes a reboot message in the shutdown log. It is invoked by the dumpadm service each time the system boots. savecore can be configured by dumpadm(1M) to save crash dump | data in either a compressed or uncompressed format. For the | compressed format, savecore saves the crash dump data in the | file directory/vmdump.N, where N in the name is replaced by | a number which grows every time savecore is run in that | directory. The compressed file can be uncompressed in a | separate step using the -f dumpfile option. For the | uncompressed format, savecore saves the crash dump data in | the file directory/vmcore.N and the kernel's namelist in | directory/unix.N. OPTIONS -f dumpfile Save a crash dump from the specified file | instead of from the system's current dump | device. When given directory/vmdump.N, | uncompress the file to vmcore.N and unix.N, | where N is the same number as in the | compressed name. | This option may also be useful if the infor- | mation stored on the dump device has been | copied to an on-disk file by means of the | dd(1M) command. FILES directory/vmdump.n | directory/vmcore.n directory/unix.n -- Sherry Moore, Solaris Core Kernel http://blogs.sun.com/sherrym