- Kdump ELF vmcores contain NT_PRSTATUS notes for online cpus only, so
  if cpus have been offlined prior to a crash, there will be fewer 
  notes than the number of cpus in the system, and therefore there will
  not be a one-to-one correlation between each cpu and its associated 
  NT_PRSTATUS note.  That causes backtrace failures for architectures 
  like ppc64 that depend upon the contents of the NT_PRSTATUS notes for
  gathering the starting stack location.
  (chan...@in.ibm.com, ander...@redhat.com)

- Fix and enhancement for the "dev" command.  When the command was run 
  against 2.6.26 or later kernels, it would fail with the error message
  "dev: invalid structure member offset: char_device_struct_fops".
  Additionally, even when the command did work, more often than not it
  would fail to determine the file_operations structure associated with 
  the block or character device, and erroneously display "(none)" or
  "(unused)".  This patch makes a more comprehensive search for the 
  file_operations structure, and instead of just displaying its address
  and symbolic translation, it will display the address of the data 
  structure that contains the pointer to the file_operations structure,
  along with the symbolic translation of the file_operations structure.
  For character devices, the containing structure is a "cdev", and for 
  block devices the containing structure is a "gendisk".  The command
  output adds new CDEV and GENDISK columns, and under the OPERATIONS
  column is the symbolic translation of its file_operations structure.
  (ander...@redhat.com, bob.montgom...@hp.com)

- Fix for a potential segmentation violation when running "foreach bt"
  on a very active live system with many processes starting and ending.
  Without the patch, a segmentation violation could occur when a "bt"
  was attempted on a task that had become non-existent.  This would
  happen on x86_64 or ppc64 machines, and was due to the usage of a 
  kernel stack pointer taken from a stale/invalid task_struct.  The 
  command will now recognize the bad stack pointer and display the 
  error  message "bt: task no longer exists" or "bt: invalid/stale 
  stack pointer for this task: <address>".

- Fix to correctly read LKCD Version 8 and later x86 dumpfile headers.

- If a kdump NMI issued to a non-crashing x86_64 cpu was received while
  running in schedule(), after having set the next task as "current" in
  the cpu's runqueue, but prior to changing the kernel stack to that of 
  the next task, then a backtrace would fail to make the transition 
  from the NMI exception stack back to the process stack, with the 
  error message "bt: cannot transition from exception stack to current 
  process stack".  This patch will report inconsistencies found between
  a task marked as the current task in a cpu's runqueue, and the task
  found in the per-cpu x8664_pda "pcurrent" field (2.6.29 and earlier) 
  or the per-cpu "current_task" variable (2.6.30 and later).  If it can
  be safely determined that the runqueue setting (used by default) is
  premature, then the crash utility's internal per-cpu active task will
  be changed to be the task indicated by the appropriate architecture
  specific value.  Also, a new "set -a <task>" option has been added
  to manually set a task to be the "active" task on its cpu. 

- Fix for x86_64 "bt" command when transitioning from the IRQ stack 
  back to the process stack on 2.6.29 and later kernels.  Without the
  patch, the interrupt exception frame address on the process stack 
  would be incorrectly determined, and its display would typically be 
  preceded by "[exception RIP: unknown or invalid address]", and the
  backtrace would fail from that point on.

- Enhancement to the "runq" command to show the current task in each 
  cpu's runqueue, plus a few formatting changes to make the output
  easier to understand.  

- Fix for a memory leak when running on live systems, due to the 
  repetitive reallocation of the internal array of active tasks.  

- Fix for usage with vmlinux debuginfo files using Dwarf 3 format, 
  for example, the Fedora 2.6.31-0.24.rc0.git18.fc12 kernel.  Without
  the patch, the crash session fails during initialization with the
  error message: "Dwarf Error: wrong version in compilation unit header 
  (is 3, should be 2) [in module <path-to>/vmlinux]", followed by 
  the erroneous message "crash: <path-to>/vmlinux: no debugging 
  data available".  The patch simply accepts the Dwarf 3 header, and
  the embedded gdb-6.1 version still appears to work with the updated
  vmlinux debuginfo file format. 

- Fix for faulty invocation failure when a System.map file is used as
  an argument with a compressed diskdump or compressed kdump dumpfile.
  If the System.map argument appears after the vmcore file on the 
  command line, as in: "crash vmcore System.map vmlinux", the crash 
  session fails immediately with the error message: "crash: vmcore: 
  initialization failed".  With the patch, the arguments may be entered
  in any order.

- Fix for a potential segmentation violation during invocation if a 
  vmcore file, a System.map file, and a non-matching vmlinux file are 
  used as command line arguments.  The problem is that whenever a
  System.map file is used, it is presumed that the user knows what he
  is doing, and that the vmlinux file is not the same as the kernel 
  that generated the vmcore; therefore the vmlinux/vmcore matching and
  verification routines are not performed.  However, if the kernel data
  structures in the non-matching vmlinux vary widely enough from the 
  kernel that generated the vmcore, all manners of bogus data may be 
  read and consumed.  The reported segmentation violation occurred when 
  using a vmcore created from a "stock" Red Hat kernel with a vmlinux 
  file from a Red Hat "debug" kernel, where the kernel data structures 
  are significantly different.  The patch adds a several new defensive 
  mechanisms, and displays additional warning messages, when invalid or
  questionable data is read, and as a result the crash session will fail 
  in a more reasonable manner.

- Adjusted several virtual and physical memory address definitions for 
  the patch, when run against CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels, 
  the "kmem -i" option would hang, and when run against CONFIG_SLUB and
  CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels, the "kmem -s" option would 
  report numerous errors indicating "kmem: read error: kernel virtual 
  address: <address>  type: page inuse", where the <address> was
  a legitimate virtual-memmap page structure address.

- Improvement for CONFIG_SLUB "kmem -s" or "kmem -S" options when an
  invalid slab page link address is encountered.  Without the patch,
  the commands fail with a generic "invalid kernel virtual address" 
  read error message, and "kmem -s" would not display any previously 
  collected statistics.  With the patch, the error message displays 
  the slab cache name, the list type, and the invalid pointer found, 
  for example, "kmem: dentry: partial list: page.lru.next: 100100".

Download from: http://people.redhat.com/anderson

Crash-utility mailing list

Reply via email to