Bug#723180: linux-image-3.2.0-4-rt-amd64: kernel oops with futexes and gdb reverse-next

2013-09-22 Thread Ben Hutchings
On Tue, 2013-09-17 at 13:04 -0700, Brian Silverman wrote:
 I can see (and download) them both at
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=723180.

Right, I replied to your original message and missed the follow-up.

 I'm just going to paste the backtrace in here, but that won't work for
 the executable, and I'm not sure what to do differently to get the
 attachments to work right.
 
 
 Also, I forgot to mention before: if it would help, I'd be happy to
 send the source code for the executable. I can also give details of
 how my code is misusing futexes once I figure that out.
[...]

Please do provide the source code.

Ben.

-- 
Ben Hutchings
Beware of programmers who carry screwdrivers. - Leonard Brandwein


signature.asc
Description: This is a digitally signed message part


Bug#723180: linux-image-3.2.0-4-rt-amd64: kernel oops with futexes and gdb reverse-next

2013-09-17 Thread Brian Silverman
Package: src:linux
Version: 3.2.46-1+deb7u1
Severity: normal

I was working on some custom mutex code (implemented using futexes), and
it wasn't working, so I started it up under GDB, waited until
it died, and then tried reverse stepping back to where it did something
wrong. I then got a kernel oops.

Here's exactly what I did:
In GDB, I set up a breakpoint in the thread which dies before the
point at which it dies, `run`, `record`, `cont`, and then (after it
crashed) `reverse-next` (might have been `reverse-step`). My X11 server
then went down and dropped me back at a virtual terminal with a kernel
backtrace on it.

I know that there is a bug in the way that the code
uses futexes, but it shouldn't lead to a kernel oops... I'm attaching
both the program that causes this problem (it's compiled for amd64 with
-m32) and the kernel backtrace.

-- Package-specific info:
** Version:
Linux version 3.2.0-4-rt-amd64 (debian-kernel@lists.debian.org) (gcc version 
4.6.3 (Debian 4.6.3-14) ) #1 SMP PREEMPT RT Debian 3.2.46-1+deb7u1

** Command line:
BOOT_IMAGE=/vmlinuz-3.2.0-4-rt-amd64 
root=UUID=48656ad1-7e3b-437f-a3ef-74be4d26c33b ro quiet

** Not tainted

** Kernel log:
[7.712779] input: PC Speaker as /devices/platform/pcspkr/input/input6
[7.727162] ACPI: AC Adapter [ADP0] (on-line)
[7.741896] ACPI: Battery Slot [BAT0] (battery present)
[7.743360] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.07
[7.743476] iTCO_wdt: Found a Cougar Point TCO device (Version=2, 
TCOBASE=0x0460)
[7.743569] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
[7.781701] ACPI: resource :00:1f.3 [io  0xefa0-0xefbf] conflicts with 
ACPI region SMBI [io 0xefa0-0xefaf]
[7.781704] ACPI: If an ACPI driver is available for this device, you should 
use it instead of the native driver
[7.981546] input: Dell WMI hotkeys as /devices/virtual/input/input7
[8.069085] [drm] Initialized drm 1.1.0 20060810
[8.464370] i915 :00:02.0: setting latency timer to 64
[8.510530] i915 :00:02.0: irq 47 for MSI/MSI-X
[8.510537] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[8.510538] [drm] Driver supports precise vblank timestamp query.
[8.510601] vgaarb: device changed decodes: 
PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[8.551386] cfg80211: Calling CRDA to update world regulatory domain
[8.923083] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[8.950671] psmouse serio1: synaptics: Touchpad model: 1, fw: 7.5, id: 
0x1e0b1, caps: 0xd00073/0x24/0x8a0400
[8.994248] input: SynPS/2 Synaptics TouchPad as 
/devices/platform/i8042/serio1/input/input8
[9.063676] Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:
[9.063679] Copyright(c) 2003-2011 Intel Corporation
[9.063803] iwlwifi :01:00.0: setting latency timer to 64
[9.063848] iwlwifi :01:00.0: pci_resource_len = 0x2000
[9.063850] iwlwifi :01:00.0: pci_resource_base = c9000539
[9.063851] iwlwifi :01:00.0: HW Revision ID = 0x34
[9.064176] iwlwifi :01:00.0: irq 48 for MSI/MSI-X
[9.064299] iwlwifi :01:00.0: Detected Intel(R) Centrino(R) Wireless-N 
1030 BGN, REV=0xB0
[9.064492] iwlwifi :01:00.0: L1 Enabled; Disabling L0S
[9.082008] iwlwifi :01:00.0: device EEPROM VER=0x716, CALIB=0x6
[9.082016] iwlwifi :01:00.0: Device SKU: 0X150
[9.082022] iwlwifi :01:00.0: Valid Tx ant: 0X1, Valid Rx ant: 0X3
[9.082061] iwlwifi :01:00.0: Tunable channels: 13 802.11bg, 0 802.11a 
channels
[9.097427] fbcon: inteldrmfb (fb0) is primary device
[9.196474] iwlwifi :01:00.0: firmware: agent loaded 
iwlwifi-6000g2b-6.ucode into memory
[9.196487] iwlwifi :01:00.0: loaded firmware version 18.168.6.1
[9.197654] Registered led device: phy0-led
[9.380257] Console: switching to colour frame buffer device 170x48
[9.383441] fb0: inteldrmfb frame buffer device
[9.383442] drm: registered panic notifier
[9.412884] ieee80211 phy0: Selected rate control algorithm 'iwl-agn-rs'
[9.419450] acpi device:28: registered as cooling_device4
[9.420391] input: Video Bus as 
/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input9
[9.420618] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[9.420662] [drm] Initialized i915 1.6.0 20080730 for :00:02.0 on minor 0
[9.420893] snd_hda_intel :00:1b.0: irq 49 for MSI/MSI-X
[9.421051] snd_hda_intel :00:1b.0: setting latency timer to 64
[9.433461] Bluetooth: Core ver 2.16
[9.433617] NET: Registered protocol family 31
[9.433624] Bluetooth: HCI device and connection manager initialized
[9.433631] Bluetooth: HCI socket layer initialized
[9.433637] Bluetooth: L2CAP socket layer initialized
[9.433669] Bluetooth: SCO socket layer initialized
[9.539877] Bluetooth: Generic Bluetooth USB driver ver 0.6
[9.540331] usbcore: registered new interface driver btusb
[   

Bug#723180: linux-image-3.2.0-4-rt-amd64: kernel oops with futexes and gdb reverse-next

2013-09-17 Thread Ben Hutchings
On Mon, 2013-09-16 at 23:13 -0700, Brian Silverman wrote:
 Package: src:linux
 Version: 3.2.46-1+deb7u1
 Severity: normal
 
 I was working on some custom mutex code (implemented using futexes), and
 it wasn't working, so I started it up under GDB, waited until
 it died, and then tried reverse stepping back to where it did something
 wrong. I then got a kernel oops.
 
 Here's exactly what I did:
 In GDB, I set up a breakpoint in the thread which dies before the
 point at which it dies, `run`, `record`, `cont`, and then (after it
 crashed) `reverse-next` (might have been `reverse-step`). My X11 server
 then went down and dropped me back at a virtual terminal with a kernel
 backtrace on it.
 
 I know that there is a bug in the way that the code
 uses futexes, but it shouldn't lead to a kernel oops...

Indeed.

 I'm attaching
 both the program that causes this problem (it's compiled for amd64 with
 -m32) and the kernel backtrace.
[...]

The attachments didn't arrive; please try again.

Ben.

-- 
Ben Hutchings
The two most common things in the universe are hydrogen and stupidity.


signature.asc
Description: This is a digitally signed message part


Bug#723180: linux-image-3.2.0-4-rt-amd64: kernel oops with futexes and gdb reverse-next

2013-09-17 Thread Brian Silverman
I can see (and download) them both at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=723180. I'm just going to
paste the backtrace in here, but that won't work for the executable, and
I'm not sure what to do differently to get the attachments to work right.

Also, I forgot to mention before: if it would help, I'd be happy to send
the source code for the executable. I can also give details of how my code
is misusing futexes once I figure that out.

Here's the stacktrace:

Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445219] CPU 2
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445220] Modules linked
in: aes_x86_64 aes_generic parport_pc ppdev lp parport bnep rfcomm
cpufreq_stats cpufreq_userspace cpufreq_conservative cpufreq_powersave
binfmt_misc uinput fuse nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc
ext2 mbcache loop kvm_intel kvm uvcvideo videodev v4l2_compat_ioctl32 media
arc4 snd_hda_codec_hdmi joydev btusb snd_hda_codec_realtek bluetooth crc16
iwlwifi coretemp i915 snd_hda_intel snd_hda_codec drm_kms_helper snd_hwdep
drm snd_pcm snd_page_alloc snd_seq i2c_algo_bit snd_seq_device i2c_i801
dell_wmi snd_timer psmouse sparse_keymap serio_raw acpi_cpufreq mperf
dell_laptop mac80211 cfg80211 crc32c_intel snd dcdbas i2c_core
ghash_clmulni_intel wmi iTCO_wdt iTCO_vendor_support video evdev soundcore
rfkill pcspkr ac battery power_supply cryptd processor button xfs dm_mod sg
sr_mod sd_mod cdrom crc_t10dif usbhid hid thermal thermal_sys ahci libahci
libata scsi_mod r8169 xhci_hcd mii ehci_hcd usbcore usb_common [last
unloaded: scsi_wait_scan]
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445277]
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445279] Pid: 13661,
comm: mutex_test Not tainted 3.2.0-4-rt-amd64 #1 Debian 3.2.46-1+deb7u1
Dell Inc.  Dell System Inspiron N4110/05TM8C
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445283] RIP:
0010:[81368e30]  [81368e30]
native_irq_enable_sysexit+0x10/0x10
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445289] RSP:
0018:  EFLAGS: 00010146
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445291] RAX:
00e0 RBX: f7d06cd4 RCX: 0d696910
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445293] RDX:
cd94 RSI: f7fe8da6 RDI: f7d13a30
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445295] RBP:
cd2c R08:  R09: 
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445296] R10:
 R11:  R12: 
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445298] R13:
 R14:  R15: 
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445300] FS:
 () GS:88013f10(0063) knlGS:f7cff700
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445302] CS:  0010 DS:
002b ES: 002b CR0: 8005003b
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445304] CR2:
fff8 CR3: a669a000 CR4: 000406e0
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445306] DR0:
 DR1:  DR2: 
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445308] DR3:
 DR6: 4ff0 DR7: 0400
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445310] Process
mutex_test (pid: 13661, threadinfo 8800745ac000, task 880135b42440)
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445312]
 88013f105e40 81621298  88013f105f58
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445316]
  0ac0  0040
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445318]
 81010d8b 880135b42440  88013f105eb8
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445327]
 [81010d8b] ? show_registers+0xac/0x209
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445331]
 [8136325b] ? __die+0x99/0xd6
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445333]
 [81011a28] ? die+0x3f/0x5b
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445336]
 [8100fc75] ? do_double_fault+0x5a/0x5c
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445338]
 [81368a75] ? double_fault+0x25/0x30
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445341]
 [81368e30] ? native_irq_enable_sysexit+0x10/0x10
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.445367]  RSP

Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.512886] ---[ end trace
0002 ]---
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.513030] [
cut here ]
Sep 16 22:14:41 dell-inspiron-linux kernel: [250336.513036] WARNING: at
/build/linux-iWNI5S/linux-3.2.46/debian/build/source_rt/kernel/smp.c:325
smp_call_function_single+0x75/0x12e()
Sep 16 22:14:41