So with those exact steps I managed to crash a real dual-core machine.
So we can rule out any virtualization. This merely is related to
whatever that command does and likely having more than one cpu. (It also
worked / crashed on a local xen guest, so I can reproduce things
locally). Next step will be to test latest upstream kernels to see
whether this persists.

** Summary changed:

- Kernel crash on EC2 & VirtualBox
+ Kernel crash in rb_next doin ohai loops

** Description changed:

+ Testcase:
+ 1. apt-get install build-essential ruby-1.9.3 screen
+ 2. gem install chef
+ 3. in screen session: while true; oahi; done
+ 
+ ---
+ 
  We have a number of small and large instances running the release
  version of 12.04.  The small instances have been completely stable.
  However, every large instance we have has crashed at a seemingly random
  interval.  This is repeatable on individual systems, though not within a
  defined time period.  It appears to be triggered by our half hourly run
  of OpsCode's chef-client.  We tried running the client in a tight loop
  to recreate the crash but were unable to get it to do so in a short time
  period.  It still took two days to crash again.
  
  This was affecting the 3.2.0-23-virtual kernel, so we updated to the
  3.2.0-24-virtual kernel but still have found the same crash.  The only
  information available in the system logs is:
  
  [17605315.391128] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000010
  [17605315.391148] IP: [<ffffffff8130d7f1>] rb_next+0x1/0x50
- [17605315.391163] PGD 1d2fdc067 PUD 1d0e3c067 PMD 0 
- [17605315.391172] Oops: 0000 [#1] SMP 
- [17605315.391179] CPU 1 
+ [17605315.391163] PGD 1d2fdc067 PUD 1d0e3c067 PMD 0
+ [17605315.391172] Oops: 0000 [#1] SMP
+ [17605315.391179] CPU 1
  [17605315.391182] Modules linked in: ipt_REJECT xt_tcpudp nf_conntrack_ipv4 
nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables isofs 
acpiphp
- [17605315.391209] 
- [17605315.391214] Pid: 28794, comm: chef-client Not tainted 3.2.0-23-virtual 
#36-Ubuntu  
+ [17605315.391209]
+ [17605315.391214] Pid: 28794, comm: chef-client Not tainted 3.2.0-23-virtual 
#36-Ubuntu
  [17605315.391223] RIP: e030:[<ffffffff8130d7f1>]  [<ffffffff8130d7f1>] 
rb_next+0x1/0x50
  [17605315.391232] RSP: e02b:ffff8801d2659c18  EFLAGS: 00010046
  [17605315.391238] RAX: 0000000000000000 RBX: ffff8801d2eb5a00 RCX: 
0000000000000000
  [17605315.391244] RDX: fffffffffffffff0 RSI: 0000000000000000 RDI: 
0000000000000010
  [17605315.391250] RBP: ffff8801d2659c48 R08: 0000000000000000 R09: 
0000000000000000
  [17605315.391255] R10: ffff8801dff866c0 R11: 0000000000000001 R12: 
0000000000000000
  [17605315.391263] R13: 0000000000000000 R14: 0000000000000000 R15: 
00000000033b9e28
  [17605315.391274] FS:  00007fee8cc10700(0000) GS:ffff8801dff8f000(0000) 
knlGS:0000000000000000
  [17605315.391281] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
  [17605315.391287] CR2: 0000000000000010 CR3: 00000001d2a0b000 CR4: 
0000000000002660
  [17605315.391294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [17605315.391301] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
  [17605315.391308] Process chef-client (pid: 28794, threadinfo 
ffff8801d2658000, task ffff8801d0870000)
  [17605315.391315] Stack:
  [17605315.391319]  ffff8801d2659c48 ffffffff8104ece9 ffff8801d2eb5a00 
ffff8801dffa26c0
  [17605315.391331]  ffff8801d2eb5200 0000000000000000 ffff8801d2659c78 
ffffffff810544b8
  [17605315.391343]  ffff8801d2659c78 ffff8801dffa26c0 0000000000000001 
ffff8801d08703a8
  [17605315.391354] Call Trace:
  [17605315.391364]  [<ffffffff8104ece9>] ? pick_next_entity+0xb9/0xe0
  [17605315.391373]  [<ffffffff810544b8>] pick_next_task_fair+0x38/0x70
  [17605315.391382]  [<ffffffff81652ddc>] __schedule+0x14c/0x6f0
  [17605315.391391]  [<ffffffff816554ee>] ? 
_raw_spin_unlock_irqrestore+0x1e/0x30
  [17605315.391399]  [<ffffffff8165344f>] schedule+0x3f/0x60
  [17605315.391408]  [<ffffffff8117e119>] pipe_wait+0x59/0x80
  [17605315.391417]  [<ffffffff81089340>] ? add_wait_queue+0x60/0x60
  [17605315.391425]  [<ffffffff8117e87a>] pipe_read+0x1da/0x330
  [17605315.391433]  [<ffffffff81174522>] do_sync_read+0xd2/0x110
  [17605315.391443]  [<ffffffff8100a25d>] ? xen_force_evtchn_callback+0xd/0x10
  [17605315.391451]  [<ffffffff8100aa32>] ? check_events+0x12/0x20
  [17605315.391459]  [<ffffffff81298d33>] ? security_file_permission+0x93/0xb0
  [17605315.391466]  [<ffffffff811749a1>] ? rw_verify_area+0x61/0xf0
  [17605315.391473]  [<ffffffff81174e80>] vfs_read+0xb0/0x180
  [17605315.391479]  [<ffffffff81174f9a>] sys_read+0x4a/0x90
  [17605315.391488]  [<ffffffff8165d8c2>] system_call_fastpath+0x16/0x1b
- [17605315.391494] Code: 89 06 48 8b 47 08 48 89 46 08 48 8b 47 10 48 89 46 10 
c3 0f 1f 80 00 00 00 00 48 89 32 eb b2 0f 1f 00 48 89 70 10 eb a9 66 90 55 <48> 
8b 17 48 89 e5 48 89 d0 48 83 e0 fc 48 39 c7 74 34 48 8b 47 
+ [17605315.391494] Code: 89 06 48 8b 47 08 48 89 46 08 48 8b 47 10 48 89 46 10 
c3 0f 1f 80 00 00 00 00 48 89 32 eb b2 0f 1f 00 48 89 70 10 eb a9 66 90 55 <48> 
8b 17 48 89 e5 48 89 d0 48 83 e0 fc 48 39 c7 74 34 48 8b 47
  [17605315.391577] RIP  [<ffffffff8130d7f1>] rb_next+0x1/0x50
  [17605315.391583]  RSP <ffff8801d2659c18>
  [17605315.391587] CR2: 0000000000000010
  [17605315.391596] ---[ end trace 586cfae3c9e3e67e ]---
  
  The stack trace is identical between the two kernels.  I am unable to
  find any reference to this on Ubuntu, Xen, or kernel forums or mailing
  lists but it's repeatable even on freshly installed m1.large instances
  on EC2.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 12.04
  Package: linux-image-3.2.0-24-virtual 3.2.0-24.37
  ProcVersionSignature: Ubuntu 3.2.0-24.37-virtual 3.2.14
  Uname: Linux 3.2.0-24-virtual x86_64
  AcpiTables:
-  
+ 
  AlsaDevices:
-  total 0
-  crw-rw---T 1 root audio 116,  1 May  7 09:58 seq
-  crw-rw---T 1 root audio 116, 33 May  7 09:58 timer
+  total 0
+  crw-rw---T 1 root audio 116,  1 May  7 09:58 seq
+  crw-rw---T 1 root audio 116, 33 May  7 09:58 timer
  AplayDevices: aplay: device_list:252: no soundcards found...
  ApportVersion: 2.0.1-0ubuntu7
  Architecture: amd64
  ArecordDevices: arecord: device_list:252: no soundcards found...
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  Date: Tue May 15 15:23:54 2012
  Ec2AMI: ami-fd1c2789
  Ec2AMIManifest: 
ubuntu-eu-west-1/images-testing/ubuntu-precise-daily-amd64-desktop-20120420.manifest.xml
  Ec2AvailabilityZone: eu-west-1b
  Ec2InstanceType: m1.large
  Ec2Kernel: aki-62695816
  Ec2Ramdisk: unavailable
  IwConfig:
-  lo        no wireless extensions.
-  
-  eth0      no wireless extensions.
+  lo        no wireless extensions.
+ 
+  eth0      no wireless extensions.
  Lspci:
-  
+ 
  Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize 
libusb: -99
  PciMultimedia:
-  
+ 
  ProcEnviron:
-  TERM=xterm-256color
-  PATH=(custom, no user)
-  LANG=en_US.UTF-8
-  SHELL=/bin/bash
+  TERM=xterm-256color
+  PATH=(custom, no user)
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
  ProcFB:
-  
+ 
  ProcKernelCmdLine: root=LABEL=cloudimg-rootfs ro console=hvc0
  PulseList:
-  Error: command ['pacmd', 'list'] failed with exit code 1: Home directory 
/home/mydrive not ours.
-  No PulseAudio daemon running, or not running as session daemon.
+  Error: command ['pacmd', 'list'] failed with exit code 1: Home directory 
/home/mydrive not ours.
+  No PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
-  linux-restricted-modules-3.2.0-24-virtual N/A
-  linux-backports-modules-3.2.0-24-virtual  N/A
-  linux-firmware                            1.79
+  linux-restricted-modules-3.2.0-24-virtual N/A
+  linux-backports-modules-3.2.0-24-virtual  N/A
+  linux-firmware                            1.79
  RfKill:
-  
+ 
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  WifiSyslog:

** Summary changed:

- Kernel crash in rb_next doin ohai loops
+ Kernel crash in rb_next doing ohai loops

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/999755

Title:
  Kernel crash in rb_next doing ohai loops

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/999755/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to