[Kernel-packages] [Bug 1488035] Re: OSDs Linked list corruption causes kernel BUG at /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

2015-09-28 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 3.13.0-65.105

---
linux (3.13.0-65.105) trusty; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
- LP: #1498108

  [ Upstream Kernel Changes ]

  * net: Fix skb_set_peeked use-after-free bug
  - LP: #1497184

linux (3.13.0-64.104) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
- LP: #1493803

  [ Chris J Arges ]

  * [Config] DEFAULT_IOSCHED="deadline" for ppc64el
- LP: #1469829

  [ Upstream Kernel Changes ]

  * tcp: fix recv with flags MSG_WAITALL | MSG_PEEK
- LP: #1486146
  * libceph: abstract out ceph_osd_request enqueue logic
- LP: #1488035
  * libceph: resend lingering requests with a new tid
- LP: #1488035
  * n_tty: Refactor input_available_p() by call site
- LP: #1397976
  * tty: Fix pty master poll() after slave closes v2
- LP: #1397976
  * md: use kzalloc() when bitmap is disabled
- LP: #1493305
  * ata: pmp: add quirk for Marvell 4140 SATA PMP
- LP: #1493305
  * libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk
VB0250EAVER
- LP: #1493305
  * libata: add ATA_HORKAGE_NOTRIM
- LP: #1493305
  * libata: force disable trim for SuperSSpeed S238
- LP: #1493305
  * libata: increase the timeout when setting transfer mode
- LP: #1493305
  * libata: Do not blacklist M510DC
- LP: #1493305
  * mac80211: clear subdir_stations when removing debugfs
- LP: #1493305
  * ALSA: hda - Add new GPU codec ID 0x10de007d to snd-hda
- LP: #1493305
  * drm: Stop resetting connector state to unknown
- LP: #1493305
  * usb: dwc3: Reset the transfer resource index on SET_INTERFACE
- LP: #1493305
  * usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init()
function
- LP: #1493305
  * xhci: Calculate old endpoints correctly on device reset
- LP: #1493305
  * xhci: report U3 when link is in resume state
- LP: #1493305
  * xhci: prevent bus_suspend if SS port resuming in phase 1
- LP: #1493305
  * xhci: do not report PLC when link is in internal resume state
- LP: #1493305
  * USB: OHCI: Fix race between ED unlink and URB submission
- LP: #1493305
  * usb-storage: ignore ZTE MF 823 card reader in mode 0x1225
- LP: #1493305
  * blkcg: fix gendisk reference leak in blkg_conf_prep()
- LP: #1493305
  * tile: use free_bootmem_late() for initrd
- LP: #1493305
  * Input: usbtouchscreen - avoid unresponsive TSC-30 touch screen
- LP: #1493305
  * md/raid1: fix test for 'was read error from last working device'.
- LP: #1493305
  * mmc: omap_hsmmc: Fix DTO and DCRC handling
- LP: #1493305
  * isdn/gigaset: reset tty->receive_room when attaching ser_gigaset
- LP: #1493305
  * mmc: sdhci-pxav3: fix platform_data is not initialized
- LP: #1493305
  * mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()
- LP: #1493305
  * mmc: sdhci-esdhc: Make 8BIT bus work
- LP: #1493305
  * bonding: correctly handle bonding type change on enslave failure
- LP: #1493305
  * net: Clone skb before setting peeked flag
- LP: #1493305
  * bridge: mdb: fix double add notification
- LP: #1493305
  * usb: gadget: mv_udc_core: fix phy_regs I/O memory leak
- LP: #1493305
  * inet: frags: fix defragmented packet's IP header for af_packet
- LP: #1493305
  * bonding: fix destruction of bond with devices different from
arphrd_ether
- LP: #1493305
  * ARM: OMAP2+: hwmod: Fix _wait_target_ready() for hwmods without sysc
- LP: #1493305
  * ASoC: pcm1681: Fix setting de-emphasis sampling rate selection
- LP: #1493305
  * iscsi-target: Fix use-after-free during TPG session shutdown
- LP: #1493305
  * iscsi-target: Fix iscsit_start_kthreads failure OOPs
- LP: #1493305
  * iscsi-target: Fix iser explicit logout TX kthread leak
- LP: #1493305
  * ALSA: hda - Apply fixup for another Toshiba Satellite S50D
- LP: #1493305
  * vhost: actually track log eventfd file
- LP: #1493305
  * xfs: remote attributes need to be considered data
- LP: #1493305
  * ALSA: usb-audio: add dB range mapping for some devices
- LP: #1493305
  * drm/radeon/combios: add some validation of lvds values
- LP: #1493305
  * x86/efi: Use all 64 bit of efi_memmap in setup_e820()
- LP: #1493305
  * ipr: Fix locking for unit attention handling
- LP: #1493305
  * ipr: Fix incorrect trace indexing
- LP: #1493305
  * ipr: Fix invalid array indexing for HRRQ
- LP: #1493305
  * ALSA: hda - Fix MacBook Pro 5,2 quirk
- LP: #1493305
  * x86/xen: Probe target addresses in set_aliased_prot() before the
hypercall
- LP: #1493305
  * netfilter: ctnetlink: put back references to master ct and expect
objects
- LP: #1493305
  * bridge: mdb: fix delmdb state in the notification
- LP: #1493305
  * ipvs: fix crash with sync protocol v0 and FTP
- LP: #1493305
  * act_pedit: check binding before calling tcf_hash_release()
- LP: #1493305
  * netfilter: 

[Kernel-packages] [Bug 1488035] Re: OSDs Linked list corruption causes kernel BUG at /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

2015-09-17 Thread Gavin Guo
** Tags removed: verification-needed-trusty
** Tags added: verification-done-trusty

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1488035

Title:
  OSDs Linked list corruption causes kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Trusty:
  Fix Committed

Bug description:
  [Impact]

  The node which mounts a ceph rbd volume causes a panic when all OSD
  daemons on the all ceph nodes are restarted.

  [642981.871592] [ cut here ]
  [642981.912255] kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!
  [642981.994517] invalid opcode:  [#1] SMP
  [642982.037227] Modules linked in: xt_multiport iptable_mangle xt_nat
  xt_tcpudp veth xfs rbd libceph libcrc32c xt_addrtype xt_conntrack
  ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
  iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge aufs
  ipmi_devintf joydev gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp
  kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
  aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd hid_generic mei_me
  ioatdma mei lpc_ich wmi ipmi_si 8021q garp stp mrp llc bonding
  acpi_power_meter mac_hid lp parport ixgbe usbhid dca tg3 ahci libahci hid
  ptp megaraid_sas mdio pps_core
  [642982.528519] CPU: 0 PID: 1062099 Comm: kworker/0:6 Not tainted
  3.13.0-45-generic #74-Ubuntu
  [642982.648057] Hardware name: NEC Express5800/R120f-1M
  [N8100-2203Y]/MS-S0901, BIOS 5.0.4016 12/17/2014
  [642982.775433] Workqueue: ceph-msgr con_work [libceph]
  [642982.841300] task: 881028444800 ti: 880d92374000 task.ti:
  880d92374000
  [642982.973255] RIP: 0010:[] []
  osd_reset+0x22e/0x2c0 [libceph]
  [642983.114484] RSP: 0018:880d92375d80 EFLAGS: 00010283
  [642983.188540] RAX: 8800197f2ca8 RBX: 882028194750 RCX:
  880036bcdc48
  [642983.334096] RDX: 8800197f2ca8 RSI: 8800197f2c10 RDI:
  0286
  [642983.485552] RBP: 880d92375dd8 R08:  R09:
  
  [642983.643277] R10: 8160afcf R11: ea00710cae00 R12:
  8800197f2c58
  [642983.805364] R13: 882028194810 R14: 880036bcdbf8 R15:
  880036bcdc18
  [642983.968728] FS: () GS:88103fa0()
  knlGS:
  [642984.135368] CS: 0010 DS:  ES:  CR0: 80050033
  [642984.220577] CR2: 7f60d4cb7868 CR3: 01c0e000 CR4:
  001407f0
  [642984.383051] Stack:
  [642984.459809] 8820281947a8 882028194760 8800197f2800
  8800197f2ca8
  [642984.618038] 880d92375da0 880d92375da0 8800197f2c10
  8800197f2830

  [Fix]

  A linked list to manage OSDs in the kernel was corrupted when restarting
  all OSD daemons on all ceph nodes at the almost same time.

  The issues must be fixed by the following.

  libceph: must use new tid when watch is resent
  http://tracker.ceph.com/issues/8806

  This includes two patched and they has been already released.

  http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/20878
  [PATCH 1/2] libceph: abstract out ceph_osd_request enqueue logic
  [PATCH 2/2] libceph: resend lingering requests with a new tid

  3.18 kernel adopts the fixes.

  libceph: abstract out ceph_osd_request enqueue logic
  
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f671b581f1dac61354186b7373af5f97fe420584
  libceph: resend lingering requests with a new tid
  
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2cc6128ab2afff7864dbdc33a73e2deaa935d9e0

  [Test Case]

  After setting up the ceph environment, repeatedly issued the following
  command from a node to all ceph nodes.

  rsh -i key -l ubuntu sn_hostname sudo service ceph-all restart

  And verify if there is panics.

  A test kernel with this fix was verified to fix this problem.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1488035/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1488035] Re: OSDs Linked list corruption causes kernel BUG at /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

2015-09-13 Thread Brad Figg
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-trusty

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1488035

Title:
  OSDs Linked list corruption causes kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Trusty:
  Fix Committed

Bug description:
  [Impact]

  The node which mounts a ceph rbd volume causes a panic when all OSD
  daemons on the all ceph nodes are restarted.

  [642981.871592] [ cut here ]
  [642981.912255] kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!
  [642981.994517] invalid opcode:  [#1] SMP
  [642982.037227] Modules linked in: xt_multiport iptable_mangle xt_nat
  xt_tcpudp veth xfs rbd libceph libcrc32c xt_addrtype xt_conntrack
  ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
  iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge aufs
  ipmi_devintf joydev gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp
  kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
  aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd hid_generic mei_me
  ioatdma mei lpc_ich wmi ipmi_si 8021q garp stp mrp llc bonding
  acpi_power_meter mac_hid lp parport ixgbe usbhid dca tg3 ahci libahci hid
  ptp megaraid_sas mdio pps_core
  [642982.528519] CPU: 0 PID: 1062099 Comm: kworker/0:6 Not tainted
  3.13.0-45-generic #74-Ubuntu
  [642982.648057] Hardware name: NEC Express5800/R120f-1M
  [N8100-2203Y]/MS-S0901, BIOS 5.0.4016 12/17/2014
  [642982.775433] Workqueue: ceph-msgr con_work [libceph]
  [642982.841300] task: 881028444800 ti: 880d92374000 task.ti:
  880d92374000
  [642982.973255] RIP: 0010:[] []
  osd_reset+0x22e/0x2c0 [libceph]
  [642983.114484] RSP: 0018:880d92375d80 EFLAGS: 00010283
  [642983.188540] RAX: 8800197f2ca8 RBX: 882028194750 RCX:
  880036bcdc48
  [642983.334096] RDX: 8800197f2ca8 RSI: 8800197f2c10 RDI:
  0286
  [642983.485552] RBP: 880d92375dd8 R08:  R09:
  
  [642983.643277] R10: 8160afcf R11: ea00710cae00 R12:
  8800197f2c58
  [642983.805364] R13: 882028194810 R14: 880036bcdbf8 R15:
  880036bcdc18
  [642983.968728] FS: () GS:88103fa0()
  knlGS:
  [642984.135368] CS: 0010 DS:  ES:  CR0: 80050033
  [642984.220577] CR2: 7f60d4cb7868 CR3: 01c0e000 CR4:
  001407f0
  [642984.383051] Stack:
  [642984.459809] 8820281947a8 882028194760 8800197f2800
  8800197f2ca8
  [642984.618038] 880d92375da0 880d92375da0 8800197f2c10
  8800197f2830

  [Fix]

  A linked list to manage OSDs in the kernel was corrupted when restarting
  all OSD daemons on all ceph nodes at the almost same time.

  The issues must be fixed by the following.

  libceph: must use new tid when watch is resent
  http://tracker.ceph.com/issues/8806

  This includes two patched and they has been already released.

  http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/20878
  [PATCH 1/2] libceph: abstract out ceph_osd_request enqueue logic
  [PATCH 2/2] libceph: resend lingering requests with a new tid

  3.18 kernel adopts the fixes.

  libceph: abstract out ceph_osd_request enqueue logic
  
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f671b581f1dac61354186b7373af5f97fe420584
  libceph: resend lingering requests with a new tid
  
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2cc6128ab2afff7864dbdc33a73e2deaa935d9e0

  [Test Case]

  After setting up the ceph environment, repeatedly issued the following
  command from a node to all ceph nodes.

  rsh -i key -l ubuntu sn_hostname sudo service ceph-all restart

  And verify if there is panics.

  A test kernel with this fix was verified to fix this problem.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1488035/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1488035] Re: OSDs Linked list corruption causes kernel BUG at /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

2015-08-27 Thread Brad Figg
** Also affects: linux (Ubuntu Trusty)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Trusty)
   Status: New = Fix Committed

** Changed in: linux (Ubuntu)
   Status: Incomplete = Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1488035

Title:
  OSDs Linked list corruption causes kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Trusty:
  Fix Committed

Bug description:
  [Impact]

  The node which mounts a ceph rbd volume causes a panic when all OSD
  daemons on the all ceph nodes are restarted.

  [642981.871592] [ cut here ]
  [642981.912255] kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!
  [642981.994517] invalid opcode:  [#1] SMP
  [642982.037227] Modules linked in: xt_multiport iptable_mangle xt_nat
  xt_tcpudp veth xfs rbd libceph libcrc32c xt_addrtype xt_conntrack
  ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
  iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge aufs
  ipmi_devintf joydev gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp
  kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
  aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd hid_generic mei_me
  ioatdma mei lpc_ich wmi ipmi_si 8021q garp stp mrp llc bonding
  acpi_power_meter mac_hid lp parport ixgbe usbhid dca tg3 ahci libahci hid
  ptp megaraid_sas mdio pps_core
  [642982.528519] CPU: 0 PID: 1062099 Comm: kworker/0:6 Not tainted
  3.13.0-45-generic #74-Ubuntu
  [642982.648057] Hardware name: NEC Express5800/R120f-1M
  [N8100-2203Y]/MS-S0901, BIOS 5.0.4016 12/17/2014
  [642982.775433] Workqueue: ceph-msgr con_work [libceph]
  [642982.841300] task: 881028444800 ti: 880d92374000 task.ti:
  880d92374000
  [642982.973255] RIP: 0010:[a025f5be] [a025f5be]
  osd_reset+0x22e/0x2c0 [libceph]
  [642983.114484] RSP: 0018:880d92375d80 EFLAGS: 00010283
  [642983.188540] RAX: 8800197f2ca8 RBX: 882028194750 RCX:
  880036bcdc48
  [642983.334096] RDX: 8800197f2ca8 RSI: 8800197f2c10 RDI:
  0286
  [642983.485552] RBP: 880d92375dd8 R08:  R09:
  
  [642983.643277] R10: 8160afcf R11: ea00710cae00 R12:
  8800197f2c58
  [642983.805364] R13: 882028194810 R14: 880036bcdbf8 R15:
  880036bcdc18
  [642983.968728] FS: () GS:88103fa0()
  knlGS:
  [642984.135368] CS: 0010 DS:  ES:  CR0: 80050033
  [642984.220577] CR2: 7f60d4cb7868 CR3: 01c0e000 CR4:
  001407f0
  [642984.383051] Stack:
  [642984.459809] 8820281947a8 882028194760 8800197f2800
  8800197f2ca8
  [642984.618038] 880d92375da0 880d92375da0 8800197f2c10
  8800197f2830

  [Fix]

  A linked list to manage OSDs in the kernel was corrupted when restarting
  all OSD daemons on all ceph nodes at the almost same time.

  The issues must be fixed by the following.

  libceph: must use new tid when watch is resent
  http://tracker.ceph.com/issues/8806

  This includes two patched and they has been already released.

  http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/20878
  [PATCH 1/2] libceph: abstract out ceph_osd_request enqueue logic
  [PATCH 2/2] libceph: resend lingering requests with a new tid

  3.18 kernel adopts the fixes.

  libceph: abstract out ceph_osd_request enqueue logic
  
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f671b581f1dac61354186b7373af5f97fe420584
  libceph: resend lingering requests with a new tid
  
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2cc6128ab2afff7864dbdc33a73e2deaa935d9e0

  [Test Case]

  After setting up the ceph environment, repeatedly issued the following
  command from a node to all ceph nodes.

  rsh -i key -l ubuntu sn_hostname sudo service ceph-all restart

  And verify if there is panics.

  A test kernel with this fix was verified to fix this problem.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1488035/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1488035] Re: OSDs Linked list corruption causes kernel BUG at /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

2015-08-25 Thread Gavin Guo
** Description changed:

  [Impact]
  
  The node which mounts a ceph rbd volume causes a panic when all OSD
  daemons on the all ceph nodes are restarted.
  
- [642981.871592] [ cut here ] 
+ [642981.871592] [ cut here ]
  [642981.912255] kernel BUG at
- /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892! 
- [642981.994517] invalid opcode:  [#1] SMP 
+ /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!
+ [642981.994517] invalid opcode:  [#1] SMP
  [642982.037227] Modules linked in: xt_multiport iptable_mangle xt_nat
  xt_tcpudp veth xfs rbd libceph libcrc32c xt_addrtype xt_conntrack
  ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
  iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge aufs
  ipmi_devintf joydev gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp
  kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
  aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd hid_generic mei_me
  ioatdma mei lpc_ich wmi ipmi_si 8021q garp stp mrp llc bonding
  acpi_power_meter mac_hid lp parport ixgbe usbhid dca tg3 ahci libahci hid
- ptp megaraid_sas mdio pps_core 
+ ptp megaraid_sas mdio pps_core
  [642982.528519] CPU: 0 PID: 1062099 Comm: kworker/0:6 Not tainted
- 3.13.0-45-generic #74-Ubuntu 
- [642982.648057] Hardware name: NEC Express5800/R120f-1M 
- [N8100-2203Y]/MS-S0901, BIOS 5.0.4016 12/17/2014 
- [642982.775433] Workqueue: ceph-msgr con_work [libceph] 
+ 3.13.0-45-generic #74-Ubuntu
+ [642982.648057] Hardware name: NEC Express5800/R120f-1M
+ [N8100-2203Y]/MS-S0901, BIOS 5.0.4016 12/17/2014
+ [642982.775433] Workqueue: ceph-msgr con_work [libceph]
  [642982.841300] task: 881028444800 ti: 880d92374000 task.ti:
- 880d92374000 
+ 880d92374000
  [642982.973255] RIP: 0010:[a025f5be] [a025f5be]
- osd_reset+0x22e/0x2c0 [libceph] 
- [642983.114484] RSP: 0018:880d92375d80 EFLAGS: 00010283 
+ osd_reset+0x22e/0x2c0 [libceph]
+ [642983.114484] RSP: 0018:880d92375d80 EFLAGS: 00010283
  [642983.188540] RAX: 8800197f2ca8 RBX: 882028194750 RCX:
- 880036bcdc48 
+ 880036bcdc48
  [642983.334096] RDX: 8800197f2ca8 RSI: 8800197f2c10 RDI:
- 0286 
+ 0286
  [642983.485552] RBP: 880d92375dd8 R08:  R09:
-  
+ 
  [642983.643277] R10: 8160afcf R11: ea00710cae00 R12:
- 8800197f2c58 
+ 8800197f2c58
  [642983.805364] R13: 882028194810 R14: 880036bcdbf8 R15:
- 880036bcdc18 
+ 880036bcdc18
  [642983.968728] FS: () GS:88103fa0()
- knlGS: 
- [642984.135368] CS: 0010 DS:  ES:  CR0: 80050033 
+ knlGS:
+ [642984.135368] CS: 0010 DS:  ES:  CR0: 80050033
  [642984.220577] CR2: 7f60d4cb7868 CR3: 01c0e000 CR4:
- 001407f0 
- [642984.383051] Stack: 
+ 001407f0
+ [642984.383051] Stack:
  [642984.459809] 8820281947a8 882028194760 8800197f2800
- 8800197f2ca8 
+ 8800197f2ca8
  [642984.618038] 880d92375da0 880d92375da0 8800197f2c10
- 8800197f2830 
+ 8800197f2830
  
  [Fix]
  
  A linked list to manage OSDs in the kernel was corrupted when restarting
- all OSD daemons on all ceph nodes at the almost same time. 
+ all OSD daemons on all ceph nodes at the almost same time.
  
  The issues must be fixed by the following.
  
- libceph: must use new tid when watch is resent 
- http://tracker.ceph.com/issues/8806 
+ libceph: must use new tid when watch is resent
+ http://tracker.ceph.com/issues/8806
  
  This includes two patched and they has been already released.
  
- http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/20878 
- [PATCH 1/2] libceph: abstract out ceph_osd_request enqueue logic 
- [PATCH 2/2] libceph: resend lingering requests with a new tid 
+ http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/20878
+ [PATCH 1/2] libceph: abstract out ceph_osd_request enqueue logic
+ [PATCH 2/2] libceph: resend lingering requests with a new tid
  
  3.18 kernel adopts the fixes.
  
- libceph: abstract out ceph_osd_request enqueue logic 
- 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f671b581f1dac61354186b7373af5f97fe420584
 
- libceph: resend lingering requests with a new tid 
- 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2cc6128ab2afff7864dbdc33a73e2deaa935d9e0
 
+ libceph: abstract out ceph_osd_request enqueue logic
+ 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f671b581f1dac61354186b7373af5f97fe420584
+ libceph: resend lingering requests with a new tid
+ 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2cc6128ab2afff7864dbdc33a73e2deaa935d9e0
  
  [Test Case]
  
  After setting up the ceph environment, repeatedly issued the following
- command from a node to 

[Kernel-packages] [Bug 1488035] Re: OSDs Linked list corruption causes kernel BUG at /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

2015-08-24 Thread Gavin Guo
** Description changed:

  [Impact]
+ 
+ The node which mounts a ceph rbd volume causes a panic when all OSD
+ daemons on the all ceph nodes are restarted.
+ 
+ [642981.871592] [ cut here ] 
+ [642981.912255] kernel BUG at
+ /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892! 
+ [642981.994517] invalid opcode:  [#1] SMP 
+ [642982.037227] Modules linked in: xt_multiport iptable_mangle xt_nat
+ xt_tcpudp veth xfs rbd libceph libcrc32c xt_addrtype xt_conntrack
+ ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
+ iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge aufs
+ ipmi_devintf joydev gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp
+ kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
+ aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd hid_generic mei_me
+ ioatdma mei lpc_ich wmi ipmi_si 8021q garp stp mrp llc bonding
+ acpi_power_meter mac_hid lp parport ixgbe usbhid dca tg3 ahci libahci hid
+ ptp megaraid_sas mdio pps_core 
+ [642982.528519] CPU: 0 PID: 1062099 Comm: kworker/0:6 Not tainted
+ 3.13.0-45-generic #74-Ubuntu 
+ [642982.648057] Hardware name: NEC Express5800/R120f-1M 
+ [N8100-2203Y]/MS-S0901, BIOS 5.0.4016 12/17/2014 
+ [642982.775433] Workqueue: ceph-msgr con_work [libceph] 
+ [642982.841300] task: 881028444800 ti: 880d92374000 task.ti:
+ 880d92374000 
+ [642982.973255] RIP: 0010:[a025f5be] [a025f5be]
+ osd_reset+0x22e/0x2c0 [libceph] 
+ [642983.114484] RSP: 0018:880d92375d80 EFLAGS: 00010283 
+ [642983.188540] RAX: 8800197f2ca8 RBX: 882028194750 RCX:
+ 880036bcdc48 
+ [642983.334096] RDX: 8800197f2ca8 RSI: 8800197f2c10 RDI:
+ 0286 
+ [642983.485552] RBP: 880d92375dd8 R08:  R09:
+  
+ [642983.643277] R10: 8160afcf R11: ea00710cae00 R12:
+ 8800197f2c58 
+ [642983.805364] R13: 882028194810 R14: 880036bcdbf8 R15:
+ 880036bcdc18 
+ [642983.968728] FS: () GS:88103fa0()
+ knlGS: 
+ [642984.135368] CS: 0010 DS:  ES:  CR0: 80050033 
+ [642984.220577] CR2: 7f60d4cb7868 CR3: 01c0e000 CR4:
+ 001407f0 
+ [642984.383051] Stack: 
+ [642984.459809] 8820281947a8 882028194760 8800197f2800
+ 8800197f2ca8 
+ [642984.618038] 880d92375da0 880d92375da0 8800197f2c10
+ 8800197f2830 
+ 
  [Fix]
+ 
+ A linked list to manage OSDs in the kernel was corrupted when restarting
+ all OSD daemons on all ceph nodes at the almost same time. 
+ 
+ The issues must be fixed by the following.
+ 
+ libceph: must use new tid when watch is resent 
+ http://tracker.ceph.com/issues/8806 
+ 
+ This includes two patched and they has been already released.
+ 
+ http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/20878 
+ [PATCH 1/2] libceph: abstract out ceph_osd_request enqueue logic 
+ [PATCH 2/2] libceph: resend lingering requests with a new tid 
+ 
+ 3.18 kernel adopts the fixes.
+ 
+ libceph: abstract out ceph_osd_request enqueue logic 
+ 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f671b581f1dac61354186b7373af5f97fe420584
 
+ libceph: resend lingering requests with a new tid 
+ 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2cc6128ab2afff7864dbdc33a73e2deaa935d9e0
 
+ 
  [Test Case]
+ 
+ After setting up the ceph environment, repeatedly issued the following
+ command from a node to all ceph nodes. 
+ 
+ rsh -i key -l ubuntu sn_hostname sudo service ceph-all restart
+ 
+ And verify if there is panics.

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) = Gavin Guo (mimi0213kimo)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1488035

Title:
  OSDs Linked list corruption causes kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892!

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  [Impact]

  The node which mounts a ceph rbd volume causes a panic when all OSD
  daemons on the all ceph nodes are restarted.

  [642981.871592] [ cut here ] 
  [642981.912255] kernel BUG at
  /build/buildd/linux-3.13.0/net/ceph/osd_client.c:892! 
  [642981.994517] invalid opcode:  [#1] SMP 
  [642982.037227] Modules linked in: xt_multiport iptable_mangle xt_nat
  xt_tcpudp veth xfs rbd libceph libcrc32c xt_addrtype xt_conntrack
  ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
  iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge aufs
  ipmi_devintf joydev gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp
  kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
  aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd hid_generic mei_me
  ioatdma mei lpc_ich wmi ipmi_si 8021q garp stp mrp llc bonding