IBM, Could you please test the kernel mentioned in comment #3 ?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1678745

Title:
  Ubuntu17.04 KVM: Guest crashed @ xfs_perag_get_tag+0x6c

Status in The Ubuntu-power-systems project:
  Incomplete
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  == Comment: #0 - Lata Kuntal <lakun...@in.ibm.com> - 2017-03-30 09:44:23 ==
  Ubuntu 17.04 KVM guest gusg8 was having ubuntu 16.04.2 and was running stress 
test IO, Base,TCP and NFS.The guest is having XFS as rootFS and after running 
few hours of regression test it dropped at xmon.

  Console logs :
  ============
  root@guskvm:~# virsh console gusg8 --force
  Connected to domain gusg8
  Escape character is ^]

  
  1:mon> r
  R00 = d00000000288edf4   R16 = 00000000024200ca
  R01 = c0000000378cb1f0   R17 = 0000000000000000
  R02 = d000000002936080   R18 = 0000000000000020
  R03 = 0000000000000001   R19 = c0000002734d1800
  R04 = c0000000378cb190   R20 = 0000000000000000
  R05 = 0000000000000000   R21 = 0000000000000000
  R06 = 3c000000d03fe056   R22 = c00000027e26ccf0
  R07 = 0000000000000000   R23 = 0000000000000000
  R08 = c0000000048492d0   R24 = 0000000000000000
  R09 = 3c000000d03fe056   R25 = 0000000000000000
  R10 = 3c000000d03fe062   R26 = 000000024df4cd49
  R11 = d0000000028fa360   R27 = 0000000000000000
  R12 = 0000000000000000   R28 = d0000000028ac7b0
  R13 = c00000000fb80900   R29 = c000000004849000
  R14 = 0000000000000000   R30 = 0000000000000000
  R15 = c00000000137ad08   R31 = 0000000000000000
  pc  = d00000000288ee0c xfs_perag_get_tag+0x6c/0x170 [xfs]
  cfar= c00000000096a494 perf_trace_mmc_request_start+0x104/0x440
  lr  = d00000000288edf4 xfs_perag_get_tag+0x54/0x170 [xfs]
  msr = 800000010280b033   cr  = 82428424
  ctr = c0000000005e4950   xer = 0000000020000000   trap =  300
  dar = 3c000000d03fe062   dsisr = 40000000
  1:mon> t
  [c0000000378cb250] d0000000028ac7b0 xfs_reclaim_inodes_count+0x70/0xa0 [xfs]
  [c0000000378cb290] d0000000028c0ea8 xfs_fs_nr_cached_objects+0x28/0x40 [xfs]
  [c0000000378cb2b0] c0000000003292d8 super_cache_count+0x68/0x120
  [c0000000378cb2f0] c000000000271530 shrink_slab.part.14+0x150/0x4f0
  [c0000000378cb430] c000000000276db8 shrink_node+0x158/0x3f0
  [c0000000378cb4f0] c000000000277178 do_try_to_free_pages+0x128/0x460
  [c0000000378cb590] c0000000002775ac try_to_free_pages+0xfc/0x280
  [c0000000378cb620] c000000000260158 __alloc_pages_nodemask+0x758/0xe30
  [c0000000378cb7e0] c0000000002dbb98 alloc_pages_vma+0x108/0x360
  [c0000000378cb880] c00000000029d080 wp_page_copy+0xf0/0x9d0
  [c0000000378cb920] c0000000002a0770 do_wp_page+0x210/0xb20
  [c0000000378cb9b0] c0000000002a656c handle_mm_fault+0x9cc/0x14c0
  [c0000000378cba60] c000000000b511a0 do_page_fault+0x260/0x7d0
  [c0000000378cbb10] c000000000008948 handle_page_fault+0x10/0x30
  --- Exception: 301 (Data Access) at c00000000010aec4 schedule_tail+0x84/0xb0
  [c0000000378cbe30] c000000000009844 ret_from_fork+0x4/0x54
  --- Exception: c00 (System Call) at 00003fffa2b5bf44
  1:mon> d
  0000000000000000 **************** ****************  |                |
  1:mon> c
  cpus stopped: 0x0-0x3
  1:mon>

  Kernel host build
  =============
  root@guskvm:~# uname -r
  4.10.0-13-generic
  root@guskvm:~#

  
  == Comment: #1 - Luciano Chavez <cha...@us.ibm.com> - 2017-03-30 10:42:15 ==
  At first glance, based on the following assembly from around the failure 
point:

  d00000000288edd4  38c00001      li      r6,1
  d00000000288edd8  7f8802a6      mflr    r28
  d00000000288eddc  78a70020      clrldi  r7,r5,32
  d00000000288ede0  7c7d1b78      mr      r29,r3
  d00000000288ede4  7c852378      mr      r5,r4
  d00000000288ede8  386302c8      addi    r3,r3,712
  d00000000288edec  38810020      addi    r4,r1,32
  d00000000288edf0  4806b571      bl      d0000000028fa360        # 
exit_xfs_fs+0x180c/0xfd44 [xfs]
  d00000000288edf4  e8410018      ld      r2,24(r1)
  d00000000288edf8  2f830000      cmpwi   cr7,r3,0
  d00000000288edfc  409d0104      ble     cr7,d00000000288ef00    # 
xfs_perag_get_tag+0x160/0x170 [xfs]
  d00000000288ee00  7c0004ac      sync
  d00000000288ee04  e9210020      ld      r9,32(r1)
  d00000000288ee08  3949000c      addi    r10,r9,12
  d00000000288ee0c  7fc05028      lwarx   r30,0,r10
  d00000000288ee10  33de0001      addic   r30,r30,1
  d00000000288ee14  7fc0512d      stwcx.  r30,0,r10

  I believe the crash in fs_perag_get_tag() is after we come back from
  the radix_tree_gang_lookup_tag() call and are attempting the
  atomic_inc_return() and struct xfs_perag    *pag is R09 =
  3c000000d03fe056 which is invalid.

   85     rcu_read_lock();                                                      
                                                 
   86     found = radix_tree_gang_lookup_tag(&mp->m_perag_tree,                 
                                                 
   87                     (void **)&pag, first, 1, tag);                        
                                                 
   88     if (found <= 0) {                                                     
                                                 
   89         rcu_read_unlock();                                                
                                                 
   90         return NULL;                                                      
                                                 
   91     }                                                                     
                                                 
   92     ref = atomic_inc_return(&pag->pag_ref);

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1678745/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to