Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, Jul 16, 2020 at 09:22:11PM +0300, Maxim Levitsky wrote: > On Thu, 2020-07-16 at 21:21 +0300, Andy Shevchenko wrote: > > On Thu, Jul 16, 2020 at 09:00:00PM +0300, Maxim Levitsky wrote: > > > On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote: ... > > > It works (no more oops) > > > > Thanks for testing. I'm about to send formal patch, can you give your > > Tested-by tag there then? > > Of course. > > Tested-by: Maxim Levitsky Thanks, I meant there [1] :-) [1]: https://lore.kernel.org/lkml/20200716182747.54929-1-andriy.shevche...@linux.intel.com/T/#u -- With Best Regards, Andy Shevchenko
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, Jul 16, 2020 at 09:00:00PM +0300, Maxim Levitsky wrote: > On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote: > > On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > > > Hi! > > > > > > Few days ago I bisected a regression on 5.8 kernel: > > > > > > I have nvidia rtx 2070s and its USB type C port driver (which is open > > > source) > > > started to crash on load: > > > > ... > > > > > Reverting the commit helped fix this oops. > > > > > > My .config attached. > > > If any more info is needed I'll be happy to provide it, > > > and of course test patches. > > > > Can you test below? > > > > diff --git a/drivers/base/property.c b/drivers/base/property.c > > index 1e6d75e65938..d58aa98fe964 100644 > > --- a/drivers/base/property.c > > +++ b/drivers/base/property.c > > @@ -721,7 +721,7 @@ struct fwnode_handle *device_get_next_child_node(struct > > device *dev, > > return next; > > > > /* When no more children in primary, continue with secondary */ > > - if (!IS_ERR_OR_NULL(fwnode->secondary)) > > + if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary)) > > next = fwnode_get_next_child_node(fwnode->secondary, child); > > > > return next; > > It works (no more oops) Thanks for testing. I'm about to send formal patch, can you give your Tested-by tag there then? -- With Best Regards, Andy Shevchenko
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, 2020-07-16 at 21:21 +0300, Andy Shevchenko wrote: > On Thu, Jul 16, 2020 at 09:00:00PM +0300, Maxim Levitsky wrote: > > On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote: > > > On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > > > > Hi! > > > > > > > > Few days ago I bisected a regression on 5.8 kernel: > > > > > > > > I have nvidia rtx 2070s and its USB type C port driver (which is open > > > > source) > > > > started to crash on load: > > > > > > ... > > > > > > > Reverting the commit helped fix this oops. > > > > > > > > My .config attached. > > > > If any more info is needed I'll be happy to provide it, > > > > and of course test patches. > > > > > > Can you test below? > > > > > > diff --git a/drivers/base/property.c b/drivers/base/property.c > > > index 1e6d75e65938..d58aa98fe964 100644 > > > --- a/drivers/base/property.c > > > +++ b/drivers/base/property.c > > > @@ -721,7 +721,7 @@ struct fwnode_handle > > > *device_get_next_child_node(struct device *dev, > > > return next; > > > > > > /* When no more children in primary, continue with secondary */ > > > - if (!IS_ERR_OR_NULL(fwnode->secondary)) > > > + if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary)) > > > next = fwnode_get_next_child_node(fwnode->secondary, child); > > > > > > return next; > > > > It works (no more oops) > > Thanks for testing. I'm about to send formal patch, can you give your > Tested-by tag there then? Of course. Tested-by: Maxim Levitsky Best regards, Maxim Levitsky >
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote: > On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > > Hi! > > > > Few days ago I bisected a regression on 5.8 kernel: > > > > I have nvidia rtx 2070s and its USB type C port driver (which is open > > source) > > started to crash on load: > > ... > > > Reverting the commit helped fix this oops. > > > > My .config attached. > > If any more info is needed I'll be happy to provide it, > > and of course test patches. > > Can you test below? > > diff --git a/drivers/base/property.c b/drivers/base/property.c > index 1e6d75e65938..d58aa98fe964 100644 > --- a/drivers/base/property.c > +++ b/drivers/base/property.c > @@ -721,7 +721,7 @@ struct fwnode_handle *device_get_next_child_node(struct > device *dev, > return next; > > /* When no more children in primary, continue with secondary */ > - if (!IS_ERR_OR_NULL(fwnode->secondary)) > + if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary)) > next = fwnode_get_next_child_node(fwnode->secondary, child); > > return next; It works (no more oops) Best regards, Maxim Levitsky
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, 2020-07-16 at 17:34 +0300, Andy Shevchenko wrote: > On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > > Hi! > > > > Few days ago I bisected a regression on 5.8 kernel: > > > > I have nvidia rtx 2070s and its USB type C port driver (which is open > > source) > > started to crash on load: > > I'm looking at this, but I have questions: > - any pointers to the device tree excerpt which this tries to iterate over > - can you provide full Code: line? > > Only way I see, why it happens, is that fwnode is not initialized properly > somewhere (means it has garbage in the secondary pointer). > > > [ +0.43] CPU: 19 PID: 31281 Comm: kworker/19:1 Tainted: PW O > > 5.8.0-rc3.stable #133 > > [ +0.45] Hardware name: Gigabyte Technology Co., Ltd. TRX40 > > DESIGNARE/TRX40 DESIGNARE, BIOS F4c 03/05/2020 > > [ +0.30] Workqueue: events_long ucsi_init_work [typec_ucsi] > > [ +0.48] RIP: 0010:device_get_next_child_node+0x5b/0xb0 > > [ +0.24] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 > > 50 48 85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 > > <48> 8b 03 48 85 c0 74 f3 48> > > [ +0.65] RSP: 0018:c900038d7e08 EFLAGS: 00010246 > > [ +0.44] RAX: 889fb6b62f00 RBX: RCX: > > 0001 > > [ +0.27] RDX: 889fb6fd4a70 RSI: RDI: > > 889fb6b63608 > > [ +0.46] RBP: R08: 0001 R09: > > 7fff > > [ +0.24] R10: 2075ce282580 R11: 0062de3e R12: > > 889fb6b63608 > > [ +0.43] R13: 0001 R14: 889fb6b63018 R15: > > 0001 > > [ +0.44] FS: () GS:889fbe4c() > > knlGS: > > [ +0.24] CS: 0010 DS: ES: CR0: 80050033 > > [ +0.42] CR2: CR3: 00175621b000 CR4: > > 00340ea0 > > [ +0.46] Call Trace: > > [ +0.30] ucsi_init+0x213/0x530 [typec_ucsi] > > [ +0.28] ucsi_init_work+0x12/0x20 [typec_ucsi] > > [ +0.49] process_one_work+0x1d2/0x390 > > [ +0.27] worker_thread+0x4a/0x3b0 > > [ +0.25] ? process_one_work+0x390/0x390 > > [ +0.49] kthread+0xf9/0x130 > > [ +0.26] ? kthread_park+0x90/0x90 > > [ +0.28] ret_from_fork+0x1f/0x30 > > [ +0.48] Modules linked in: ucsi_ccg typec_ucsi typec hfsplus cdrom > > ntfs msdos vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost > > vhost_iotlb tap xfs rfcomm xt_M> > > [ +0.39] usb_storage ext4 mbcache jbd2 amdgpu gpu_sched ttm > > drm_kms_helper syscopyarea sysfillrect ahci sysimgblt fb_sys_fops > > crc32_pclmul libahci crc32c_intel igb ccp > > > [ +0.000289] CR2: > > [ +0.26] ---[ end trace 38ebb9aebd55fbff ]--- > > [ +0.014201] RIP: 0010:device_get_next_child_node+0x5b/0xb0 > > [ +0.30] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 > > 50 48 85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 > > <48> 8b 03 48 85 c0 74 f3 48> > > [ +0.75] RSP: 0018:c900038d7e08 EFLAGS: 00010246 > > [ +0.27] RAX: 889fb6b62f00 RBX: RCX: > > 0001 > > [ +0.48] RDX: 889fb6fd4a70 RSI: RDI: > > 889fb6b63608 > > [ +0.49] RBP: R08: 0001 R09: > > 7fff > > [ +0.27] R10: 2075ce282580 R11: 0062de3e R12: > > 889fb6b63608 > > [ +0.49] R13: 0001 R14: 889fb6b63018 R15: > > 0001 > > [ +0.50] FS: () GS:889fbe4c() > > knlGS: > > [ +0.27] CS: 0010 DS: ES: CR0: 80050033 > > [ +0.50] CR2: CR3: 00175621b000 CR4: > > 00340ea0 > > > > I bisected this, while passing the UCSI controller to a VM, and this > > is the result: > > > > git bisect start > > # good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7 > > git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 > > # bad: [48778464bb7d346b47157d21ffde2af6b2d39110] Linux 5.8-rc2 > > git bisect bad 48778464bb7d346b47157d21ffde2af6b2d39110 > > # good: [a98f670e41a99f53acb1fb33cee9c6abbb2e6f23] Merge tag 'media/v5.8-1' > > of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media > > git bisect good a98f670e41a99f53acb1fb33cee9c6abbb2e6f23 > > # good: [081096d98bb23946f16215357b141c5616b234bf] Merge tag 'tty-5.8-rc1' > > of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty > > git bisect good 081096d98bb23946f16215357b141c5616b234bf > > # bad: [3a2a8751742133a7bbc49b9d1bcbd52e212edff6] Merge tag 'for-v5.8' of > > git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply > > git bisect bad 3a2a8751742133a7bbc49b9d1bcbd52e212edff6 > > # bad: [a1e81f9654eef650d3ee35c94a8cab00b5cd379c] m68k: implement > > flush_icache_user_range > > git bisect bad
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > Hi! > > Few days ago I bisected a regression on 5.8 kernel: > > I have nvidia rtx 2070s and its USB type C port driver (which is open source) > started to crash on load: ... > Reverting the commit helped fix this oops. > > My .config attached. > If any more info is needed I'll be happy to provide it, > and of course test patches. Can you test below? diff --git a/drivers/base/property.c b/drivers/base/property.c index 1e6d75e65938..d58aa98fe964 100644 --- a/drivers/base/property.c +++ b/drivers/base/property.c @@ -721,7 +721,7 @@ struct fwnode_handle *device_get_next_child_node(struct device *dev, return next; /* When no more children in primary, continue with secondary */ - if (!IS_ERR_OR_NULL(fwnode->secondary)) + if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary)) next = fwnode_get_next_child_node(fwnode->secondary, child); return next; -- With Best Regards, Andy Shevchenko
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > Hi! > > Few days ago I bisected a regression on 5.8 kernel: > > I have nvidia rtx 2070s and its USB type C port driver (which is open source) > started to crash on load: I'm looking at this, but I have questions: - any pointers to the device tree excerpt which this tries to iterate over - can you provide full Code: line? Only way I see, why it happens, is that fwnode is not initialized properly somewhere (means it has garbage in the secondary pointer). > [ +0.43] CPU: 19 PID: 31281 Comm: kworker/19:1 Tainted: PW O > 5.8.0-rc3.stable #133 > [ +0.45] Hardware name: Gigabyte Technology Co., Ltd. TRX40 > DESIGNARE/TRX40 DESIGNARE, BIOS F4c 03/05/2020 > [ +0.30] Workqueue: events_long ucsi_init_work [typec_ucsi] > [ +0.48] RIP: 0010:device_get_next_child_node+0x5b/0xb0 > [ +0.24] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 50 > 48 85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 <48> 8b > 03 48 85 c0 74 f3 48> > [ +0.65] RSP: 0018:c900038d7e08 EFLAGS: 00010246 > [ +0.44] RAX: 889fb6b62f00 RBX: RCX: > 0001 > [ +0.27] RDX: 889fb6fd4a70 RSI: RDI: > 889fb6b63608 > [ +0.46] RBP: R08: 0001 R09: > 7fff > [ +0.24] R10: 2075ce282580 R11: 0062de3e R12: > 889fb6b63608 > [ +0.43] R13: 0001 R14: 889fb6b63018 R15: > 0001 > [ +0.44] FS: () GS:889fbe4c() > knlGS: > [ +0.24] CS: 0010 DS: ES: CR0: 80050033 > [ +0.42] CR2: CR3: 00175621b000 CR4: > 00340ea0 > [ +0.46] Call Trace: > [ +0.30] ucsi_init+0x213/0x530 [typec_ucsi] > [ +0.28] ucsi_init_work+0x12/0x20 [typec_ucsi] > [ +0.49] process_one_work+0x1d2/0x390 > [ +0.27] worker_thread+0x4a/0x3b0 > [ +0.25] ? process_one_work+0x390/0x390 > [ +0.49] kthread+0xf9/0x130 > [ +0.26] ? kthread_park+0x90/0x90 > [ +0.28] ret_from_fork+0x1f/0x30 > [ +0.48] Modules linked in: ucsi_ccg typec_ucsi typec hfsplus cdrom ntfs > msdos vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost vhost_iotlb > tap xfs rfcomm xt_M> > [ +0.39] usb_storage ext4 mbcache jbd2 amdgpu gpu_sched ttm > drm_kms_helper syscopyarea sysfillrect ahci sysimgblt fb_sys_fops > crc32_pclmul libahci crc32c_intel igb ccp > > [ +0.000289] CR2: > [ +0.26] ---[ end trace 38ebb9aebd55fbff ]--- > [ +0.014201] RIP: 0010:device_get_next_child_node+0x5b/0xb0 > [ +0.30] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 50 > 48 85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 <48> 8b > 03 48 85 c0 74 f3 48> > [ +0.75] RSP: 0018:c900038d7e08 EFLAGS: 00010246 > [ +0.27] RAX: 889fb6b62f00 RBX: RCX: > 0001 > [ +0.48] RDX: 889fb6fd4a70 RSI: RDI: > 889fb6b63608 > [ +0.49] RBP: R08: 0001 R09: > 7fff > [ +0.27] R10: 2075ce282580 R11: 0062de3e R12: > 889fb6b63608 > [ +0.49] R13: 0001 R14: 889fb6b63018 R15: > 0001 > [ +0.50] FS: () GS:889fbe4c() > knlGS: > [ +0.27] CS: 0010 DS: ES: CR0: 80050033 > [ +0.50] CR2: CR3: 00175621b000 CR4: > 00340ea0 > > I bisected this, while passing the UCSI controller to a VM, and this > is the result: > > git bisect start > # good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7 > git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 > # bad: [48778464bb7d346b47157d21ffde2af6b2d39110] Linux 5.8-rc2 > git bisect bad 48778464bb7d346b47157d21ffde2af6b2d39110 > # good: [a98f670e41a99f53acb1fb33cee9c6abbb2e6f23] Merge tag 'media/v5.8-1' > of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media > git bisect good a98f670e41a99f53acb1fb33cee9c6abbb2e6f23 > # good: [081096d98bb23946f16215357b141c5616b234bf] Merge tag 'tty-5.8-rc1' of > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty > git bisect good 081096d98bb23946f16215357b141c5616b234bf > # bad: [3a2a8751742133a7bbc49b9d1bcbd52e212edff6] Merge tag 'for-v5.8' of > git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply > git bisect bad 3a2a8751742133a7bbc49b9d1bcbd52e212edff6 > # bad: [a1e81f9654eef650d3ee35c94a8cab00b5cd379c] m68k: implement > flush_icache_user_range > git bisect bad a1e81f9654eef650d3ee35c94a8cab00b5cd379c > # good: [c336c022503d1be719ca06f2526c211709e3d2d3] staging: wfx: remove false > positive warning > git bisect good c336c022503d1be719ca06f2526c211709e3d2d3 > # good: [05c8a4fc44a916dd897769ca69b42381f9177ec4] habanalabs: correctly cast > u64
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, 2020-07-16 at 10:28 +0200, Greg KH wrote: > On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > > Hi! > > > > Few days ago I bisected a regression on 5.8 kernel: > > > > I have nvidia rtx 2070s and its USB type C port driver (which is open > > source) > > Is that driver merged into the tree? If not, do you have a pointer to > it somewhere? > > thanks, > > greg k-h > It is in the tree. CONFIG_TYPEC_UCSI selectes the generic UCSI driver CONFIG_UCSI_CCG selects the hardware driver, which is an i2c driver which binds to an i2c device (I think with address 0x8) on an i2c controller, which is exposed by function 3 of the NVIDIA card, and uses the CONFIG_I2C_NVIDIA_GPU driver. We also have CONFIG_TYPEC_NVIDIA_ALTMODE which I haven't researched what it does. Best regards, Maxim Levitsky
Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'
On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote: > Hi! > > Few days ago I bisected a regression on 5.8 kernel: > > I have nvidia rtx 2070s and its USB type C port driver (which is open source) Is that driver merged into the tree? If not, do you have a pointer to it somewhere? thanks, greg k-h
Re: Kernel Oops on 4.8.0-rc8 while running trinity tests
The kernel oops is still reproducible on 4.8.0-rc8 on PowerPC bare metal While running trinity system call fuzzer, I see these kernel oops messages: Unable to handle kernel paging request for data at address 0xe45f7702 Faulting instruction address: 0xc0055380 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA PowerNV Modules linked in: torture leds_powernv led_class powernv_op_panel powernv_rng rng_core autofs4 [last unloaded: rcutorture] CPU: 28 PID: 19687 Comm: trinity-main Not tainted 4.8.0-rc8-autotest #1 task: c007dc61c600 task.stack: c007ddb2 NIP: c0055380 LR: c0234968 CTR: REGS: c007ddb23640 TRAP: 0300 Not tainted (4.8.0-rc8-autotest) MSR: 90009033CR: 24002442 XER: CFAR: c00087d0 DAR: e45f7702 DSISR: 4000 SOFTE: 1 GPR00: 0007 c007ddb238c0 c0f7c100 c000 GPR04: 0009 GPR08: e45f7702 007f 0015 GPR12: c000 1000 GPR16: 0001 c2e02798 10034120 GPR20: 10034108 c007ddf842e0 c0ff0df8 GPR24: c1fff7ff c007ddb23a60 0100 GPR28: 0100 c2e02400 c2e02464 NIP [c0055380] __find_linux_pte_or_hugepte+0x1c0/0x330 LR [c0234968] __unmap_hugepage_range+0x178/0x670 Call Trace: [c007ddb23980] [c0234e80] __unmap_hugepage_range_final+0x20/0x50 [c007ddb239b0] [c020a52c] unmap_single_vma+0xcc/0x120 [c007ddb239f0] [c020a984] unmap_vmas+0x84/0x120 [c007ddb23a40] [c0212c00] unmap_region+0xd0/0x1a0 [c007ddb23b30] [c0214e8c] do_munmap+0x2dc/0x4a0 [c007ddb23ba0] [c0216800] mmap_region+0x1c0/0x6e0 [c007ddb23c90] [c02170fc] do_mmap+0x3dc/0x4c0 [c007ddb23d20] [c01f1034] vm_mmap_pgoff+0xc4/0x100 [c007ddb23d90] [c02144d0] SyS_mmap_pgoff+0x100/0x2a0 [c007ddb23e10] [c0012424] sys_mmap+0x44/0x70 [c007ddb23e30] [c00095e0] system_call+0x38/0x108 Instruction dump: 7d290030 79081764 3929 3860 7d2a07b4 7c895c36 7d494838 78630044 7908f5e6 79291f24 7d081b78 796b0020 <7d49402a> 7c694214 2eaa f941ffd0 ---[ end trace f4f25c6801290199 ]--- On Friday 26 August 2016 12:02 PM, Abdul Haleem wrote: Hi, Trinity tests failed on mainline4.8.0-rc3with the following error message: Machine Type : PowerPC Bare Metal & also reproducible on PowerVM LPAR config : attached 06:05:25 20:36:07 INFO | Test: running trinity tests 06:05:25 20:36:07 INFO | trinity 06:05:25 20:36:07 INFO | STARTtrinity trinity timestamp=1471912567localtime=Aug 22 20:36:07 06:06:19 Unable to handle kernel paging request for data at address 0xe475e1dc0700 06:06:19 Faulting instruction address: 0xc00553a0 06:06:19 Oops: Kernel access of bad area, sig: 11 [#1] 06:06:19 SMP NR_CPUS=32 NUMA PowerNV 06:06:19 Modules linked in: torture iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc iptable_filter ip_tables x_tables binfmt_misc kvm_hv kvm leds_powernv led_class powernv_op_panel powernv_rng rng_core autofs4 btrfs xor raid6_pq [last unloaded: rcutorture] 06:06:19 CPU: 24 PID: 16309 Comm: trinity-main Not tainted 4.8.0-rc3-autotest #1 06:06:19 task: c007de33 task.stack: c007d85dc000 06:06:19 NIP: c00553a0 LR: c02345a8 CTR: 06:06:19 REGS: c007d85df640 TRAP: 0300 Not tainted (4.8.0-rc3-autotest) 06:06:19 MSR: 90009033 CR: 24002452 XER: 06:06:19 CFAR: c00087d0 DAR: e475e1dc0700 DSISR: 4000 SOFTE: 1 06:06:19 GPR00: 0007 c007d85df8c0 c0f7ad00 c000 06:06:19 GPR04: 0009 0700 06:06:19 GPR08: e475e1dc0700 007f 0015 06:06:19 GPR12: cfffe000 1000 06:06:19 GPR16: 0001 c007ddfa6798 100341e0 06:06:19 GPR20: 100341c8 c007dc336508 c0ff0df8 06:06:19 GPR24: c1fff7ff c007d85dfa60 0100 06:06:19 GPR28: 0100 c007ddfa6400 c007ddfa6464 0007 06:06:19 NIP [c00553a0] __find_linux_pte_or_hugepte+0x1c0/0x330 06:06:19 LR [c02345a8] __unmap_hugepage_range+0x178/0x670 06:06:19 Call Trace: 06:06:19 [c007d85df980] [c0234ac0] __unmap_hugepage_range_final+0x20/0x50 06:06:19
Re: Kernel Oops on 4.8.0-rc8 while running trinity tests
The kernel oops is still reproducible on 4.8.0-rc8 on PowerPC bare metal While running trinity system call fuzzer, I see these kernel oops messages: Unable to handle kernel paging request for data at address 0xe45f7702 Faulting instruction address: 0xc0055380 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA PowerNV Modules linked in: torture leds_powernv led_class powernv_op_panel powernv_rng rng_core autofs4 [last unloaded: rcutorture] CPU: 28 PID: 19687 Comm: trinity-main Not tainted 4.8.0-rc8-autotest #1 task: c007dc61c600 task.stack: c007ddb2 NIP: c0055380 LR: c0234968 CTR: REGS: c007ddb23640 TRAP: 0300 Not tainted (4.8.0-rc8-autotest) MSR: 90009033 CR: 24002442 XER: CFAR: c00087d0 DAR: e45f7702 DSISR: 4000 SOFTE: 1 GPR00: 0007 c007ddb238c0 c0f7c100 c000 GPR04: 0009 GPR08: e45f7702 007f 0015 GPR12: c000 1000 GPR16: 0001 c2e02798 10034120 GPR20: 10034108 c007ddf842e0 c0ff0df8 GPR24: c1fff7ff c007ddb23a60 0100 GPR28: 0100 c2e02400 c2e02464 NIP [c0055380] __find_linux_pte_or_hugepte+0x1c0/0x330 LR [c0234968] __unmap_hugepage_range+0x178/0x670 Call Trace: [c007ddb23980] [c0234e80] __unmap_hugepage_range_final+0x20/0x50 [c007ddb239b0] [c020a52c] unmap_single_vma+0xcc/0x120 [c007ddb239f0] [c020a984] unmap_vmas+0x84/0x120 [c007ddb23a40] [c0212c00] unmap_region+0xd0/0x1a0 [c007ddb23b30] [c0214e8c] do_munmap+0x2dc/0x4a0 [c007ddb23ba0] [c0216800] mmap_region+0x1c0/0x6e0 [c007ddb23c90] [c02170fc] do_mmap+0x3dc/0x4c0 [c007ddb23d20] [c01f1034] vm_mmap_pgoff+0xc4/0x100 [c007ddb23d90] [c02144d0] SyS_mmap_pgoff+0x100/0x2a0 [c007ddb23e10] [c0012424] sys_mmap+0x44/0x70 [c007ddb23e30] [c00095e0] system_call+0x38/0x108 Instruction dump: 7d290030 79081764 3929 3860 7d2a07b4 7c895c36 7d494838 78630044 7908f5e6 79291f24 7d081b78 796b0020 <7d49402a> 7c694214 2eaa f941ffd0 ---[ end trace f4f25c6801290199 ]--- On Friday 26 August 2016 12:02 PM, Abdul Haleem wrote: Hi, Trinity tests failed on mainline4.8.0-rc3with the following error message: Machine Type : PowerPC Bare Metal & also reproducible on PowerVM LPAR config : attached 06:05:25 20:36:07 INFO | Test: running trinity tests 06:05:25 20:36:07 INFO | trinity 06:05:25 20:36:07 INFO | STARTtrinity trinity timestamp=1471912567localtime=Aug 22 20:36:07 06:06:19 Unable to handle kernel paging request for data at address 0xe475e1dc0700 06:06:19 Faulting instruction address: 0xc00553a0 06:06:19 Oops: Kernel access of bad area, sig: 11 [#1] 06:06:19 SMP NR_CPUS=32 NUMA PowerNV 06:06:19 Modules linked in: torture iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc iptable_filter ip_tables x_tables binfmt_misc kvm_hv kvm leds_powernv led_class powernv_op_panel powernv_rng rng_core autofs4 btrfs xor raid6_pq [last unloaded: rcutorture] 06:06:19 CPU: 24 PID: 16309 Comm: trinity-main Not tainted 4.8.0-rc3-autotest #1 06:06:19 task: c007de33 task.stack: c007d85dc000 06:06:19 NIP: c00553a0 LR: c02345a8 CTR: 06:06:19 REGS: c007d85df640 TRAP: 0300 Not tainted (4.8.0-rc3-autotest) 06:06:19 MSR: 90009033 CR: 24002452 XER: 06:06:19 CFAR: c00087d0 DAR: e475e1dc0700 DSISR: 4000 SOFTE: 1 06:06:19 GPR00: 0007 c007d85df8c0 c0f7ad00 c000 06:06:19 GPR04: 0009 0700 06:06:19 GPR08: e475e1dc0700 007f 0015 06:06:19 GPR12: cfffe000 1000 06:06:19 GPR16: 0001 c007ddfa6798 100341e0 06:06:19 GPR20: 100341c8 c007dc336508 c0ff0df8 06:06:19 GPR24: c1fff7ff c007d85dfa60 0100 06:06:19 GPR28: 0100 c007ddfa6400 c007ddfa6464 0007 06:06:19 NIP [c00553a0] __find_linux_pte_or_hugepte+0x1c0/0x330 06:06:19 LR [c02345a8] __unmap_hugepage_range+0x178/0x670 06:06:19 Call Trace: 06:06:19 [c007d85df980] [c0234ac0] __unmap_hugepage_range_final+0x20/0x50 06:06:19 [c007d85df9b0] [c020a16c]
Re: kernel OOPS in MM(?)
Hello, On 2016-03-10 12:31, Evgenii Lepikhin wrote: > We need help to understand the source of the problem and may be to create a > bugreport. Here is crash report: > > Mar 10 04:03:51 l28 kernel: [2075560.434445] BUG: unable to handle kernel > paging request at 40008021 > Mar 10 04:03:51 l28 kernel: [2075560.434669] IP: [] > __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.434800] PGD b7e462067 PUD 0 > Mar 10 04:03:51 l28 kernel: [2075560.434913] Oops: [#1] SMP > Mar 10 04:03:51 l28 kernel: [2075560.435044] Modules linked in: > tcm_loop iscsi_target_mod target_core_pscsi target_core_file > target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp > libis > csi_tcp libiscsi scsi_transport_iscsi fuse [last unloaded: ipfw_mod] > Mar 10 04:03:51 l28 kernel: [2075560.435539] CPU: 4 PID: 27141 Comm: rm > Tainted: G O 3.12.51-jl-2015-12-25 #1 > Mar 10 04:03:51 l28 kernel: [2075560.435734] Hardware name: Intel Corporation > S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 > 02/26/2013 > Mar 10 04:03:51 l28 kernel: [2075560.435939] task: 880e622ccba0 ti: > 880eeb008000 task.ti: 880eeb008000 > Mar 10 04:03:51 l28 kernel: [2075560.436131] RIP: 0010:[] > [] __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.436333] RSP: 0018:880eeb009b38 > EFLAGS: 00010282 > Mar 10 04:03:51 l28 kernel: [2075560.436439] RAX: RBX: > RCX: a8a73dc2 > Mar 10 04:03:51 l28 kernel: [2075560.436632] RDX: a8a73dc1 RSI: > RDI: 00013500 > Mar 10 04:03:51 l28 kernel: [2075560.438248] RBP: 880eeb009b58 R08: > 88103fc13500 R09: 811a0267 > Mar 10 04:03:51 l28 kernel: [2075560.438446] R10: 880eeb009d84 R11: > R12: 88081f803a00 > Mar 10 04:03:51 l28 kernel: [2075560.438656] R13: 40008021 R14: > 0250 R15: 880250e833b0 > Mar 10 04:03:51 l28 kernel: [2075560.438851] FS: 7fe2316dd700() > GS:88103fc0() knlGS: > Mar 10 04:03:51 l28 kernel: [2075560.439045] CS: 0010 DS: ES: CR0: > 80050033 > Mar 10 04:03:51 l28 kernel: [2075560.439152] CR2: 40008021 CR3: > 000a20736000 CR4: 000407e0 > Mar 10 04:03:51 l28 kernel: [2075560.439343] Stack: > Mar 10 04:03:51 l28 kernel: [2075560.439439] > 0250 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439663] 880eeb009b88 > 811a0267 881015fb7fe0 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439898] 880250e83490 > 880eeb009ba8 811a02f8 > Mar 10 04:03:51 l28 kernel: [2075560.440153] Call Trace: > Mar 10 04:03:51 l28 kernel: [2075560.440257] [] > kmem_alloc+0x67/0xe0 > Mar 10 04:03:51 l28 kernel: [2075560.440365] [] > kmem_zalloc+0x18/0x40 > Mar 10 04:03:51 l28 kernel: [2075560.440473] [] > xfs_log_commit_cil+0x373/0x4c0 > Mar 10 04:03:51 l28 kernel: [2075560.440585] [] ? > xfs_bmap_search_multi_extents+0xe0/0x110 > Mar 10 04:03:51 l28 kernel: [2075560.440783] [] > xfs_trans_commit+0x6c/0x250 > Mar 10 04:03:51 l28 kernel: [2075560.440899] [] > xfs_bmap_finish+0xb7/0x1a0 Another issue on the same server, same instruction pointer: Mar 16 04:53:54 l28 kernel: [521052.387878] BUG: unable to handle kernel paging request at 40008021 Mar 16 04:53:54 l28 kernel: [521052.388022] IP: [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.388171] PGD 0 Mar 16 04:53:54 l28 kernel: [521052.388289] Oops: [#1] SMP Mar 16 04:53:54 l28 kernel: [521052.388410] Modules linked in: tcm_loop iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp libis csi_tcp libiscsi scsi_transport_iscsi fuse Mar 16 04:53:54 l28 kernel: [521052.388913] CPU: 6 PID: 5947 Comm: iscsi_trx Tainted: G O 3.12.51-jl-2015-12-25 #1 Mar 16 04:53:54 l28 kernel: [521052.389125] Hardware name: Intel Corporation S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Mar 16 04:53:54 l28 kernel: [521052.389351] task: 88081a3a6720 ti: 8808162de000 task.ti: 8808162de000 Mar 16 04:53:54 l28 kernel: [521052.389566] RIP: 0010:[] [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.389782] RSP: 0018:8808162dfd18 EFLAGS: 00010286 Mar 16 04:53:54 l28 kernel: [521052.389899] RAX: RBX: 880819a51800 RCX: 03b305d3 Mar 16 04:53:54 l28 kernel: [521052.390112] RDX: 03b305d2 RSI: RDI: 00013500 Mar 16 04:53:54 l28 kernel: [521052.390309] RBP: 8808162dfd38 R08: 88103fd13500 R09: a00e7072 Mar 16 04:53:54 l28 kernel: [521052.390503] R10: 0010 R11: 0030 R12: 88081f803a00 Mar 16 04:53:54 l28 kernel: [521052.390694] R13: 40008021 R14: 80d0 R15: 8808162dfdd0 Mar 16
Re: kernel OOPS in MM(?)
Hello, On 2016-03-10 12:31, Evgenii Lepikhin wrote: > We need help to understand the source of the problem and may be to create a > bugreport. Here is crash report: > > Mar 10 04:03:51 l28 kernel: [2075560.434445] BUG: unable to handle kernel > paging request at 40008021 > Mar 10 04:03:51 l28 kernel: [2075560.434669] IP: [] > __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.434800] PGD b7e462067 PUD 0 > Mar 10 04:03:51 l28 kernel: [2075560.434913] Oops: [#1] SMP > Mar 10 04:03:51 l28 kernel: [2075560.435044] Modules linked in: > tcm_loop iscsi_target_mod target_core_pscsi target_core_file > target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp > libis > csi_tcp libiscsi scsi_transport_iscsi fuse [last unloaded: ipfw_mod] > Mar 10 04:03:51 l28 kernel: [2075560.435539] CPU: 4 PID: 27141 Comm: rm > Tainted: G O 3.12.51-jl-2015-12-25 #1 > Mar 10 04:03:51 l28 kernel: [2075560.435734] Hardware name: Intel Corporation > S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 > 02/26/2013 > Mar 10 04:03:51 l28 kernel: [2075560.435939] task: 880e622ccba0 ti: > 880eeb008000 task.ti: 880eeb008000 > Mar 10 04:03:51 l28 kernel: [2075560.436131] RIP: 0010:[] > [] __kmalloc+0x69/0x100 > Mar 10 04:03:51 l28 kernel: [2075560.436333] RSP: 0018:880eeb009b38 > EFLAGS: 00010282 > Mar 10 04:03:51 l28 kernel: [2075560.436439] RAX: RBX: > RCX: a8a73dc2 > Mar 10 04:03:51 l28 kernel: [2075560.436632] RDX: a8a73dc1 RSI: > RDI: 00013500 > Mar 10 04:03:51 l28 kernel: [2075560.438248] RBP: 880eeb009b58 R08: > 88103fc13500 R09: 811a0267 > Mar 10 04:03:51 l28 kernel: [2075560.438446] R10: 880eeb009d84 R11: > R12: 88081f803a00 > Mar 10 04:03:51 l28 kernel: [2075560.438656] R13: 40008021 R14: > 0250 R15: 880250e833b0 > Mar 10 04:03:51 l28 kernel: [2075560.438851] FS: 7fe2316dd700() > GS:88103fc0() knlGS: > Mar 10 04:03:51 l28 kernel: [2075560.439045] CS: 0010 DS: ES: CR0: > 80050033 > Mar 10 04:03:51 l28 kernel: [2075560.439152] CR2: 40008021 CR3: > 000a20736000 CR4: 000407e0 > Mar 10 04:03:51 l28 kernel: [2075560.439343] Stack: > Mar 10 04:03:51 l28 kernel: [2075560.439439] > 0250 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439663] 880eeb009b88 > 811a0267 881015fb7fe0 0060 > Mar 10 04:03:51 l28 kernel: [2075560.439898] 880250e83490 > 880eeb009ba8 811a02f8 > Mar 10 04:03:51 l28 kernel: [2075560.440153] Call Trace: > Mar 10 04:03:51 l28 kernel: [2075560.440257] [] > kmem_alloc+0x67/0xe0 > Mar 10 04:03:51 l28 kernel: [2075560.440365] [] > kmem_zalloc+0x18/0x40 > Mar 10 04:03:51 l28 kernel: [2075560.440473] [] > xfs_log_commit_cil+0x373/0x4c0 > Mar 10 04:03:51 l28 kernel: [2075560.440585] [] ? > xfs_bmap_search_multi_extents+0xe0/0x110 > Mar 10 04:03:51 l28 kernel: [2075560.440783] [] > xfs_trans_commit+0x6c/0x250 > Mar 10 04:03:51 l28 kernel: [2075560.440899] [] > xfs_bmap_finish+0xb7/0x1a0 Another issue on the same server, same instruction pointer: Mar 16 04:53:54 l28 kernel: [521052.387878] BUG: unable to handle kernel paging request at 40008021 Mar 16 04:53:54 l28 kernel: [521052.388022] IP: [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.388171] PGD 0 Mar 16 04:53:54 l28 kernel: [521052.388289] Oops: [#1] SMP Mar 16 04:53:54 l28 kernel: [521052.388410] Modules linked in: tcm_loop iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp libis csi_tcp libiscsi scsi_transport_iscsi fuse Mar 16 04:53:54 l28 kernel: [521052.388913] CPU: 6 PID: 5947 Comm: iscsi_trx Tainted: G O 3.12.51-jl-2015-12-25 #1 Mar 16 04:53:54 l28 kernel: [521052.389125] Hardware name: Intel Corporation S2600IP ../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Mar 16 04:53:54 l28 kernel: [521052.389351] task: 88081a3a6720 ti: 8808162de000 task.ti: 8808162de000 Mar 16 04:53:54 l28 kernel: [521052.389566] RIP: 0010:[] [] __kmalloc+0x69/0x100 Mar 16 04:53:54 l28 kernel: [521052.389782] RSP: 0018:8808162dfd18 EFLAGS: 00010286 Mar 16 04:53:54 l28 kernel: [521052.389899] RAX: RBX: 880819a51800 RCX: 03b305d3 Mar 16 04:53:54 l28 kernel: [521052.390112] RDX: 03b305d2 RSI: RDI: 00013500 Mar 16 04:53:54 l28 kernel: [521052.390309] RBP: 8808162dfd38 R08: 88103fd13500 R09: a00e7072 Mar 16 04:53:54 l28 kernel: [521052.390503] R10: 0010 R11: 0030 R12: 88081f803a00 Mar 16 04:53:54 l28 kernel: [521052.390694] R13: 40008021 R14: 80d0 R15: 8808162dfdd0 Mar 16
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 19, 2015 at 08:58:27AM +0200, Kirill A. Shutemov wrote: > On Thu, Nov 19, 2015 at 11:12:21AM +0900, Minchan Kim wrote: > > On Tue, Nov 17, 2015 at 11:32:13AM +0200, Kirill A. Shutemov wrote: > > > On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > > > > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > > > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > > > > I couldn't see any problem. > > > > > > > > > > > > > > > > However, in this round, I did another test which is same one > > > > > > > > I attached but a liitle bit different because it doesn't do > > > > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > > > > > > > Could you share updated test? > > > > > > > > > > > > It's part of my testing suite so I should factor it out. > > > > > > I will send it when I go to office tomorrow. > > > > > > > > > > Thanks. > > > > > > > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > > > > > > > Befor leaving office, I queued it up and result is below. > > > > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > > > > Anyway, please confirm and say to me what I should add more patches > > > > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > > > > fix patches. > > > > > > > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 > > > > > release: > > > > > > > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > > > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > > > > > > > 1. mm: fix __page_mapcount() > > > > 2. thp: fix leak due split_huge_page() vs. exit race > > > > > > > > If I missed some patches, let me know it. > > > > > > > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested > > > > again. > > > > But unfortunately, the result was below. > > > > > > > > Now, I am making test program I can send to you but it seems to be not > > > > easy > > > > because small changes for factoring it out from testing suite seems to > > > > change > > > > something(ex, timing) and makes hard to reproduce. I will try it again. > > > > > > Your test suite seems generate quite a few bug reports. Don't mind make > > > whole > > > suite public? > > > > It's tough due to including company internal stuffs. > > That's why I try to factor the part I can share out but unfortunatel, > > I couldn't grab a time for retrying until now. :( > > > > > > > > > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > > > > index:0x60e02 > > > > flags: 0x40040018(uptodate|dirty|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > > > > page->mem_cgroup:880077cf0c00 > > > > [ cut here ] > > > > kernel BUG at mm/huge_memory.c:3272! > > > > invalid opcode: [#1] SMP > > > > Dumping ftrace buffer: > > > >(ftrace buffer empty) > > > > Modules linked in: > > > > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs > > > > 01/01/2011 > > > > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > > > > RIP: 0010:[] [] > > > > split_huge_page_to_list+0x8fb/0x910 > > > > RSP: 0018:88007344f968 EFLAGS: 00010286 > > > > RAX: 0021 RBX: ea240080 RCX: > > > > RDX: 0001 RSI: 0246 RDI: 821df4d8 > > > > RBP: 88007344f9e8 R08: R09: 880bc600 > > > > R10: 8163e2c0 R11: 4b47 R12: ea240080 > > > > R13: ea240088 R14: ea240080 R15: > > > > FS: () GS:88007830() > > > > knlGS: > > > > CS: 0010 DS: ES: CR0: 8005003b > > > > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > > > > Stack: > > > > cccd ea240080 88007344fa00 ea240088 > > > > 88007344fa00 88007344f9e8 810f0200 > > > > ea24 ea240080 > > > > Call Trace: > > > > [] ? __lock_page+0xa0/0xb0 > > > > [] deferred_split_scan+0x115/0x240 > > > > [] ? list_lru_count_one+0x1c/0x30 > > > > [] shrink_slab.part.42+0x1e3/0x350 > > > > [] shrink_zone+0x26a/0x280 > > > > [] do_try_to_free_pages+0x12d/0x3b0 > > > > [] try_to_free_pages+0xb4/0x140 > > > > [] __alloc_pages_nodemask+0x459/0x920 > > > > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > > > > [] khugepaged+0x155/0x1b10 > > > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 19, 2015 at 08:58:27AM +0200, Kirill A. Shutemov wrote: > On Thu, Nov 19, 2015 at 11:12:21AM +0900, Minchan Kim wrote: > > On Tue, Nov 17, 2015 at 11:32:13AM +0200, Kirill A. Shutemov wrote: > > > On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > > > > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > > > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > > > > I couldn't see any problem. > > > > > > > > > > > > > > > > However, in this round, I did another test which is same one > > > > > > > > I attached but a liitle bit different because it doesn't do > > > > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > > > > > > > Could you share updated test? > > > > > > > > > > > > It's part of my testing suite so I should factor it out. > > > > > > I will send it when I go to office tomorrow. > > > > > > > > > > Thanks. > > > > > > > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > > > > > > > Befor leaving office, I queued it up and result is below. > > > > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > > > > Anyway, please confirm and say to me what I should add more patches > > > > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > > > > fix patches. > > > > > > > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 > > > > > release: > > > > > > > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > > > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > > > > > > > 1. mm: fix __page_mapcount() > > > > 2. thp: fix leak due split_huge_page() vs. exit race > > > > > > > > If I missed some patches, let me know it. > > > > > > > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested > > > > again. > > > > But unfortunately, the result was below. > > > > > > > > Now, I am making test program I can send to you but it seems to be not > > > > easy > > > > because small changes for factoring it out from testing suite seems to > > > > change > > > > something(ex, timing) and makes hard to reproduce. I will try it again. > > > > > > Your test suite seems generate quite a few bug reports. Don't mind make > > > whole > > > suite public? > > > > It's tough due to including company internal stuffs. > > That's why I try to factor the part I can share out but unfortunatel, > > I couldn't grab a time for retrying until now. :( > > > > > > > > > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > > > > index:0x60e02 > > > > flags: 0x40040018(uptodate|dirty|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > > > > page->mem_cgroup:880077cf0c00 > > > > [ cut here ] > > > > kernel BUG at mm/huge_memory.c:3272! > > > > invalid opcode: [#1] SMP > > > > Dumping ftrace buffer: > > > >(ftrace buffer empty) > > > > Modules linked in: > > > > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs > > > > 01/01/2011 > > > > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > > > > RIP: 0010:[] [] > > > > split_huge_page_to_list+0x8fb/0x910 > > > > RSP: 0018:88007344f968 EFLAGS: 00010286 > > > > RAX: 0021 RBX: ea240080 RCX: > > > > RDX: 0001 RSI: 0246 RDI: 821df4d8 > > > > RBP: 88007344f9e8 R08: R09: 880bc600 > > > > R10: 8163e2c0 R11: 4b47 R12: ea240080 > > > > R13: ea240088 R14: ea240080 R15: > > > > FS: () GS:88007830() > > > > knlGS: > > > > CS: 0010 DS: ES: CR0: 8005003b > > > > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > > > > Stack: > > > > cccd ea240080 88007344fa00 ea240088 > > > > 88007344fa00 88007344f9e8 810f0200 > > > > ea24 ea240080 > > > > Call Trace: > > > > [] ? __lock_page+0xa0/0xb0 > > > > [] deferred_split_scan+0x115/0x240 > > > > [] ? list_lru_count_one+0x1c/0x30 > > > > [] shrink_slab.part.42+0x1e3/0x350 > > > > [] shrink_zone+0x26a/0x280 > > > > [] do_try_to_free_pages+0x12d/0x3b0 > > > > [] try_to_free_pages+0xb4/0x140 > > > > [] __alloc_pages_nodemask+0x459/0x920 > > > > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > > > > [] khugepaged+0x155/0x1b10 > > > >
Re: kernel oops on mmotm-2015-10-15-15-20
> On Nov 19, 2015, at 14:58, Kirill A. Shutemov wrote: > > uncharged i also encounter this crash , also i encounter a crash like this in qemu: [2.703436] [] do_execveat_common.isra.36+0x4f0/0x630 [2.703624] [] do_execve+0x24/0x30 [2.703767] [] SyS_execve+0x1c/0x2c [2.703923] BUG: Bad page map in process init pte:604837ebd3 pmd:b29e7003 [2.704140] page:ffc07f00af80 count:2 mapcount:-1 mapping: (null) index:0x1 [2.704414] flags: 0x4014(referenced|dirty) [2.704563] page dumped because: bad pte [2.704666] addr:007fafb7e000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7e [2.704906] file: (null) fault: (null) mmap: (null) readpage: (null) [2.705117] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.705315] Hardware name: ranchu (DT) [2.705408] Call trace: [2.705488] [] dump_backtrace+0x0/0x124 [2.705657] [] show_stack+0x10/0x1c [2.705797] [] dump_stack+0x78/0x98 [2.705971] [] print_bad_pte+0x154/0x1f0 [2.706102] [] unmap_single_vma+0x574/0x704 [2.706236] [] unmap_vmas+0x54/0x70 [2.706354] [] exit_mmap+0x88/0xfc [2.706473] [] mmput+0x48/0xe8 [2.706584] [] flush_old_exec+0x30c/0x79c [2.706719] [] load_elf_binary+0x21c/0x1098 [2.706856] [] search_binary_handler+0xa8/0x224 [2.706995] [] do_execveat_common.isra.36+0x4f0/0x630 [2.707144] [] do_execve+0x24/0x30 [2.707263] [] SyS_execve+0x1c/0x2c [2.707392] BUG: Bad page map in process init pte:604837fbd3 pmd:b29e7003 [2.707752] page:ffc07f00afc0 count:2 mapcount:-1 mapping: (null) index:0x1 [2.708167] flags: 0x4014(referenced|dirty) [2.708333] page dumped because: bad pte [2.708501] addr:007fafb7f000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7f [2.709084] file: (null) fault: (null) mmap: (null) readpage: (null) [2.709306] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.709494] Hardware name: ranchu (DT) seems the page map count is not correct .. i build is based on mmotm-2015-10-21-14-41 Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
> On Nov 19, 2015, at 14:58, Kirill A. Shutemovwrote: > > uncharged i also encounter this crash , also i encounter a crash like this in qemu: [2.703436] [] do_execveat_common.isra.36+0x4f0/0x630 [2.703624] [] do_execve+0x24/0x30 [2.703767] [] SyS_execve+0x1c/0x2c [2.703923] BUG: Bad page map in process init pte:604837ebd3 pmd:b29e7003 [2.704140] page:ffc07f00af80 count:2 mapcount:-1 mapping: (null) index:0x1 [2.704414] flags: 0x4014(referenced|dirty) [2.704563] page dumped because: bad pte [2.704666] addr:007fafb7e000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7e [2.704906] file: (null) fault: (null) mmap: (null) readpage: (null) [2.705117] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.705315] Hardware name: ranchu (DT) [2.705408] Call trace: [2.705488] [] dump_backtrace+0x0/0x124 [2.705657] [] show_stack+0x10/0x1c [2.705797] [] dump_stack+0x78/0x98 [2.705971] [] print_bad_pte+0x154/0x1f0 [2.706102] [] unmap_single_vma+0x574/0x704 [2.706236] [] unmap_vmas+0x54/0x70 [2.706354] [] exit_mmap+0x88/0xfc [2.706473] [] mmput+0x48/0xe8 [2.706584] [] flush_old_exec+0x30c/0x79c [2.706719] [] load_elf_binary+0x21c/0x1098 [2.706856] [] search_binary_handler+0xa8/0x224 [2.706995] [] do_execveat_common.isra.36+0x4f0/0x630 [2.707144] [] do_execve+0x24/0x30 [2.707263] [] SyS_execve+0x1c/0x2c [2.707392] BUG: Bad page map in process init pte:604837fbd3 pmd:b29e7003 [2.707752] page:ffc07f00afc0 count:2 mapcount:-1 mapping: (null) index:0x1 [2.708167] flags: 0x4014(referenced|dirty) [2.708333] page dumped because: bad pte [2.708501] addr:007fafb7f000 vm_flags:00100073 anon_vma:ffc0729bdb90 mapping: (null) index:7fafb7f [2.709084] file: (null) fault: (null) mmap: (null) readpage: (null) [2.709306] CPU: 0 PID: 84 Comm: init Tainted: GB 4.2.0ajb-5-g11a9bf3 #80 [2.709494] Hardware name: ranchu (DT) seems the page map count is not correct .. i build is based on mmotm-2015-10-21-14-41 Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 19, 2015 at 11:12:21AM +0900, Minchan Kim wrote: > On Tue, Nov 17, 2015 at 11:32:13AM +0200, Kirill A. Shutemov wrote: > > On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > > > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > > > I couldn't see any problem. > > > > > > > > > > > > > > However, in this round, I did another test which is same one > > > > > > > I attached but a liitle bit different because it doesn't do > > > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > > > > > Could you share updated test? > > > > > > > > > > It's part of my testing suite so I should factor it out. > > > > > I will send it when I go to office tomorrow. > > > > > > > > Thanks. > > > > > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > > > > > Befor leaving office, I queued it up and result is below. > > > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > > > Anyway, please confirm and say to me what I should add more patches > > > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > > > fix patches. > > > > > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > > > > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > > > > > 1. mm: fix __page_mapcount() > > > 2. thp: fix leak due split_huge_page() vs. exit race > > > > > > If I missed some patches, let me know it. > > > > > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested > > > again. > > > But unfortunately, the result was below. > > > > > > Now, I am making test program I can send to you but it seems to be not > > > easy > > > because small changes for factoring it out from testing suite seems to > > > change > > > something(ex, timing) and makes hard to reproduce. I will try it again. > > > > Your test suite seems generate quite a few bug reports. Don't mind make > > whole > > suite public? > > It's tough due to including company internal stuffs. > That's why I try to factor the part I can share out but unfortunatel, > I couldn't grab a time for retrying until now. :( > > > > > > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > > > index:0x60e02 > > > flags: 0x40040018(uptodate|dirty|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > > > page->mem_cgroup:880077cf0c00 > > > [ cut here ] > > > kernel BUG at mm/huge_memory.c:3272! > > > invalid opcode: [#1] SMP > > > Dumping ftrace buffer: > > >(ftrace buffer empty) > > > Modules linked in: > > > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs > > > 01/01/2011 > > > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > > > RIP: 0010:[] [] > > > split_huge_page_to_list+0x8fb/0x910 > > > RSP: 0018:88007344f968 EFLAGS: 00010286 > > > RAX: 0021 RBX: ea240080 RCX: > > > RDX: 0001 RSI: 0246 RDI: 821df4d8 > > > RBP: 88007344f9e8 R08: R09: 880bc600 > > > R10: 8163e2c0 R11: 4b47 R12: ea240080 > > > R13: ea240088 R14: ea240080 R15: > > > FS: () GS:88007830() > > > knlGS: > > > CS: 0010 DS: ES: CR0: 8005003b > > > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > > > Stack: > > > cccd ea240080 88007344fa00 ea240088 > > > 88007344fa00 88007344f9e8 810f0200 > > > ea24 ea240080 > > > Call Trace: > > > [] ? __lock_page+0xa0/0xb0 > > > [] deferred_split_scan+0x115/0x240 > > > [] ? list_lru_count_one+0x1c/0x30 > > > [] shrink_slab.part.42+0x1e3/0x350 > > > [] shrink_zone+0x26a/0x280 > > > [] do_try_to_free_pages+0x12d/0x3b0 > > > [] try_to_free_pages+0xb4/0x140 > > > [] __alloc_pages_nodemask+0x459/0x920 > > > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > > > [] khugepaged+0x155/0x1b10 > > > [] ? prepare_to_wait_event+0xf0/0xf0 > > > [] ? __split_huge_pmd_locked+0x4e0/0x4e0 > > > [] kthread+0xc9/0xe0 > > > [] ? kthread_park+0x60/0x60 > > > [] ret_from_fork+0x3f/0x70 > > > [] ? kthread_park+0x60/0x60 > > > Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 17, 2015 at 11:32:13AM +0200, Kirill A. Shutemov wrote: > On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > > I couldn't see any problem. > > > > > > > > > > > > However, in this round, I did another test which is same one > > > > > > I attached but a liitle bit different because it doesn't do > > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > > > Could you share updated test? > > > > > > > > It's part of my testing suite so I should factor it out. > > > > I will send it when I go to office tomorrow. > > > > > > Thanks. > > > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > > > Befor leaving office, I queued it up and result is below. > > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > > Anyway, please confirm and say to me what I should add more patches > > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > > fix patches. > > > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > > > 1. mm: fix __page_mapcount() > > 2. thp: fix leak due split_huge_page() vs. exit race > > > > If I missed some patches, let me know it. > > > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested > > again. > > But unfortunately, the result was below. > > > > Now, I am making test program I can send to you but it seems to be not easy > > because small changes for factoring it out from testing suite seems to > > change > > something(ex, timing) and makes hard to reproduce. I will try it again. > > Your test suite seems generate quite a few bug reports. Don't mind make whole > suite public? It's tough due to including company internal stuffs. That's why I try to factor the part I can share out but unfortunatel, I couldn't grab a time for retrying until now. :( > > > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > > index:0x60e02 > > flags: 0x40040018(uptodate|dirty|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > > page->mem_cgroup:880077cf0c00 > > [ cut here ] > > kernel BUG at mm/huge_memory.c:3272! > > invalid opcode: [#1] SMP > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > > RIP: 0010:[] [] > > split_huge_page_to_list+0x8fb/0x910 > > RSP: 0018:88007344f968 EFLAGS: 00010286 > > RAX: 0021 RBX: ea240080 RCX: > > RDX: 0001 RSI: 0246 RDI: 821df4d8 > > RBP: 88007344f9e8 R08: R09: 880bc600 > > R10: 8163e2c0 R11: 4b47 R12: ea240080 > > R13: ea240088 R14: ea240080 R15: > > FS: () GS:88007830() knlGS: > > CS: 0010 DS: ES: CR0: 8005003b > > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > > Stack: > > cccd ea240080 88007344fa00 ea240088 > > 88007344fa00 88007344f9e8 810f0200 > > ea24 ea240080 > > Call Trace: > > [] ? __lock_page+0xa0/0xb0 > > [] deferred_split_scan+0x115/0x240 > > [] ? list_lru_count_one+0x1c/0x30 > > [] shrink_slab.part.42+0x1e3/0x350 > > [] shrink_zone+0x26a/0x280 > > [] do_try_to_free_pages+0x12d/0x3b0 > > [] try_to_free_pages+0xb4/0x140 > > [] __alloc_pages_nodemask+0x459/0x920 > > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > > [] khugepaged+0x155/0x1b10 > > [] ? prepare_to_wait_event+0xf0/0xf0 > > [] ? __split_huge_pmd_locked+0x4e0/0x4e0 > > [] kthread+0xc9/0xe0 > > [] ? kthread_park+0x60/0x60 > > [] ret_from_fork+0x3f/0x70 > > [] ? kthread_park+0x60/0x60 > > Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f 0b 48 83 e8 01 > > e9 94 f7 ff ff 48 c7 c6 80 bb 77 81 4c 89 f7 e8 c5 ce fc ff <0f> 0b 48 c7 > > c6 48 c9 77 81 4c 89 e7 e8 b4 ce fc ff 0f 0b 66 90 > > RIP [] split_huge_page_to_list+0x8fb/0x910 > > RSP > > ---[ end trace 0ee39378e850d8de ]--- > > Kernel panic - not syncing: Fatal
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 17, 2015 at 11:32:13AM +0200, Kirill A. Shutemov wrote: > On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > > I couldn't see any problem. > > > > > > > > > > > > However, in this round, I did another test which is same one > > > > > > I attached but a liitle bit different because it doesn't do > > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > > > Could you share updated test? > > > > > > > > It's part of my testing suite so I should factor it out. > > > > I will send it when I go to office tomorrow. > > > > > > Thanks. > > > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > > > Befor leaving office, I queued it up and result is below. > > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > > Anyway, please confirm and say to me what I should add more patches > > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > > fix patches. > > > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > > > 1. mm: fix __page_mapcount() > > 2. thp: fix leak due split_huge_page() vs. exit race > > > > If I missed some patches, let me know it. > > > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested > > again. > > But unfortunately, the result was below. > > > > Now, I am making test program I can send to you but it seems to be not easy > > because small changes for factoring it out from testing suite seems to > > change > > something(ex, timing) and makes hard to reproduce. I will try it again. > > Your test suite seems generate quite a few bug reports. Don't mind make whole > suite public? It's tough due to including company internal stuffs. That's why I try to factor the part I can share out but unfortunatel, I couldn't grab a time for retrying until now. :( > > > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > > index:0x60e02 > > flags: 0x40040018(uptodate|dirty|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > > page->mem_cgroup:880077cf0c00 > > [ cut here ] > > kernel BUG at mm/huge_memory.c:3272! > > invalid opcode: [#1] SMP > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > > RIP: 0010:[] [] > > split_huge_page_to_list+0x8fb/0x910 > > RSP: 0018:88007344f968 EFLAGS: 00010286 > > RAX: 0021 RBX: ea240080 RCX: > > RDX: 0001 RSI: 0246 RDI: 821df4d8 > > RBP: 88007344f9e8 R08: R09: 880bc600 > > R10: 8163e2c0 R11: 4b47 R12: ea240080 > > R13: ea240088 R14: ea240080 R15: > > FS: () GS:88007830() knlGS: > > CS: 0010 DS: ES: CR0: 8005003b > > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > > Stack: > > cccd ea240080 88007344fa00 ea240088 > > 88007344fa00 88007344f9e8 810f0200 > > ea24 ea240080 > > Call Trace: > > [] ? __lock_page+0xa0/0xb0 > > [] deferred_split_scan+0x115/0x240 > > [] ? list_lru_count_one+0x1c/0x30 > > [] shrink_slab.part.42+0x1e3/0x350 > > [] shrink_zone+0x26a/0x280 > > [] do_try_to_free_pages+0x12d/0x3b0 > > [] try_to_free_pages+0xb4/0x140 > > [] __alloc_pages_nodemask+0x459/0x920 > > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > > [] khugepaged+0x155/0x1b10 > > [] ? prepare_to_wait_event+0xf0/0xf0 > > [] ? __split_huge_pmd_locked+0x4e0/0x4e0 > > [] kthread+0xc9/0xe0 > > [] ? kthread_park+0x60/0x60 > > [] ret_from_fork+0x3f/0x70 > > [] ? kthread_park+0x60/0x60 > > Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f 0b 48 83 e8 01 > > e9 94 f7 ff ff 48 c7 c6 80 bb 77 81 4c 89 f7 e8 c5 ce fc ff <0f> 0b 48 c7 > > c6 48 c9 77 81 4c 89 e7 e8 b4 ce fc ff 0f 0b 66 90 > > RIP [] split_huge_page_to_list+0x8fb/0x910 > > RSP > > ---[ end trace 0ee39378e850d8de ]--- > > Kernel panic - not syncing: Fatal
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 19, 2015 at 11:12:21AM +0900, Minchan Kim wrote: > On Tue, Nov 17, 2015 at 11:32:13AM +0200, Kirill A. Shutemov wrote: > > On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > > > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > > > I couldn't see any problem. > > > > > > > > > > > > > > However, in this round, I did another test which is same one > > > > > > > I attached but a liitle bit different because it doesn't do > > > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > > > > > Could you share updated test? > > > > > > > > > > It's part of my testing suite so I should factor it out. > > > > > I will send it when I go to office tomorrow. > > > > > > > > Thanks. > > > > > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > > > > > Befor leaving office, I queued it up and result is below. > > > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > > > Anyway, please confirm and say to me what I should add more patches > > > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > > > fix patches. > > > > > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > > > > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > > > > > 1. mm: fix __page_mapcount() > > > 2. thp: fix leak due split_huge_page() vs. exit race > > > > > > If I missed some patches, let me know it. > > > > > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested > > > again. > > > But unfortunately, the result was below. > > > > > > Now, I am making test program I can send to you but it seems to be not > > > easy > > > because small changes for factoring it out from testing suite seems to > > > change > > > something(ex, timing) and makes hard to reproduce. I will try it again. > > > > Your test suite seems generate quite a few bug reports. Don't mind make > > whole > > suite public? > > It's tough due to including company internal stuffs. > That's why I try to factor the part I can share out but unfortunatel, > I couldn't grab a time for retrying until now. :( > > > > > > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > > > index:0x60e02 > > > flags: 0x40040018(uptodate|dirty|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > > > page->mem_cgroup:880077cf0c00 > > > [ cut here ] > > > kernel BUG at mm/huge_memory.c:3272! > > > invalid opcode: [#1] SMP > > > Dumping ftrace buffer: > > >(ftrace buffer empty) > > > Modules linked in: > > > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs > > > 01/01/2011 > > > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > > > RIP: 0010:[] [] > > > split_huge_page_to_list+0x8fb/0x910 > > > RSP: 0018:88007344f968 EFLAGS: 00010286 > > > RAX: 0021 RBX: ea240080 RCX: > > > RDX: 0001 RSI: 0246 RDI: 821df4d8 > > > RBP: 88007344f9e8 R08: R09: 880bc600 > > > R10: 8163e2c0 R11: 4b47 R12: ea240080 > > > R13: ea240088 R14: ea240080 R15: > > > FS: () GS:88007830() > > > knlGS: > > > CS: 0010 DS: ES: CR0: 8005003b > > > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > > > Stack: > > > cccd ea240080 88007344fa00 ea240088 > > > 88007344fa00 88007344f9e8 810f0200 > > > ea24 ea240080 > > > Call Trace: > > > [] ? __lock_page+0xa0/0xb0 > > > [] deferred_split_scan+0x115/0x240 > > > [] ? list_lru_count_one+0x1c/0x30 > > > [] shrink_slab.part.42+0x1e3/0x350 > > > [] shrink_zone+0x26a/0x280 > > > [] do_try_to_free_pages+0x12d/0x3b0 > > > [] try_to_free_pages+0xb4/0x140 > > > [] __alloc_pages_nodemask+0x459/0x920 > > > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > > > [] khugepaged+0x155/0x1b10 > > > [] ? prepare_to_wait_event+0xf0/0xf0 > > > [] ? __split_huge_pmd_locked+0x4e0/0x4e0 > > > [] kthread+0xc9/0xe0 > > > [] ? kthread_park+0x60/0x60 > > > [] ret_from_fork+0x3f/0x70 > > > [] ? kthread_park+0x60/0x60 > > > Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > I couldn't see any problem. > > > > > > > > > > However, in this round, I did another test which is same one > > > > > I attached but a liitle bit different because it doesn't do > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > Could you share updated test? > > > > > > It's part of my testing suite so I should factor it out. > > > I will send it when I go to office tomorrow. > > > > Thanks. > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > Befor leaving office, I queued it up and result is below. > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > Anyway, please confirm and say to me what I should add more patches > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > fix patches. > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > 1. mm: fix __page_mapcount() > 2. thp: fix leak due split_huge_page() vs. exit race > > If I missed some patches, let me know it. > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested again. > But unfortunately, the result was below. > > Now, I am making test program I can send to you but it seems to be not easy > because small changes for factoring it out from testing suite seems to change > something(ex, timing) and makes hard to reproduce. I will try it again. Your test suite seems generate quite a few bug reports. Don't mind make whole suite public? > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > index:0x60e02 > flags: 0x40040018(uptodate|dirty|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > page->mem_cgroup:880077cf0c00 > [ cut here ] > kernel BUG at mm/huge_memory.c:3272! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > RIP: 0010:[] [] > split_huge_page_to_list+0x8fb/0x910 > RSP: 0018:88007344f968 EFLAGS: 00010286 > RAX: 0021 RBX: ea240080 RCX: > RDX: 0001 RSI: 0246 RDI: 821df4d8 > RBP: 88007344f9e8 R08: R09: 880bc600 > R10: 8163e2c0 R11: 4b47 R12: ea240080 > R13: ea240088 R14: ea240080 R15: > FS: () GS:88007830() knlGS: > CS: 0010 DS: ES: CR0: 8005003b > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > Stack: > cccd ea240080 88007344fa00 ea240088 > 88007344fa00 88007344f9e8 810f0200 > ea24 ea240080 > Call Trace: > [] ? __lock_page+0xa0/0xb0 > [] deferred_split_scan+0x115/0x240 > [] ? list_lru_count_one+0x1c/0x30 > [] shrink_slab.part.42+0x1e3/0x350 > [] shrink_zone+0x26a/0x280 > [] do_try_to_free_pages+0x12d/0x3b0 > [] try_to_free_pages+0xb4/0x140 > [] __alloc_pages_nodemask+0x459/0x920 > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > [] khugepaged+0x155/0x1b10 > [] ? prepare_to_wait_event+0xf0/0xf0 > [] ? __split_huge_pmd_locked+0x4e0/0x4e0 > [] kthread+0xc9/0xe0 > [] ? kthread_park+0x60/0x60 > [] ret_from_fork+0x3f/0x70 > [] ? kthread_park+0x60/0x60 > Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f 0b 48 83 e8 01 e9 > 94 f7 ff ff 48 c7 c6 80 bb 77 81 4c 89 f7 e8 c5 ce fc ff <0f> 0b 48 c7 c6 48 > c9 77 81 4c 89 e7 e8 b4 ce fc ff 0f 0b 66 90 > RIP [] split_huge_page_to_list+0x8fb/0x910 > RSP > ---[ end trace 0ee39378e850d8de ]--- > Kernel panic - not syncing: Fatal exception > Dumping ftrace buffer: >(ftrace buffer empty) > Kernel Offset: disabled I looked more into it. It seems a race between split_huge_page() and deferred_split_scan() as the dumped page is not huge. Could you check if the patch below makes any difference to the situation? diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 91e2f4b7ca39..923c0f6eb50a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3186,13 +3186,6 @@ static
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 17, 2015 at 04:35:39PM +0900, Minchan Kim wrote: > On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > > I couldn't see any problem. > > > > > > > > > > However, in this round, I did another test which is same one > > > > > I attached but a liitle bit different because it doesn't do > > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > > > Could you share updated test? > > > > > > It's part of my testing suite so I should factor it out. > > > I will send it when I go to office tomorrow. > > > > Thanks. > > > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > > > Befor leaving office, I queued it up and result is below. > > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > > Anyway, please confirm and say to me what I should add more patches > > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > > fix patches. > > > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com > > 1. mm: fix __page_mapcount() > 2. thp: fix leak due split_huge_page() vs. exit race > > If I missed some patches, let me know it. > > I applied above two patches based on mmotm-2015-11-10-15-53 and tested again. > But unfortunately, the result was below. > > Now, I am making test program I can send to you but it seems to be not easy > because small changes for factoring it out from testing suite seems to change > something(ex, timing) and makes hard to reproduce. I will try it again. Your test suite seems generate quite a few bug reports. Don't mind make whole suite public? > page:ea240080 count:2 mapcount:1 mapping:88007eff3321 > index:0x60e02 > flags: 0x40040018(uptodate|dirty|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > page->mem_cgroup:880077cf0c00 > [ cut here ] > kernel BUG at mm/huge_memory.c:3272! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 > RIP: 0010:[] [] > split_huge_page_to_list+0x8fb/0x910 > RSP: 0018:88007344f968 EFLAGS: 00010286 > RAX: 0021 RBX: ea240080 RCX: > RDX: 0001 RSI: 0246 RDI: 821df4d8 > RBP: 88007344f9e8 R08: R09: 880bc600 > R10: 8163e2c0 R11: 4b47 R12: ea240080 > R13: ea240088 R14: ea240080 R15: > FS: () GS:88007830() knlGS: > CS: 0010 DS: ES: CR0: 8005003b > CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 > Stack: > cccd ea240080 88007344fa00 ea240088 > 88007344fa00 88007344f9e8 810f0200 > ea24 ea240080 > Call Trace: > [] ? __lock_page+0xa0/0xb0 > [] deferred_split_scan+0x115/0x240 > [] ? list_lru_count_one+0x1c/0x30 > [] shrink_slab.part.42+0x1e3/0x350 > [] shrink_zone+0x26a/0x280 > [] do_try_to_free_pages+0x12d/0x3b0 > [] try_to_free_pages+0xb4/0x140 > [] __alloc_pages_nodemask+0x459/0x920 > [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 > [] khugepaged+0x155/0x1b10 > [] ? prepare_to_wait_event+0xf0/0xf0 > [] ? __split_huge_pmd_locked+0x4e0/0x4e0 > [] kthread+0xc9/0xe0 > [] ? kthread_park+0x60/0x60 > [] ret_from_fork+0x3f/0x70 > [] ? kthread_park+0x60/0x60 > Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f 0b 48 83 e8 01 e9 > 94 f7 ff ff 48 c7 c6 80 bb 77 81 4c 89 f7 e8 c5 ce fc ff <0f> 0b 48 c7 c6 48 > c9 77 81 4c 89 e7 e8 b4 ce fc ff 0f 0b 66 90 > RIP [] split_huge_page_to_list+0x8fb/0x910 > RSP > ---[ end trace 0ee39378e850d8de ]--- > Kernel panic - not syncing: Fatal exception > Dumping ftrace buffer: >(ftrace buffer empty) > Kernel Offset: disabled I looked more into it. It seems a race between split_huge_page() and deferred_split_scan() as the dumped page is not huge. Could you check if the patch below makes any difference to the situation? diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 91e2f4b7ca39..923c0f6eb50a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3186,13 +3186,6 @@ static
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > I couldn't see any problem. > > > > > > > > However, in this round, I did another test which is same one > > > > I attached but a liitle bit different because it doesn't do > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > Could you share updated test? > > > > It's part of my testing suite so I should factor it out. > > I will send it when I go to office tomorrow. > > Thanks. > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > Befor leaving office, I queued it up and result is below. > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > Anyway, please confirm and say to me what I should add more patches > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > fix patches. > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com 1. mm: fix __page_mapcount() 2. thp: fix leak due split_huge_page() vs. exit race If I missed some patches, let me know it. I applied above two patches based on mmotm-2015-11-10-15-53 and tested again. But unfortunately, the result was below. Now, I am making test program I can send to you but it seems to be not easy because small changes for factoring it out from testing suite seems to change something(ex, timing) and makes hard to reproduce. I will try it again. page:ea240080 count:2 mapcount:1 mapping:88007eff3321 index:0x60e02 flags: 0x40040018(uptodate|dirty|swapbacked) page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) page->mem_cgroup:880077cf0c00 [ cut here ] kernel BUG at mm/huge_memory.c:3272! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 RIP: 0010:[] [] split_huge_page_to_list+0x8fb/0x910 RSP: 0018:88007344f968 EFLAGS: 00010286 RAX: 0021 RBX: ea240080 RCX: RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 88007344f9e8 R08: R09: 880bc600 R10: 8163e2c0 R11: 4b47 R12: ea240080 R13: ea240088 R14: ea240080 R15: FS: () GS:88007830() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 Stack: cccd ea240080 88007344fa00 ea240088 88007344fa00 88007344f9e8 810f0200 ea24 ea240080 Call Trace: [] ? __lock_page+0xa0/0xb0 [] deferred_split_scan+0x115/0x240 [] ? list_lru_count_one+0x1c/0x30 [] shrink_slab.part.42+0x1e3/0x350 [] shrink_zone+0x26a/0x280 [] do_try_to_free_pages+0x12d/0x3b0 [] try_to_free_pages+0xb4/0x140 [] __alloc_pages_nodemask+0x459/0x920 [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 [] khugepaged+0x155/0x1b10 [] ? prepare_to_wait_event+0xf0/0xf0 [] ? __split_huge_pmd_locked+0x4e0/0x4e0 [] kthread+0xc9/0xe0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60 Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f 0b 48 83 e8 01 e9 94 f7 ff ff 48 c7 c6 80 bb 77 81 4c 89 f7 e8 c5 ce fc ff <0f> 0b 48 c7 c6 48 c9 77 81 4c 89 e7 e8 b4 ce fc ff 0f 0b 66 90 RIP [] split_huge_page_to_list+0x8fb/0x910 RSP ---[ end trace 0ee39378e850d8de ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > During the test with MADV_FREE on kernel I applied your patches, > > > I couldn't see any problem. > > > > > > However, in this round, I did another test which is same one > > > I attached but a liitle bit different because it doesn't do > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > Could you share updated test? > > It's part of my testing suite so I should factor it out. > I will send it when I go to office tomorrow. Thanks. > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > Befor leaving office, I queued it up and result is below. > It seems you fixed already but didn't apply it to mmotm yet. Right? > Anyway, please confirm and say to me what I should add more patches > into mmotm-2015-11-10-15-53 for follow up your recent many bug > fix patches. The two my patches which are not in the mmotm-2015-11-10-15-53 release: http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > During the test with MADV_FREE on kernel I applied your patches, > > I couldn't see any problem. > > > > However, in this round, I did another test which is same one > > I attached but a liitle bit different because it doesn't do > > (memcg things/kill/swapoff) for testing program long-live test. > > Could you share updated test? It's part of my testing suite so I should factor it out. I will send it when I go to office tomorrow. > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? Befor leaving office, I queued it up and result is below. It seems you fixed already but didn't apply it to mmotm yet. Right? Anyway, please confirm and say to me what I should add more patches into mmotm-2015-11-10-15-53 for follow up your recent many bug fix patches. Thanks. page:ea553fc0 count:3 mapcount:1 mapping:88007f717a01 index:0x602ff flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma) page->mem_cgroup:880077cf0c00 [ cut here ] kernel BUG at mm/migrate.c:889! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 10 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 RIP: 0010:[] [] migrate_pages+0x8e6/0x950 RSP: 0018:88007344fa00 EFLAGS: 00010282 RAX: 0021 RBX: ea0001a0bbc0 RCX: RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 88007344fa80 R08: R09: 880b9540 R10: 8163e2c0 R11: 02c2 R12: R13: ea553f80 R14: ea553fc0 R15: 8189db40 FS: () GS:88007834() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 7f45cc0091d8 CR3: 7eba7000 CR4: 06a0 Stack: 880073441a40 81114880 81116420 ea553fe0 88007344fb30 88007344fb20 88007344fb20 Call Trace: [] ? trace_raw_output_mm_compaction_defer_template+0xc0/0xc0 [] ? isolate_freepages_block+0x3d0/0x3d0 [] compact_zone+0x2bb/0x720 [] ? list_del+0xd/0x30 [] compact_zone_order+0x6d/0xa0 [] try_to_compact_pages+0xed/0x200 [] __alloc_pages_direct_compact+0x3b/0xd4 [] __alloc_pages_nodemask+0x3fb/0x920 [] khugepaged+0x155/0x1b10 [] ? prepare_to_wait_event+0xf0/0xf0 [] ? __split_huge_pmd_locked+0x4e0/0x4e0 [] kthread+0xc9/0xe0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60 Code: 44 c6 48 8b 40 08 83 e0 03 48 83 f8 03 0f 84 fd fa ff ff 4d 85 e4 0f 85 f4 fa ff ff 48 c7 c6 b8 f6 77 81 4c 89 f7 e8 fa 36 fd ff <0f> 0b 48 83 e8 01 e9 d0 fa ff ff f6 40 07 01 0f 84 5b fd ff ff RIP [] migrate_pages+0x8e6/0x950 RSP ---[ end trace 337555313b7e45be ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > During the test with MADV_FREE on kernel I applied your patches, > I couldn't see any problem. > > However, in this round, I did another test which is same one > I attached but a liitle bit different because it doesn't do > (memcg things/kill/swapoff) for testing program long-live test. Could you share updated test? And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > With that, I encountered this problem. > > page:eaf60080 count:1 mapcount:0 mapping:88007f584691 > index:0x62a02 > flags: 0x4006a028(uptodate|lru|writeback|swapcache|reclaim|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > page->mem_cgroup:880077cf0c00 > [ cut here ] > kernel BUG at mm/huge_memory.c:3340! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 7 PID: 1657 Comm: memhog Not tainted 4.3.0-rc5-mm1-madv-free+ #4 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 88006b0f1a40 ti: 88004ced4000 task.ti: 88004ced4000 > RIP: 0010:[] [] > split_huge_page_to_list+0x907/0x920 > RSP: 0018:88004ced7a38 EFLAGS: 00010296 > RAX: 0021 RBX: eaf60080 RCX: 81830db8 > RDX: 0001 RSI: 0246 RDI: 821df4d8 > RBP: 88004ced7ab8 R08: R09: 880bc560 > R10: 8163d880 R11: 00014f25 R12: eaf60080 > R13: eaf60088 R14: eaf60080 R15: > FS: 7f43d3ced740() GS:8800782e() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7ff1f6fcdb98 CR3: 4cf56000 CR4: 06a0 > Stack: > cccd eaf60080 88004ced7ad0 eaf60088 > 88004ced7ad0 88004ced7ab8 810ef9d0 > eaf6 eaf60080 > Call Trace: > [] ? __lock_page+0xa0/0xb0 > [] deferred_split_scan+0x11c/0x260 > [] ? list_lru_count_one+0x1c/0x30 > [] shrink_slab.part.42+0x1e3/0x350 > [] shrink_zone+0x26a/0x280 > [] do_try_to_free_pages+0x12d/0x3b0 > [] try_to_free_pages+0xb4/0x140 > [] __alloc_pages_nodemask+0x459/0x920 > [] handle_mm_fault+0xc77/0x1000 > [] ? retint_kernel+0x10/0x10 > [] __do_page_fault+0x189/0x400 > [] do_page_fault+0xc/0x10 > [] page_fault+0x22/0x30 > Code: ff ff 48 c7 c6 f0 b2 77 81 4c 89 f7 e8 13 c3 fc ff 0f 0b 48 83 e8 01 e9 > 88 f7 ff ff 48 c7 c6 70 a1 77 81 4c 89 f7 e8 f9 c2 fc ff <0f> 0b 48 c7 c6 38 > af 77 81 4c 89 e7 e8 e8 c2 fc ff 0f 0b 66 0f > RIP [] split_huge_page_to_list+0x907/0x920 > RSP > ---[ end trace c9a60522e3a296e4 ]--- I don't see how it's possible: call lock_page() just before split_huge_page() in deferred_split_scan(). > So, I reverted all MADV_FREE patches and chaged it with MADV_DONTNEED. > In this time, I saw below oops in this time. > If I miss somethings, please let me know it. > > [ cut here ] > kernel BUG at include/linux/swapops.h:129! Looks similar to what I fixed by inserting smp_wmb() just before clear_compound_head() in __split_huge_page_tail(). Do you have this in place? Like in last -mm tree? > Another hit: > > page:ea520080 count:2 mapcount:0 mapping:880072b38a51 > index:0x62602 > flags: 0x40048028(uptodate|lru|swapcache|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > page->mem_cgroup:880077cf0c00 > [ cut here ] > kernel BUG at mm/huge_memory.c:3306! The same as the first one: no idea. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > During the test with MADV_FREE on kernel I applied your patches, > I couldn't see any problem. > > However, in this round, I did another test which is same one > I attached but a liitle bit different because it doesn't do > (memcg things/kill/swapoff) for testing program long-live test. Could you share updated test? And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > With that, I encountered this problem. > > page:eaf60080 count:1 mapcount:0 mapping:88007f584691 > index:0x62a02 > flags: 0x4006a028(uptodate|lru|writeback|swapcache|reclaim|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > page->mem_cgroup:880077cf0c00 > [ cut here ] > kernel BUG at mm/huge_memory.c:3340! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 7 PID: 1657 Comm: memhog Not tainted 4.3.0-rc5-mm1-madv-free+ #4 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 88006b0f1a40 ti: 88004ced4000 task.ti: 88004ced4000 > RIP: 0010:[] [] > split_huge_page_to_list+0x907/0x920 > RSP: 0018:88004ced7a38 EFLAGS: 00010296 > RAX: 0021 RBX: eaf60080 RCX: 81830db8 > RDX: 0001 RSI: 0246 RDI: 821df4d8 > RBP: 88004ced7ab8 R08: R09: 880bc560 > R10: 8163d880 R11: 00014f25 R12: eaf60080 > R13: eaf60088 R14: eaf60080 R15: > FS: 7f43d3ced740() GS:8800782e() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7ff1f6fcdb98 CR3: 4cf56000 CR4: 06a0 > Stack: > cccd eaf60080 88004ced7ad0 eaf60088 > 88004ced7ad0 88004ced7ab8 810ef9d0 > eaf6 eaf60080 > Call Trace: > [] ? __lock_page+0xa0/0xb0 > [] deferred_split_scan+0x11c/0x260 > [] ? list_lru_count_one+0x1c/0x30 > [] shrink_slab.part.42+0x1e3/0x350 > [] shrink_zone+0x26a/0x280 > [] do_try_to_free_pages+0x12d/0x3b0 > [] try_to_free_pages+0xb4/0x140 > [] __alloc_pages_nodemask+0x459/0x920 > [] handle_mm_fault+0xc77/0x1000 > [] ? retint_kernel+0x10/0x10 > [] __do_page_fault+0x189/0x400 > [] do_page_fault+0xc/0x10 > [] page_fault+0x22/0x30 > Code: ff ff 48 c7 c6 f0 b2 77 81 4c 89 f7 e8 13 c3 fc ff 0f 0b 48 83 e8 01 e9 > 88 f7 ff ff 48 c7 c6 70 a1 77 81 4c 89 f7 e8 f9 c2 fc ff <0f> 0b 48 c7 c6 38 > af 77 81 4c 89 e7 e8 e8 c2 fc ff 0f 0b 66 0f > RIP [] split_huge_page_to_list+0x907/0x920 > RSP > ---[ end trace c9a60522e3a296e4 ]--- I don't see how it's possible: call lock_page() just before split_huge_page() in deferred_split_scan(). > So, I reverted all MADV_FREE patches and chaged it with MADV_DONTNEED. > In this time, I saw below oops in this time. > If I miss somethings, please let me know it. > > [ cut here ] > kernel BUG at include/linux/swapops.h:129! Looks similar to what I fixed by inserting smp_wmb() just before clear_compound_head() in __split_huge_page_tail(). Do you have this in place? Like in last -mm tree? > Another hit: > > page:ea520080 count:2 mapcount:0 mapping:880072b38a51 > index:0x62602 > flags: 0x40048028(uptodate|lru|swapcache|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) > page->mem_cgroup:880077cf0c00 > [ cut here ] > kernel BUG at mm/huge_memory.c:3306! The same as the first one: no idea. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > During the test with MADV_FREE on kernel I applied your patches, > > I couldn't see any problem. > > > > However, in this round, I did another test which is same one > > I attached but a liitle bit different because it doesn't do > > (memcg things/kill/swapoff) for testing program long-live test. > > Could you share updated test? It's part of my testing suite so I should factor it out. I will send it when I go to office tomorrow. > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? Befor leaving office, I queued it up and result is below. It seems you fixed already but didn't apply it to mmotm yet. Right? Anyway, please confirm and say to me what I should add more patches into mmotm-2015-11-10-15-53 for follow up your recent many bug fix patches. Thanks. page:ea553fc0 count:3 mapcount:1 mapping:88007f717a01 index:0x602ff flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma) page->mem_cgroup:880077cf0c00 [ cut here ] kernel BUG at mm/migrate.c:889! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 10 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 RIP: 0010:[] [] migrate_pages+0x8e6/0x950 RSP: 0018:88007344fa00 EFLAGS: 00010282 RAX: 0021 RBX: ea0001a0bbc0 RCX: RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 88007344fa80 R08: R09: 880b9540 R10: 8163e2c0 R11: 02c2 R12: R13: ea553f80 R14: ea553fc0 R15: 8189db40 FS: () GS:88007834() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 7f45cc0091d8 CR3: 7eba7000 CR4: 06a0 Stack: 880073441a40 81114880 81116420 ea553fe0 88007344fb30 88007344fb20 88007344fb20 Call Trace: [] ? trace_raw_output_mm_compaction_defer_template+0xc0/0xc0 [] ? isolate_freepages_block+0x3d0/0x3d0 [] compact_zone+0x2bb/0x720 [] ? list_del+0xd/0x30 [] compact_zone_order+0x6d/0xa0 [] try_to_compact_pages+0xed/0x200 [] __alloc_pages_direct_compact+0x3b/0xd4 [] __alloc_pages_nodemask+0x3fb/0x920 [] khugepaged+0x155/0x1b10 [] ? prepare_to_wait_event+0xf0/0xf0 [] ? __split_huge_pmd_locked+0x4e0/0x4e0 [] kthread+0xc9/0xe0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60 Code: 44 c6 48 8b 40 08 83 e0 03 48 83 f8 03 0f 84 fd fa ff ff 4d 85 e4 0f 85 f4 fa ff ff 48 c7 c6 b8 f6 77 81 4c 89 f7 e8 fa 36 fd ff <0f> 0b 48 83 e8 01 e9 d0 fa ff ff f6 40 07 01 0f 84 5b fd ff ff RIP [] migrate_pages+0x8e6/0x950 RSP ---[ end trace 337555313b7e45be ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > During the test with MADV_FREE on kernel I applied your patches, > > > I couldn't see any problem. > > > > > > However, in this round, I did another test which is same one > > > I attached but a liitle bit different because it doesn't do > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > Could you share updated test? > > It's part of my testing suite so I should factor it out. > I will send it when I go to office tomorrow. Thanks. > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > Befor leaving office, I queued it up and result is below. > It seems you fixed already but didn't apply it to mmotm yet. Right? > Anyway, please confirm and say to me what I should add more patches > into mmotm-2015-11-10-15-53 for follow up your recent many bug > fix patches. The two my patches which are not in the mmotm-2015-11-10-15-53 release: http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 16, 2015 at 12:54:53PM +0200, Kirill A. Shutemov wrote: > On Mon, Nov 16, 2015 at 07:32:20PM +0900, Minchan Kim wrote: > > On Mon, Nov 16, 2015 at 10:45:22AM +0200, Kirill A. Shutemov wrote: > > > On Mon, Nov 16, 2015 at 10:45:21AM +0900, Minchan Kim wrote: > > > > During the test with MADV_FREE on kernel I applied your patches, > > > > I couldn't see any problem. > > > > > > > > However, in this round, I did another test which is same one > > > > I attached but a liitle bit different because it doesn't do > > > > (memcg things/kill/swapoff) for testing program long-live test. > > > > > > Could you share updated test? > > > > It's part of my testing suite so I should factor it out. > > I will send it when I go to office tomorrow. > > Thanks. > > > > And could you try to reproduce it on clean mmotm-2015-11-10-15-53? > > > > Befor leaving office, I queued it up and result is below. > > It seems you fixed already but didn't apply it to mmotm yet. Right? > > Anyway, please confirm and say to me what I should add more patches > > into mmotm-2015-11-10-15-53 for follow up your recent many bug > > fix patches. > > The two my patches which are not in the mmotm-2015-11-10-15-53 release: > > http://lkml.kernel.org/g/1447236557-68682-1-git-send-email-kirill.shute...@linux.intel.com > http://lkml.kernel.org/g/1447236567-68751-1-git-send-email-kirill.shute...@linux.intel.com 1. mm: fix __page_mapcount() 2. thp: fix leak due split_huge_page() vs. exit race If I missed some patches, let me know it. I applied above two patches based on mmotm-2015-11-10-15-53 and tested again. But unfortunately, the result was below. Now, I am making test program I can send to you but it seems to be not easy because small changes for factoring it out from testing suite seems to change something(ex, timing) and makes hard to reproduce. I will try it again. page:ea240080 count:2 mapcount:1 mapping:88007eff3321 index:0x60e02 flags: 0x40040018(uptodate|dirty|swapbacked) page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) page->mem_cgroup:880077cf0c00 [ cut here ] kernel BUG at mm/huge_memory.c:3272! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 8 PID: 59 Comm: khugepaged Not tainted 4.3.0-mm1-kirill+ #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 880073441a40 ti: 88007344c000 task.ti: 88007344c000 RIP: 0010:[] [] split_huge_page_to_list+0x8fb/0x910 RSP: 0018:88007344f968 EFLAGS: 00010286 RAX: 0021 RBX: ea240080 RCX: RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 88007344f9e8 R08: R09: 880bc600 R10: 8163e2c0 R11: 4b47 R12: ea240080 R13: ea240088 R14: ea240080 R15: FS: () GS:88007830() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 7ffd59edcd68 CR3: 01808000 CR4: 06a0 Stack: cccd ea240080 88007344fa00 ea240088 88007344fa00 88007344f9e8 810f0200 ea24 ea240080 Call Trace: [] ? __lock_page+0xa0/0xb0 [] deferred_split_scan+0x115/0x240 [] ? list_lru_count_one+0x1c/0x30 [] shrink_slab.part.42+0x1e3/0x350 [] shrink_zone+0x26a/0x280 [] do_try_to_free_pages+0x12d/0x3b0 [] try_to_free_pages+0xb4/0x140 [] __alloc_pages_nodemask+0x459/0x920 [] ? trace_event_raw_event_tick_stop+0xd0/0xd0 [] khugepaged+0x155/0x1b10 [] ? prepare_to_wait_event+0xf0/0xf0 [] ? __split_huge_pmd_locked+0x4e0/0x4e0 [] kthread+0xc9/0xe0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60 Code: ff ff 48 c7 c6 00 cd 77 81 4c 89 f7 e8 df ce fc ff 0f 0b 48 83 e8 01 e9 94 f7 ff ff 48 c7 c6 80 bb 77 81 4c 89 f7 e8 c5 ce fc ff <0f> 0b 48 c7 c6 48 c9 77 81 4c 89 e7 e8 b4 ce fc ff 0f 0b 66 90 RIP [] split_huge_page_to_list+0x8fb/0x910 RSP ---[ end trace 0ee39378e850d8de ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 12, 2015 at 09:36:14AM +0900, Minchan Kim wrote: > > > mmotm-2015-10-15-15-20-no-madvise_free, IOW it means git head for > > > 54bad5da4834 arm64: add pmd_[dirty|mkclean] for THP so there is no > > > MADV_FREE code in there > > > + pte_mkdirty patch > > > + freeze/unfreeze patch > > > + do_page_add_anon_rmap patch > > > + above split_huge_pmd > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > BUG: Bad rss-counter state mm:88007fa3bb80 idx:1 val:512 > > > > With the patch below my test setup run for 2+ days without triggering the > > bug. split_huge_pmd patch should be dropped. > > > > Please test. > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index 14cbbad54a3e..7aa0a3fef2aa 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -2841,9 +2841,6 @@ static void __split_huge_pmd_locked(struct > > vm_area_struct *vma, pmd_t *pmd, > > write = pmd_write(*pmd); > > young = pmd_young(*pmd); > > > > - /* leave pmd empty until pte is filled */ > > - pmdp_huge_clear_flush_notify(vma, haddr, pmd); > > - > > pgtable = pgtable_trans_huge_withdraw(mm, pmd); > > pmd_populate(mm, &_pmd, pgtable); > > > > @@ -2893,6 +2890,28 @@ static void __split_huge_pmd_locked(struct > > vm_area_struct *vma, pmd_t *pmd, > > } > > > > smp_wmb(); /* make pte visible before pmd */ > > + /* > > +* Up to this point the pmd is present and huge and userland has the > > +* whole access to the hugepage during the split (which happens in > > +* place). If we overwrite the pmd with the not-huge version pointing > > +* to the pte here (which of course we could if all CPUs were bug > > +* free), userland could trigger a small page size TLB miss on the > > +* small sized TLB while the hugepage TLB entry is still established in > > +* the huge TLB. Some CPU doesn't like that. > > +* See http://support.amd.com/us/Processor_TechDocs/41322.pdf, Erratum > > +* 383 on page 93. Intel should be safe but is also warns that it's > > +* only safe if the permission and cache attributes of the two entries > > +* loaded in the two TLB is identical (which should be the case here). > > +* But it is generally safer to never allow small and huge TLB entries > > +* for the same virtual address to be loaded simultaneously. So instead > > +* of doing "pmd_populate(); flush_pmd_tlb_range();" we first mark the > > +* current pmd notpresent (atomically because here the pmd_trans_huge > > +* and pmd_trans_splitting must remain set at all times on the pmd > > +* until the split is complete for this pmd), then we flush the SMP TLB > > +* and finally we write the non-huge version of the pmd entry with > > +* pmd_populate. > > +*/ > > + pmdp_invalidate(vma, haddr, pmd); > > pmd_populate(mm, pmd, pgtable); > > > > if (freeze) { > > I have been tested this patch with MADV_DONTNEED for a few days and > I couldn't see the problem any more. And I will continue to test it > with MADV_FREE. During the test with MADV_FREE on kernel I applied your patches, I couldn't see any problem. However, in this round, I did another test which is same one I attached but a liitle bit different because it doesn't do (memcg things/kill/swapoff) for testing program long-live test. With that, I encountered this problem. page:eaf60080 count:1 mapcount:0 mapping:88007f584691 index:0x62a02 flags: 0x4006a028(uptodate|lru|writeback|swapcache|reclaim|swapbacked) page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) page->mem_cgroup:880077cf0c00 [ cut here ] kernel BUG at mm/huge_memory.c:3340! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 7 PID: 1657 Comm: memhog Not tainted 4.3.0-rc5-mm1-madv-free+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 88006b0f1a40 ti: 88004ced4000 task.ti: 88004ced4000 RIP: 0010:[] [] split_huge_page_to_list+0x907/0x920 RSP: 0018:88004ced7a38 EFLAGS: 00010296 RAX: 0021 RBX: eaf60080 RCX: 81830db8 RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 88004ced7ab8 R08: R09: 880bc560 R10: 8163d880 R11: 00014f25 R12: eaf60080 R13: eaf60088 R14: eaf60080 R15: FS: 7f43d3ced740() GS:8800782e() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7ff1f6fcdb98 CR3: 4cf56000 CR4: 06a0 Stack: cccd eaf60080 88004ced7ad0 eaf60088 88004ced7ad0 88004ced7ab8
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 12, 2015 at 09:36:14AM +0900, Minchan Kim wrote: > > > mmotm-2015-10-15-15-20-no-madvise_free, IOW it means git head for > > > 54bad5da4834 arm64: add pmd_[dirty|mkclean] for THP so there is no > > > MADV_FREE code in there > > > + pte_mkdirty patch > > > + freeze/unfreeze patch > > > + do_page_add_anon_rmap patch > > > + above split_huge_pmd > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > BUG: Bad rss-counter state mm:88007fa3bb80 idx:1 val:512 > > > > With the patch below my test setup run for 2+ days without triggering the > > bug. split_huge_pmd patch should be dropped. > > > > Please test. > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index 14cbbad54a3e..7aa0a3fef2aa 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -2841,9 +2841,6 @@ static void __split_huge_pmd_locked(struct > > vm_area_struct *vma, pmd_t *pmd, > > write = pmd_write(*pmd); > > young = pmd_young(*pmd); > > > > - /* leave pmd empty until pte is filled */ > > - pmdp_huge_clear_flush_notify(vma, haddr, pmd); > > - > > pgtable = pgtable_trans_huge_withdraw(mm, pmd); > > pmd_populate(mm, &_pmd, pgtable); > > > > @@ -2893,6 +2890,28 @@ static void __split_huge_pmd_locked(struct > > vm_area_struct *vma, pmd_t *pmd, > > } > > > > smp_wmb(); /* make pte visible before pmd */ > > + /* > > +* Up to this point the pmd is present and huge and userland has the > > +* whole access to the hugepage during the split (which happens in > > +* place). If we overwrite the pmd with the not-huge version pointing > > +* to the pte here (which of course we could if all CPUs were bug > > +* free), userland could trigger a small page size TLB miss on the > > +* small sized TLB while the hugepage TLB entry is still established in > > +* the huge TLB. Some CPU doesn't like that. > > +* See http://support.amd.com/us/Processor_TechDocs/41322.pdf, Erratum > > +* 383 on page 93. Intel should be safe but is also warns that it's > > +* only safe if the permission and cache attributes of the two entries > > +* loaded in the two TLB is identical (which should be the case here). > > +* But it is generally safer to never allow small and huge TLB entries > > +* for the same virtual address to be loaded simultaneously. So instead > > +* of doing "pmd_populate(); flush_pmd_tlb_range();" we first mark the > > +* current pmd notpresent (atomically because here the pmd_trans_huge > > +* and pmd_trans_splitting must remain set at all times on the pmd > > +* until the split is complete for this pmd), then we flush the SMP TLB > > +* and finally we write the non-huge version of the pmd entry with > > +* pmd_populate. > > +*/ > > + pmdp_invalidate(vma, haddr, pmd); > > pmd_populate(mm, pmd, pgtable); > > > > if (freeze) { > > I have been tested this patch with MADV_DONTNEED for a few days and > I couldn't see the problem any more. And I will continue to test it > with MADV_FREE. During the test with MADV_FREE on kernel I applied your patches, I couldn't see any problem. However, in this round, I did another test which is same one I attached but a liitle bit different because it doesn't do (memcg things/kill/swapoff) for testing program long-live test. With that, I encountered this problem. page:eaf60080 count:1 mapcount:0 mapping:88007f584691 index:0x62a02 flags: 0x4006a028(uptodate|lru|writeback|swapcache|reclaim|swapbacked) page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) page->mem_cgroup:880077cf0c00 [ cut here ] kernel BUG at mm/huge_memory.c:3340! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 7 PID: 1657 Comm: memhog Not tainted 4.3.0-rc5-mm1-madv-free+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 88006b0f1a40 ti: 88004ced4000 task.ti: 88004ced4000 RIP: 0010:[] [] split_huge_page_to_list+0x907/0x920 RSP: 0018:88004ced7a38 EFLAGS: 00010296 RAX: 0021 RBX: eaf60080 RCX: 81830db8 RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 88004ced7ab8 R08: R09: 880bc560 R10: 8163d880 R11: 00014f25 R12: eaf60080 R13: eaf60088 R14: eaf60080 R15: FS: 7f43d3ced740() GS:8800782e() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7ff1f6fcdb98 CR3: 4cf56000 CR4: 06a0 Stack: cccd eaf60080 88004ced7ad0 eaf60088 88004ced7ad0 88004ced7ab8
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 09, 2015 at 12:55:22AM +0200, Kirill A. Shutemov wrote: > On Thu, Nov 05, 2015 at 09:19:22AM +0900, Minchan Kim wrote: > > On Wed, Nov 04, 2015 at 04:21:35PM +0200, Kirill A. Shutemov wrote: > > > On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > > > > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > > > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > > > > Hello Kirill, > > > > > > > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov > > > > > > > wrote: > > > > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov > > > > > > > > > wrote: > > > > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. > > > > > > > > > > > Shutemov wrote: > > > > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim > > > > > > > > > > > > wrote: > > > > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh > > > > > > > > > > > > > > Dickins wrote: > > > > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it > > > > > > > > > > > > > > > > again but I had another oops > > > > > > > > > > > > > > > > in this time but symptom is related to > > > > > > > > > > > > > > > > anon_vma, too. > > > > > > > > > > > > > > > > (kernel is based on recent mmotm + > > > > > > > > > > > > > > > > unconditional mkdirty for bug fix) > > > > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since > > > > > > > > > > > > > > > > the page was not page_mapped > > > > > > > > > > > > > > > > at that time but second check of page_mapped > > > > > > > > > > > > > > > > right before try_to_unmap seems > > > > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > > > > flags: > > > > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > > > > page dumped because: > > > > > > > > > > > > > > > > VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > > > > > > > > > > > > && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > > > > migration series. > > > > > > > > > > > > > > > Let me think on it, but it could well relate to > > > > > > > > > > > > > > > the one you got before. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > > > > migration cleanup > > > > > > > > > > > > > > series and will test it again. > > > > > > > > > > > > > > If it is fixed, I will test again with your > > > > > > > > > > > > > > migration patchset, then. > > > > > > > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I > > > > > > > > > > > > > attach for a long time. > > > > > > > > > > > > > Therefore, there is no patchset from Hugh's migration > > > > > > > > > > > > > patch in there. > > > > > > > > > > > > > And I added below debug code with request from Kirill > > > > > > > > > > > > > to all test kernels. > > > > > > > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I > > > > > > > > > > > > think I track it down > > > > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > > > > properly tested, but > > > > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration > > > > > > > > > > > > works: I thought that > > > > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with
Re: kernel oops on mmotm-2015-10-15-15-20
On Mon, Nov 09, 2015 at 12:55:22AM +0200, Kirill A. Shutemov wrote: > On Thu, Nov 05, 2015 at 09:19:22AM +0900, Minchan Kim wrote: > > On Wed, Nov 04, 2015 at 04:21:35PM +0200, Kirill A. Shutemov wrote: > > > On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > > > > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > > > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > > > > Hello Kirill, > > > > > > > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov > > > > > > > wrote: > > > > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov > > > > > > > > > wrote: > > > > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. > > > > > > > > > > > Shutemov wrote: > > > > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim > > > > > > > > > > > > wrote: > > > > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh > > > > > > > > > > > > > > Dickins wrote: > > > > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it > > > > > > > > > > > > > > > > again but I had another oops > > > > > > > > > > > > > > > > in this time but symptom is related to > > > > > > > > > > > > > > > > anon_vma, too. > > > > > > > > > > > > > > > > (kernel is based on recent mmotm + > > > > > > > > > > > > > > > > unconditional mkdirty for bug fix) > > > > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since > > > > > > > > > > > > > > > > the page was not page_mapped > > > > > > > > > > > > > > > > at that time but second check of page_mapped > > > > > > > > > > > > > > > > right before try_to_unmap seems > > > > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > > > > flags: > > > > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > > > > page dumped because: > > > > > > > > > > > > > > > > VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > > > > > > > > > > > > && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > > > > migration series. > > > > > > > > > > > > > > > Let me think on it, but it could well relate to > > > > > > > > > > > > > > > the one you got before. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > > > > migration cleanup > > > > > > > > > > > > > > series and will test it again. > > > > > > > > > > > > > > If it is fixed, I will test again with your > > > > > > > > > > > > > > migration patchset, then. > > > > > > > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I > > > > > > > > > > > > > attach for a long time. > > > > > > > > > > > > > Therefore, there is no patchset from Hugh's migration > > > > > > > > > > > > > patch in there. > > > > > > > > > > > > > And I added below debug code with request from Kirill > > > > > > > > > > > > > to all test kernels. > > > > > > > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I > > > > > > > > > > > > think I track it down > > > > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > > > > properly tested, but > > > > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration > > > > > > > > > > > > works: I thought that > > > > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 05, 2015 at 09:19:22AM +0900, Minchan Kim wrote: > On Wed, Nov 04, 2015 at 04:21:35PM +0200, Kirill A. Shutemov wrote: > > On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > > > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > > > Hello Kirill, > > > > > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov > > > > > > > > wrote: > > > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. > > > > > > > > > > Shutemov wrote: > > > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim > > > > > > > > > > > wrote: > > > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh > > > > > > > > > > > > > Dickins wrote: > > > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again > > > > > > > > > > > > > > > but I had another oops > > > > > > > > > > > > > > > in this time but symptom is related to anon_vma, > > > > > > > > > > > > > > > too. > > > > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the > > > > > > > > > > > > > > > page was not page_mapped > > > > > > > > > > > > > > > at that time but second check of page_mapped > > > > > > > > > > > > > > > right before try_to_unmap seems > > > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > > > flags: > > > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > > > page dumped because: > > > > > > > > > > > > > > > VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > > > > > > > > > > > && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > > > migration series. > > > > > > > > > > > > > > Let me think on it, but it could well relate to the > > > > > > > > > > > > > > one you got before. > > > > > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > > > migration cleanup > > > > > > > > > > > > > series and will test it again. > > > > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I > > > > > > > > > > > > attach for a long time. > > > > > > > > > > > > Therefore, there is no patchset from Hugh's migration > > > > > > > > > > > > patch in there. > > > > > > > > > > > > And I added below debug code with request from Kirill > > > > > > > > > > > > to all test kernels. > > > > > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I > > > > > > > > > > > think I track it down > > > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > > > properly tested, but > > > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration > > > > > > > > > > > works: I thought that > > > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with > > > > > > > > > > > split_huge_page(), we can end up > > > > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > > > > _mapcount > > > > > > > > > > > elevated. The page is on LRU too. So it's still reachable > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Nov 05, 2015 at 09:19:22AM +0900, Minchan Kim wrote: > On Wed, Nov 04, 2015 at 04:21:35PM +0200, Kirill A. Shutemov wrote: > > On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > > > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > > > Hello Kirill, > > > > > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov > > > > > > > > wrote: > > > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. > > > > > > > > > > Shutemov wrote: > > > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim > > > > > > > > > > > wrote: > > > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh > > > > > > > > > > > > > Dickins wrote: > > > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again > > > > > > > > > > > > > > > but I had another oops > > > > > > > > > > > > > > > in this time but symptom is related to anon_vma, > > > > > > > > > > > > > > > too. > > > > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the > > > > > > > > > > > > > > > page was not page_mapped > > > > > > > > > > > > > > > at that time but second check of page_mapped > > > > > > > > > > > > > > > right before try_to_unmap seems > > > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > > > flags: > > > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > > > page dumped because: > > > > > > > > > > > > > > > VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > > > > > > > > > > > && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > > > migration series. > > > > > > > > > > > > > > Let me think on it, but it could well relate to the > > > > > > > > > > > > > > one you got before. > > > > > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > > > migration cleanup > > > > > > > > > > > > > series and will test it again. > > > > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I > > > > > > > > > > > > attach for a long time. > > > > > > > > > > > > Therefore, there is no patchset from Hugh's migration > > > > > > > > > > > > patch in there. > > > > > > > > > > > > And I added below debug code with request from Kirill > > > > > > > > > > > > to all test kernels. > > > > > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I > > > > > > > > > > > think I track it down > > > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > > > properly tested, but > > > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration > > > > > > > > > > > works: I thought that > > > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with > > > > > > > > > > > split_huge_page(), we can end up > > > > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > > > > _mapcount > > > > > > > > > > > elevated. The page is on LRU too. So it's still reachable > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Nov 04, 2015 at 04:21:35PM +0200, Kirill A. Shutemov wrote: > On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > > Hello Kirill, > > > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov > > > > > > > wrote: > > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov > > > > > > > > > wrote: > > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim > > > > > > > > > > > wrote: > > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins > > > > > > > > > > > > wrote: > > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again > > > > > > > > > > > > > > but I had another oops > > > > > > > > > > > > > > in this time but symptom is related to anon_vma, > > > > > > > > > > > > > > too. > > > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the > > > > > > > > > > > > > > page was not page_mapped > > > > > > > > > > > > > > at that time but second check of page_mapped right > > > > > > > > > > > > > > before try_to_unmap seems > > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > > flags: > > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) > > > > > > > > > > > > > > && !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > > migration series. > > > > > > > > > > > > > Let me think on it, but it could well relate to the > > > > > > > > > > > > > one you got before. > > > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > > migration cleanup > > > > > > > > > > > > series and will test it again. > > > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I > > > > > > > > > > > attach for a long time. > > > > > > > > > > > Therefore, there is no patchset from Hugh's migration > > > > > > > > > > > patch in there. > > > > > > > > > > > And I added below debug code with request from Kirill to > > > > > > > > > > > all test kernels. > > > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think > > > > > > > > > > I track it down > > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > > properly tested, but > > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: > > > > > > > > > > I thought that > > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), > > > > > > > > > > we can end up > > > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > > > _mapcount > > > > > > > > > > elevated. The page is on LRU too. So it's still reachable > > > > > > > > > > by vmscan and by > > > > > > > > > > pfn scanners (Sasha showed few similar traces from > > > > > > > > > > compaction too). > > > > > > > > > > It's likely that page->mapping in this case would point to > > > > > > > > > > freed anon_vma. > > > > > > > > > > > > > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > Hello Kirill, > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov > > > > > > > > wrote: > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins > > > > > > > > > > > wrote: > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but > > > > > > > > > > > > > I had another oops > > > > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the > > > > > > > > > > > > > page was not page_mapped > > > > > > > > > > > > > at that time but second check of page_mapped right > > > > > > > > > > > > > before try_to_unmap seems > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > flags: > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > migration series. > > > > > > > > > > > > Let me think on it, but it could well relate to the one > > > > > > > > > > > > you got before. > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > migration cleanup > > > > > > > > > > > series and will test it again. > > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach > > > > > > > > > > for a long time. > > > > > > > > > > Therefore, there is no patchset from Hugh's migration patch > > > > > > > > > > in there. > > > > > > > > > > And I added below debug code with request from Kirill to > > > > > > > > > > all test kernels. > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I > > > > > > > > > track it down > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > properly tested, but > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > > > > thought that > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we > > > > > > > > > can end up > > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > > _mapcount > > > > > > > > > elevated. The page is on LRU too. So it's still reachable by > > > > > > > > > vmscan and by > > > > > > > > > pfn scanners (Sasha showed few similar traces from compaction > > > > > > > > > too). > > > > > > > > > It's likely that page->mapping in this case would point to > > > > > > > > > freed anon_vma. > > > > > > > > > > > > > > > > > > BOOM! > > > > > > > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > > > > migration > > > > > > > > > entries logic: on setup we remove page from rmap and drop > > > > > > > > > pin, on removing > > > > > > > > > we get pin back and put page on rmap.
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Nov 04, 2015 at 04:21:35PM +0200, Kirill A. Shutemov wrote: > On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > > Hello Kirill, > > > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov > > > > > > > wrote: > > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov > > > > > > > > > wrote: > > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim > > > > > > > > > > > wrote: > > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins > > > > > > > > > > > > wrote: > > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again > > > > > > > > > > > > > > but I had another oops > > > > > > > > > > > > > > in this time but symptom is related to anon_vma, > > > > > > > > > > > > > > too. > > > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the > > > > > > > > > > > > > > page was not page_mapped > > > > > > > > > > > > > > at that time but second check of page_mapped right > > > > > > > > > > > > > > before try_to_unmap seems > > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > > flags: > > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) > > > > > > > > > > > > > > && !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > > migration series. > > > > > > > > > > > > > Let me think on it, but it could well relate to the > > > > > > > > > > > > > one you got before. > > > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > > migration cleanup > > > > > > > > > > > > series and will test it again. > > > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I > > > > > > > > > > > attach for a long time. > > > > > > > > > > > Therefore, there is no patchset from Hugh's migration > > > > > > > > > > > patch in there. > > > > > > > > > > > And I added below debug code with request from Kirill to > > > > > > > > > > > all test kernels. > > > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think > > > > > > > > > > I track it down > > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > > properly tested, but > > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: > > > > > > > > > > I thought that > > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), > > > > > > > > > > we can end up > > > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > > > _mapcount > > > > > > > > > > elevated. The page is on LRU too. So it's still reachable > > > > > > > > > > by vmscan and by > > > > > > > > > > pfn scanners (Sasha showed few similar traces from > > > > > > > > > > compaction too). > > > > > > > > > > It's likely that page->mapping in this case would point to > > > > > > > > > > freed anon_vma. > > > > > > > > > > > > > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Nov 04, 2015 at 12:20:19AM +0900, Minchan Kim wrote: > On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > > Hello Kirill, > > > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov > > > > > > > > wrote: > > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins > > > > > > > > > > > wrote: > > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but > > > > > > > > > > > > > I had another oops > > > > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the > > > > > > > > > > > > > page was not page_mapped > > > > > > > > > > > > > at that time but second check of page_mapped right > > > > > > > > > > > > > before try_to_unmap seems > > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > > flags: > > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > > migration series. > > > > > > > > > > > > Let me think on it, but it could well relate to the one > > > > > > > > > > > > you got before. > > > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > > migration cleanup > > > > > > > > > > > series and will test it again. > > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach > > > > > > > > > > for a long time. > > > > > > > > > > Therefore, there is no patchset from Hugh's migration patch > > > > > > > > > > in there. > > > > > > > > > > And I added below debug code with request from Kirill to > > > > > > > > > > all test kernels. > > > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I > > > > > > > > > track it down > > > > > > > > > finally. > > > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet > > > > > > > > > properly tested, but > > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > > > > thought that > > > > > > > > > kernel would wait migration to finish on before > > > > > > > > > deconstruction mapping. > > > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we > > > > > > > > > can end up > > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > > _mapcount > > > > > > > > > elevated. The page is on LRU too. So it's still reachable by > > > > > > > > > vmscan and by > > > > > > > > > pfn scanners (Sasha showed few similar traces from compaction > > > > > > > > > too). > > > > > > > > > It's likely that page->mapping in this case would point to > > > > > > > > > freed anon_vma. > > > > > > > > > > > > > > > > > > BOOM! > > > > > > > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > > > > migration > > > > > > > > > entries logic: on setup we remove page from rmap and drop > > > > > > > > > pin, on removing > > > > > > > > > we get pin back and put page on rmap.
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > Hello Kirill, > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov > > > > > > > wrote: > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins > > > > > > > > > > wrote: > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I > > > > > > > > > > > > had another oops > > > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the page > > > > > > > > > > > > was not page_mapped > > > > > > > > > > > > at that time but second check of page_mapped right > > > > > > > > > > > > before try_to_unmap seems > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > flags: > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > migration series. > > > > > > > > > > > Let me think on it, but it could well relate to the one > > > > > > > > > > > you got before. > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > migration cleanup > > > > > > > > > > series and will test it again. > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach > > > > > > > > > for a long time. > > > > > > > > > Therefore, there is no patchset from Hugh's migration patch > > > > > > > > > in there. > > > > > > > > > And I added below debug code with request from Kirill to all > > > > > > > > > test kernels. > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I > > > > > > > > track it down > > > > > > > > finally. > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > > > > tested, but > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > > > thought that > > > > > > > > kernel would wait migration to finish on before deconstruction > > > > > > > > mapping. > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we > > > > > > > > can end up > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > _mapcount > > > > > > > > elevated. The page is on LRU too. So it's still reachable by > > > > > > > > vmscan and by > > > > > > > > pfn scanners (Sasha showed few similar traces from compaction > > > > > > > > too). > > > > > > > > It's likely that page->mapping in this case would point to > > > > > > > > freed anon_vma. > > > > > > > > > > > > > > > > BOOM! > > > > > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > > > migration > > > > > > > > entries logic: on setup we remove page from rmap and drop pin, > > > > > > > > on removing > > > > > > > > we get pin back and put page on rmap. This way even if > > > > > > > > migration entry > > > > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > > > > > > > Please, test. > > > > > > > > > > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 03, 2015 at 04:33:29PM +0900, Minchan Kim wrote: > On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > > Hello Kirill, > > > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov > > > > > > > wrote: > > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins > > > > > > > > > > wrote: > > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I > > > > > > > > > > > > had another oops > > > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > > > (kernel is based on recent mmotm + unconditional > > > > > > > > > > > > mkdirty for bug fix) > > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the page > > > > > > > > > > > > was not page_mapped > > > > > > > > > > > > at that time but second check of page_mapped right > > > > > > > > > > > > before try_to_unmap seems > > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 > > > > > > > > > > > > extents:1 across:4191228k FS > > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > > flags: > > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page > > > > > > > > > > > migration series. > > > > > > > > > > > Let me think on it, but it could well relate to the one > > > > > > > > > > > you got before. > > > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > > instead of next-20151021 to remove noise from your > > > > > > > > > > migration cleanup > > > > > > > > > > series and will test it again. > > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach > > > > > > > > > for a long time. > > > > > > > > > Therefore, there is no patchset from Hugh's migration patch > > > > > > > > > in there. > > > > > > > > > And I added below debug code with request from Kirill to all > > > > > > > > > test kernels. > > > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I > > > > > > > > track it down > > > > > > > > finally. > > > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > > > > tested, but > > > > > > > > looks like it works. > > > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > > > thought that > > > > > > > > kernel would wait migration to finish on before deconstruction > > > > > > > > mapping. > > > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we > > > > > > > > can end up > > > > > > > > with page which is not mapped anymore but has _count and > > > > > > > > _mapcount > > > > > > > > elevated. The page is on LRU too. So it's still reachable by > > > > > > > > vmscan and by > > > > > > > > pfn scanners (Sasha showed few similar traces from compaction > > > > > > > > too). > > > > > > > > It's likely that page->mapping in this case would point to > > > > > > > > freed anon_vma. > > > > > > > > > > > > > > > > BOOM! > > > > > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > > > migration > > > > > > > > entries logic: on setup we remove page from rmap and drop pin, > > > > > > > > on removing > > > > > > > > we get pin back and put page on rmap. This way even if > > > > > > > > migration entry > > > > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > > > > > > > Please, test. > > > > > > > > > > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > Hello Kirill, > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I > > > > > > > > > > > had another oops > > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty > > > > > > > > > > > for bug fix) > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the page > > > > > > > > > > > was not page_mapped > > > > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > > > > try_to_unmap seems > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > > across:4191228k FS > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > > across:4191228k FS > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > flags: > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > > > > series. > > > > > > > > > > Let me think on it, but it could well relate to the one you > > > > > > > > > > got before. > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > instead of next-20151021 to remove noise from your migration > > > > > > > > > cleanup > > > > > > > > > series and will test it again. > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for > > > > > > > > a long time. > > > > > > > > Therefore, there is no patchset from Hugh's migration patch in > > > > > > > > there. > > > > > > > > And I added below debug code with request from Kirill to all > > > > > > > > test kernels. > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I > > > > > > > track it down > > > > > > > finally. > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > > > tested, but > > > > > > > looks like it works. > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > > thought that > > > > > > > kernel would wait migration to finish on before deconstruction > > > > > > > mapping. > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can > > > > > > > end up > > > > > > > with page which is not mapped anymore but has _count and _mapcount > > > > > > > elevated. The page is on LRU too. So it's still reachable by > > > > > > > vmscan and by > > > > > > > pfn scanners (Sasha showed few similar traces from compaction > > > > > > > too). > > > > > > > It's likely that page->mapping in this case would point to freed > > > > > > > anon_vma. > > > > > > > > > > > > > > BOOM! > > > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > > migration > > > > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > > > > removing > > > > > > > we get pin back and put page on rmap. This way even if migration > > > > > > > entry > > > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > > > > > Please, test. > > > > > > > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new > > > > > > patch, I tested > > > > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > > > > index:0x61800 compound_mapcount: 0 > > > > > > flags:
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > Hello Kirill, > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > > > > another oops > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty > > > > > > > > > > for bug fix) > > > > > > > > > > It seems page_get_anon_vma returns NULL since the page was > > > > > > > > > > not page_mapped > > > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > > > try_to_unmap seems > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > across:4191228k FS > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > across:4191228k FS > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > flags: > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > > > series. > > > > > > > > > Let me think on it, but it could well relate to the one you > > > > > > > > > got before. > > > > > > > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > instead of next-20151021 to remove noise from your migration > > > > > > > > cleanup > > > > > > > > series and will test it again. > > > > > > > > If it is fixed, I will test again with your migration patchset, > > > > > > > > then. > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a > > > > > > > long time. > > > > > > > Therefore, there is no patchset from Hugh's migration patch in > > > > > > > there. > > > > > > > And I added below debug code with request from Kirill to all test > > > > > > > kernels. > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I track > > > > > > it down > > > > > > finally. > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > > tested, but > > > > > > looks like it works. > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > thought that > > > > > > kernel would wait migration to finish on before deconstruction > > > > > > mapping. > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can > > > > > > end up > > > > > > with page which is not mapped anymore but has _count and _mapcount > > > > > > elevated. The page is on LRU too. So it's still reachable by vmscan > > > > > > and by > > > > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > > > > It's likely that page->mapping in this case would point to freed > > > > > > anon_vma. > > > > > > > > > > > > BOOM! > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > migration > > > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > > > removing > > > > > > we get pin back and put page on rmap. This way even if migration > > > > > > entry > > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > > > Please, test. > > > > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new > > > > > patch, I tested > > > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > > > index:0x61800 compound_mapcount: 0 > > > > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > > > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > > > > page->mem_cgroup:88007f613c00 > > > > > > > > Ignore my previous answer. Still sleeping. > > > > > > > > The right way to fix I think is something like: > > > > > > > > diff --git a/mm/rmap.c
Re: kernel oops on mmotm-2015-10-15-15-20
Hello Kirill, On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > Hello Hugh, > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > > > another oops > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for > > > > > > > > > bug fix) > > > > > > > > > It seems page_get_anon_vma returns NULL since the page was > > > > > > > > > not page_mapped > > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > > try_to_unmap seems > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > across:4191228k FS > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > across:4191228k FS > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > flags: > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > > series. > > > > > > > > Let me think on it, but it could well relate to the one you got > > > > > > > > before. > > > > > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > instead of next-20151021 to remove noise from your migration > > > > > > > cleanup > > > > > > > series and will test it again. > > > > > > > If it is fixed, I will test again with your migration patchset, > > > > > > > then. > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a > > > > > > long time. > > > > > > Therefore, there is no patchset from Hugh's migration patch in > > > > > > there. > > > > > > And I added below debug code with request from Kirill to all test > > > > > > kernels. > > > > > > > > > > It took too long time (and a lot of printk()), but I think I track it > > > > > down > > > > > finally. > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > tested, but > > > > > looks like it works. > > > > > > > > > > The problem was my wrong assumption on how migration works: I thought > > > > > that > > > > > kernel would wait migration to finish on before deconstruction > > > > > mapping. > > > > > > > > > > But turn out that's not true. > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can end > > > > > up > > > > > with page which is not mapped anymore but has _count and _mapcount > > > > > elevated. The page is on LRU too. So it's still reachable by vmscan > > > > > and by > > > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > > > It's likely that page->mapping in this case would point to freed > > > > > anon_vma. > > > > > > > > > > BOOM! > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > > removing > > > > > we get pin back and put page on rmap. This way even if migration entry > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > Please, test. > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, > > > > I tested > > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > > index:0x61800 compound_mapcount: 0 > > > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > > > page->mem_cgroup:88007f613c00 > > > > > > Ignore my previous answer. Still sleeping. > > > > > > The right way to fix I think is something like: > > > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > > index 35643176bc15..f2d46792a554 100644 > > > --- a/mm/rmap.c > > > +++ b/mm/rmap.c > > > @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, > > > bool compound = flags & RMAP_COMPOUND; > > > bool first; > > > > > > - if (PageTransCompound(page)) { > > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > Hello Hugh, > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > > another oops > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for > > > > > > > > bug fix) > > > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > > > page_mapped > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > try_to_unmap seems > > > > > > > > to be true. > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > across:4191228k FS > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > across:4191228k FS > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > flags: > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > series. > > > > > > > Let me think on it, but it could well relate to the one you got > > > > > > > before. > > > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > > > series and will test it again. > > > > > > If it is fixed, I will test again with your migration patchset, > > > > > > then. > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > > > time. > > > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > > > And I added below debug code with request from Kirill to all test > > > > > kernels. > > > > > > > > It took too long time (and a lot of printk()), but I think I track it > > > > down > > > > finally. > > > > > > > > The patch below seems fixes issue for me. It's not yet properly tested, > > > > but > > > > looks like it works. > > > > > > > > The problem was my wrong assumption on how migration works: I thought > > > > that > > > > kernel would wait migration to finish on before deconstruction mapping. > > > > > > > > But turn out that's not true. > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > > > with page which is not mapped anymore but has _count and _mapcount > > > > elevated. The page is on LRU too. So it's still reachable by vmscan and > > > > by > > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > > It's likely that page->mapping in this case would point to freed > > > > anon_vma. > > > > > > > > BOOM! > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > removing > > > > we get pin back and put page on rmap. This way even if migration entry > > > > will be removed under us we don't corrupt page's state. > > > > > > > > Please, test. > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > > > tested > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > index:0x61800 compound_mapcount: 0 > > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > > page->mem_cgroup:88007f613c00 > > > > Ignore my previous answer. Still sleeping. > > > > The right way to fix I think is something like: > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index 35643176bc15..f2d46792a554 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, > > bool compound = flags & RMAP_COMPOUND; > > bool first; > > > > - if (PageTransCompound(page)) { > > + if (PageTransCompound(page) && compound) { > > + atomic_t *mapcount; > > VM_BUG_ON_PAGE(!PageLocked(page), page); > > - if (compound) { > > - atomic_t *mapcount; > > - > > - VM_BUG_ON_PAGE(!PageTransHuge(page), page); > > - mapcount =
Re: kernel oops on mmotm-2015-10-15-15-20
On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > Hello Hugh, > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > > another oops > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for > > > > > > > > bug fix) > > > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > > > page_mapped > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > try_to_unmap seems > > > > > > > > to be true. > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > across:4191228k FS > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > across:4191228k FS > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > flags: > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > series. > > > > > > > Let me think on it, but it could well relate to the one you got > > > > > > > before. > > > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > > > series and will test it again. > > > > > > If it is fixed, I will test again with your migration patchset, > > > > > > then. > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > > > time. > > > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > > > And I added below debug code with request from Kirill to all test > > > > > kernels. > > > > > > > > It took too long time (and a lot of printk()), but I think I track it > > > > down > > > > finally. > > > > > > > > The patch below seems fixes issue for me. It's not yet properly tested, > > > > but > > > > looks like it works. > > > > > > > > The problem was my wrong assumption on how migration works: I thought > > > > that > > > > kernel would wait migration to finish on before deconstruction mapping. > > > > > > > > But turn out that's not true. > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > > > with page which is not mapped anymore but has _count and _mapcount > > > > elevated. The page is on LRU too. So it's still reachable by vmscan and > > > > by > > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > > It's likely that page->mapping in this case would point to freed > > > > anon_vma. > > > > > > > > BOOM! > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > removing > > > > we get pin back and put page on rmap. This way even if migration entry > > > > will be removed under us we don't corrupt page's state. > > > > > > > > Please, test. > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > > > tested > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > index:0x61800 compound_mapcount: 0 > > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > > page->mem_cgroup:88007f613c00 > > > > Ignore my previous answer. Still sleeping. > > > > The right way to fix I think is something like: > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index 35643176bc15..f2d46792a554 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, > > bool compound = flags & RMAP_COMPOUND; > > bool first; > > > > - if (PageTransCompound(page)) { > > + if (PageTransCompound(page) && compound) { > > + atomic_t *mapcount; > > VM_BUG_ON_PAGE(!PageLocked(page), page); > > - if (compound) { > > - atomic_t *mapcount; > > - > > - VM_BUG_ON_PAGE(!PageTransHuge(page), page); > > - mapcount =
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 03, 2015 at 09:16:50AM +0200, Kirill A. Shutemov wrote: > On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > > Hello Kirill, > > > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I > > > > > > > > > > > had another oops > > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty > > > > > > > > > > > for bug fix) > > > > > > > > > > > It seems page_get_anon_vma returns NULL since the page > > > > > > > > > > > was not page_mapped > > > > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > > > > try_to_unmap seems > > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > > across:4191228k FS > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > > across:4191228k FS > > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > > flags: > > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > > > > series. > > > > > > > > > > Let me think on it, but it could well relate to the one you > > > > > > > > > > got before. > > > > > > > > > > > > > > > > > > I will roll back to > > > > > > > > > mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > > instead of next-20151021 to remove noise from your migration > > > > > > > > > cleanup > > > > > > > > > series and will test it again. > > > > > > > > > If it is fixed, I will test again with your migration > > > > > > > > > patchset, then. > > > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for > > > > > > > > a long time. > > > > > > > > Therefore, there is no patchset from Hugh's migration patch in > > > > > > > > there. > > > > > > > > And I added below debug code with request from Kirill to all > > > > > > > > test kernels. > > > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I > > > > > > > track it down > > > > > > > finally. > > > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > > > tested, but > > > > > > > looks like it works. > > > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > > thought that > > > > > > > kernel would wait migration to finish on before deconstruction > > > > > > > mapping. > > > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can > > > > > > > end up > > > > > > > with page which is not mapped anymore but has _count and _mapcount > > > > > > > elevated. The page is on LRU too. So it's still reachable by > > > > > > > vmscan and by > > > > > > > pfn scanners (Sasha showed few similar traces from compaction > > > > > > > too). > > > > > > > It's likely that page->mapping in this case would point to freed > > > > > > > anon_vma. > > > > > > > > > > > > > > BOOM! > > > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > > migration > > > > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > > > > removing > > > > > > > we get pin back and put page on rmap. This way even if migration > > > > > > > entry > > > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > > > > > Please, test. > > > > > > > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new > > > > > > patch, I tested > > > > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > > > > index:0x61800 compound_mapcount: 0 > > > > > > flags:
Re: kernel oops on mmotm-2015-10-15-15-20
On Tue, Nov 03, 2015 at 12:02:58PM +0900, Minchan Kim wrote: > Hello Kirill, > > On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > > Hello Hugh, > > > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > > > > another oops > > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty > > > > > > > > > > for bug fix) > > > > > > > > > > It seems page_get_anon_vma returns NULL since the page was > > > > > > > > > > not page_mapped > > > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > > > try_to_unmap seems > > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > across:4191228k FS > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > > across:4191228k FS > > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > > flags: > > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > > > series. > > > > > > > > > Let me think on it, but it could well relate to the one you > > > > > > > > > got before. > > > > > > > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > > instead of next-20151021 to remove noise from your migration > > > > > > > > cleanup > > > > > > > > series and will test it again. > > > > > > > > If it is fixed, I will test again with your migration patchset, > > > > > > > > then. > > > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a > > > > > > > long time. > > > > > > > Therefore, there is no patchset from Hugh's migration patch in > > > > > > > there. > > > > > > > And I added below debug code with request from Kirill to all test > > > > > > > kernels. > > > > > > > > > > > > It took too long time (and a lot of printk()), but I think I track > > > > > > it down > > > > > > finally. > > > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > > tested, but > > > > > > looks like it works. > > > > > > > > > > > > The problem was my wrong assumption on how migration works: I > > > > > > thought that > > > > > > kernel would wait migration to finish on before deconstruction > > > > > > mapping. > > > > > > > > > > > > But turn out that's not true. > > > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can > > > > > > end up > > > > > > with page which is not mapped anymore but has _count and _mapcount > > > > > > elevated. The page is on LRU too. So it's still reachable by vmscan > > > > > > and by > > > > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > > > > It's likely that page->mapping in this case would point to freed > > > > > > anon_vma. > > > > > > > > > > > > BOOM! > > > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal > > > > > > migration > > > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > > > removing > > > > > > we get pin back and put page on rmap. This way even if migration > > > > > > entry > > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > > > Please, test. > > > > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new > > > > > patch, I tested > > > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > > > index:0x61800 compound_mapcount: 0 > > > > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > > > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > > > > page->mem_cgroup:88007f613c00 > > > > > > > > Ignore my previous answer. Still sleeping. > > > > > > > > The right way to fix I think is something like: > > > > > > > > diff --git a/mm/rmap.c
Re: kernel oops on mmotm-2015-10-15-15-20
Hello Kirill, On Mon, Nov 02, 2015 at 02:57:49PM +0200, Kirill A. Shutemov wrote: > On Fri, Oct 30, 2015 at 04:03:50PM +0900, Minchan Kim wrote: > > On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > > > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > > > Hello Hugh, > > > > > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > > > another oops > > > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for > > > > > > > > > bug fix) > > > > > > > > > It seems page_get_anon_vma returns NULL since the page was > > > > > > > > > not page_mapped > > > > > > > > > at that time but second check of page_mapped right before > > > > > > > > > try_to_unmap seems > > > > > > > > > to be true. > > > > > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > across:4191228k FS > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > > > across:4191228k FS > > > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 > > > > > > > > > mapping:88007f1b5f51 index:0x60aff > > > > > > > > > flags: > > > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > > > > > That's interesting, that's one I added in my page migration > > > > > > > > series. > > > > > > > > Let me think on it, but it could well relate to the one you got > > > > > > > > before. > > > > > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > > > instead of next-20151021 to remove noise from your migration > > > > > > > cleanup > > > > > > > series and will test it again. > > > > > > > If it is fixed, I will test again with your migration patchset, > > > > > > > then. > > > > > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a > > > > > > long time. > > > > > > Therefore, there is no patchset from Hugh's migration patch in > > > > > > there. > > > > > > And I added below debug code with request from Kirill to all test > > > > > > kernels. > > > > > > > > > > It took too long time (and a lot of printk()), but I think I track it > > > > > down > > > > > finally. > > > > > > > > > > The patch below seems fixes issue for me. It's not yet properly > > > > > tested, but > > > > > looks like it works. > > > > > > > > > > The problem was my wrong assumption on how migration works: I thought > > > > > that > > > > > kernel would wait migration to finish on before deconstruction > > > > > mapping. > > > > > > > > > > But turn out that's not true. > > > > > > > > > > As result if zap_pte_range() races with split_huge_page(), we can end > > > > > up > > > > > with page which is not mapped anymore but has _count and _mapcount > > > > > elevated. The page is on LRU too. So it's still reachable by vmscan > > > > > and by > > > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > > > It's likely that page->mapping in this case would point to freed > > > > > anon_vma. > > > > > > > > > > BOOM! > > > > > > > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > > > > entries logic: on setup we remove page from rmap and drop pin, on > > > > > removing > > > > > we get pin back and put page on rmap. This way even if migration entry > > > > > will be removed under us we don't corrupt page's state. > > > > > > > > > > Please, test. > > > > > > > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, > > > > I tested > > > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > > > index:0x61800 compound_mapcount: 0 > > > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > > > page->mem_cgroup:88007f613c00 > > > > > > Ignore my previous answer. Still sleeping. > > > > > > The right way to fix I think is something like: > > > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > > index 35643176bc15..f2d46792a554 100644 > > > --- a/mm/rmap.c > > > +++ b/mm/rmap.c > > > @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, > > > bool compound = flags & RMAP_COMPOUND; > > > bool first; > > > > > > - if (PageTransCompound(page)) { > > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > Hello Hugh, > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > another oops > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug > > > > > > > fix) > > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > > page_mapped > > > > > > > at that time but second check of page_mapped right before > > > > > > > try_to_unmap seems > > > > > > > to be true. > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > across:4191228k FS > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > across:4191228k FS > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > > > index:0x60aff > > > > > > > flags: > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > > > Let me think on it, but it could well relate to the one you got > > > > > > before. > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > > series and will test it again. > > > > > If it is fixed, I will test again with your migration patchset, then. > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > > time. > > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > > And I added below debug code with request from Kirill to all test > > > > kernels. > > > > > > It took too long time (and a lot of printk()), but I think I track it down > > > finally. > > > > > > The patch below seems fixes issue for me. It's not yet properly tested, > > > but > > > looks like it works. > > > > > > The problem was my wrong assumption on how migration works: I thought that > > > kernel would wait migration to finish on before deconstruction mapping. > > > > > > But turn out that's not true. > > > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > > with page which is not mapped anymore but has _count and _mapcount > > > elevated. The page is on LRU too. So it's still reachable by vmscan and by > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > It's likely that page->mapping in this case would point to freed anon_vma. > > > > > > BOOM! > > > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > > entries logic: on setup we remove page from rmap and drop pin, on removing > > > we get pin back and put page on rmap. This way even if migration entry > > > will be removed under us we don't corrupt page's state. > > > > > > Please, test. > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > > tested > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > index:0x61800 compound_mapcount: 0 > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > page->mem_cgroup:88007f613c00 > > Ignore my previous answer. Still sleeping. > > The right way to fix I think is something like: > > diff --git a/mm/rmap.c b/mm/rmap.c > index 35643176bc15..f2d46792a554 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, > bool compound = flags & RMAP_COMPOUND; > bool first; > > - if (PageTransCompound(page)) { > + if (PageTransCompound(page) && compound) { > + atomic_t *mapcount; > VM_BUG_ON_PAGE(!PageLocked(page), page); > - if (compound) { > - atomic_t *mapcount; > - > - VM_BUG_ON_PAGE(!PageTransHuge(page), page); > - mapcount = compound_mapcount_ptr(page); > - first = atomic_inc_and_test(mapcount); > - } else { > - /* Anon THP always mapped first with PMD */ > - first = 0; > - VM_BUG_ON_PAGE(!page_mapcount(page), page); > - atomic_inc(>_mapcount); > -
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 11:52:06AM +0200, Kirill A. Shutemov wrote: > On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > > Hello Hugh, > > > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > > > I added the code to check it and queued it again but I had > > > > > > > another oops > > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug > > > > > > > fix) > > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > > page_mapped > > > > > > > at that time but second check of page_mapped right before > > > > > > > try_to_unmap seems > > > > > > > to be true. > > > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > across:4191228k FS > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > > across:4191228k FS > > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > > > index:0x60aff > > > > > > > flags: > > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > > > Let me think on it, but it could well relate to the one you got > > > > > > before. > > > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > > series and will test it again. > > > > > If it is fixed, I will test again with your migration patchset, then. > > > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > > time. > > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > > And I added below debug code with request from Kirill to all test > > > > kernels. > > > > > > It took too long time (and a lot of printk()), but I think I track it down > > > finally. > > > > > > The patch below seems fixes issue for me. It's not yet properly tested, > > > but > > > looks like it works. > > > > > > The problem was my wrong assumption on how migration works: I thought that > > > kernel would wait migration to finish on before deconstruction mapping. > > > > > > But turn out that's not true. > > > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > > with page which is not mapped anymore but has _count and _mapcount > > > elevated. The page is on LRU too. So it's still reachable by vmscan and by > > > pfn scanners (Sasha showed few similar traces from compaction too). > > > It's likely that page->mapping in this case would point to freed anon_vma. > > > > > > BOOM! > > > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > > entries logic: on setup we remove page from rmap and drop pin, on removing > > > we get pin back and put page on rmap. This way even if migration entry > > > will be removed under us we don't corrupt page's state. > > > > > > Please, test. > > > > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > > tested > > one I sent to you(ie, oops.c + memcg_test.sh) > > > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > > index:0x61800 compound_mapcount: 0 > > flags: 0x40044009(locked|uptodate|head|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > > page->mem_cgroup:88007f613c00 > > Ignore my previous answer. Still sleeping. > > The right way to fix I think is something like: > > diff --git a/mm/rmap.c b/mm/rmap.c > index 35643176bc15..f2d46792a554 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, > bool compound = flags & RMAP_COMPOUND; > bool first; > > - if (PageTransCompound(page)) { > + if (PageTransCompound(page) && compound) { > + atomic_t *mapcount; > VM_BUG_ON_PAGE(!PageLocked(page), page); > - if (compound) { > - atomic_t *mapcount; > - > - VM_BUG_ON_PAGE(!PageTransHuge(page), page); > - mapcount = compound_mapcount_ptr(page); > - first = atomic_inc_and_test(mapcount); > - } else { > - /* Anon THP always mapped first with PMD */ > - first = 0; > - VM_BUG_ON_PAGE(!page_mapcount(page), page); > - atomic_inc(>_mapcount); > -
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > Hello Hugh, > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > > oops > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug > > > > > > fix) > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > page_mapped > > > > > > at that time but second check of page_mapped right before > > > > > > try_to_unmap seems > > > > > > to be true. > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > > index:0x60aff > > > > > > flags: > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > > Let me think on it, but it could well relate to the one you got > > > > > before. > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > series and will test it again. > > > > If it is fixed, I will test again with your migration patchset, then. > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > time. > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > And I added below debug code with request from Kirill to all test kernels. > > > > It took too long time (and a lot of printk()), but I think I track it down > > finally. > > > > The patch below seems fixes issue for me. It's not yet properly tested, but > > looks like it works. > > > > The problem was my wrong assumption on how migration works: I thought that > > kernel would wait migration to finish on before deconstruction mapping. > > > > But turn out that's not true. > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > with page which is not mapped anymore but has _count and _mapcount > > elevated. The page is on LRU too. So it's still reachable by vmscan and by > > pfn scanners (Sasha showed few similar traces from compaction too). > > It's likely that page->mapping in this case would point to freed anon_vma. > > > > BOOM! > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > entries logic: on setup we remove page from rmap and drop pin, on removing > > we get pin back and put page on rmap. This way even if migration entry > > will be removed under us we don't corrupt page's state. > > > > Please, test. > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > tested > one I sent to you(ie, oops.c + memcg_test.sh) > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > index:0x61800 compound_mapcount: 0 > flags: 0x40044009(locked|uptodate|head|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > page->mem_cgroup:88007f613c00 Ignore my previous answer. Still sleeping. The right way to fix I think is something like: diff --git a/mm/rmap.c b/mm/rmap.c index 35643176bc15..f2d46792a554 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, bool compound = flags & RMAP_COMPOUND; bool first; - if (PageTransCompound(page)) { + if (PageTransCompound(page) && compound) { + atomic_t *mapcount; VM_BUG_ON_PAGE(!PageLocked(page), page); - if (compound) { - atomic_t *mapcount; - - VM_BUG_ON_PAGE(!PageTransHuge(page), page); - mapcount = compound_mapcount_ptr(page); - first = atomic_inc_and_test(mapcount); - } else { - /* Anon THP always mapped first with PMD */ - first = 0; - VM_BUG_ON_PAGE(!page_mapcount(page), page); - atomic_inc(>_mapcount); - } + VM_BUG_ON_PAGE(!PageTransHuge(page), page); + mapcount = compound_mapcount_ptr(page); + first = atomic_inc_and_test(mapcount); } else { VM_BUG_ON_PAGE(compound, page); first
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > Hello Hugh, > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > > oops > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug > > > > > > fix) > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > page_mapped > > > > > > at that time but second check of page_mapped right before > > > > > > try_to_unmap seems > > > > > > to be true. > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > > index:0x60aff > > > > > > flags: > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > > Let me think on it, but it could well relate to the one you got > > > > > before. > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > series and will test it again. > > > > If it is fixed, I will test again with your migration patchset, then. > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > time. > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > And I added below debug code with request from Kirill to all test kernels. > > > > It took too long time (and a lot of printk()), but I think I track it down > > finally. > > > > The patch below seems fixes issue for me. It's not yet properly tested, but > > looks like it works. > > > > The problem was my wrong assumption on how migration works: I thought that > > kernel would wait migration to finish on before deconstruction mapping. > > > > But turn out that's not true. > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > with page which is not mapped anymore but has _count and _mapcount > > elevated. The page is on LRU too. So it's still reachable by vmscan and by > > pfn scanners (Sasha showed few similar traces from compaction too). > > It's likely that page->mapping in this case would point to freed anon_vma. > > > > BOOM! > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > entries logic: on setup we remove page from rmap and drop pin, on removing > > we get pin back and put page on rmap. This way even if migration entry > > will be removed under us we don't corrupt page's state. > > > > Please, test. > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > tested > one I sent to you(ie, oops.c + memcg_test.sh) > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > index:0x61800 compound_mapcount: 0 > flags: 0x40044009(locked|uptodate|head|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) The VM_BUG_ON_PAGE() is bogus after the patch. Just drop it. > page->mem_cgroup:88007f613c00 > [ cut here ] > kernel BUG at mm/rmap.c:1156! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 7 PID: 3312 Comm: oops Not tainted 4.3.0-rc5-mm1-madv-free-no-lazy-thp+ > #1573 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 8800b8804ec0 ti: 8805c000 task.ti: 8805c000 > RIP: 0010:[] [] > do_page_add_anon_rmap+0x323/0x360 > RSP: :8805f758 EFLAGS: 00010292 > RAX: 0021 RBX: ea00016a RCX: 81830db8 > RDX: 0001 RSI: 0246 RDI: 821df4d8 > RBP: 8805f780 R08: R09: 880b8be0 > R10: 8163d7c0 R11: 01a5 R12: 88007e85ddc0 > R13: 6180 R14: R15: 88007e85ddc0 > FS: 7f5cd5fea740() GS:8800bfae() knlGS: > CS: 0010 DS: ES: CR0: 8005003b > CR2: 64c03000 CR3: 7f017000 CR4: 06a0 > Stack: > 88007f351000 88007f352000 ea00016a 6180 > 88007e85ddc0 8805f790 81128278 8805f800 > 81146dbb 000619ff
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > Hello Hugh, > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > oops > > > > > in this time but symptom is related to anon_vma, too. > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > page_mapped > > > > > at that time but second check of page_mapped right before > > > > > try_to_unmap seems > > > > > to be true. > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > index:0x60aff > > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > && !anon_vma) > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > Let me think on it, but it could well relate to the one you got before. > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > instead of next-20151021 to remove noise from your migration cleanup > > > series and will test it again. > > > If it is fixed, I will test again with your migration patchset, then. > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long time. > > Therefore, there is no patchset from Hugh's migration patch in there. > > And I added below debug code with request from Kirill to all test kernels. > > It took too long time (and a lot of printk()), but I think I track it down > finally. > > The patch below seems fixes issue for me. It's not yet properly tested, but > looks like it works. > > The problem was my wrong assumption on how migration works: I thought that > kernel would wait migration to finish on before deconstruction mapping. > > But turn out that's not true. > > As result if zap_pte_range() races with split_huge_page(), we can end up > with page which is not mapped anymore but has _count and _mapcount > elevated. The page is on LRU too. So it's still reachable by vmscan and by > pfn scanners (Sasha showed few similar traces from compaction too). > It's likely that page->mapping in this case would point to freed anon_vma. > > BOOM! > > The patch modify freeze/unfreeze_page() code to match normal migration > entries logic: on setup we remove page from rmap and drop pin, on removing > we get pin back and put page on rmap. This way even if migration entry > will be removed under us we don't corrupt page's state. > > Please, test. > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I tested one I sent to you(ie, oops.c + memcg_test.sh) page:ea00016a count:3 mapcount:0 mapping:88007f49d001 index:0x61800 compound_mapcount: 0 flags: 0x40044009(locked|uptodate|head|swapbacked) page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) page->mem_cgroup:88007f613c00 [ cut here ] kernel BUG at mm/rmap.c:1156! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 7 PID: 3312 Comm: oops Not tainted 4.3.0-rc5-mm1-madv-free-no-lazy-thp+ #1573 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 8800b8804ec0 ti: 8805c000 task.ti: 8805c000 RIP: 0010:[] [] do_page_add_anon_rmap+0x323/0x360 RSP: :8805f758 EFLAGS: 00010292 RAX: 0021 RBX: ea00016a RCX: 81830db8 RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 8805f780 R08: R09: 880b8be0 R10: 8163d7c0 R11: 01a5 R12: 88007e85ddc0 R13: 6180 R14: R15: 88007e85ddc0 FS: 7f5cd5fea740() GS:8800bfae() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 64c03000 CR3: 7f017000 CR4: 06a0 Stack: 88007f351000 88007f352000 ea00016a 6180 88007e85ddc0 8805f790 81128278 8805f800 81146dbb 000619ff 00061800 1600 Call Trace: [] page_add_anon_rmap+0x18/0x20 [] unfreeze_page+0x24b/0x330 [] split_huge_page_to_list+0x3df/0x920 [] ? scan_swap_map+0x37f/0x550 [] add_to_swap+0xb6/0x100 [] shrink_page_list+0x3b7/0xdc0 [] shrink_inactive_list+0x18c/0x4b0 [] shrink_lruvec+0x58f/0x730 [] shrink_zone+0xd4/0x280 [] do_try_to_free_pages+0x12d/0x3b0 []
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > Hello Hugh, > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > oops > > > > > in this time but symptom is related to anon_vma, too. > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > page_mapped > > > > > at that time but second check of page_mapped right before > > > > > try_to_unmap seems > > > > > to be true. > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > index:0x60aff > > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > && !anon_vma) > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > Let me think on it, but it could well relate to the one you got before. > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > instead of next-20151021 to remove noise from your migration cleanup > > > series and will test it again. > > > If it is fixed, I will test again with your migration patchset, then. > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long time. > > Therefore, there is no patchset from Hugh's migration patch in there. > > And I added below debug code with request from Kirill to all test kernels. > > It took too long time (and a lot of printk()), but I think I track it down > finally. > > The patch below seems fixes issue for me. It's not yet properly tested, but > looks like it works. > > The problem was my wrong assumption on how migration works: I thought that > kernel would wait migration to finish on before deconstruction mapping. > > But turn out that's not true. > > As result if zap_pte_range() races with split_huge_page(), we can end up > with page which is not mapped anymore but has _count and _mapcount > elevated. The page is on LRU too. So it's still reachable by vmscan and by > pfn scanners (Sasha showed few similar traces from compaction too). > It's likely that page->mapping in this case would point to freed anon_vma. > > BOOM! > > The patch modify freeze/unfreeze_page() code to match normal migration > entries logic: on setup we remove page from rmap and drop pin, on removing > we get pin back and put page on rmap. This way even if migration entry > will be removed under us we don't corrupt page's state. > > Please, test. > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I tested one I sent to you(ie, oops.c + memcg_test.sh) page:ea00016a count:3 mapcount:0 mapping:88007f49d001 index:0x61800 compound_mapcount: 0 flags: 0x40044009(locked|uptodate|head|swapbacked) page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) page->mem_cgroup:88007f613c00 [ cut here ] kernel BUG at mm/rmap.c:1156! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 7 PID: 3312 Comm: oops Not tainted 4.3.0-rc5-mm1-madv-free-no-lazy-thp+ #1573 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 8800b8804ec0 ti: 8805c000 task.ti: 8805c000 RIP: 0010:[] [] do_page_add_anon_rmap+0x323/0x360 RSP: :8805f758 EFLAGS: 00010292 RAX: 0021 RBX: ea00016a RCX: 81830db8 RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 8805f780 R08: R09: 880b8be0 R10: 8163d7c0 R11: 01a5 R12: 88007e85ddc0 R13: 6180 R14: R15: 88007e85ddc0 FS: 7f5cd5fea740() GS:8800bfae() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 64c03000 CR3: 7f017000 CR4: 06a0 Stack: 88007f351000 88007f352000 ea00016a 6180 88007e85ddc0 8805f790 81128278 8805f800 81146dbb 000619ff 00061800 1600 Call Trace: [] page_add_anon_rmap+0x18/0x20 [] unfreeze_page+0x24b/0x330 [] split_huge_page_to_list+0x3df/0x920 [] ? scan_swap_map+0x37f/0x550 [] add_to_swap+0xb6/0x100 [] shrink_page_list+0x3b7/0xdc0 [] shrink_inactive_list+0x18c/0x4b0 [] shrink_lruvec+0x58f/0x730 [] shrink_zone+0xd4/0x280 [] do_try_to_free_pages+0x12d/0x3b0 []
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > Hello Hugh, > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > > oops > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug > > > > > > fix) > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > page_mapped > > > > > > at that time but second check of page_mapped right before > > > > > > try_to_unmap seems > > > > > > to be true. > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > > index:0x60aff > > > > > > flags: > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > > Let me think on it, but it could well relate to the one you got > > > > > before. > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > series and will test it again. > > > > If it is fixed, I will test again with your migration patchset, then. > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > time. > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > And I added below debug code with request from Kirill to all test kernels. > > > > It took too long time (and a lot of printk()), but I think I track it down > > finally. > > > > The patch below seems fixes issue for me. It's not yet properly tested, but > > looks like it works. > > > > The problem was my wrong assumption on how migration works: I thought that > > kernel would wait migration to finish on before deconstruction mapping. > > > > But turn out that's not true. > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > with page which is not mapped anymore but has _count and _mapcount > > elevated. The page is on LRU too. So it's still reachable by vmscan and by > > pfn scanners (Sasha showed few similar traces from compaction too). > > It's likely that page->mapping in this case would point to freed anon_vma. > > > > BOOM! > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > entries logic: on setup we remove page from rmap and drop pin, on removing > > we get pin back and put page on rmap. This way even if migration entry > > will be removed under us we don't corrupt page's state. > > > > Please, test. > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > tested > one I sent to you(ie, oops.c + memcg_test.sh) > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > index:0x61800 compound_mapcount: 0 > flags: 0x40044009(locked|uptodate|head|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) The VM_BUG_ON_PAGE() is bogus after the patch. Just drop it. > page->mem_cgroup:88007f613c00 > [ cut here ] > kernel BUG at mm/rmap.c:1156! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 7 PID: 3312 Comm: oops Not tainted 4.3.0-rc5-mm1-madv-free-no-lazy-thp+ > #1573 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 8800b8804ec0 ti: 8805c000 task.ti: 8805c000 > RIP: 0010:[] [] > do_page_add_anon_rmap+0x323/0x360 > RSP: :8805f758 EFLAGS: 00010292 > RAX: 0021 RBX: ea00016a RCX: 81830db8 > RDX: 0001 RSI: 0246 RDI: 821df4d8 > RBP: 8805f780 R08: R09: 880b8be0 > R10: 8163d7c0 R11: 01a5 R12: 88007e85ddc0 > R13: 6180 R14: R15: 88007e85ddc0 > FS: 7f5cd5fea740() GS:8800bfae() knlGS: > CS: 0010 DS: ES: CR0: 8005003b > CR2: 64c03000 CR3: 7f017000 CR4: 06a0 > Stack: > 88007f351000 88007f352000 ea00016a 6180 > 88007e85ddc0 8805f790 81128278 8805f800 > 81146dbb 000619ff
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 29, 2015 at 04:58:29PM +0900, Minchan Kim wrote: > On Thu, Oct 29, 2015 at 02:25:24AM +0200, Kirill A. Shutemov wrote: > > On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > > > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > > > Hello Hugh, > > > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > > oops > > > > > > in this time but symptom is related to anon_vma, too. > > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug > > > > > > fix) > > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > > page_mapped > > > > > > at that time but second check of page_mapped right before > > > > > > try_to_unmap seems > > > > > > to be true. > > > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > > across:4191228k FS > > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > > index:0x60aff > > > > > > flags: > > > > > > 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && > > > > > > !PageKsm(page) && !anon_vma) > > > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > > Let me think on it, but it could well relate to the one you got > > > > > before. > > > > > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > > > instead of next-20151021 to remove noise from your migration cleanup > > > > series and will test it again. > > > > If it is fixed, I will test again with your migration patchset, then. > > > > > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long > > > time. > > > Therefore, there is no patchset from Hugh's migration patch in there. > > > And I added below debug code with request from Kirill to all test kernels. > > > > It took too long time (and a lot of printk()), but I think I track it down > > finally. > > > > The patch below seems fixes issue for me. It's not yet properly tested, but > > looks like it works. > > > > The problem was my wrong assumption on how migration works: I thought that > > kernel would wait migration to finish on before deconstruction mapping. > > > > But turn out that's not true. > > > > As result if zap_pte_range() races with split_huge_page(), we can end up > > with page which is not mapped anymore but has _count and _mapcount > > elevated. The page is on LRU too. So it's still reachable by vmscan and by > > pfn scanners (Sasha showed few similar traces from compaction too). > > It's likely that page->mapping in this case would point to freed anon_vma. > > > > BOOM! > > > > The patch modify freeze/unfreeze_page() code to match normal migration > > entries logic: on setup we remove page from rmap and drop pin, on removing > > we get pin back and put page on rmap. This way even if migration entry > > will be removed under us we don't corrupt page's state. > > > > Please, test. > > > > kernel: On mmotm-2015-10-15-15-20 + pte_mkdirty patch + your new patch, I > tested > one I sent to you(ie, oops.c + memcg_test.sh) > > page:ea00016a count:3 mapcount:0 mapping:88007f49d001 > index:0x61800 compound_mapcount: 0 > flags: 0x40044009(locked|uptodate|head|swapbacked) > page dumped because: VM_BUG_ON_PAGE(!page_mapcount(page)) > page->mem_cgroup:88007f613c00 Ignore my previous answer. Still sleeping. The right way to fix I think is something like: diff --git a/mm/rmap.c b/mm/rmap.c index 35643176bc15..f2d46792a554 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1173,20 +1173,12 @@ void do_page_add_anon_rmap(struct page *page, bool compound = flags & RMAP_COMPOUND; bool first; - if (PageTransCompound(page)) { + if (PageTransCompound(page) && compound) { + atomic_t *mapcount; VM_BUG_ON_PAGE(!PageLocked(page), page); - if (compound) { - atomic_t *mapcount; - - VM_BUG_ON_PAGE(!PageTransHuge(page), page); - mapcount = compound_mapcount_ptr(page); - first = atomic_inc_and_test(mapcount); - } else { - /* Anon THP always mapped first with PMD */ - first = 0; - VM_BUG_ON_PAGE(!page_mapcount(page), page); - atomic_inc(>_mapcount); - } + VM_BUG_ON_PAGE(!PageTransHuge(page), page); + mapcount = compound_mapcount_ptr(page); + first = atomic_inc_and_test(mapcount); } else { VM_BUG_ON_PAGE(compound, page); first
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > Hello Hugh, > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > I added the code to check it and queued it again but I had another oops > > > > in this time but symptom is related to anon_vma, too. > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > page_mapped > > > > at that time but second check of page_mapped right before try_to_unmap > > > > seems > > > > to be true. > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > index:0x60aff > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > > !anon_vma) > > > > > > That's interesting, that's one I added in my page migration series. > > > Let me think on it, but it could well relate to the one you got before. > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > instead of next-20151021 to remove noise from your migration cleanup > > series and will test it again. > > If it is fixed, I will test again with your migration patchset, then. > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long time. > Therefore, there is no patchset from Hugh's migration patch in there. > And I added below debug code with request from Kirill to all test kernels. It took too long time (and a lot of printk()), but I think I track it down finally. The patch below seems fixes issue for me. It's not yet properly tested, but looks like it works. The problem was my wrong assumption on how migration works: I thought that kernel would wait migration to finish on before deconstruction mapping. But turn out that's not true. As result if zap_pte_range() races with split_huge_page(), we can end up with page which is not mapped anymore but has _count and _mapcount elevated. The page is on LRU too. So it's still reachable by vmscan and by pfn scanners (Sasha showed few similar traces from compaction too). It's likely that page->mapping in this case would point to freed anon_vma. BOOM! The patch modify freeze/unfreeze_page() code to match normal migration entries logic: on setup we remove page from rmap and drop pin, on removing we get pin back and put page on rmap. This way even if migration entry will be removed under us we don't corrupt page's state. Please, test. Not-Yet-Signed-off-by: Kirill A. Shutemov diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5e0fe82a0fae..192b50c7526c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2934,6 +2934,13 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, smp_wmb(); /* make pte visible before pmd */ pmd_populate(mm, pmd, pgtable); + + if (freeze) { + for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + page_remove_rmap(page + i, false); + put_page(page + i); + } + } } void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, @@ -3079,6 +3086,8 @@ static void freeze_page_vma(struct vm_area_struct *vma, struct page *page, if (pte_soft_dirty(entry)) swp_pte = pte_swp_mksoft_dirty(swp_pte); set_pte_at(vma->vm_mm, address, pte + i, swp_pte); + page_remove_rmap(page, false); + put_page(page); } pte_unmap_unlock(pte, ptl); } @@ -3117,8 +3126,6 @@ static void unfreeze_page_vma(struct vm_area_struct *vma, struct page *page, return; pte = pte_offset_map_lock(vma->vm_mm, pmd, address, ); for (i = 0; i < HPAGE_PMD_NR; i++, address += PAGE_SIZE, page++) { - if (!page_mapped(page)) - continue; if (!is_swap_pte(pte[i])) continue; @@ -3128,6 +3135,9 @@ static void unfreeze_page_vma(struct vm_area_struct *vma, struct page *page, if (migration_entry_to_page(swp_entry) != page) continue; + get_page(page); + page_add_anon_rmap(page, vma, address, false); + entry = pte_mkold(mk_pte(page, vma->vm_page_prot)); entry = pte_mkdirty(entry); if (is_write_migration_entry(swp_entry)) -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 22, 2015 at 06:00:51PM +0900, Minchan Kim wrote: > On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > > Hello Hugh, > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > I added the code to check it and queued it again but I had another oops > > > > in this time but symptom is related to anon_vma, too. > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > page_mapped > > > > at that time but second check of page_mapped right before try_to_unmap > > > > seems > > > > to be true. > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > index:0x60aff > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > > !anon_vma) > > > > > > That's interesting, that's one I added in my page migration series. > > > Let me think on it, but it could well relate to the one you got before. > > > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > > instead of next-20151021 to remove noise from your migration cleanup > > series and will test it again. > > If it is fixed, I will test again with your migration patchset, then. > > I tested mmotm-2015-10-15-15-20 with test program I attach for a long time. > Therefore, there is no patchset from Hugh's migration patch in there. > And I added below debug code with request from Kirill to all test kernels. It took too long time (and a lot of printk()), but I think I track it down finally. The patch below seems fixes issue for me. It's not yet properly tested, but looks like it works. The problem was my wrong assumption on how migration works: I thought that kernel would wait migration to finish on before deconstruction mapping. But turn out that's not true. As result if zap_pte_range() races with split_huge_page(), we can end up with page which is not mapped anymore but has _count and _mapcount elevated. The page is on LRU too. So it's still reachable by vmscan and by pfn scanners (Sasha showed few similar traces from compaction too). It's likely that page->mapping in this case would point to freed anon_vma. BOOM! The patch modify freeze/unfreeze_page() code to match normal migration entries logic: on setup we remove page from rmap and drop pin, on removing we get pin back and put page on rmap. This way even if migration entry will be removed under us we don't corrupt page's state. Please, test. Not-Yet-Signed-off-by: Kirill A. Shutemovdiff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5e0fe82a0fae..192b50c7526c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2934,6 +2934,13 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, smp_wmb(); /* make pte visible before pmd */ pmd_populate(mm, pmd, pgtable); + + if (freeze) { + for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + page_remove_rmap(page + i, false); + put_page(page + i); + } + } } void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, @@ -3079,6 +3086,8 @@ static void freeze_page_vma(struct vm_area_struct *vma, struct page *page, if (pte_soft_dirty(entry)) swp_pte = pte_swp_mksoft_dirty(swp_pte); set_pte_at(vma->vm_mm, address, pte + i, swp_pte); + page_remove_rmap(page, false); + put_page(page); } pte_unmap_unlock(pte, ptl); } @@ -3117,8 +3126,6 @@ static void unfreeze_page_vma(struct vm_area_struct *vma, struct page *page, return; pte = pte_offset_map_lock(vma->vm_mm, pmd, address, ); for (i = 0; i < HPAGE_PMD_NR; i++, address += PAGE_SIZE, page++) { - if (!page_mapped(page)) - continue; if (!is_swap_pte(pte[i])) continue; @@ -3128,6 +3135,9 @@ static void unfreeze_page_vma(struct vm_area_struct *vma, struct page *page, if (migration_entry_to_page(swp_entry) != page) continue; + get_page(page); + page_add_anon_rmap(page, vma, address, false); + entry = pte_mkold(mk_pte(page, vma->vm_page_prot)); entry = pte_mkdirty(entry); if (is_write_migration_entry(swp_entry)) -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, 21 Oct 2015, Hugh Dickins wrote: > On Wed, 21 Oct 2015, Hugh Dickins wrote: > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > Hello Hugh, > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > oops > > > > > in this time but symptom is related to anon_vma, too. > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > page_mapped > > > > > at that time but second check of page_mapped right before > > > > > try_to_unmap seems > > > > > to be true. > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > index:0x60aff > > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > && !anon_vma) > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > Let me think on it, but it could well relate to the one you got before. > > I think I have introduced a bug there; or rather, made more evident > a pre-existing bug. But I'm not sure yet: the stacktrace was from > compaction (called by khugepaged, but that may not be relevant at all), > and thinking through the races with isolate_migratepages_block() is > never easy. > > What's certain is that I was not giving any thought to > isolate_migratepages_block() when I added that VM_BUG_ON_PAGE(): > I was thinking about "stable" anonymous pages, and how they get > faulted back in from swapcache while holding page lock. > > It looks to me now as if a page might not yet be PageAnon when it's > first tested in __unmap_and_move(), when going to page_get_anon_vma(); > but is page_mapped() and PageAnon() by time of calling try_to_unmap(), > where I inserted the VM_BUG_ON_PAGE(). > > If so, the code would always have been wrong (trying to unmap the > anonymous page, and later remap its replacement, without a hold on > the anon_vma needed to guide both lookups); but I'll have made it > more glaringly wrong with the VM_BUG_ON_PAGE() - let me pretend > that's a good step forward :) > > There's a reference count check in isolated_migratepages_block() > before this, which would make it unlikely, but I doubt rules it out. > > However... you did hit an anon_vma reference counting problem before > my migration changes went in, and Kirill had a vague suspicion that > he might be screwing up anon_vma refcounting in split_huge_page(): > if he confirms that, I'd say it's more likely to be the cause of > your crash on this occasion. > > Not hard to fix mine (though we'll probably have to lose the > VM_BUG_ON_PAGE on the way, so the real fix will be hidden by that > trivial fix), I just want to give the races more thought. And after giving it more thought, I realize that I was wrong yesterday, and the new VM_BUG_ON_PAGE() should be good as is: my guess is that it is simply alerting you to the same anon_vma reference counting issue as you had already hit without that patch. What I was forgetting yesterday, is that isolate_migratepages_block() can only take the page for migration when it's PageLRU(): and do_anonymous_page() only adds a page to the LRU after it has been marked as mapped and PageAnon. So the window that worried me yesterday, that __unmap_and_move() might see !PageAnon, then reach try_to_unmap() with it page_mapped and PageAnon: that window does not exist, with or without my changes. Hugh > > However it turns out, I think you have a very useful test there. > > (And I've observed no PageDirty problems with your recent patchsets, > though I don't use MADV_FREE at all myself.) > > Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > Hello Hugh, > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > I added the code to check it and queued it again but I had another oops > > > in this time but symptom is related to anon_vma, too. > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > It seems page_get_anon_vma returns NULL since the page was not page_mapped > > > at that time but second check of page_mapped right before try_to_unmap > > > seems > > > to be true. > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > index:0x60aff > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > !anon_vma) > > > > That's interesting, that's one I added in my page migration series. > > Let me think on it, but it could well relate to the one you got before. > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > instead of next-20151021 to remove noise from your migration cleanup > series and will test it again. > If it is fixed, I will test again with your migration patchset, then. I tested mmotm-2015-10-15-15-20 with test program I attach for a long time. Therefore, there is no patchset from Hugh's migration patch in there. And I added below debug code with request from Kirill to all test kernels. diff --git a/mm/rmap.c b/mm/rmap.c index ddfb9be72366..1c23b70b1f57 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -513,6 +513,13 @@ struct anon_vma *page_lock_anon_vma_read(struct page *page) anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON); root_anon_vma = READ_ONCE(anon_vma->root); + + if (root_anon_vma == NULL) { + printk("anon_vma %p refcount %d\n", anon_vma, + atomic_read(_vma->refcount)); + VM_BUG_ON_PAGE(1, page); + } + if (down_read_trylock(_anon_vma->rwsem)) { /* * If the page is still mapped, then this anon_vma is still 1. mmotm-2015-10-15-15-20 + kirill's pte_mkdirty 1st trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS BUG: Bad rss-counter state mm:88007f1ed780 idx:1 val:488 BUG: Bad rss-counter state mm:88007f1ed780 idx:2 val:24 2nd trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS BUG: Bad rss-counter state mm:8800a5cca680 idx:1 val:512 Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS 2. mmotm-2015-10-15-15-20-no-madvise_free, IOW it means git head for 54bad5da4834 arm64: add pmd_[dirty|mkclean] for THP. 1st trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS BUG: Bad rss-counter state mm:88007f4c2d80 idx:1 val:511 BUG: Bad rss-counter state mm:88007f4c2d80 idx:2 val:1 2nd trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS anon_vma 88089aa0 refcount 0 page:ea0001a2ea40 count:3 mapcount:1 mapping:88089aa1 index:0x647a9 I tested it with KVM which guest system has 12 core and 3G memory. In mmotm-2015-10-15-15-20-no-madvise_free, I tweaked test program does madvise_dontneed intead of madvise_free via below patch For the testing, gcc -o oops oops.c ./memcg_test.sh I will be off from now on so please understand late response but I hope my test program will reproduce it in your machine. diff --git a/oops.c b/oops.c index e50330a..c8298f8 100644 --- a/oops.c +++ b/oops.c @@ -8,7 +8,7 @@ #include #include -#define MADV_FREE 5 +#define MADV_FREE 4 int pid; memcg_move_task.sh Description: Bourne shell script memcg_test.sh Description: Bourne shell script #include #include #include #include #include #include #include #include #include #define MADV_FREE 4 int pid; void sig_handler(int signo) { printf("pid %d sig received %d\n", pid, signo); exit(1); } void free_bufs(void **bufs, unsigned long buf_count, unsigned long buf_size) { int i; for (i = 0; i < buf_count; i++) { if (bufs[i] != NULL) { munmap(bufs[i], buf_size); bufs[i] = NULL; } } } void alloc_bufs(void **bufs, unsigned long buf_count, unsigned long buf_size) { int i; time_t rawtime; struct tm * timeinfo; void *addr = (void*)0x6000; for (i = 0; i < buf_count; i++) { void *ptr = NULL; ptr = mmap(addr, buf_size, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE|MAP_FIXED, 0, 0); if (ptr == MAP_FAILED) { char bufs[64];
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, Oct 22, 2015 at 10:21:36AM +0900, Minchan Kim wrote: > Hello Hugh, > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > I added the code to check it and queued it again but I had another oops > > > in this time but symptom is related to anon_vma, too. > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > It seems page_get_anon_vma returns NULL since the page was not page_mapped > > > at that time but second check of page_mapped right before try_to_unmap > > > seems > > > to be true. > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > index:0x60aff > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > !anon_vma) > > > > That's interesting, that's one I added in my page migration series. > > Let me think on it, but it could well relate to the one you got before. > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > instead of next-20151021 to remove noise from your migration cleanup > series and will test it again. > If it is fixed, I will test again with your migration patchset, then. I tested mmotm-2015-10-15-15-20 with test program I attach for a long time. Therefore, there is no patchset from Hugh's migration patch in there. And I added below debug code with request from Kirill to all test kernels. diff --git a/mm/rmap.c b/mm/rmap.c index ddfb9be72366..1c23b70b1f57 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -513,6 +513,13 @@ struct anon_vma *page_lock_anon_vma_read(struct page *page) anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON); root_anon_vma = READ_ONCE(anon_vma->root); + + if (root_anon_vma == NULL) { + printk("anon_vma %p refcount %d\n", anon_vma, + atomic_read(_vma->refcount)); + VM_BUG_ON_PAGE(1, page); + } + if (down_read_trylock(_anon_vma->rwsem)) { /* * If the page is still mapped, then this anon_vma is still 1. mmotm-2015-10-15-15-20 + kirill's pte_mkdirty 1st trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS BUG: Bad rss-counter state mm:88007f1ed780 idx:1 val:488 BUG: Bad rss-counter state mm:88007f1ed780 idx:2 val:24 2nd trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS BUG: Bad rss-counter state mm:8800a5cca680 idx:1 val:512 Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS 2. mmotm-2015-10-15-15-20-no-madvise_free, IOW it means git head for 54bad5da4834 arm64: add pmd_[dirty|mkclean] for THP. 1st trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS BUG: Bad rss-counter state mm:88007f4c2d80 idx:1 val:511 BUG: Bad rss-counter state mm:88007f4c2d80 idx:2 val:1 2nd trial: Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS anon_vma 88089aa0 refcount 0 page:ea0001a2ea40 count:3 mapcount:1 mapping:88089aa1 index:0x647a9 I tested it with KVM which guest system has 12 core and 3G memory. In mmotm-2015-10-15-15-20-no-madvise_free, I tweaked test program does madvise_dontneed intead of madvise_free via below patch For the testing, gcc -o oops oops.c ./memcg_test.sh I will be off from now on so please understand late response but I hope my test program will reproduce it in your machine. diff --git a/oops.c b/oops.c index e50330a..c8298f8 100644 --- a/oops.c +++ b/oops.c @@ -8,7 +8,7 @@ #include #include -#define MADV_FREE 5 +#define MADV_FREE 4 int pid; memcg_move_task.sh Description: Bourne shell script memcg_test.sh Description: Bourne shell script #include #include #include #include #include #include #include #include #include #define MADV_FREE 4 int pid; void sig_handler(int signo) { printf("pid %d sig received %d\n", pid, signo); exit(1); } void free_bufs(void **bufs, unsigned long buf_count, unsigned long buf_size) { int i; for (i = 0; i < buf_count; i++) { if (bufs[i] != NULL) { munmap(bufs[i], buf_size); bufs[i] = NULL; } } } void alloc_bufs(void **bufs, unsigned long buf_count, unsigned long buf_size) { int i; time_t rawtime; struct tm * timeinfo; void *addr = (void*)0x6000; for (i = 0; i < buf_count; i++) { void *ptr = NULL; ptr = mmap(addr, buf_size, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE|MAP_FIXED, 0, 0); if (ptr == MAP_FAILED) { char bufs[64];
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, 21 Oct 2015, Hugh Dickins wrote: > On Wed, 21 Oct 2015, Hugh Dickins wrote: > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > Hello Hugh, > > > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > > > I added the code to check it and queued it again but I had another > > > > > oops > > > > > in this time but symptom is related to anon_vma, too. > > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > > page_mapped > > > > > at that time but second check of page_mapped right before > > > > > try_to_unmap seems > > > > > to be true. > > > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > > across:4191228k FS > > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > > index:0x60aff > > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) > > > > > && !anon_vma) > > > > > > > > That's interesting, that's one I added in my page migration series. > > > > Let me think on it, but it could well relate to the one you got before. > > I think I have introduced a bug there; or rather, made more evident > a pre-existing bug. But I'm not sure yet: the stacktrace was from > compaction (called by khugepaged, but that may not be relevant at all), > and thinking through the races with isolate_migratepages_block() is > never easy. > > What's certain is that I was not giving any thought to > isolate_migratepages_block() when I added that VM_BUG_ON_PAGE(): > I was thinking about "stable" anonymous pages, and how they get > faulted back in from swapcache while holding page lock. > > It looks to me now as if a page might not yet be PageAnon when it's > first tested in __unmap_and_move(), when going to page_get_anon_vma(); > but is page_mapped() and PageAnon() by time of calling try_to_unmap(), > where I inserted the VM_BUG_ON_PAGE(). > > If so, the code would always have been wrong (trying to unmap the > anonymous page, and later remap its replacement, without a hold on > the anon_vma needed to guide both lookups); but I'll have made it > more glaringly wrong with the VM_BUG_ON_PAGE() - let me pretend > that's a good step forward :) > > There's a reference count check in isolated_migratepages_block() > before this, which would make it unlikely, but I doubt rules it out. > > However... you did hit an anon_vma reference counting problem before > my migration changes went in, and Kirill had a vague suspicion that > he might be screwing up anon_vma refcounting in split_huge_page(): > if he confirms that, I'd say it's more likely to be the cause of > your crash on this occasion. > > Not hard to fix mine (though we'll probably have to lose the > VM_BUG_ON_PAGE on the way, so the real fix will be hidden by that > trivial fix), I just want to give the races more thought. And after giving it more thought, I realize that I was wrong yesterday, and the new VM_BUG_ON_PAGE() should be good as is: my guess is that it is simply alerting you to the same anon_vma reference counting issue as you had already hit without that patch. What I was forgetting yesterday, is that isolate_migratepages_block() can only take the page for migration when it's PageLRU(): and do_anonymous_page() only adds a page to the LRU after it has been marked as mapped and PageAnon. So the window that worried me yesterday, that __unmap_and_move() might see !PageAnon, then reach try_to_unmap() with it page_mapped and PageAnon: that window does not exist, with or without my changes. Hugh > > However it turns out, I think you have a very useful test there. > > (And I've observed no PageDirty problems with your recent patchsets, > though I don't use MADV_FREE at all myself.) > > Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, 21 Oct 2015, Hugh Dickins wrote: > On Thu, 22 Oct 2015, Minchan Kim wrote: > > Hello Hugh, > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > I added the code to check it and queued it again but I had another oops > > > > in this time but symptom is related to anon_vma, too. > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > page_mapped > > > > at that time but second check of page_mapped right before try_to_unmap > > > > seems > > > > to be true. > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > index:0x60aff > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > > !anon_vma) > > > > > > That's interesting, that's one I added in my page migration series. > > > Let me think on it, but it could well relate to the one you got before. I think I have introduced a bug there; or rather, made more evident a pre-existing bug. But I'm not sure yet: the stacktrace was from compaction (called by khugepaged, but that may not be relevant at all), and thinking through the races with isolate_migratepages_block() is never easy. What's certain is that I was not giving any thought to isolate_migratepages_block() when I added that VM_BUG_ON_PAGE(): I was thinking about "stable" anonymous pages, and how they get faulted back in from swapcache while holding page lock. It looks to me now as if a page might not yet be PageAnon when it's first tested in __unmap_and_move(), when going to page_get_anon_vma(); but is page_mapped() and PageAnon() by time of calling try_to_unmap(), where I inserted the VM_BUG_ON_PAGE(). If so, the code would always have been wrong (trying to unmap the anonymous page, and later remap its replacement, without a hold on the anon_vma needed to guide both lookups); but I'll have made it more glaringly wrong with the VM_BUG_ON_PAGE() - let me pretend that's a good step forward :) There's a reference count check in isolated_migratepages_block() before this, which would make it unlikely, but I doubt rules it out. However... you did hit an anon_vma reference counting problem before my migration changes went in, and Kirill had a vague suspicion that he might be screwing up anon_vma refcounting in split_huge_page(): if he confirms that, I'd say it's more likely to be the cause of your crash on this occasion. Not hard to fix mine (though we'll probably have to lose the VM_BUG_ON_PAGE on the way, so the real fix will be hidden by that trivial fix), I just want to give the races more thought. However it turns out, I think you have a very useful test there. (And I've observed no PageDirty problems with your recent patchsets, though I don't use MADV_FREE at all myself.) Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, 22 Oct 2015, Minchan Kim wrote: > Hello Hugh, > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > I added the code to check it and queued it again but I had another oops > > > in this time but symptom is related to anon_vma, too. > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > It seems page_get_anon_vma returns NULL since the page was not page_mapped > > > at that time but second check of page_mapped right before try_to_unmap > > > seems > > > to be true. > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > index:0x60aff > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > !anon_vma) > > > > That's interesting, that's one I added in my page migration series. > > Let me think on it, but it could well relate to the one you got before. > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > instead of next-20151021 to remove noise from your migration cleanup > series and will test it again. > If it is fixed, I will test again with your migration patchset, then. Not a good use of your time, I think. It's sure to be fixed in the rc5-mmotm because that VM_BUG_ON_PAGE(blah) just does not exist in that tree: I added it to verify my reasoning in changing the comments about page_get_anon_vma() and PageSwapCache in mm/migrate.c. > > > > > > page->mem_cgroup:88007f3dcc00 > > > [ cut here ] > > > kernel BUG at mm/migrate.c:889! > > > invalid opcode: [#1] SMP > > > Dumping ftrace buffer: > > >(ftrace buffer empty) > > > Modules linked in: > > > CPU: 11 PID: 59 Comm: khugepaged Not tainted > > > 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 > > > > Hmm, it might be me to blame, or it might be Kirill, don't know yet. > > It might be me, either. > > > > > Oh, hold on, I think Andrew has just posted a new mmotm, and it includes > > an update to Kirill's migrate_pages-try-to-split-pages-on-queueing.patch: > > I haven't digested yet, but it might turn out to be relevant. Sorry, I think that was an irrelevant suggestion: today's new rc6-mmotm is identical to yesterday's there, and the patch that was removed appears to be identical to the one added. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
Hello Hugh, On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > I added the code to check it and queued it again but I had another oops > > in this time but symptom is related to anon_vma, too. > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > It seems page_get_anon_vma returns NULL since the page was not page_mapped > > at that time but second check of page_mapped right before try_to_unmap seems > > to be true. > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > index:0x60aff > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > !anon_vma) > > That's interesting, that's one I added in my page migration series. > Let me think on it, but it could well relate to the one you got before. I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 instead of next-20151021 to remove noise from your migration cleanup series and will test it again. If it is fixed, I will test again with your migration patchset, then. > > > page->mem_cgroup:88007f3dcc00 > > [ cut here ] > > kernel BUG at mm/migrate.c:889! > > invalid opcode: [#1] SMP > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 11 PID: 59 Comm: khugepaged Not tainted > > 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 > > Hmm, it might be me to blame, or it might be Kirill, don't know yet. It might be me, either. > > Oh, hold on, I think Andrew has just posted a new mmotm, and it includes > an update to Kirill's migrate_pages-try-to-split-pages-on-queueing.patch: > I haven't digested yet, but it might turn out to be relevant. > > Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, 22 Oct 2015, Minchan Kim wrote: > > I added the code to check it and queued it again but I had another oops > in this time but symptom is related to anon_vma, too. > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > It seems page_get_anon_vma returns NULL since the page was not page_mapped > at that time but second check of page_mapped right before try_to_unmap seems > to be true. > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > index:0x60aff > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > !anon_vma) That's interesting, that's one I added in my page migration series. Let me think on it, but it could well relate to the one you got before. > page->mem_cgroup:88007f3dcc00 > [ cut here ] > kernel BUG at mm/migrate.c:889! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 11 PID: 59 Comm: khugepaged Not tainted > 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 Hmm, it might be me to blame, or it might be Kirill, don't know yet. Oh, hold on, I think Andrew has just posted a new mmotm, and it includes an update to Kirill's migrate_pages-try-to-split-pages-on-queueing.patch: I haven't digested yet, but it might turn out to be relevant. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Oct 21, 2015 at 02:07:23PM +0300, Kirill A. Shutemov wrote: > On Wed, Oct 21, 2015 at 02:28:36PM +0900, Minchan Kim wrote: > > I detach this report from my patchset thread because I see below > > problem with removing MADV_FREE related code and I can reproduce > > same oops with MADV_FREE + recent patches(both my SetPageDirty > > and Kirill's pte_mkdirty) within 7 hours. > > Could you share code for your workload? It's part of test suite so I need time to factor it out. I will do/test and send it. > > > I can not be sure it's THP refcount redesign's problem but it was > > one of big change in MM between mmotm-2015-10-15-15-20 and > > mmotm-2015-10-06-16-30 so it could be a culprit. > > > > In page_lock_anon_vma_read, anon_vma_root was NULL. > > I added VM_BUG_ON_PAGE(!root_anon_vma, page) in there and got the result. > > Hm. That's tricky.. :-/ > > Could you please dump anon_vma->refcount too? I added the code to check it and queued it again but I had another oops in this time but symptom is related to anon_vma, too. (kernel is based on recent mmotm + unconditional mkdirty for bug fix) It seems page_get_anon_vma returns NULL since the page was not page_mapped at that time but second check of page_mapped right before try_to_unmap seems to be true. Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 index:0x60aff flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma) page->mem_cgroup:88007f3dcc00 [ cut here ] kernel BUG at mm/migrate.c:889! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 11 PID: 59 Comm: khugepaged Not tainted 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 8800b9851a40 ti: 8800b985c000 task.ti: 8800b985c000 RIP: 0010:[] [] migrate_pages+0x8e6/0x950 RSP: 0018:8800b985fa00 EFLAGS: 00010286 RAX: 0021 RBX: ea0002dd7fc0 RCX: 81830db8 RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 8800b985fa80 R08: R09: 880bb160 R10: 8163e000 R11: 01e0 R12: R13: ea0001cfbf80 R14: ea0001cfbfc0 R15: 8189de80 FS: () GS:8800bfb6() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 5594f9d7e578 CR3: 01808000 CR4: 06a0 Stack: 8800b9851a40 811144b0 81115fb0 ea0001cfbfe0 8800b985fb30 8800b985fb20 8800b985fb20 Call Trace: [] ? trace_raw_output_mm_compaction_defer_template+0xc0/0xc0 [] ? isolate_freepages_block+0x3d0/0x3d0 [] compact_zone+0x2bb/0x720 [] ? retint_kernel+0x10/0x10 [] ? list_del+0xd/0x30 [] compact_zone_order+0x6d/0xa0 [] try_to_compact_pages+0xed/0x200 [] __alloc_pages_direct_compact+0x3b/0xd4 [] __alloc_pages_nodemask+0x3fb/0x920 [] khugepaged+0x158/0x1b90 [] ? hrtick_update+0x51/0x70 [] ? prepare_to_wait_event+0xf0/0xf0 [] ? unfreeze_page+0x320/0x320 [] kthread+0xc9/0xe0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60 Code: 44 c6 48 8b 40 08 83 e0 03 48 83 f8 03 0f 84 fd fa ff ff 4d 85 e4 0f 85 f4 fa ff ff 48 c7 c6 58 e9 77 81 4c 89 f7 e8 fa 2a fd ff <0f> 0b 48 83 e8 01 e9 d0 fa ff ff f6 40 07 01 0f 84 5b fd ff ff RIP [] migrate_pages+0x8e6/0x950 RSP ---[ end trace 59eb35cc15af8a53 ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled > > I have vage suspicion that I'm screwing up anon_vma refcounting during > split_huge_page. > > It would be great to see if the page was part of THP before. > > > > > .. > > .. > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > > page:ea0001b81140 count:3 mapcount:1 mapping:88007e806461 > > index:0x61445 > > page:ea0001b87bc0 count:3 mapcount:1 mapping:88007e806461 > > index:0x615ef > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(1) > > page->mem_cgroup:88007f2de000 > > [ cut here ] > > kernel BUG at mm/rmap.c:517! > > invalid opcode: [#1] SMP > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 0 PID: 24935 Comm: madvise_test Not tainted > > 4.3.0-rc5-mm1-THP-ref-madv_free+ #1555 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > task: 88ce8000 ti: 8800ada28000 task.ti: 8800ada28000 > > RIP: 0010:[] [] > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Oct 21, 2015 at 02:28:36PM +0900, Minchan Kim wrote: > I detach this report from my patchset thread because I see below > problem with removing MADV_FREE related code and I can reproduce > same oops with MADV_FREE + recent patches(both my SetPageDirty > and Kirill's pte_mkdirty) within 7 hours. Could you share code for your workload? > I can not be sure it's THP refcount redesign's problem but it was > one of big change in MM between mmotm-2015-10-15-15-20 and > mmotm-2015-10-06-16-30 so it could be a culprit. > > In page_lock_anon_vma_read, anon_vma_root was NULL. > I added VM_BUG_ON_PAGE(!root_anon_vma, page) in there and got the result. Hm. That's tricky.. :-/ Could you please dump anon_vma->refcount too? I have vage suspicion that I'm screwing up anon_vma refcounting during split_huge_page. It would be great to see if the page was part of THP before. > > .. > .. > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > page:ea0001b81140 count:3 mapcount:1 mapping:88007e806461 > index:0x61445 > page:ea0001b87bc0 count:3 mapcount:1 mapping:88007e806461 > index:0x615ef > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > page dumped because: VM_BUG_ON_PAGE(1) > page->mem_cgroup:88007f2de000 > [ cut here ] > kernel BUG at mm/rmap.c:517! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 0 PID: 24935 Comm: madvise_test Not tainted > 4.3.0-rc5-mm1-THP-ref-madv_free+ #1555 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 88ce8000 ti: 8800ada28000 task.ti: 8800ada28000 > RIP: 0010:[] [] > page_lock_anon_vma_read+0x18e/0x190 > RSP: :8800ada2b868 EFLAGS: 00010296 > RAX: 0021 RBX: ea0001b87bc0 RCX: > RDX: 0001 RSI: 0282 RDI: 81830db0 > RBP: 8800ada2b888 R08: 0021 R09: 8800ba40eb75 > R10: 01ff14bc R11: R12: 88007e806461 > R13: 88007e806460 R14: R15: 818464c0 > FS: 7f6d93212740() GS:8800bfa0() knlGS: > CS: 0010 DS: ES: CR0: 8005003b > CR2: 63c14000 CR3: a674b000 CR4: 06b0 > Stack: > ea0001b87bc0 8800ada2b8f8 88007f2de000 > 8800ada2b8d0 81129593 8800 8105f8c0 > ea0001b87bc0 8800ada2b9f8 88007f2de000 > Call Trace: > [] rmap_walk+0x1b3/0x3f0 > [] ? finish_task_switch+0x70/0x260 > [] page_referenced+0x1a3/0x220 > [] ? __page_check_address+0x1d0/0x1d0 > [] ? page_get_anon_vma+0xd0/0xd0 > [] ? anon_vma_ctor+0x40/0x40 > [] shrink_page_list+0x5ce/0xdc0 > [] shrink_inactive_list+0x18c/0x4b0 > [] shrink_lruvec+0x58f/0x730 > [] shrink_zone+0xd4/0x280 > [] do_try_to_free_pages+0x12d/0x3b0 > [] try_to_free_mem_cgroup_pages+0x9d/0x120 > [] try_charge+0x175/0x720 > [] ? __activate_page+0x230/0x230 > [] mem_cgroup_try_charge+0x85/0x1d0 > [] handle_mm_fault+0xc9a/0x1000 > [] ? __set_cpus_allowed_ptr+0x9b/0x1a0 > [] __do_page_fault+0x189/0x400 > [] do_page_fault+0xc/0x10 > [] page_fault+0x22/0x30 > Code: c9 0f 84 b9 fe ff ff 8d 51 01 89 c8 f0 0f b1 16 39 c1 0f 84 11 ff ff ff > 89 c1 eb e3 48 c7 c6 88 02 78 81 48 89 df e8 02 f3 fe ff <0f> 0b 0f 1f 44 00 > 00 55 48 89 e5 41 57 41 56 45 31 f6 > 41 55 4c > RIP [] page_lock_anon_vma_read+0x18e/0x190 > RSP > ---[ end trace cfbb87f54f12290e ]--- > Kernel panic - not syncing: Fatal exception > Dumping ftrace buffer: >(ftrace buffer empty) > Kernel Offset: disabled > > On Tue, Oct 20, 2015 at 10:38:54AM +0900, Minchan Kim wrote: > > On Mon, Oct 19, 2015 at 07:01:50PM +0900, Minchan Kim wrote: > > > On Mon, Oct 19, 2015 at 03:31:42PM +0900, Minchan Kim wrote: > > > > Hello, it's too late since I sent previos patch. > > > > https://lkml.org/lkml/2015/6/3/37 > > > > > > > > This patch is alomost new compared to previos approach. > > > > I think this is more simple, clear and easy to review. > > > > > > > > One thing I should notice is that I have tested this patch > > > > and couldn't find any critical problem so I rebased patchset > > > > onto recent mmotm(ie, mmotm-2015-10-15-15-20) to send formal > > > > patchset. Unfortunately, I start to see sudden discarding of > > > > the page we shouldn't do. IOW, application's valid anonymous page > > > > was disappeared suddenly. > > > > > > > > When I look through THP changes, I think we could lose > > > > dirty bit of pte between freeze_page and unfreeze_page > > > > when we mark it as migration entry and restore it. > > > > So, I added below simple code without enough considering > > > > and cannot see the problem any more. > > > > I hope it's good hint to find right fix this problem. > > > > > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, 22 Oct 2015, Minchan Kim wrote: > Hello Hugh, > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > I added the code to check it and queued it again but I had another oops > > > in this time but symptom is related to anon_vma, too. > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > It seems page_get_anon_vma returns NULL since the page was not page_mapped > > > at that time but second check of page_mapped right before try_to_unmap > > > seems > > > to be true. > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k > > > FS > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > index:0x60aff > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > !anon_vma) > > > > That's interesting, that's one I added in my page migration series. > > Let me think on it, but it could well relate to the one you got before. > > I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 > instead of next-20151021 to remove noise from your migration cleanup > series and will test it again. > If it is fixed, I will test again with your migration patchset, then. Not a good use of your time, I think. It's sure to be fixed in the rc5-mmotm because that VM_BUG_ON_PAGE(blah) just does not exist in that tree: I added it to verify my reasoning in changing the comments about page_get_anon_vma() and PageSwapCache in mm/migrate.c. > > > > > > page->mem_cgroup:88007f3dcc00 > > > [ cut here ] > > > kernel BUG at mm/migrate.c:889! > > > invalid opcode: [#1] SMP > > > Dumping ftrace buffer: > > >(ftrace buffer empty) > > > Modules linked in: > > > CPU: 11 PID: 59 Comm: khugepaged Not tainted > > > 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 > > > > Hmm, it might be me to blame, or it might be Kirill, don't know yet. > > It might be me, either. > > > > > Oh, hold on, I think Andrew has just posted a new mmotm, and it includes > > an update to Kirill's migrate_pages-try-to-split-pages-on-queueing.patch: > > I haven't digested yet, but it might turn out to be relevant. Sorry, I think that was an irrelevant suggestion: today's new rc6-mmotm is identical to yesterday's there, and the patch that was removed appears to be identical to the one added. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, 21 Oct 2015, Hugh Dickins wrote: > On Thu, 22 Oct 2015, Minchan Kim wrote: > > Hello Hugh, > > > > On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > > > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > > > > > I added the code to check it and queued it again but I had another oops > > > > in this time but symptom is related to anon_vma, too. > > > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > > > It seems page_get_anon_vma returns NULL since the page was not > > > > page_mapped > > > > at that time but second check of page_mapped right before try_to_unmap > > > > seems > > > > to be true. > > > > > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 > > > > across:4191228k FS > > > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > > > index:0x60aff > > > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > > > !anon_vma) > > > > > > That's interesting, that's one I added in my page migration series. > > > Let me think on it, but it could well relate to the one you got before. I think I have introduced a bug there; or rather, made more evident a pre-existing bug. But I'm not sure yet: the stacktrace was from compaction (called by khugepaged, but that may not be relevant at all), and thinking through the races with isolate_migratepages_block() is never easy. What's certain is that I was not giving any thought to isolate_migratepages_block() when I added that VM_BUG_ON_PAGE(): I was thinking about "stable" anonymous pages, and how they get faulted back in from swapcache while holding page lock. It looks to me now as if a page might not yet be PageAnon when it's first tested in __unmap_and_move(), when going to page_get_anon_vma(); but is page_mapped() and PageAnon() by time of calling try_to_unmap(), where I inserted the VM_BUG_ON_PAGE(). If so, the code would always have been wrong (trying to unmap the anonymous page, and later remap its replacement, without a hold on the anon_vma needed to guide both lookups); but I'll have made it more glaringly wrong with the VM_BUG_ON_PAGE() - let me pretend that's a good step forward :) There's a reference count check in isolated_migratepages_block() before this, which would make it unlikely, but I doubt rules it out. However... you did hit an anon_vma reference counting problem before my migration changes went in, and Kirill had a vague suspicion that he might be screwing up anon_vma refcounting in split_huge_page(): if he confirms that, I'd say it's more likely to be the cause of your crash on this occasion. Not hard to fix mine (though we'll probably have to lose the VM_BUG_ON_PAGE on the way, so the real fix will be hidden by that trivial fix), I just want to give the races more thought. However it turns out, I think you have a very useful test there. (And I've observed no PageDirty problems with your recent patchsets, though I don't use MADV_FREE at all myself.) Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Thu, 22 Oct 2015, Minchan Kim wrote: > > I added the code to check it and queued it again but I had another oops > in this time but symptom is related to anon_vma, too. > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > It seems page_get_anon_vma returns NULL since the page was not page_mapped > at that time but second check of page_mapped right before try_to_unmap seems > to be true. > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > index:0x60aff > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > !anon_vma) That's interesting, that's one I added in my page migration series. Let me think on it, but it could well relate to the one you got before. > page->mem_cgroup:88007f3dcc00 > [ cut here ] > kernel BUG at mm/migrate.c:889! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 11 PID: 59 Comm: khugepaged Not tainted > 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 Hmm, it might be me to blame, or it might be Kirill, don't know yet. Oh, hold on, I think Andrew has just posted a new mmotm, and it includes an update to Kirill's migrate_pages-try-to-split-pages-on-queueing.patch: I haven't digested yet, but it might turn out to be relevant. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
Hello Hugh, On Wed, Oct 21, 2015 at 05:59:59PM -0700, Hugh Dickins wrote: > On Thu, 22 Oct 2015, Minchan Kim wrote: > > > > I added the code to check it and queued it again but I had another oops > > in this time but symptom is related to anon_vma, too. > > (kernel is based on recent mmotm + unconditional mkdirty for bug fix) > > It seems page_get_anon_vma returns NULL since the page was not page_mapped > > at that time but second check of page_mapped right before try_to_unmap seems > > to be true. > > > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > > page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 > > index:0x60aff > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && > > !anon_vma) > > That's interesting, that's one I added in my page migration series. > Let me think on it, but it could well relate to the one you got before. I will roll back to mm/madv_free-v4.3-rc5-mmotm-2015-10-15-15-20 instead of next-20151021 to remove noise from your migration cleanup series and will test it again. If it is fixed, I will test again with your migration patchset, then. > > > page->mem_cgroup:88007f3dcc00 > > [ cut here ] > > kernel BUG at mm/migrate.c:889! > > invalid opcode: [#1] SMP > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 11 PID: 59 Comm: khugepaged Not tainted > > 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 > > Hmm, it might be me to blame, or it might be Kirill, don't know yet. It might be me, either. > > Oh, hold on, I think Andrew has just posted a new mmotm, and it includes > an update to Kirill's migrate_pages-try-to-split-pages-on-queueing.patch: > I haven't digested yet, but it might turn out to be relevant. > > Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Oct 21, 2015 at 02:07:23PM +0300, Kirill A. Shutemov wrote: > On Wed, Oct 21, 2015 at 02:28:36PM +0900, Minchan Kim wrote: > > I detach this report from my patchset thread because I see below > > problem with removing MADV_FREE related code and I can reproduce > > same oops with MADV_FREE + recent patches(both my SetPageDirty > > and Kirill's pte_mkdirty) within 7 hours. > > Could you share code for your workload? It's part of test suite so I need time to factor it out. I will do/test and send it. > > > I can not be sure it's THP refcount redesign's problem but it was > > one of big change in MM between mmotm-2015-10-15-15-20 and > > mmotm-2015-10-06-16-30 so it could be a culprit. > > > > In page_lock_anon_vma_read, anon_vma_root was NULL. > > I added VM_BUG_ON_PAGE(!root_anon_vma, page) in there and got the result. > > Hm. That's tricky.. :-/ > > Could you please dump anon_vma->refcount too? I added the code to check it and queued it again but I had another oops in this time but symptom is related to anon_vma, too. (kernel is based on recent mmotm + unconditional mkdirty for bug fix) It seems page_get_anon_vma returns NULL since the page was not page_mapped at that time but second check of page_mapped right before try_to_unmap seems to be true. Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS page:ea0001cfbfc0 count:3 mapcount:1 mapping:88007f1b5f51 index:0x60aff flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) page dumped because: VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma) page->mem_cgroup:88007f3dcc00 [ cut here ] kernel BUG at mm/migrate.c:889! invalid opcode: [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 11 PID: 59 Comm: khugepaged Not tainted 4.3.0-rc6-next-20151021-THP-ref-madv_free+ #1557 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: 8800b9851a40 ti: 8800b985c000 task.ti: 8800b985c000 RIP: 0010:[] [] migrate_pages+0x8e6/0x950 RSP: 0018:8800b985fa00 EFLAGS: 00010286 RAX: 0021 RBX: ea0002dd7fc0 RCX: 81830db8 RDX: 0001 RSI: 0246 RDI: 821df4d8 RBP: 8800b985fa80 R08: R09: 880bb160 R10: 8163e000 R11: 01e0 R12: R13: ea0001cfbf80 R14: ea0001cfbfc0 R15: 8189de80 FS: () GS:8800bfb6() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 5594f9d7e578 CR3: 01808000 CR4: 06a0 Stack: 8800b9851a40 811144b0 81115fb0 ea0001cfbfe0 8800b985fb30 8800b985fb20 8800b985fb20 Call Trace: [] ? trace_raw_output_mm_compaction_defer_template+0xc0/0xc0 [] ? isolate_freepages_block+0x3d0/0x3d0 [] compact_zone+0x2bb/0x720 [] ? retint_kernel+0x10/0x10 [] ? list_del+0xd/0x30 [] compact_zone_order+0x6d/0xa0 [] try_to_compact_pages+0xed/0x200 [] __alloc_pages_direct_compact+0x3b/0xd4 [] __alloc_pages_nodemask+0x3fb/0x920 [] khugepaged+0x158/0x1b90 [] ? hrtick_update+0x51/0x70 [] ? prepare_to_wait_event+0xf0/0xf0 [] ? unfreeze_page+0x320/0x320 [] kthread+0xc9/0xe0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60 Code: 44 c6 48 8b 40 08 83 e0 03 48 83 f8 03 0f 84 fd fa ff ff 4d 85 e4 0f 85 f4 fa ff ff 48 c7 c6 58 e9 77 81 4c 89 f7 e8 fa 2a fd ff <0f> 0b 48 83 e8 01 e9 d0 fa ff ff f6 40 07 01 0f 84 5b fd ff ff RIP [] migrate_pages+0x8e6/0x950 RSP ---[ end trace 59eb35cc15af8a53 ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled > > I have vage suspicion that I'm screwing up anon_vma refcounting during > split_huge_page. > > It would be great to see if the page was part of THP before. > > > > > .. > > .. > > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > > page:ea0001b81140 count:3 mapcount:1 mapping:88007e806461 > > index:0x61445 > > page:ea0001b87bc0 count:3 mapcount:1 mapping:88007e806461 > > index:0x615ef > > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > > page dumped because: VM_BUG_ON_PAGE(1) > > page->mem_cgroup:88007f2de000 > > [ cut here ] > > kernel BUG at mm/rmap.c:517! > > invalid opcode: [#1] SMP > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 0 PID: 24935 Comm: madvise_test Not tainted > > 4.3.0-rc5-mm1-THP-ref-madv_free+ #1555 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > task: 88ce8000 ti: 8800ada28000 task.ti: 8800ada28000 > > RIP: 0010:[] [] > >
Re: kernel oops on mmotm-2015-10-15-15-20
On Wed, Oct 21, 2015 at 02:28:36PM +0900, Minchan Kim wrote: > I detach this report from my patchset thread because I see below > problem with removing MADV_FREE related code and I can reproduce > same oops with MADV_FREE + recent patches(both my SetPageDirty > and Kirill's pte_mkdirty) within 7 hours. Could you share code for your workload? > I can not be sure it's THP refcount redesign's problem but it was > one of big change in MM between mmotm-2015-10-15-15-20 and > mmotm-2015-10-06-16-30 so it could be a culprit. > > In page_lock_anon_vma_read, anon_vma_root was NULL. > I added VM_BUG_ON_PAGE(!root_anon_vma, page) in there and got the result. Hm. That's tricky.. :-/ Could you please dump anon_vma->refcount too? I have vage suspicion that I'm screwing up anon_vma refcounting during split_huge_page. It would be great to see if the page was part of THP before. > > .. > .. > Adding 4191228k swap on /dev/vda5. Priority:-1 extents:1 across:4191228k FS > page:ea0001b81140 count:3 mapcount:1 mapping:88007e806461 > index:0x61445 > page:ea0001b87bc0 count:3 mapcount:1 mapping:88007e806461 > index:0x615ef > flags: 0x40048019(locked|uptodate|dirty|swapcache|swapbacked) > page dumped because: VM_BUG_ON_PAGE(1) > page->mem_cgroup:88007f2de000 > [ cut here ] > kernel BUG at mm/rmap.c:517! > invalid opcode: [#1] SMP > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 0 PID: 24935 Comm: madvise_test Not tainted > 4.3.0-rc5-mm1-THP-ref-madv_free+ #1555 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: 88ce8000 ti: 8800ada28000 task.ti: 8800ada28000 > RIP: 0010:[] [] > page_lock_anon_vma_read+0x18e/0x190 > RSP: :8800ada2b868 EFLAGS: 00010296 > RAX: 0021 RBX: ea0001b87bc0 RCX: > RDX: 0001 RSI: 0282 RDI: 81830db0 > RBP: 8800ada2b888 R08: 0021 R09: 8800ba40eb75 > R10: 01ff14bc R11: R12: 88007e806461 > R13: 88007e806460 R14: R15: 818464c0 > FS: 7f6d93212740() GS:8800bfa0() knlGS: > CS: 0010 DS: ES: CR0: 8005003b > CR2: 63c14000 CR3: a674b000 CR4: 06b0 > Stack: > ea0001b87bc0 8800ada2b8f8 88007f2de000 > 8800ada2b8d0 81129593 8800 8105f8c0 > ea0001b87bc0 8800ada2b9f8 88007f2de000 > Call Trace: > [] rmap_walk+0x1b3/0x3f0 > [] ? finish_task_switch+0x70/0x260 > [] page_referenced+0x1a3/0x220 > [] ? __page_check_address+0x1d0/0x1d0 > [] ? page_get_anon_vma+0xd0/0xd0 > [] ? anon_vma_ctor+0x40/0x40 > [] shrink_page_list+0x5ce/0xdc0 > [] shrink_inactive_list+0x18c/0x4b0 > [] shrink_lruvec+0x58f/0x730 > [] shrink_zone+0xd4/0x280 > [] do_try_to_free_pages+0x12d/0x3b0 > [] try_to_free_mem_cgroup_pages+0x9d/0x120 > [] try_charge+0x175/0x720 > [] ? __activate_page+0x230/0x230 > [] mem_cgroup_try_charge+0x85/0x1d0 > [] handle_mm_fault+0xc9a/0x1000 > [] ? __set_cpus_allowed_ptr+0x9b/0x1a0 > [] __do_page_fault+0x189/0x400 > [] do_page_fault+0xc/0x10 > [] page_fault+0x22/0x30 > Code: c9 0f 84 b9 fe ff ff 8d 51 01 89 c8 f0 0f b1 16 39 c1 0f 84 11 ff ff ff > 89 c1 eb e3 48 c7 c6 88 02 78 81 48 89 df e8 02 f3 fe ff <0f> 0b 0f 1f 44 00 > 00 55 48 89 e5 41 57 41 56 45 31 f6 > 41 55 4c > RIP [] page_lock_anon_vma_read+0x18e/0x190 > RSP > ---[ end trace cfbb87f54f12290e ]--- > Kernel panic - not syncing: Fatal exception > Dumping ftrace buffer: >(ftrace buffer empty) > Kernel Offset: disabled > > On Tue, Oct 20, 2015 at 10:38:54AM +0900, Minchan Kim wrote: > > On Mon, Oct 19, 2015 at 07:01:50PM +0900, Minchan Kim wrote: > > > On Mon, Oct 19, 2015 at 03:31:42PM +0900, Minchan Kim wrote: > > > > Hello, it's too late since I sent previos patch. > > > > https://lkml.org/lkml/2015/6/3/37 > > > > > > > > This patch is alomost new compared to previos approach. > > > > I think this is more simple, clear and easy to review. > > > > > > > > One thing I should notice is that I have tested this patch > > > > and couldn't find any critical problem so I rebased patchset > > > > onto recent mmotm(ie, mmotm-2015-10-15-15-20) to send formal > > > > patchset. Unfortunately, I start to see sudden discarding of > > > > the page we shouldn't do. IOW, application's valid anonymous page > > > > was disappeared suddenly. > > > > > > > > When I look through THP changes, I think we could lose > > > > dirty bit of pte between freeze_page and unfreeze_page > > > > when we mark it as migration entry and restore it. > > > > So, I added below simple code without enough considering > > > > and cannot see the problem any more. > > > > I hope it's good hint to find right fix this problem. > > > > > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > > >
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6f957724b94cb19f5c1c97efd01dd4df8ced323c >> > > Certainly looks like a plausible solution, will build kernel tonight to > confirm. Just to confirm; 4.2rc1 + above patch, and 4.2rc2 both function correctly and I no longer see the lock up/Oops. Thanks to all who helped out, Simon. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6f957724b94cb19f5c1c97efd01dd4df8ced323c Certainly looks like a plausible solution, will build kernel tonight to confirm. Just to confirm; 4.2rc1 + above patch, and 4.2rc2 both function correctly and I no longer see the lock up/Oops. Thanks to all who helped out, Simon. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
> On 07/17/2015 08:14 AM, si...@mungewell.org wrote: >> So in summary this problem is showing up now as the 'User Helper Fallback' is now forced on, obviously the underlying problem needs to be fixed - but I don't know when it crept in. >>> >>> The 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' enables to load firmware >>> data manually by accessing /sys/class/firmware//data. It runs in >>> case the firmware file is missing. >>> This user helper fallback will be enabled if one of LP55xx driver is >>> included in your dot config. Please see my patch below. >>> >>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/leds?id=b67893206fc0a0e8af87130e67f3d8ae553fc87c >>> >>> However, I'm not sure why this affects your system lockup. Can I have >>> more details? >> >> Hi Milo, >> I'm not suggesting that your patch is the cause, just that it is an >> 'enabler' and and explains why the problem (system lockup when I plug >> USB >> Bluetooth dongle in) appears now. >> >> A full Oops log is further back in this thread: >> http://www.spinics.net/lists/linux-bluetooth/msg63090.html >> >> Will try building 4.1 with this option to see if it fails. >> >> A very quick test as I was leaving the house this morning shows that 4.1 >> with 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' does not show the problem. >> >> So at least we know the 'real' problem is a recent change to the code. >> Simon >> > > I think this was reported and fixed > > https://lkml.org/lkml/2015/7/8/858 > https://lkml.org/lkml/2015/7/8/1199 > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6f957724b94cb19f5c1c97efd01dd4df8ced323c > Certainly looks like a plausible solution, will build kernel tonight to confirm. If Shuah is still looking for the trigger, see above note regarding 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK'. Thanks, and have an awesome weekend. :-) Simon -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
On 07/17/2015 08:14 AM, si...@mungewell.org wrote: So in summary this problem is showing up now as the 'User Helper Fallback' is now forced on, obviously the underlying problem needs to be fixed - but I don't know when it crept in. The 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' enables to load firmware data manually by accessing /sys/class/firmware//data. It runs in case the firmware file is missing. This user helper fallback will be enabled if one of LP55xx driver is included in your dot config. Please see my patch below. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/leds?id=b67893206fc0a0e8af87130e67f3d8ae553fc87c However, I'm not sure why this affects your system lockup. Can I have more details? Hi Milo, I'm not suggesting that your patch is the cause, just that it is an 'enabler' and and explains why the problem (system lockup when I plug USB Bluetooth dongle in) appears now. A full Oops log is further back in this thread: http://www.spinics.net/lists/linux-bluetooth/msg63090.html Will try building 4.1 with this option to see if it fails. A very quick test as I was leaving the house this morning shows that 4.1 with 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' does not show the problem. So at least we know the 'real' problem is a recent change to the code. Simon I think this was reported and fixed https://lkml.org/lkml/2015/7/8/858 https://lkml.org/lkml/2015/7/8/1199 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6f957724b94cb19f5c1c97efd01dd4df8ced323c Thanks, Laura -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
>> So in summary this problem is showing up now as the 'User Helper >> Fallback' >> is now forced on, obviously the underlying problem needs to be fixed - >> but >> I don't know when it crept in. >> > > The 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' enables to load firmware > data manually by accessing /sys/class/firmware//data. It runs in > case the firmware file is missing. > This user helper fallback will be enabled if one of LP55xx driver is > included in your dot config. Please see my patch below. > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/leds?id=b67893206fc0a0e8af87130e67f3d8ae553fc87c > > However, I'm not sure why this affects your system lockup. Can I have > more details? Hi Milo, I'm not suggesting that your patch is the cause, just that it is an 'enabler' and and explains why the problem (system lockup when I plug USB Bluetooth dongle in) appears now. A full Oops log is further back in this thread: http://www.spinics.net/lists/linux-bluetooth/msg63090.html >> Will try building 4.1 with this option to see if it fails. A very quick test as I was leaving the house this morning shows that 4.1 with 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' does not show the problem. So at least we know the 'real' problem is a recent change to the code. Simon -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
Hi Simon, On 7/17/2015 3:14 PM, si...@mungewell.org wrote: It looks like the firmware 'opt_flags' must be different, so this may be a contributing factor. Plot thickens kernel config has changed since I built 4.1.0rc7, but I don't recall doing it or starting a fresh. /boot/config-4.1.0-rc7+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="" CONFIG_FW_LOADER_USER_HELPER=y # CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set CONFIG_WANT_DEV_COREDUMP=y CONFIG_ALLOW_DEV_COREDUMP=y CONFIG_DEV_COREDUMP=y -- /boot/config-4.2.0-rc1+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="" CONFIG_FW_LOADER_USER_HELPER=y CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
> It looks like the firmware 'opt_flags' must be different, so this may be a > contributing factor. Plot thickens kernel config has changed since I built 4.1.0rc7, but I don't recall doing it or starting a fresh. /boot/config-4.1.0-rc7+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="" CONFIG_FW_LOADER_USER_HELPER=y # CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set CONFIG_WANT_DEV_COREDUMP=y CONFIG_ALLOW_DEV_COREDUMP=y CONFIG_DEV_COREDUMP=y -- /boot/config-4.2.0-rc1+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="" CONFIG_FW_LOADER_USER_HELPER=y CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
So in summary this problem is showing up now as the 'User Helper Fallback' is now forced on, obviously the underlying problem needs to be fixed - but I don't know when it crept in. The 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' enables to load firmware data manually by accessing /sys/class/firmware/name/data. It runs in case the firmware file is missing. This user helper fallback will be enabled if one of LP55xx driver is included in your dot config. Please see my patch below. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/leds?id=b67893206fc0a0e8af87130e67f3d8ae553fc87c However, I'm not sure why this affects your system lockup. Can I have more details? Hi Milo, I'm not suggesting that your patch is the cause, just that it is an 'enabler' and and explains why the problem (system lockup when I plug USB Bluetooth dongle in) appears now. A full Oops log is further back in this thread: http://www.spinics.net/lists/linux-bluetooth/msg63090.html Will try building 4.1 with this option to see if it fails. A very quick test as I was leaving the house this morning shows that 4.1 with 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' does not show the problem. So at least we know the 'real' problem is a recent change to the code. Simon -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
It looks like the firmware 'opt_flags' must be different, so this may be a contributing factor. Plot thickens kernel config has changed since I built 4.1.0rc7, but I don't recall doing it or starting a fresh. /boot/config-4.1.0-rc7+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE= CONFIG_FW_LOADER_USER_HELPER=y # CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set CONFIG_WANT_DEV_COREDUMP=y CONFIG_ALLOW_DEV_COREDUMP=y CONFIG_DEV_COREDUMP=y -- /boot/config-4.2.0-rc1+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE= CONFIG_FW_LOADER_USER_HELPER=y CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y !!! CONFIG_WANT_DEV_COREDUMP=y CONFIG_ALLOW_DEV_COREDUMP=y CONFIG_DEV_COREDUMP=y -- Has a kconfig forced a change? Grrr -- $ git blame ./drivers/leds/Kconfig -- c93d08fa7 (Milo(Woogyom) Kim 2013-02-05 18:01:23 +0900 228) config LEDS_LP55XX_COMMON 33b3a561f (Kim, Milo 2013-07-09 02:11:37 -0700 229) tristate Common Driver for TI/National LP5521/5523/55231/5562/8501 33b3a561f (Kim, Milo 2013-07-09 02:11:37 -0700 230) depends on LEDS_LP5521 || LEDS_LP5523 || LEDS_LP5562 || LEDS_LP8501 10c06d178 (Milo(Woogyom) Kim 2013-02-05 19:17:20 +0900 231) select FW_LOADER b67893206 (Milo Kim 2015-06-28 17:39:14 -0700 232) select FW_LOADER_USER_HELPER_FALLBACK - c93d08fa7 (Milo(Woogyom) Kim 2013-02-05 18:01:23 +0900 233) help 33b3a561f (Kim, Milo 2013-07-09 02:11:37 -0700 234) This option supports common operations for LP5521/5523/55231/5562/8501 c93d08fa7 (Milo(Woogyom) Kim 2013-02-05 18:01:23 +0900 235) devices. -- So in summary this problem is showing up now as the 'User Helper Fallback' is now forced on, obviously the underlying problem needs to be fixed - but I don't know when it crept in. Will try building 4.1 with this option to see if it fails. Simon -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
Hi Simon, On 7/17/2015 3:14 PM, si...@mungewell.org wrote: It looks like the firmware 'opt_flags' must be different, so this may be a contributing factor. Plot thickens kernel config has changed since I built 4.1.0rc7, but I don't recall doing it or starting a fresh. /boot/config-4.1.0-rc7+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE= CONFIG_FW_LOADER_USER_HELPER=y # CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set CONFIG_WANT_DEV_COREDUMP=y CONFIG_ALLOW_DEV_COREDUMP=y CONFIG_DEV_COREDUMP=y -- /boot/config-4.2.0-rc1+ -- CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE= CONFIG_FW_LOADER_USER_HELPER=y CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y !!! CONFIG_WANT_DEV_COREDUMP=y CONFIG_ALLOW_DEV_COREDUMP=y CONFIG_DEV_COREDUMP=y -- Has a kconfig forced a change? Grrr -- $ git blame ./drivers/leds/Kconfig -- c93d08fa7 (Milo(Woogyom) Kim 2013-02-05 18:01:23 +0900 228) config LEDS_LP55XX_COMMON 33b3a561f (Kim, Milo 2013-07-09 02:11:37 -0700 229) tristate Common Driver for TI/National LP5521/5523/55231/5562/8501 33b3a561f (Kim, Milo 2013-07-09 02:11:37 -0700 230) depends on LEDS_LP5521 || LEDS_LP5523 || LEDS_LP5562 || LEDS_LP8501 10c06d178 (Milo(Woogyom) Kim 2013-02-05 19:17:20 +0900 231) select FW_LOADER b67893206 (Milo Kim 2015-06-28 17:39:14 -0700 232) select FW_LOADER_USER_HELPER_FALLBACK - c93d08fa7 (Milo(Woogyom) Kim 2013-02-05 18:01:23 +0900 233) help 33b3a561f (Kim, Milo 2013-07-09 02:11:37 -0700 234) This option supports common operations for LP5521/5523/55231/5562/8501 c93d08fa7 (Milo(Woogyom) Kim 2013-02-05 18:01:23 +0900 235) devices. -- So in summary this problem is showing up now as the 'User Helper Fallback' is now forced on, obviously the underlying problem needs to be fixed - but I don't know when it crept in. The 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' enables to load firmware data manually by accessing /sys/class/firmware/name/data. It runs in case the firmware file is missing. This user helper fallback will be enabled if one of LP55xx driver is included in your dot config. Please see my patch below. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/leds?id=b67893206fc0a0e8af87130e67f3d8ae553fc87c However, I'm not sure why this affects your system lockup. Can I have more details? Best regards, Milo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
On 07/17/2015 08:14 AM, si...@mungewell.org wrote: So in summary this problem is showing up now as the 'User Helper Fallback' is now forced on, obviously the underlying problem needs to be fixed - but I don't know when it crept in. The 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' enables to load firmware data manually by accessing /sys/class/firmware/name/data. It runs in case the firmware file is missing. This user helper fallback will be enabled if one of LP55xx driver is included in your dot config. Please see my patch below. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/leds?id=b67893206fc0a0e8af87130e67f3d8ae553fc87c However, I'm not sure why this affects your system lockup. Can I have more details? Hi Milo, I'm not suggesting that your patch is the cause, just that it is an 'enabler' and and explains why the problem (system lockup when I plug USB Bluetooth dongle in) appears now. A full Oops log is further back in this thread: http://www.spinics.net/lists/linux-bluetooth/msg63090.html Will try building 4.1 with this option to see if it fails. A very quick test as I was leaving the house this morning shows that 4.1 with 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' does not show the problem. So at least we know the 'real' problem is a recent change to the code. Simon I think this was reported and fixed https://lkml.org/lkml/2015/7/8/858 https://lkml.org/lkml/2015/7/8/1199 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6f957724b94cb19f5c1c97efd01dd4df8ced323c Thanks, Laura -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
On 07/17/2015 08:14 AM, si...@mungewell.org wrote: So in summary this problem is showing up now as the 'User Helper Fallback' is now forced on, obviously the underlying problem needs to be fixed - but I don't know when it crept in. The 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' enables to load firmware data manually by accessing /sys/class/firmware/name/data. It runs in case the firmware file is missing. This user helper fallback will be enabled if one of LP55xx driver is included in your dot config. Please see my patch below. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/leds?id=b67893206fc0a0e8af87130e67f3d8ae553fc87c However, I'm not sure why this affects your system lockup. Can I have more details? Hi Milo, I'm not suggesting that your patch is the cause, just that it is an 'enabler' and and explains why the problem (system lockup when I plug USB Bluetooth dongle in) appears now. A full Oops log is further back in this thread: http://www.spinics.net/lists/linux-bluetooth/msg63090.html Will try building 4.1 with this option to see if it fails. A very quick test as I was leaving the house this morning shows that 4.1 with 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK' does not show the problem. So at least we know the 'real' problem is a recent change to the code. Simon I think this was reported and fixed https://lkml.org/lkml/2015/7/8/858 https://lkml.org/lkml/2015/7/8/1199 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6f957724b94cb19f5c1c97efd01dd4df8ced323c Certainly looks like a plausible solution, will build kernel tonight to confirm. If Shuah is still looking for the trigger, see above note regarding 'CONFIG_FW_LOADER_USER_HELPER_FALLBACK'. Thanks, and have an awesome weekend. :-) Simon -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
> [ 117.236007] [] device_del+0x18f/0x270 > [ 117.236007] [] ? wake_up_q+0x70/0x70 > [ 117.236007] [] _request_firmware+0x5aa/0xaf0 > [ 117.236007] [] request_firmware+0x35/0x50 > [ 117.236007] [] btbcm_setup_patchram+0x191/0x910 > [btbcm] > [ 117.236007] [] ? rpm_idle+0xc4/0x200 > [ 117.236007] [] hci_dev_do_open+0xd8/0x500 Looking between log from 3.19 -- Jul 7 21:42:57 retrobox kernel: [ 107.562441] bluetooth hci0: Direct firmware load for brcm/BCM20702A0-0a5c-21e8.hcd failed with error -2 Jul 7 21:42:57 retrobox kernel: [ 107.562452] Bluetooth: hci0: BCM: patch brcm/BCM20702A0-0a5c-21e8.hcd not found -- And the log of the lockup: https://www.flickr.com/photos/24244464@N03/19375918529/sizes/o/ It looks like the firmware 'opt_flags' must be different, so this may be a contributing factor. In fact I found a log from 4.1.0rc7, which shows they recently changed!! -- Jul 15 21:17:40 blind-fury kernel: [0.00] Linux version 4.1.0-rc7+ (root@blind-fury) (gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13) ) #2 SMP Wed Jun 10 21:25:17 MDT 2015 Jul 15 21:17:40 blind-fury kernel: [0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.1.0-rc7+ root=UUID=56684438-bf61-422a-9c47-e0d7e405f4e7 ro quiet splash ... Jul 15 21:20:04 blind-fury kernel: [ 173.591327] usbcore: registered new interface driver btusb Jul 15 21:20:04 blind-fury kernel: [ 173.604148] Bluetooth: hci0: BCM: chip id 63 Jul 15 21:20:04 blind-fury kernel: [ 173.606079] Bluetooth: hci0: BCM20702A1 (001.002.014) build Jul 15 21:20:04 blind-fury kernel: [ 173.628434] bluetooth hci0: Direct firmware load for brcm/BCM20702A1-0a5c-21e8.hcd failed with error -2 Jul 15 21:20:04 blind-fury kernel: [ 173.628439] Bluetooth: hci0: BCM: Patch brcm/BCM20702A1-0a5c-21e8.hcd not found -- These are checked here, but code hasn't changed recently: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/base/firmware_class.c?id=6593d9245bc66e6e3cf4ba6d365a7833110c1402#n1135 There has been changes to the btbcm.c code wrt to firmware loading: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/bluetooth/btbcm.c?id=18aeb4445aa00f6f402ba3a92a2e9ff3d13882b4 Simon. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops: btusb: 4.2rc1 System lockup with BT dongle insert - log attached
[ 117.236007] [814d2ccf] device_del+0x18f/0x270 [ 117.236007] [8109b340] ? wake_up_q+0x70/0x70 [ 117.236007] [814e97da] _request_firmware+0x5aa/0xaf0 [ 117.236007] [814e9d55] request_firmware+0x35/0x50 [ 117.236007] [c00fb881] btbcm_setup_patchram+0x191/0x910 [btbcm] [ 117.236007] [814e0994] ? rpm_idle+0xc4/0x200 [ 117.236007] [c0e28488] hci_dev_do_open+0xd8/0x500 Looking between log from 3.19 -- Jul 7 21:42:57 retrobox kernel: [ 107.562441] bluetooth hci0: Direct firmware load for brcm/BCM20702A0-0a5c-21e8.hcd failed with error -2 Jul 7 21:42:57 retrobox kernel: [ 107.562452] Bluetooth: hci0: BCM: patch brcm/BCM20702A0-0a5c-21e8.hcd not found -- And the log of the lockup: https://www.flickr.com/photos/24244464@N03/19375918529/sizes/o/ It looks like the firmware 'opt_flags' must be different, so this may be a contributing factor. In fact I found a log from 4.1.0rc7, which shows they recently changed!! -- Jul 15 21:17:40 blind-fury kernel: [0.00] Linux version 4.1.0-rc7+ (root@blind-fury) (gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13) ) #2 SMP Wed Jun 10 21:25:17 MDT 2015 Jul 15 21:17:40 blind-fury kernel: [0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.1.0-rc7+ root=UUID=56684438-bf61-422a-9c47-e0d7e405f4e7 ro quiet splash ... Jul 15 21:20:04 blind-fury kernel: [ 173.591327] usbcore: registered new interface driver btusb Jul 15 21:20:04 blind-fury kernel: [ 173.604148] Bluetooth: hci0: BCM: chip id 63 Jul 15 21:20:04 blind-fury kernel: [ 173.606079] Bluetooth: hci0: BCM20702A1 (001.002.014) build Jul 15 21:20:04 blind-fury kernel: [ 173.628434] bluetooth hci0: Direct firmware load for brcm/BCM20702A1-0a5c-21e8.hcd failed with error -2 Jul 15 21:20:04 blind-fury kernel: [ 173.628439] Bluetooth: hci0: BCM: Patch brcm/BCM20702A1-0a5c-21e8.hcd not found -- These are checked here, but code hasn't changed recently: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/base/firmware_class.c?id=6593d9245bc66e6e3cf4ba6d365a7833110c1402#n1135 There has been changes to the btbcm.c code wrt to firmware loading: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/bluetooth/btbcm.c?id=18aeb4445aa00f6f402ba3a92a2e9ff3d13882b4 Simon. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel OOPS when reloading i915 after resume from suspend
Hi! > >>1) Shut down X, > >>2) Unbind the consoles: > >> > >>echo 0 > /sys/class/vtconsole/vtcon1/bind > >>echo 0 > /sys/class/vtconsole/vtcon0/bind > >> > >>3) Remove the i915 > >> > >>rmmod i915 > >> > >>4) Suspend the system > >> > >>pm-suspend > > > >Does it also break with plain echo mem > /sys/power/state? > > No. If the modules stay resident, any user driven shutdown/resume > events by writing into /sys/power/state work just fine. > > >Does it break when you don't suspend at all? > > No. Unloading and reloading i915 under normal conditions also works fine. > > >>5) Resume the system by pressing on the power-button. > >>6) Reload the i915 module with > >> > >>modprobe i915. > >> > >>Result is a kernel-Oops: > > > >Sounds like problem in suspend/resume framework, actually :-(. > > Good guess, but wrong. The BIOS leaves the 830GM in a state the i915 > does not seem to like. Ok, good, something for Intel people to solve, then... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel OOPS when reloading i915 after resume from suspend
Am 02.07.2014 23:22, schrieb Pavel Machek: Hi! still experimenting with the resume from suspend on the Fujitsu S6010. I can, however, still create a kernel oops. The kernel source comes from alm_fixes5, kernel 3.15.0-rc7+. For that, do the following: 1) Shut down X, 2) Unbind the consoles: echo 0 > /sys/class/vtconsole/vtcon1/bind echo 0 > /sys/class/vtconsole/vtcon0/bind 3) Remove the i915 rmmod i915 4) Suspend the system pm-suspend Does it also break with plain echo mem > /sys/power/state? No. If the modules stay resident, any user driven shutdown/resume events by writing into /sys/power/state work just fine. Does it break when you don't suspend at all? No. Unloading and reloading i915 under normal conditions also works fine. 5) Resume the system by pressing on the power-button. 6) Reload the i915 module with modprobe i915. Result is a kernel-Oops: Sounds like problem in suspend/resume framework, actually :-(. Good guess, but wrong. The BIOS leaves the 830GM in a state the i915 does not seem to like. Greetings, Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel OOPS when reloading i915 after resume from suspend
Am 02.07.2014 23:22, schrieb Pavel Machek: Hi! still experimenting with the resume from suspend on the Fujitsu S6010. I can, however, still create a kernel oops. The kernel source comes from alm_fixes5, kernel 3.15.0-rc7+. For that, do the following: 1) Shut down X, 2) Unbind the consoles: echo 0 /sys/class/vtconsole/vtcon1/bind echo 0 /sys/class/vtconsole/vtcon0/bind 3) Remove the i915 rmmod i915 4) Suspend the system pm-suspend Does it also break with plain echo mem /sys/power/state? No. If the modules stay resident, any user driven shutdown/resume events by writing into /sys/power/state work just fine. Does it break when you don't suspend at all? No. Unloading and reloading i915 under normal conditions also works fine. 5) Resume the system by pressing on the power-button. 6) Reload the i915 module with modprobe i915. Result is a kernel-Oops: Sounds like problem in suspend/resume framework, actually :-(. Good guess, but wrong. The BIOS leaves the 830GM in a state the i915 does not seem to like. Greetings, Thomas -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel OOPS when reloading i915 after resume from suspend
Hi! 1) Shut down X, 2) Unbind the consoles: echo 0 /sys/class/vtconsole/vtcon1/bind echo 0 /sys/class/vtconsole/vtcon0/bind 3) Remove the i915 rmmod i915 4) Suspend the system pm-suspend Does it also break with plain echo mem /sys/power/state? No. If the modules stay resident, any user driven shutdown/resume events by writing into /sys/power/state work just fine. Does it break when you don't suspend at all? No. Unloading and reloading i915 under normal conditions also works fine. 5) Resume the system by pressing on the power-button. 6) Reload the i915 module with modprobe i915. Result is a kernel-Oops: Sounds like problem in suspend/resume framework, actually :-(. Good guess, but wrong. The BIOS leaves the 830GM in a state the i915 does not seem to like. Ok, good, something for Intel people to solve, then... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel OOPS when reloading i915 after resume from suspend
Hi! > still experimenting with the resume from suspend on the Fujitsu > S6010. I can, however, still create a kernel oops. The kernel source > comes from alm_fixes5, kernel 3.15.0-rc7+. For that, do the > following: > > 1) Shut down X, > 2) Unbind the consoles: > > echo 0 > /sys/class/vtconsole/vtcon1/bind > echo 0 > /sys/class/vtconsole/vtcon0/bind > > 3) Remove the i915 > > rmmod i915 > > 4) Suspend the system > > pm-suspend Does it also break with plain echo mem > /sys/power/state? Does it break when you don't suspend at all? > 5) Resume the system by pressing on the power-button. > 6) Reload the i915 module with > > modprobe i915. > > Result is a kernel-Oops: Sounds like problem in suspend/resume framework, actually :-(. Pavel > > Jun 29 19:34:00 tyleet kernel: [ 321.283072] [drm] Memory usable by > graphics device = 128M > Jun 29 19:34:00 tyleet kernel: [ 321.286770] [drm] Supports vblank > timestamp caching Rev 2 (21.10.2013). > Jun 29 19:34:00 tyleet kernel: [ 321.286782] [drm] Driver supports > precise vblank timestamp query. > Jun 29 19:34:00 tyleet kernel: [ 321.286959] [drm] applying pipe a > force quirk > Jun 29 19:34:00 tyleet kernel: [ 321.286965] [drm] applying pipe b > force quirk > Jun 29 19:34:00 tyleet kernel: [ 321.307436] *pde = > Jun 29 19:34:00 tyleet kernel: [ 321.307568] Oops: [#1] > Jun 29 19:34:00 tyleet kernel: [ 321.307751] Modules linked in: > i915(+) michael_mic arc4 ecb lib80211_crypt_tkip lib80211_crypt_ccmp > binfmt_misc fuse netconsole loop firewire_sbp2 hid_generic usbhid > hid snd_intel8x0 sg snd_ac97_codec ac97_bus snd_pcm sr_mod snd_seq -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel OOPS when reloading i915 after resume from suspend
Hi! still experimenting with the resume from suspend on the Fujitsu S6010. I can, however, still create a kernel oops. The kernel source comes from alm_fixes5, kernel 3.15.0-rc7+. For that, do the following: 1) Shut down X, 2) Unbind the consoles: echo 0 /sys/class/vtconsole/vtcon1/bind echo 0 /sys/class/vtconsole/vtcon0/bind 3) Remove the i915 rmmod i915 4) Suspend the system pm-suspend Does it also break with plain echo mem /sys/power/state? Does it break when you don't suspend at all? 5) Resume the system by pressing on the power-button. 6) Reload the i915 module with modprobe i915. Result is a kernel-Oops: Sounds like problem in suspend/resume framework, actually :-(. Pavel Jun 29 19:34:00 tyleet kernel: [ 321.283072] [drm] Memory usable by graphics device = 128M Jun 29 19:34:00 tyleet kernel: [ 321.286770] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). Jun 29 19:34:00 tyleet kernel: [ 321.286782] [drm] Driver supports precise vblank timestamp query. Jun 29 19:34:00 tyleet kernel: [ 321.286959] [drm] applying pipe a force quirk Jun 29 19:34:00 tyleet kernel: [ 321.286965] [drm] applying pipe b force quirk Jun 29 19:34:00 tyleet kernel: [ 321.307436] *pde = Jun 29 19:34:00 tyleet kernel: [ 321.307568] Oops: [#1] Jun 29 19:34:00 tyleet kernel: [ 321.307751] Modules linked in: i915(+) michael_mic arc4 ecb lib80211_crypt_tkip lib80211_crypt_ccmp binfmt_misc fuse netconsole loop firewire_sbp2 hid_generic usbhid hid snd_intel8x0 sg snd_ac97_codec ac97_bus snd_pcm sr_mod snd_seq -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel oops without Backtrace
On Thu 27-03-14 13:47:21, Celestino Martinez Lopez wrote: > Hi All, > > I am running a SUSE linux 3.0.13-0.27 and after printing an opps the system > hangs (though it only happened once). > In the opps there is only Code: information, no stack trace or processor > registers. > Analyzing the code most of the cores are in intel_idle function, but there is > also appears native_read_tsc and delay_tsc. > > Any help would be more than welcome. http://bugzilla.novell.com/ would be a more appropriate channel IMO. > Attached the console output: > > [2718229.308036] Stack: > [2718229.334021] Call Trace: > [2718229.365203] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [2718229.592831] Stack: > [2718229.618815] Call Trace: > [2718229.619910] Leftover inexact backtrace: > [2718229.619910] > [2718229.619911] > [2718229.619913] > [2718229.619914] Code: 98 45 89 ef 48 89 04 24 eb 16 66 0f 1f 44 00 00 f3 90 > 65 44 8b 34 25 70 c5 00 00 45 39 fe 75 2b 66 66 90 0f > ae e8 e8 cd 9a db ff > [2718229.620741] Stack: > [2718229.620748] Call Trace: > [2718229.620761] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [2718229.620825] Stack: > [2718229.620833] Call Trace: > [2718229.620842] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [2718229.621012] Stack: > [2718229.621020] Call Trace: > [2718229.621036] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [2718229.621115] Stack: > [2718229.621122] Call Trace: > [2718229.621132] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [2718229.621205] Stack: > [2718229.621212] Call Trace: > [2718229.621222] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [2718229.621422] Stack: > [2718229.621429] Call Trace: > [2718229.621446] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [2718229.621551] Stack: > [2718229.621564] Call Trace: > [2718229.621579] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [4194339.646298] Stack: > [4194339.672279] Call Trace: > [4194339.704539] Leftover inexact backtrace: > [4194339.704539] > [4194339.772037] > [4194339.796922] > [4194339.826073] Code: 70 e4 71 c3 0f 1f 80 00 00 00 00 40 0f b6 c6 e6 70 40 > 0f b6 c7 e6 71 c3 0f 1f 00 0f 31 89 c1 48 89 d0 48 c1 > e0 20 89 ca 48 09 d0 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 83 > ec 18 48 > [4194340.054808] Stack: > [4194340.080786] Call Trace: > [4194340.111957] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [4194340.339554] Stack: > [4194340.365534] Call Trace: > [4194340.396700] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [4194340.624288] Stack: > [4194340.650267] Call Trace: > [4194340.681435] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [4194340.909073] Stack: > [4194340.935054] Call Trace: > [4194340.966222] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [4194341.193816] Stack: > [4194341.219796] Call Trace: > [4194341.250969] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca > 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 > 01 48 89 e8 0f 01 c9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 > 24 2c bf > [4194341.478549] Stack: > [4194341.504530] Call Trace: > [4194341.535696] Code: 01 e8 5d 41
Re: Kernel oops without Backtrace
On Thu 27-03-14 13:47:21, Celestino Martinez Lopez wrote: Hi All, I am running a SUSE linux 3.0.13-0.27 and after printing an opps the system hangs (though it only happened once). In the opps there is only Code: information, no stack trace or processor registers. Analyzing the code most of the cores are in intel_idle function, but there is also appears native_read_tsc and delay_tsc. Any help would be more than welcome. http://bugzilla.novell.com/ would be a more appropriate channel IMO. Attached the console output: [2718229.308036] Stack: [2718229.334021] Call Trace: [2718229.365203] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [2718229.592831] Stack: [2718229.618815] Call Trace: [2718229.619910] Leftover inexact backtrace: [2718229.619910] [2718229.619911] IRQ [2718229.619913] EOI [2718229.619914] Code: 98 45 89 ef 48 89 04 24 eb 16 66 0f 1f 44 00 00 f3 90 65 44 8b 34 25 70 c5 00 00 45 39 fe 75 2b 66 66 90 0f ae e8 e8 cd 9a db ff [2718229.620741] Stack: [2718229.620748] Call Trace: [2718229.620761] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [2718229.620825] Stack: [2718229.620833] Call Trace: [2718229.620842] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [2718229.621012] Stack: [2718229.621020] Call Trace: [2718229.621036] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [2718229.621115] Stack: [2718229.621122] Call Trace: [2718229.621132] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [2718229.621205] Stack: [2718229.621212] Call Trace: [2718229.621222] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [2718229.621422] Stack: [2718229.621429] Call Trace: [2718229.621446] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [2718229.621551] Stack: [2718229.621564] Call Trace: [2718229.621579] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [4194339.646298] Stack: [4194339.672279] Call Trace: [4194339.704539] Leftover inexact backtrace: [4194339.704539] [4194339.772037] IRQ [4194339.796922] EOI [4194339.826073] Code: 70 e4 71 c3 0f 1f 80 00 00 00 00 40 0f b6 c6 e6 70 40 0f b6 c7 e6 71 c3 0f 1f 00 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 ca 48 09 d0 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec 18 48 [4194340.054808] Stack: [4194340.080786] Call Trace: [4194340.111957] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [4194340.339554] Stack: [4194340.365534] Call Trace: [4194340.396700] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [4194340.624288] Stack: [4194340.650267] Call Trace: [4194340.681435] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [4194340.909073] Stack: [4194340.935054] Call Trace: [4194340.966222] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [4194341.193816] Stack: [4194341.219796] Call Trace: [4194341.250969] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87 38 e0 ff ff a8 08 75 88 b1 01 48 89 e8 0f 01 c9 e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 48 8d 74 24 2c bf [4194341.478549] Stack: [4194341.504530] Call Trace: [4194341.535696] Code: 01 e8 5d 41 5c c3 0f 1f 40 00 31 c9 48 89 f0 48 89 ca 0f 01 c8 0f ae f0 48 8b 87
Re: Kernel OOPS using auditd Debian 3.2 on a /sys file audit
On Tue 14-05-13 20:36:59, Javier Domingo wrote: > I didn't get any reply, is this mailing list still valid, or should I > ask elsewhere? (this is a ping message) This is a high traffic list so your message is likely to just go unnoticed. It is good to find maintainers of the subsystem (MAINTAINERS file in kernel sources) and CC them as well - in this case these are Al Viro and Eric Paris . Also searching in the archives shows that the first message likely didn't even get it into the list (too large attachment?). Finally, kernel 3.2 is rather old which reduces the enthusiasm of people here to look into the problem (although you still might be lucky). So you might have better luck with reporting this in Debian bug tracking system... Honza > 2013/5/2 Javier Domingo : > > Hi, > > > > I am currently having problems with the cpu scaling (something is > > touching the /sys/devices/system/cpu/cpu?/cpufreq/scaling_max_freq), > > and I decided to use inotify to seek if something was being changed or > > not. > > > > This gave no problems to me, but something is writing just before me, > > and I can't know from the inotifywait cmd who. Then, I found auditdl, > > so I put this command onto it: > > > > # auditctl -w /sys/devices/system/cpu/cpu?/cpufreq/scaling_max_freq > > > > And just after this, this totally blocking OOPS was triggered: > > [ 5072.685483] BUG: unable to handle kernel NULL pointer dereference > > at 0038 > > [ 5072.714302] IP: [] sysfs_dentry_revalidate+0x9/0xa2 > > [ 5072.741249] PGD 12eca4067 PUD 12ec62067 PMD 0 > > [ 5072.768102] Oops: [#1] SMP > > > > I tried to make sysrq, but didn't work. I have repeated several times > > this and it always crashes. I am attaching the photo of the second > > time because it is much longer than the second one. -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/