Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Michal Hocko
On Thu 30-11-17 22:01:03, Wu Fengguang wrote:
> On Thu, Nov 30, 2017 at 02:50:16PM +0100, Michal Hocko wrote:
> > On Thu 30-11-17 21:38:40, Wu Fengguang wrote:
> > > Hello,
> > > 
> > > It looks like a regression in 4.15.0-rc1 -- the test case simply run a
> > > set of parallel dd's and there seems no reason to run into memory problem.
> > > 
> > > It occurs in 1 out of 4 tests.
> > 
> > This is an atomic allocations. So the failure really depends on the
> > state of the free memory and that can vary between runs depending on
> > timing I guess. So I am not really sure this is a regression. But maybe
> > there is something reclaim related going on here.
> 
> Yes, it does depend on how the drivers rely on atomic allocations.
> I just wonder if any changes make the pressure more tight than before.
> It may not even be a MM change -- in theory drivers might also use atomic
> allocations more aggressively than before.
> 
[...]
> Attached the JSON format per-second vmstat records.
> It feels more readable than the raw dumps.

Well from a quick check it seems that there is just a legit memory
pressure where kswapd doesn't keep up with the allocation pace. If we
just check the overal kswapd reclaim efficiency
(proc-vmstat.pgsteal_kswapd/proc-vmstat.pgscan_kswapd)
13311631/1331957
then we are ~99% which means that kswad made a good work to reclaim and
didn't stumble over anything. We reclaim _a lot_ from the direct reclaim
context which means that kswapd doesn't keep up at all
(proc-vmstat.pgsteal_direct/proc-vmstat.pgscan_direct)
107767391/108058968 again ~99% but 8x the kswapd of what kswapd does.

If you look at diffs in pgsteal numbers, we are at 2M pages reclaimed
per second for the direct reclaim and ~200k p/s from kswapd. kswapd is
naturally slower as it is a single thread compared to many reclaimers
hitting the direct reclaim at once.

allocstall numbers show that this was not a single peak but rather a
continual direct reclaim storm, starting with dozens of direct reclaim
invocations per second reaching to hundres on zone Normal and even worse
on the movable zone where we are in thousands (just have a look at diffs
between respective numbres).

So from a quick look it really seems like a heavy memory pressure rather
than some reclaim defficiency to me.
-- 
Michal Hocko
SUSE Labs


Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Michal Hocko
On Thu 30-11-17 22:01:03, Wu Fengguang wrote:
> On Thu, Nov 30, 2017 at 02:50:16PM +0100, Michal Hocko wrote:
> > On Thu 30-11-17 21:38:40, Wu Fengguang wrote:
> > > Hello,
> > > 
> > > It looks like a regression in 4.15.0-rc1 -- the test case simply run a
> > > set of parallel dd's and there seems no reason to run into memory problem.
> > > 
> > > It occurs in 1 out of 4 tests.
> > 
> > This is an atomic allocations. So the failure really depends on the
> > state of the free memory and that can vary between runs depending on
> > timing I guess. So I am not really sure this is a regression. But maybe
> > there is something reclaim related going on here.
> 
> Yes, it does depend on how the drivers rely on atomic allocations.
> I just wonder if any changes make the pressure more tight than before.
> It may not even be a MM change -- in theory drivers might also use atomic
> allocations more aggressively than before.
> 
[...]
> Attached the JSON format per-second vmstat records.
> It feels more readable than the raw dumps.

Well from a quick check it seems that there is just a legit memory
pressure where kswapd doesn't keep up with the allocation pace. If we
just check the overal kswapd reclaim efficiency
(proc-vmstat.pgsteal_kswapd/proc-vmstat.pgscan_kswapd)
13311631/1331957
then we are ~99% which means that kswad made a good work to reclaim and
didn't stumble over anything. We reclaim _a lot_ from the direct reclaim
context which means that kswapd doesn't keep up at all
(proc-vmstat.pgsteal_direct/proc-vmstat.pgscan_direct)
107767391/108058968 again ~99% but 8x the kswapd of what kswapd does.

If you look at diffs in pgsteal numbers, we are at 2M pages reclaimed
per second for the direct reclaim and ~200k p/s from kswapd. kswapd is
naturally slower as it is a single thread compared to many reclaimers
hitting the direct reclaim at once.

allocstall numbers show that this was not a single peak but rather a
continual direct reclaim storm, starting with dozens of direct reclaim
invocations per second reaching to hundres on zone Normal and even worse
on the movable zone where we are in thousands (just have a look at diffs
between respective numbres).

So from a quick look it really seems like a heavy memory pressure rather
than some reclaim defficiency to me.
-- 
Michal Hocko
SUSE Labs


Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Fengguang Wu

On Thu, Nov 30, 2017 at 10:08:04PM +0800, Fengguang Wu wrote:

[   78.848629] dd: page allocation failure: order:0, 
mode:0x1080020(GFP_ATOMIC), nodemask=(null)
[   78.857841] dd cpuset=/ mems_allowed=0-1
[   78.862502] CPU: 0 PID: 6131 Comm: dd Tainted: G   O 4.15.0-rc1 
#1
[   78.870437] Call Trace:
[   78.873610]  
[   78.876342]  dump_stack+0x5c/0x7b
[   78.880414]  warn_alloc+0xbe/0x150
[   78.884550]  __alloc_pages_slowpath+0xda7/0xdf0
[   78.889822]  ? xhci_urb_enqueue+0x23d/0x580
[   78.894713]  __alloc_pages_nodemask+0x269/0x280
[   78.899891]  page_frag_alloc+0x11c/0x150
[   78.904471]  __netdev_alloc_skb+0xa0/0x110
[   78.909277]  rx_submit+0x3b/0x2e0
[   78.913256]  rx_complete+0x196/0x2d0
[   78.917560]  __usb_hcd_giveback_urb+0x86/0x100
[   78.922681]  xhci_giveback_urb_in_irq+0x86/0x100
[   78.928769]  ? ip_rcv+0x261/0x390
[   78.932739]  xhci_td_cleanup+0xe7/0x170
[   78.937308]  handle_tx_event+0x297/0x1190
[   78.941990]  xhci_irq+0x300/0xb80
[   78.945968]  ? pciehp_isr+0x46/0x320
[   78.950870]  __handle_irq_event_percpu+0x3a/0x1a0
[   78.956311]  handle_irq_event_percpu+0x20/0x50
[   78.961466]  handle_irq_event+0x3d/0x60
[   78.965962]  handle_edge_irq+0x71/0x190
[   78.970480]  handle_irq+0xa5/0x100
[   78.974565]  do_IRQ+0x41/0xc0
[   78.978206]  ? pagevec_move_tail_fn+0x350/0x350
[   78.983412]  common_interrupt+0x96/0x96


Unfortunatelly we are missing the most imporatant information, the
meminfo. We cannot tell much without it. Maybe collecting /proc/vmstat
during the test will tell us more.


Attached the JSON format per-second vmstat records.
It feels more readable than the raw dumps.


And here is the meminfo lines.


Here are the raw proc file dumps during every second of the test.

Thanks,
Fengguang


meminfo.gz
Description: application/gzip


proc-vmstat.gz
Description: application/gzip


Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Fengguang Wu

On Thu, Nov 30, 2017 at 10:08:04PM +0800, Fengguang Wu wrote:

[   78.848629] dd: page allocation failure: order:0, 
mode:0x1080020(GFP_ATOMIC), nodemask=(null)
[   78.857841] dd cpuset=/ mems_allowed=0-1
[   78.862502] CPU: 0 PID: 6131 Comm: dd Tainted: G   O 4.15.0-rc1 
#1
[   78.870437] Call Trace:
[   78.873610]  
[   78.876342]  dump_stack+0x5c/0x7b
[   78.880414]  warn_alloc+0xbe/0x150
[   78.884550]  __alloc_pages_slowpath+0xda7/0xdf0
[   78.889822]  ? xhci_urb_enqueue+0x23d/0x580
[   78.894713]  __alloc_pages_nodemask+0x269/0x280
[   78.899891]  page_frag_alloc+0x11c/0x150
[   78.904471]  __netdev_alloc_skb+0xa0/0x110
[   78.909277]  rx_submit+0x3b/0x2e0
[   78.913256]  rx_complete+0x196/0x2d0
[   78.917560]  __usb_hcd_giveback_urb+0x86/0x100
[   78.922681]  xhci_giveback_urb_in_irq+0x86/0x100
[   78.928769]  ? ip_rcv+0x261/0x390
[   78.932739]  xhci_td_cleanup+0xe7/0x170
[   78.937308]  handle_tx_event+0x297/0x1190
[   78.941990]  xhci_irq+0x300/0xb80
[   78.945968]  ? pciehp_isr+0x46/0x320
[   78.950870]  __handle_irq_event_percpu+0x3a/0x1a0
[   78.956311]  handle_irq_event_percpu+0x20/0x50
[   78.961466]  handle_irq_event+0x3d/0x60
[   78.965962]  handle_edge_irq+0x71/0x190
[   78.970480]  handle_irq+0xa5/0x100
[   78.974565]  do_IRQ+0x41/0xc0
[   78.978206]  ? pagevec_move_tail_fn+0x350/0x350
[   78.983412]  common_interrupt+0x96/0x96


Unfortunatelly we are missing the most imporatant information, the
meminfo. We cannot tell much without it. Maybe collecting /proc/vmstat
during the test will tell us more.


Attached the JSON format per-second vmstat records.
It feels more readable than the raw dumps.


And here is the meminfo lines.


Here are the raw proc file dumps during every second of the test.

Thanks,
Fengguang


meminfo.gz
Description: application/gzip


proc-vmstat.gz
Description: application/gzip


Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Fengguang Wu

[   78.848629] dd: page allocation failure: order:0, 
mode:0x1080020(GFP_ATOMIC), nodemask=(null)
[   78.857841] dd cpuset=/ mems_allowed=0-1
[   78.862502] CPU: 0 PID: 6131 Comm: dd Tainted: G   O 4.15.0-rc1 
#1
[   78.870437] Call Trace:
[   78.873610]  
[   78.876342]  dump_stack+0x5c/0x7b
[   78.880414]  warn_alloc+0xbe/0x150
[   78.884550]  __alloc_pages_slowpath+0xda7/0xdf0
[   78.889822]  ? xhci_urb_enqueue+0x23d/0x580
[   78.894713]  __alloc_pages_nodemask+0x269/0x280
[   78.899891]  page_frag_alloc+0x11c/0x150
[   78.904471]  __netdev_alloc_skb+0xa0/0x110
[   78.909277]  rx_submit+0x3b/0x2e0
[   78.913256]  rx_complete+0x196/0x2d0
[   78.917560]  __usb_hcd_giveback_urb+0x86/0x100
[   78.922681]  xhci_giveback_urb_in_irq+0x86/0x100
[   78.928769]  ? ip_rcv+0x261/0x390
[   78.932739]  xhci_td_cleanup+0xe7/0x170
[   78.937308]  handle_tx_event+0x297/0x1190
[   78.941990]  xhci_irq+0x300/0xb80
[   78.945968]  ? pciehp_isr+0x46/0x320
[   78.950870]  __handle_irq_event_percpu+0x3a/0x1a0
[   78.956311]  handle_irq_event_percpu+0x20/0x50
[   78.961466]  handle_irq_event+0x3d/0x60
[   78.965962]  handle_edge_irq+0x71/0x190
[   78.970480]  handle_irq+0xa5/0x100
[   78.974565]  do_IRQ+0x41/0xc0
[   78.978206]  ? pagevec_move_tail_fn+0x350/0x350
[   78.983412]  common_interrupt+0x96/0x96


Unfortunatelly we are missing the most imporatant information, the
meminfo. We cannot tell much without it. Maybe collecting /proc/vmstat
during the test will tell us more.


Attached the JSON format per-second vmstat records.
It feels more readable than the raw dumps.


And here is the meminfo lines.

Thanks,
Fengguang



meminfo.json.gz
Description: application/gzip


Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Fengguang Wu

[   78.848629] dd: page allocation failure: order:0, 
mode:0x1080020(GFP_ATOMIC), nodemask=(null)
[   78.857841] dd cpuset=/ mems_allowed=0-1
[   78.862502] CPU: 0 PID: 6131 Comm: dd Tainted: G   O 4.15.0-rc1 
#1
[   78.870437] Call Trace:
[   78.873610]  
[   78.876342]  dump_stack+0x5c/0x7b
[   78.880414]  warn_alloc+0xbe/0x150
[   78.884550]  __alloc_pages_slowpath+0xda7/0xdf0
[   78.889822]  ? xhci_urb_enqueue+0x23d/0x580
[   78.894713]  __alloc_pages_nodemask+0x269/0x280
[   78.899891]  page_frag_alloc+0x11c/0x150
[   78.904471]  __netdev_alloc_skb+0xa0/0x110
[   78.909277]  rx_submit+0x3b/0x2e0
[   78.913256]  rx_complete+0x196/0x2d0
[   78.917560]  __usb_hcd_giveback_urb+0x86/0x100
[   78.922681]  xhci_giveback_urb_in_irq+0x86/0x100
[   78.928769]  ? ip_rcv+0x261/0x390
[   78.932739]  xhci_td_cleanup+0xe7/0x170
[   78.937308]  handle_tx_event+0x297/0x1190
[   78.941990]  xhci_irq+0x300/0xb80
[   78.945968]  ? pciehp_isr+0x46/0x320
[   78.950870]  __handle_irq_event_percpu+0x3a/0x1a0
[   78.956311]  handle_irq_event_percpu+0x20/0x50
[   78.961466]  handle_irq_event+0x3d/0x60
[   78.965962]  handle_edge_irq+0x71/0x190
[   78.970480]  handle_irq+0xa5/0x100
[   78.974565]  do_IRQ+0x41/0xc0
[   78.978206]  ? pagevec_move_tail_fn+0x350/0x350
[   78.983412]  common_interrupt+0x96/0x96


Unfortunatelly we are missing the most imporatant information, the
meminfo. We cannot tell much without it. Maybe collecting /proc/vmstat
during the test will tell us more.


Attached the JSON format per-second vmstat records.
It feels more readable than the raw dumps.


And here is the meminfo lines.

Thanks,
Fengguang



meminfo.json.gz
Description: application/gzip


Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Fengguang Wu

On Thu, Nov 30, 2017 at 02:50:16PM +0100, Michal Hocko wrote:

On Thu 30-11-17 21:38:40, Wu Fengguang wrote:

Hello,

It looks like a regression in 4.15.0-rc1 -- the test case simply run a
set of parallel dd's and there seems no reason to run into memory problem.

It occurs in 1 out of 4 tests.


This is an atomic allocations. So the failure really depends on the
state of the free memory and that can vary between runs depending on
timing I guess. So I am not really sure this is a regression. But maybe
there is something reclaim related going on here.


Yes, it does depend on how the drivers rely on atomic allocations.
I just wonder if any changes make the pressure more tight than before.
It may not even be a MM change -- in theory drivers might also use atomic
allocations more aggressively than before.


[...]

[   71.088242] dd: page allocation failure: order:0, 
mode:0x1080020(GFP_ATOMIC), nodemask=(null)
[   71.098654] dd cpuset=/ mems_allowed=0-1
[   71.104460] CPU: 0 PID: 6016 Comm: dd Tainted: G   O 4.15.0-rc1 
#1
[   71.113553] Call Trace:
[   71.117886]  
[   71.121749]  dump_stack+0x5c/0x7b:
dump_stack at 
lib/dump_stack.c:55
[   71.126785]  warn_alloc+0xbe/0x150:
preempt_count at 
arch/x86/include/asm/preempt.h:23
 (inlined by) 
should_suppress_show_mem at mm/page_alloc.c:3244
 (inlined by) 
warn_alloc_show_mem at mm/page_alloc.c:3254
 (inlined by) warn_alloc at 
mm/page_alloc.c:3293
[   71.131939]  __alloc_pages_slowpath+0xda7/0xdf0:
__alloc_pages_slowpath at 
mm/page_alloc.c:4151
[   71.138110]  ? xhci_urb_enqueue+0x23d/0x580:
xhci_urb_enqueue at 
drivers/usb/host/xhci.c:1389
[   71.143941]  __alloc_pages_nodemask+0x269/0x280:
__alloc_pages_nodemask at 
mm/page_alloc.c:4245
[   71.150167]  page_frag_alloc+0x11c/0x150:
__page_frag_cache_refill at 
mm/page_alloc.c:4335
 (inlined by) page_frag_alloc 
at mm/page_alloc.c:4364
[   71.155668]  __netdev_alloc_skb+0xa0/0x110:
__netdev_alloc_skb at 
net/core/skbuff.c:415
[   71.161386]  rx_submit+0x3b/0x2e0:
rx_submit at 
drivers/net/usb/usbnet.c:488
[   71.166232]  rx_complete+0x196/0x2d0:
rx_complete at 
drivers/net/usb/usbnet.c:659
[   71.171354]  __usb_hcd_giveback_urb+0x86/0x100:
arch_local_irq_restore at 
arch/x86/include/asm/paravirt.h:777
 (inlined by) 
__usb_hcd_giveback_urb at drivers/usb/core/hcd.c:1769
[   71.177281]  xhci_giveback_urb_in_irq+0x86/0x100
[   71.184107]  xhci_td_cleanup+0xe7/0x170:
xhci_td_cleanup at 
drivers/usb/host/xhci-ring.c:1924
[   71.189457]  handle_tx_event+0x297/0x1190:
process_bulk_intr_td at 
drivers/usb/host/xhci-ring.c:2267
 (inlined by) handle_tx_event 
at drivers/usb/host/xhci-ring.c:2598
[   71.194905]  ? reweight_entity+0x145/0x180:
enqueue_runnable_load_avg at 
kernel/sched/fair.c:2742
 (inlined by) reweight_entity 
at kernel/sched/fair.c:2810
[   71.200466]  xhci_irq+0x300/0xb80:
xhci_handle_event at 
drivers/usb/host/xhci-ring.c:2676
 (inlined by) xhci_irq at 
drivers/usb/host/xhci-ring.c:2777
[   71.205195]  ? scheduler_tick+0xb2/0xe0:
rq_last_tick_reset at 
kernel/sched/sched.h:1643
 (inlined by) scheduler_tick at 
kernel/sched/core.c:3036
[   71.210407]  ? run_timer_softirq+0x73/0x460:
__collect_expired_timers at 
kernel/time/timer.c:1375
 (inlined by) 
collect_expired_timers at kernel/time/timer.c:1609
 (inlined by) __run_timers at 
kernel/time/timer.c:1656
 (inlined by) run_timer_softirq 
at kernel/time/timer.c:1688
[   71.215905]  __handle_irq_event_percpu+0x3a/0x1a0:
__handle_irq_event_percpu at 
kernel/irq/handle.c:147
[   71.221975]  handle_irq_event_percpu+0x20/0x50:

Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Fengguang Wu

On Thu, Nov 30, 2017 at 02:50:16PM +0100, Michal Hocko wrote:

On Thu 30-11-17 21:38:40, Wu Fengguang wrote:

Hello,

It looks like a regression in 4.15.0-rc1 -- the test case simply run a
set of parallel dd's and there seems no reason to run into memory problem.

It occurs in 1 out of 4 tests.


This is an atomic allocations. So the failure really depends on the
state of the free memory and that can vary between runs depending on
timing I guess. So I am not really sure this is a regression. But maybe
there is something reclaim related going on here.


Yes, it does depend on how the drivers rely on atomic allocations.
I just wonder if any changes make the pressure more tight than before.
It may not even be a MM change -- in theory drivers might also use atomic
allocations more aggressively than before.


[...]

[   71.088242] dd: page allocation failure: order:0, 
mode:0x1080020(GFP_ATOMIC), nodemask=(null)
[   71.098654] dd cpuset=/ mems_allowed=0-1
[   71.104460] CPU: 0 PID: 6016 Comm: dd Tainted: G   O 4.15.0-rc1 
#1
[   71.113553] Call Trace:
[   71.117886]  
[   71.121749]  dump_stack+0x5c/0x7b:
dump_stack at 
lib/dump_stack.c:55
[   71.126785]  warn_alloc+0xbe/0x150:
preempt_count at 
arch/x86/include/asm/preempt.h:23
 (inlined by) 
should_suppress_show_mem at mm/page_alloc.c:3244
 (inlined by) 
warn_alloc_show_mem at mm/page_alloc.c:3254
 (inlined by) warn_alloc at 
mm/page_alloc.c:3293
[   71.131939]  __alloc_pages_slowpath+0xda7/0xdf0:
__alloc_pages_slowpath at 
mm/page_alloc.c:4151
[   71.138110]  ? xhci_urb_enqueue+0x23d/0x580:
xhci_urb_enqueue at 
drivers/usb/host/xhci.c:1389
[   71.143941]  __alloc_pages_nodemask+0x269/0x280:
__alloc_pages_nodemask at 
mm/page_alloc.c:4245
[   71.150167]  page_frag_alloc+0x11c/0x150:
__page_frag_cache_refill at 
mm/page_alloc.c:4335
 (inlined by) page_frag_alloc 
at mm/page_alloc.c:4364
[   71.155668]  __netdev_alloc_skb+0xa0/0x110:
__netdev_alloc_skb at 
net/core/skbuff.c:415
[   71.161386]  rx_submit+0x3b/0x2e0:
rx_submit at 
drivers/net/usb/usbnet.c:488
[   71.166232]  rx_complete+0x196/0x2d0:
rx_complete at 
drivers/net/usb/usbnet.c:659
[   71.171354]  __usb_hcd_giveback_urb+0x86/0x100:
arch_local_irq_restore at 
arch/x86/include/asm/paravirt.h:777
 (inlined by) 
__usb_hcd_giveback_urb at drivers/usb/core/hcd.c:1769
[   71.177281]  xhci_giveback_urb_in_irq+0x86/0x100
[   71.184107]  xhci_td_cleanup+0xe7/0x170:
xhci_td_cleanup at 
drivers/usb/host/xhci-ring.c:1924
[   71.189457]  handle_tx_event+0x297/0x1190:
process_bulk_intr_td at 
drivers/usb/host/xhci-ring.c:2267
 (inlined by) handle_tx_event 
at drivers/usb/host/xhci-ring.c:2598
[   71.194905]  ? reweight_entity+0x145/0x180:
enqueue_runnable_load_avg at 
kernel/sched/fair.c:2742
 (inlined by) reweight_entity 
at kernel/sched/fair.c:2810
[   71.200466]  xhci_irq+0x300/0xb80:
xhci_handle_event at 
drivers/usb/host/xhci-ring.c:2676
 (inlined by) xhci_irq at 
drivers/usb/host/xhci-ring.c:2777
[   71.205195]  ? scheduler_tick+0xb2/0xe0:
rq_last_tick_reset at 
kernel/sched/sched.h:1643
 (inlined by) scheduler_tick at 
kernel/sched/core.c:3036
[   71.210407]  ? run_timer_softirq+0x73/0x460:
__collect_expired_timers at 
kernel/time/timer.c:1375
 (inlined by) 
collect_expired_timers at kernel/time/timer.c:1609
 (inlined by) __run_timers at 
kernel/time/timer.c:1656
 (inlined by) run_timer_softirq 
at kernel/time/timer.c:1688
[   71.215905]  __handle_irq_event_percpu+0x3a/0x1a0:
__handle_irq_event_percpu at 
kernel/irq/handle.c:147
[   71.221975]  handle_irq_event_percpu+0x20/0x50:

Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Michal Hocko
On Thu 30-11-17 21:38:40, Wu Fengguang wrote:
> Hello,
> 
> It looks like a regression in 4.15.0-rc1 -- the test case simply run a
> set of parallel dd's and there seems no reason to run into memory problem.
> 
> It occurs in 1 out of 4 tests.

This is an atomic allocations. So the failure really depends on the
state of the free memory and that can vary between runs depending on
timing I guess. So I am not really sure this is a regression. But maybe
there is something reclaim related going on here.
[...]
> [   71.088242] dd: page allocation failure: order:0, 
> mode:0x1080020(GFP_ATOMIC), nodemask=(null)
> [   71.098654] dd cpuset=/ mems_allowed=0-1
> [   71.104460] CPU: 0 PID: 6016 Comm: dd Tainted: G   O 
> 4.15.0-rc1 #1
> [   71.113553] Call Trace:
> [   71.117886]  
> [   71.121749]  dump_stack+0x5c/0x7b:
>   dump_stack at 
> lib/dump_stack.c:55
> [   71.126785]  warn_alloc+0xbe/0x150:
>   preempt_count at 
> arch/x86/include/asm/preempt.h:23
>(inlined by) 
> should_suppress_show_mem at mm/page_alloc.c:3244
>(inlined by) 
> warn_alloc_show_mem at mm/page_alloc.c:3254
>(inlined by) warn_alloc at 
> mm/page_alloc.c:3293
> [   71.131939]  __alloc_pages_slowpath+0xda7/0xdf0:
>   __alloc_pages_slowpath at 
> mm/page_alloc.c:4151
> [   71.138110]  ? xhci_urb_enqueue+0x23d/0x580:
>   xhci_urb_enqueue at 
> drivers/usb/host/xhci.c:1389
> [   71.143941]  __alloc_pages_nodemask+0x269/0x280:
>   __alloc_pages_nodemask at 
> mm/page_alloc.c:4245
> [   71.150167]  page_frag_alloc+0x11c/0x150:
>   __page_frag_cache_refill at 
> mm/page_alloc.c:4335
>(inlined by) page_frag_alloc 
> at mm/page_alloc.c:4364
> [   71.155668]  __netdev_alloc_skb+0xa0/0x110:
>   __netdev_alloc_skb at 
> net/core/skbuff.c:415
> [   71.161386]  rx_submit+0x3b/0x2e0:
>   rx_submit at 
> drivers/net/usb/usbnet.c:488
> [   71.166232]  rx_complete+0x196/0x2d0:
>   rx_complete at 
> drivers/net/usb/usbnet.c:659
> [   71.171354]  __usb_hcd_giveback_urb+0x86/0x100:
>   arch_local_irq_restore at 
> arch/x86/include/asm/paravirt.h:777
>(inlined by) 
> __usb_hcd_giveback_urb at drivers/usb/core/hcd.c:1769
> [   71.177281]  xhci_giveback_urb_in_irq+0x86/0x100
> [   71.184107]  xhci_td_cleanup+0xe7/0x170:
>   xhci_td_cleanup at 
> drivers/usb/host/xhci-ring.c:1924
> [   71.189457]  handle_tx_event+0x297/0x1190:
>   process_bulk_intr_td at 
> drivers/usb/host/xhci-ring.c:2267
>(inlined by) handle_tx_event 
> at drivers/usb/host/xhci-ring.c:2598
> [   71.194905]  ? reweight_entity+0x145/0x180:
>   enqueue_runnable_load_avg at 
> kernel/sched/fair.c:2742
>(inlined by) reweight_entity 
> at kernel/sched/fair.c:2810
> [   71.200466]  xhci_irq+0x300/0xb80:
>   xhci_handle_event at 
> drivers/usb/host/xhci-ring.c:2676
>(inlined by) xhci_irq at 
> drivers/usb/host/xhci-ring.c:2777
> [   71.205195]  ? scheduler_tick+0xb2/0xe0:
>   rq_last_tick_reset at 
> kernel/sched/sched.h:1643
>(inlined by) scheduler_tick at 
> kernel/sched/core.c:3036
> [   71.210407]  ? run_timer_softirq+0x73/0x460:
>   __collect_expired_timers at 
> kernel/time/timer.c:1375
>(inlined by) 
> collect_expired_timers at kernel/time/timer.c:1609
>(inlined by) __run_timers at 
> kernel/time/timer.c:1656
>(inlined by) run_timer_softirq 
> at kernel/time/timer.c:1688
> [   71.215905]  __handle_irq_event_percpu+0x3a/0x1a0:
>   __handle_irq_event_percpu at 
> kernel/irq/handle.c:147
> [   71.221975]  handle_irq_event_percpu+0x20/0x50:
>   handle_irq_event_percpu at 
> kernel/irq/handle.c:189
> [   71.227641]  handle_irq_event+0x3d/0x60:
>   handle_irq_event at 
> kernel/irq/handle.c:206
> [   

Re: dd: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)

2017-11-30 Thread Michal Hocko
On Thu 30-11-17 21:38:40, Wu Fengguang wrote:
> Hello,
> 
> It looks like a regression in 4.15.0-rc1 -- the test case simply run a
> set of parallel dd's and there seems no reason to run into memory problem.
> 
> It occurs in 1 out of 4 tests.

This is an atomic allocations. So the failure really depends on the
state of the free memory and that can vary between runs depending on
timing I guess. So I am not really sure this is a regression. But maybe
there is something reclaim related going on here.
[...]
> [   71.088242] dd: page allocation failure: order:0, 
> mode:0x1080020(GFP_ATOMIC), nodemask=(null)
> [   71.098654] dd cpuset=/ mems_allowed=0-1
> [   71.104460] CPU: 0 PID: 6016 Comm: dd Tainted: G   O 
> 4.15.0-rc1 #1
> [   71.113553] Call Trace:
> [   71.117886]  
> [   71.121749]  dump_stack+0x5c/0x7b:
>   dump_stack at 
> lib/dump_stack.c:55
> [   71.126785]  warn_alloc+0xbe/0x150:
>   preempt_count at 
> arch/x86/include/asm/preempt.h:23
>(inlined by) 
> should_suppress_show_mem at mm/page_alloc.c:3244
>(inlined by) 
> warn_alloc_show_mem at mm/page_alloc.c:3254
>(inlined by) warn_alloc at 
> mm/page_alloc.c:3293
> [   71.131939]  __alloc_pages_slowpath+0xda7/0xdf0:
>   __alloc_pages_slowpath at 
> mm/page_alloc.c:4151
> [   71.138110]  ? xhci_urb_enqueue+0x23d/0x580:
>   xhci_urb_enqueue at 
> drivers/usb/host/xhci.c:1389
> [   71.143941]  __alloc_pages_nodemask+0x269/0x280:
>   __alloc_pages_nodemask at 
> mm/page_alloc.c:4245
> [   71.150167]  page_frag_alloc+0x11c/0x150:
>   __page_frag_cache_refill at 
> mm/page_alloc.c:4335
>(inlined by) page_frag_alloc 
> at mm/page_alloc.c:4364
> [   71.155668]  __netdev_alloc_skb+0xa0/0x110:
>   __netdev_alloc_skb at 
> net/core/skbuff.c:415
> [   71.161386]  rx_submit+0x3b/0x2e0:
>   rx_submit at 
> drivers/net/usb/usbnet.c:488
> [   71.166232]  rx_complete+0x196/0x2d0:
>   rx_complete at 
> drivers/net/usb/usbnet.c:659
> [   71.171354]  __usb_hcd_giveback_urb+0x86/0x100:
>   arch_local_irq_restore at 
> arch/x86/include/asm/paravirt.h:777
>(inlined by) 
> __usb_hcd_giveback_urb at drivers/usb/core/hcd.c:1769
> [   71.177281]  xhci_giveback_urb_in_irq+0x86/0x100
> [   71.184107]  xhci_td_cleanup+0xe7/0x170:
>   xhci_td_cleanup at 
> drivers/usb/host/xhci-ring.c:1924
> [   71.189457]  handle_tx_event+0x297/0x1190:
>   process_bulk_intr_td at 
> drivers/usb/host/xhci-ring.c:2267
>(inlined by) handle_tx_event 
> at drivers/usb/host/xhci-ring.c:2598
> [   71.194905]  ? reweight_entity+0x145/0x180:
>   enqueue_runnable_load_avg at 
> kernel/sched/fair.c:2742
>(inlined by) reweight_entity 
> at kernel/sched/fair.c:2810
> [   71.200466]  xhci_irq+0x300/0xb80:
>   xhci_handle_event at 
> drivers/usb/host/xhci-ring.c:2676
>(inlined by) xhci_irq at 
> drivers/usb/host/xhci-ring.c:2777
> [   71.205195]  ? scheduler_tick+0xb2/0xe0:
>   rq_last_tick_reset at 
> kernel/sched/sched.h:1643
>(inlined by) scheduler_tick at 
> kernel/sched/core.c:3036
> [   71.210407]  ? run_timer_softirq+0x73/0x460:
>   __collect_expired_timers at 
> kernel/time/timer.c:1375
>(inlined by) 
> collect_expired_timers at kernel/time/timer.c:1609
>(inlined by) __run_timers at 
> kernel/time/timer.c:1656
>(inlined by) run_timer_softirq 
> at kernel/time/timer.c:1688
> [   71.215905]  __handle_irq_event_percpu+0x3a/0x1a0:
>   __handle_irq_event_percpu at 
> kernel/irq/handle.c:147
> [   71.221975]  handle_irq_event_percpu+0x20/0x50:
>   handle_irq_event_percpu at 
> kernel/irq/handle.c:189
> [   71.227641]  handle_irq_event+0x3d/0x60:
>   handle_irq_event at 
> kernel/irq/handle.c:206
> [