Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-26 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 10:35 AM, Catalin Marinas
 wrote:
> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
>> If the following is a legit forced way to get query the kernel to ask it
>> who owns a page then perhaps this technique can be used in the future to
>> figure out who the hell caused this. Catalin, can you confirm? In this
>> case this is perhaps not a leaked page but I am trying to abuse the
>> kmemleak debugfs API to query who allocated the page. Is that fine?
>>
>> [0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 
>> note_page+0x63c/0x7e0
>> [0.917636] x86/mm: Found insecure W+X mapping at address 
>> c03d5000/0xc03d5000
>> [0.918502] Modules linked in:
>> [0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>> 4.11.0-mcgrof-force-config #340
>> [0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> [0.920011] Call Trace:
>> [0.920011]  dump_stack+0x63/0x81
>> [0.920011]  __warn+0xcb/0xf0
>> [0.920011]  warn_slowpath_fmt+0x5a/0x80
>> [0.920011]  note_page+0x63c/0x7e0
>> [0.920011]  ptdump_walk_pgd_level_core+0x3b1/0x460
>> [0.920011]  ? 0x86c0
>> [0.920011]  ptdump_walk_pgd_level_checkwx+0x17/0x20
>> [0.920011]  mark_rodata_ro+0xf4/0x100
>> [0.920011]  ? rest_init+0x80/0x80
>> [0.920011]  kernel_init+0x2a/0x100
>> [0.920011]  ret_from_fork+0x2c/0x40
>> [0.925474] ---[ end trace dca00cd779490a2b ]---
>> [0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>>
>> echo dump=0xc03d5000 > /sys/kernel/debug/kmemleak
>> dmesg | tail
>>
>> [   49.209565] kmemleak: Object 0xc03d5000 (size 335):
>> [   49.210814] kmemleak:   comm "swapper/0", pid 1, jiffies 4294892440
>> [   49.212148] kmemleak:   min_count = 2
>> [   49.212852] kmemleak:   count = 0
>> [   49.213363] kmemleak:   flags = 0x1
>> [   49.213363] kmemleak:   checksum = 0
>> [   49.213363] kmemleak:   backtrace:
>> [   49.213363]  kmemleak_alloc+0x4a/0xa0
>> [   49.213363]  __vmalloc_node_range+0x20a/0x2b0
>> [   49.213363]  module_alloc+0x67/0xc0
>> [   49.213363]  arch_ftrace_update_trampoline+0xba/0x260
>> [   49.213363]  ftrace_startup+0x90/0x210
>> [   49.213363]  register_ftrace_function+0x4b/0x60
>> [   49.213363]  arm_kprobe+0x84/0xe0
>> [   49.213363]  register_kprobe+0x56e/0x5b0
>> [   49.213363]  init_test_probes+0x61/0x560
>> [   49.213363]  init_kprobes+0x1e3/0x206
>> [   49.213363]  do_one_initcall+0x52/0x1a0
>> [   49.213363]  kernel_init_freeable+0x178/0x200
>> [   49.213363]  kernel_init+0xe/0x100
>> [   49.213363]  ret_from_fork+0x2c/0x40
>> [   49.213363]  0x
>
> You could as well use kmemleak this way since it tracks the memory
> allocations.

Great!

> However, it doesn't track alloc_pages and also doesn't
> track mapping existing pages (vmap etc.)

Can we verify that? If so then the splat from the above complaint
could include a follow up dump of the trace, no ? That's *much* more
useful.

 Luis


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-26 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 10:35 AM, Catalin Marinas
 wrote:
> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
>> If the following is a legit forced way to get query the kernel to ask it
>> who owns a page then perhaps this technique can be used in the future to
>> figure out who the hell caused this. Catalin, can you confirm? In this
>> case this is perhaps not a leaked page but I am trying to abuse the
>> kmemleak debugfs API to query who allocated the page. Is that fine?
>>
>> [0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 
>> note_page+0x63c/0x7e0
>> [0.917636] x86/mm: Found insecure W+X mapping at address 
>> c03d5000/0xc03d5000
>> [0.918502] Modules linked in:
>> [0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>> 4.11.0-mcgrof-force-config #340
>> [0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> [0.920011] Call Trace:
>> [0.920011]  dump_stack+0x63/0x81
>> [0.920011]  __warn+0xcb/0xf0
>> [0.920011]  warn_slowpath_fmt+0x5a/0x80
>> [0.920011]  note_page+0x63c/0x7e0
>> [0.920011]  ptdump_walk_pgd_level_core+0x3b1/0x460
>> [0.920011]  ? 0x86c0
>> [0.920011]  ptdump_walk_pgd_level_checkwx+0x17/0x20
>> [0.920011]  mark_rodata_ro+0xf4/0x100
>> [0.920011]  ? rest_init+0x80/0x80
>> [0.920011]  kernel_init+0x2a/0x100
>> [0.920011]  ret_from_fork+0x2c/0x40
>> [0.925474] ---[ end trace dca00cd779490a2b ]---
>> [0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>>
>> echo dump=0xc03d5000 > /sys/kernel/debug/kmemleak
>> dmesg | tail
>>
>> [   49.209565] kmemleak: Object 0xc03d5000 (size 335):
>> [   49.210814] kmemleak:   comm "swapper/0", pid 1, jiffies 4294892440
>> [   49.212148] kmemleak:   min_count = 2
>> [   49.212852] kmemleak:   count = 0
>> [   49.213363] kmemleak:   flags = 0x1
>> [   49.213363] kmemleak:   checksum = 0
>> [   49.213363] kmemleak:   backtrace:
>> [   49.213363]  kmemleak_alloc+0x4a/0xa0
>> [   49.213363]  __vmalloc_node_range+0x20a/0x2b0
>> [   49.213363]  module_alloc+0x67/0xc0
>> [   49.213363]  arch_ftrace_update_trampoline+0xba/0x260
>> [   49.213363]  ftrace_startup+0x90/0x210
>> [   49.213363]  register_ftrace_function+0x4b/0x60
>> [   49.213363]  arm_kprobe+0x84/0xe0
>> [   49.213363]  register_kprobe+0x56e/0x5b0
>> [   49.213363]  init_test_probes+0x61/0x560
>> [   49.213363]  init_kprobes+0x1e3/0x206
>> [   49.213363]  do_one_initcall+0x52/0x1a0
>> [   49.213363]  kernel_init_freeable+0x178/0x200
>> [   49.213363]  kernel_init+0xe/0x100
>> [   49.213363]  ret_from_fork+0x2c/0x40
>> [   49.213363]  0x
>
> You could as well use kmemleak this way since it tracks the memory
> allocations.

Great!

> However, it doesn't track alloc_pages and also doesn't
> track mapping existing pages (vmap etc.)

Can we verify that? If so then the splat from the above complaint
could include a follow up dump of the trace, no ? That's *much* more
useful.

 Luis


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-24 Thread Luis R. Rodriguez
On Tue, May 23, 2017 at 04:48:50PM +0200, Luis R. Rodriguez wrote:
> On Sat, May 20, 2017 at 11:38:50AM +0900, Masami Hiramatsu wrote:
> > Hi Luis,
> > 
> > On Fri, 19 May 2017 19:28:54 +0200
> > "Luis R. Rodriguez"  wrote:
> > > 
> > > Aha! And the winner is:
> > > 
> > > CONFIG_KPROBES_SANITY_TEST
> > > 
> > > I confirm disabling it on 4.3.0-rc3 and on linux-next next-20170519 
> > > avoids the WARN.
> > > I also can confirm using the 'echo dump=mem-area > 
> > > /sys/kernel/debug/kmemleak' yields
> > > the same trace for both of these kernels.
> > > 
> > > So -- the above kmemleak hack seems to actually work to seek who owns 
> > > that page.
> > > 
> > > Now to figure out how the hell kernel/test_kprobes.c screws around with 
> > > things.
> > 
> > Ah, that was fixed recently;
> > 
> > https://marc.info/?l=linux-kernel=149076389011850
> > 
> > Note that this patch depends another patch in the series;
> > 
> > https://marc.info/?l=linux-kernel=149076370111812=2
> 
> I actually boot tested linux-next tag next-20170519 which carries these
> patches and the WARNING still is there. Please note the issue is with
> CONFIG_KPROBES_SANITY_TEST enabled.
> 
> [1.025601] x86/mm: Found insecure W+X mapping at address 
> c01e7000/0xc01e7000
> [1.026429] [ cut here ]
> [1.026885] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
> note_page+0x630/0x7e0
> [1.027711] Modules linked in:
> [1.028032] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> 4.12.0-rc1-next-20170519 #151
> [1.028788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [1.029928] task: 9fd47a5ccc80 task.stack: b6bcc063
> [1.030509] RIP: 0010:note_page+0x630/0x7e0
> [1.030917] RSP: :b6bcc0633df0 EFLAGS: 00010286
> [1.031425] RAX: 0051 RBX: b6bcc0633e88 RCX: 
> bb656708
> [1.032132] RDX:  RSI: 0096 RDI: 
> 0246
> [1.032834] RBP: b6bcc0633e28 R08: 203a6d6d2f363878 R09: 
> 0161
> [1.033539] R10: b6bcc0633dd8 R11: 736e6920646e756f R12: 
> 
> [1.034235] R13: 0004 R14:  R15: 
> 
> [1.034927] FS:  () GS:9fd47fc8() 
> knlGS:
> [1.035722] CS:  0010 DS:  ES:  CR0: 80050033
> [1.036290] CR2: b6bcc073c000 CR3: 53209000 CR4: 
> 06e0
> [1.036839] Call Trace:
> [1.037034]  ptdump_walk_pgd_level_core+0x3e7/0x490
> [1.037367]  ? 0xbaa0
> [1.037705]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> [1.038187]  mark_rodata_ro+0xf4/0x100
> [1.038559]  ? rest_init+0x80/0x80
> [1.038890]  kernel_init+0x2f/0x100
> [1.039235]  ret_from_fork+0x2c/0x40
> [1.039582] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 
> 73 10 48 c7 c7 f0 3d 3e bb c6 05 f8 eb bc 00 01 48 89 f2 e8 1d 02 12 00 <0f> 
> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 3c ba 3e bb e8 06 02
> [1.041416] ---[ end trace e726c1b63e5a81a9 ]---
> [1.041872] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> 
> root@piggy:~# echo dump=0xc01e7000 > /sys/kernel/debug/kmemleak
> 
> On dmesg:
> 
> May 23 07:44:51 piggy kernel: kmemleak: Object 0xc01e7000 (size 335):
> May 23 07:44:51 piggy kernel: kmemleak:   comm "swapper/0", pid 1, jiffies 
> 4294892451
> May 23 07:44:51 piggy kernel: kmemleak:   min_count = 2
> May 23 07:44:51 piggy kernel: kmemleak:   count = 2
> May 23 07:44:51 piggy kernel: kmemleak:   flags = 0x1
> May 23 07:44:51 piggy kernel: kmemleak:   checksum = 0
> May 23 07:44:51 piggy kernel: kmemleak:   backtrace:
> May 23 07:44:51 piggy kernel:  kmemleak_alloc+0x4a/0xa0
> May 23 07:44:51 piggy kernel:  __vmalloc_node_range+0x20c/0x2b0
> May 23 07:44:51 piggy kernel:  module_alloc+0x67/0xc0
> May 23 07:44:51 piggy kernel:  arch_ftrace_update_trampoline+0xc1/0x240
> May 23 07:44:51 piggy kernel:  ftrace_startup+0x92/0x210
> May 23 07:44:51 piggy kernel:  register_ftrace_function+0x4b/0x60
> May 23 07:44:51 piggy kernel:  arm_kprobe+0x84/0xc0
> May 23 07:44:51 piggy kernel:  register_kprobe+0x59c/0x5e0
> May 23 07:44:51 piggy kernel:  init_test_probes+0x61/0x560
> May 23 07:44:51 piggy kernel:  init_kprobes+0x1ea/0x20d
> May 23 07:44:51 piggy kernel:  do_one_initcall+0x52/0x1a0
> May 23 07:44:51 piggy kernel:  kernel_init_freeable+0x17d/0x205
> May 23 07:44:51 piggy kernel:  kernel_init+0xe/0x100
> May 23 07:44:51 piggy kernel:  ret_from_fork+0x2c/0x40
> May 23 07:44:51 piggy kernel:  0x

Turns out that Thomas Gleixner's patch from today [0] fixes this as the same
module_alloc() path was the culprit of the issue. Steven Rostedt however just
reported that this patch crashes on his ftracetests, so it 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-24 Thread Luis R. Rodriguez
On Tue, May 23, 2017 at 04:48:50PM +0200, Luis R. Rodriguez wrote:
> On Sat, May 20, 2017 at 11:38:50AM +0900, Masami Hiramatsu wrote:
> > Hi Luis,
> > 
> > On Fri, 19 May 2017 19:28:54 +0200
> > "Luis R. Rodriguez"  wrote:
> > > 
> > > Aha! And the winner is:
> > > 
> > > CONFIG_KPROBES_SANITY_TEST
> > > 
> > > I confirm disabling it on 4.3.0-rc3 and on linux-next next-20170519 
> > > avoids the WARN.
> > > I also can confirm using the 'echo dump=mem-area > 
> > > /sys/kernel/debug/kmemleak' yields
> > > the same trace for both of these kernels.
> > > 
> > > So -- the above kmemleak hack seems to actually work to seek who owns 
> > > that page.
> > > 
> > > Now to figure out how the hell kernel/test_kprobes.c screws around with 
> > > things.
> > 
> > Ah, that was fixed recently;
> > 
> > https://marc.info/?l=linux-kernel=149076389011850
> > 
> > Note that this patch depends another patch in the series;
> > 
> > https://marc.info/?l=linux-kernel=149076370111812=2
> 
> I actually boot tested linux-next tag next-20170519 which carries these
> patches and the WARNING still is there. Please note the issue is with
> CONFIG_KPROBES_SANITY_TEST enabled.
> 
> [1.025601] x86/mm: Found insecure W+X mapping at address 
> c01e7000/0xc01e7000
> [1.026429] [ cut here ]
> [1.026885] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
> note_page+0x630/0x7e0
> [1.027711] Modules linked in:
> [1.028032] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> 4.12.0-rc1-next-20170519 #151
> [1.028788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [1.029928] task: 9fd47a5ccc80 task.stack: b6bcc063
> [1.030509] RIP: 0010:note_page+0x630/0x7e0
> [1.030917] RSP: :b6bcc0633df0 EFLAGS: 00010286
> [1.031425] RAX: 0051 RBX: b6bcc0633e88 RCX: 
> bb656708
> [1.032132] RDX:  RSI: 0096 RDI: 
> 0246
> [1.032834] RBP: b6bcc0633e28 R08: 203a6d6d2f363878 R09: 
> 0161
> [1.033539] R10: b6bcc0633dd8 R11: 736e6920646e756f R12: 
> 
> [1.034235] R13: 0004 R14:  R15: 
> 
> [1.034927] FS:  () GS:9fd47fc8() 
> knlGS:
> [1.035722] CS:  0010 DS:  ES:  CR0: 80050033
> [1.036290] CR2: b6bcc073c000 CR3: 53209000 CR4: 
> 06e0
> [1.036839] Call Trace:
> [1.037034]  ptdump_walk_pgd_level_core+0x3e7/0x490
> [1.037367]  ? 0xbaa0
> [1.037705]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> [1.038187]  mark_rodata_ro+0xf4/0x100
> [1.038559]  ? rest_init+0x80/0x80
> [1.038890]  kernel_init+0x2f/0x100
> [1.039235]  ret_from_fork+0x2c/0x40
> [1.039582] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 
> 73 10 48 c7 c7 f0 3d 3e bb c6 05 f8 eb bc 00 01 48 89 f2 e8 1d 02 12 00 <0f> 
> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 3c ba 3e bb e8 06 02
> [1.041416] ---[ end trace e726c1b63e5a81a9 ]---
> [1.041872] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> 
> root@piggy:~# echo dump=0xc01e7000 > /sys/kernel/debug/kmemleak
> 
> On dmesg:
> 
> May 23 07:44:51 piggy kernel: kmemleak: Object 0xc01e7000 (size 335):
> May 23 07:44:51 piggy kernel: kmemleak:   comm "swapper/0", pid 1, jiffies 
> 4294892451
> May 23 07:44:51 piggy kernel: kmemleak:   min_count = 2
> May 23 07:44:51 piggy kernel: kmemleak:   count = 2
> May 23 07:44:51 piggy kernel: kmemleak:   flags = 0x1
> May 23 07:44:51 piggy kernel: kmemleak:   checksum = 0
> May 23 07:44:51 piggy kernel: kmemleak:   backtrace:
> May 23 07:44:51 piggy kernel:  kmemleak_alloc+0x4a/0xa0
> May 23 07:44:51 piggy kernel:  __vmalloc_node_range+0x20c/0x2b0
> May 23 07:44:51 piggy kernel:  module_alloc+0x67/0xc0
> May 23 07:44:51 piggy kernel:  arch_ftrace_update_trampoline+0xc1/0x240
> May 23 07:44:51 piggy kernel:  ftrace_startup+0x92/0x210
> May 23 07:44:51 piggy kernel:  register_ftrace_function+0x4b/0x60
> May 23 07:44:51 piggy kernel:  arm_kprobe+0x84/0xc0
> May 23 07:44:51 piggy kernel:  register_kprobe+0x59c/0x5e0
> May 23 07:44:51 piggy kernel:  init_test_probes+0x61/0x560
> May 23 07:44:51 piggy kernel:  init_kprobes+0x1ea/0x20d
> May 23 07:44:51 piggy kernel:  do_one_initcall+0x52/0x1a0
> May 23 07:44:51 piggy kernel:  kernel_init_freeable+0x17d/0x205
> May 23 07:44:51 piggy kernel:  kernel_init+0xe/0x100
> May 23 07:44:51 piggy kernel:  ret_from_fork+0x2c/0x40
> May 23 07:44:51 piggy kernel:  0x

Turns out that Thomas Gleixner's patch from today [0] fixes this as the same
module_alloc() path was the culprit of the issue. Steven Rostedt however just
reported that this patch crashes on his ftracetests, so it would seem we just

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-23 Thread Luis R. Rodriguez
On Sat, May 20, 2017 at 11:38:50AM +0900, Masami Hiramatsu wrote:
> Hi Luis,
> 
> On Fri, 19 May 2017 19:28:54 +0200
> "Luis R. Rodriguez"  wrote:
> 
> > On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> > > On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > > > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez 
> > > > > >  wrote:
> > > > > > > Yes, but I had killed that boot session again, so upon my next 
> > > > > > > boot
> > > > > > > I had a different layout, the ASLR gap was much larger:
> > > > > > >
> > > > > > > ---[ Modules ]---
> > > > > > > 0xc000-0xc01b1728K
> > > > > > >pte
> > > > > > > 0xc01b-0xc01b1000   4K RW 
> > > > > > > GLB x  pte
> > > > > > > 0xc01b1000-0xc01b2000   4K
> > > > > > >pte
> > > > > > > 0xc01b2000-0xc01c6000  80K ro 
> > > > > > > GLB x  pte
> > > > > > > 0xc01c6000-0xc01cc000  24K ro 
> > > > > > > GLB NX pte
> > > > > > > 0xc01cc000-0xc01d5000  36K RW 
> > > > > > > GLB NX pte
> > > > > > >
> > > > > > > As you can guess if we follow similar pattern the RW hole is the 
> > > > > > > one this boot
> > > > > > > warned about:
> > > > > > >
> > > > > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > > > > c01b/0xc01b
> > > > > > > [1.451280] [ cut here ]
> > > > > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > > > [1.452499] Modules linked in:
> > > > > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > > > > 4.12.0-rc1-next-20170515+ #145
> > > > > > >
> > > > > > > I checked and indeed 0xc01b2000 is part of a module, it 
> > > > > > > was not the first one
> > > > > > > on the /proc/modules list but then again /proc/modules does not 
> > > > > > > seem to have a specific
> > > > > > > order other than perhaps being pegged into a linked list of 
> > > > > > > modules once they go live,
> > > > > > > and it seems its typically output backwards from when that 
> > > > > > > happened, sorting that
> > > > > > > by address we get:
> > > > > > 
> > > > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > > > /proc/modules, but that's fine, it's there.
> > > > > > 
> > > > > > >
> > > > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 
> > > > > > > 0xc01df000 (E)
> > > > > > >
> > > > > > > And this then seems to be the first module loaded:
> > > > > > >
> > > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > > >
> > > > > > > The output of dmesg seems to confirm this as per the list of 
> > > > > > > modules sorted
> > > > > > > as per above.
> > > > > > >
> > > > > > >> Something touched the module gap and left is RW+x...
> > > > > > >
> > > > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see 
> > > > > > > how that goes.
> > > > > > 
> > > > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > > > That seems odd, but maybe unload isn't cleaning up?
> > > > > > 
> > > > > > >> Are you able to bisect this?
> > > > > > >
> > > > > > > This issue has been present for a while so since I recall this I 
> > > > > > > might be
> > > > > > > able to reduce the number of needed target kernels to bisect. 
> > > > > > > Lemme tinker
> > > > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > > > 
> > > > > > Okay, thanks!
> > > > > 
> > > > > Sorry to report that this issue is present since the feature's 
> > > > > addition. So
> > > > > the issue is there since its addition and is still present today. 
> > > > > *But* it
> > > > > may also be a configuration issue, given I have booted this guest 
> > > > > *without*
> > > > > this issue ...
> > > > > 
> > > > > So:
> > > > > 
> > > > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > > > 
> > > > > That boots with the warning. To help debug further I've minimized my 
> > > > > modules
> > > > > to only a few: scsi_mod, e1000, libata.
> > > > > 
> > > > > I suspect at this point this is not the fault of a particular module 
> > > > > but
> > > > > instead just an accounting semantic (>= or <= on an edge) but let's 
> > > > > see.
> > > > > 
> > > > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-23 Thread Luis R. Rodriguez
On Sat, May 20, 2017 at 11:38:50AM +0900, Masami Hiramatsu wrote:
> Hi Luis,
> 
> On Fri, 19 May 2017 19:28:54 +0200
> "Luis R. Rodriguez"  wrote:
> 
> > On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> > > On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > > > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez 
> > > > > >  wrote:
> > > > > > > Yes, but I had killed that boot session again, so upon my next 
> > > > > > > boot
> > > > > > > I had a different layout, the ASLR gap was much larger:
> > > > > > >
> > > > > > > ---[ Modules ]---
> > > > > > > 0xc000-0xc01b1728K
> > > > > > >pte
> > > > > > > 0xc01b-0xc01b1000   4K RW 
> > > > > > > GLB x  pte
> > > > > > > 0xc01b1000-0xc01b2000   4K
> > > > > > >pte
> > > > > > > 0xc01b2000-0xc01c6000  80K ro 
> > > > > > > GLB x  pte
> > > > > > > 0xc01c6000-0xc01cc000  24K ro 
> > > > > > > GLB NX pte
> > > > > > > 0xc01cc000-0xc01d5000  36K RW 
> > > > > > > GLB NX pte
> > > > > > >
> > > > > > > As you can guess if we follow similar pattern the RW hole is the 
> > > > > > > one this boot
> > > > > > > warned about:
> > > > > > >
> > > > > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > > > > c01b/0xc01b
> > > > > > > [1.451280] [ cut here ]
> > > > > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > > > [1.452499] Modules linked in:
> > > > > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > > > > 4.12.0-rc1-next-20170515+ #145
> > > > > > >
> > > > > > > I checked and indeed 0xc01b2000 is part of a module, it 
> > > > > > > was not the first one
> > > > > > > on the /proc/modules list but then again /proc/modules does not 
> > > > > > > seem to have a specific
> > > > > > > order other than perhaps being pegged into a linked list of 
> > > > > > > modules once they go live,
> > > > > > > and it seems its typically output backwards from when that 
> > > > > > > happened, sorting that
> > > > > > > by address we get:
> > > > > > 
> > > > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > > > /proc/modules, but that's fine, it's there.
> > > > > > 
> > > > > > >
> > > > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 
> > > > > > > 0xc01df000 (E)
> > > > > > >
> > > > > > > And this then seems to be the first module loaded:
> > > > > > >
> > > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > > >
> > > > > > > The output of dmesg seems to confirm this as per the list of 
> > > > > > > modules sorted
> > > > > > > as per above.
> > > > > > >
> > > > > > >> Something touched the module gap and left is RW+x...
> > > > > > >
> > > > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see 
> > > > > > > how that goes.
> > > > > > 
> > > > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > > > That seems odd, but maybe unload isn't cleaning up?
> > > > > > 
> > > > > > >> Are you able to bisect this?
> > > > > > >
> > > > > > > This issue has been present for a while so since I recall this I 
> > > > > > > might be
> > > > > > > able to reduce the number of needed target kernels to bisect. 
> > > > > > > Lemme tinker
> > > > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > > > 
> > > > > > Okay, thanks!
> > > > > 
> > > > > Sorry to report that this issue is present since the feature's 
> > > > > addition. So
> > > > > the issue is there since its addition and is still present today. 
> > > > > *But* it
> > > > > may also be a configuration issue, given I have booted this guest 
> > > > > *without*
> > > > > this issue ...
> > > > > 
> > > > > So:
> > > > > 
> > > > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > > > 
> > > > > That boots with the warning. To help debug further I've minimized my 
> > > > > modules
> > > > > to only a few: scsi_mod, e1000, libata.
> > > > > 
> > > > > I suspect at this point this is not the fault of a particular module 
> > > > > but
> > > > > instead just an accounting semantic (>= or <= on an edge) but let's 
> > > > > see.
> > > > > 
> > > > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > > > mappings") and I with:
> > 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Masami Hiramatsu
Hi Luis,

On Fri, 19 May 2017 19:28:54 +0200
"Luis R. Rodriguez"  wrote:

> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> > On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez 
> > > > >  wrote:
> > > > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > > > I had a different layout, the ASLR gap was much larger:
> > > > > >
> > > > > > ---[ Modules ]---
> > > > > > 0xc000-0xc01b1728K  
> > > > > >  pte
> > > > > > 0xc01b-0xc01b1000   4K RW   
> > > > > >   GLB x  pte
> > > > > > 0xc01b1000-0xc01b2000   4K  
> > > > > >  pte
> > > > > > 0xc01b2000-0xc01c6000  80K ro   
> > > > > >   GLB x  pte
> > > > > > 0xc01c6000-0xc01cc000  24K ro   
> > > > > >   GLB NX pte
> > > > > > 0xc01cc000-0xc01d5000  36K RW   
> > > > > >   GLB NX pte
> > > > > >
> > > > > > As you can guess if we follow similar pattern the RW hole is the 
> > > > > > one this boot
> > > > > > warned about:
> > > > > >
> > > > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > > > c01b/0xc01b
> > > > > > [1.451280] [ cut here ]
> > > > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > > [1.452499] Modules linked in:
> > > > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > > > 4.12.0-rc1-next-20170515+ #145
> > > > > >
> > > > > > I checked and indeed 0xc01b2000 is part of a module, it was 
> > > > > > not the first one
> > > > > > on the /proc/modules list but then again /proc/modules does not 
> > > > > > seem to have a specific
> > > > > > order other than perhaps being pegged into a linked list of modules 
> > > > > > once they go live,
> > > > > > and it seems its typically output backwards from when that 
> > > > > > happened, sorting that
> > > > > > by address we get:
> > > > > 
> > > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > > /proc/modules, but that's fine, it's there.
> > > > > 
> > > > > >
> > > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 
> > > > > > (E)
> > > > > >
> > > > > > And this then seems to be the first module loaded:
> > > > > >
> > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > >
> > > > > > The output of dmesg seems to confirm this as per the list of 
> > > > > > modules sorted
> > > > > > as per above.
> > > > > >
> > > > > >> Something touched the module gap and left is RW+x...
> > > > > >
> > > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how 
> > > > > > that goes.
> > > > > 
> > > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > > That seems odd, but maybe unload isn't cleaning up?
> > > > > 
> > > > > >> Are you able to bisect this?
> > > > > >
> > > > > > This issue has been present for a while so since I recall this I 
> > > > > > might be
> > > > > > able to reduce the number of needed target kernels to bisect. Lemme 
> > > > > > tinker
> > > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > > 
> > > > > Okay, thanks!
> > > > 
> > > > Sorry to report that this issue is present since the feature's 
> > > > addition. So
> > > > the issue is there since its addition and is still present today. *But* 
> > > > it
> > > > may also be a configuration issue, given I have booted this guest 
> > > > *without*
> > > > this issue ...
> > > > 
> > > > So:
> > > > 
> > > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > > 
> > > > That boots with the warning. To help debug further I've minimized my 
> > > > modules
> > > > to only a few: scsi_mod, e1000, libata.
> > > > 
> > > > I suspect at this point this is not the fault of a particular module but
> > > > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > > > 
> > > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > > mappings") and I with:
> > > > 
> > > > [0.949435] [ cut here ] 
> > > > 
> > > > [0.949992] WARNING: CPU: 2 PID: 1 at 
> > > > arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > > > [0.950996] x86/mm: Found insecure W+X mapping at 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Masami Hiramatsu
Hi Luis,

On Fri, 19 May 2017 19:28:54 +0200
"Luis R. Rodriguez"  wrote:

> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> > On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez 
> > > > >  wrote:
> > > > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > > > I had a different layout, the ASLR gap was much larger:
> > > > > >
> > > > > > ---[ Modules ]---
> > > > > > 0xc000-0xc01b1728K  
> > > > > >  pte
> > > > > > 0xc01b-0xc01b1000   4K RW   
> > > > > >   GLB x  pte
> > > > > > 0xc01b1000-0xc01b2000   4K  
> > > > > >  pte
> > > > > > 0xc01b2000-0xc01c6000  80K ro   
> > > > > >   GLB x  pte
> > > > > > 0xc01c6000-0xc01cc000  24K ro   
> > > > > >   GLB NX pte
> > > > > > 0xc01cc000-0xc01d5000  36K RW   
> > > > > >   GLB NX pte
> > > > > >
> > > > > > As you can guess if we follow similar pattern the RW hole is the 
> > > > > > one this boot
> > > > > > warned about:
> > > > > >
> > > > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > > > c01b/0xc01b
> > > > > > [1.451280] [ cut here ]
> > > > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > > [1.452499] Modules linked in:
> > > > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > > > 4.12.0-rc1-next-20170515+ #145
> > > > > >
> > > > > > I checked and indeed 0xc01b2000 is part of a module, it was 
> > > > > > not the first one
> > > > > > on the /proc/modules list but then again /proc/modules does not 
> > > > > > seem to have a specific
> > > > > > order other than perhaps being pegged into a linked list of modules 
> > > > > > once they go live,
> > > > > > and it seems its typically output backwards from when that 
> > > > > > happened, sorting that
> > > > > > by address we get:
> > > > > 
> > > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > > /proc/modules, but that's fine, it's there.
> > > > > 
> > > > > >
> > > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 
> > > > > > (E)
> > > > > >
> > > > > > And this then seems to be the first module loaded:
> > > > > >
> > > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > >
> > > > > > The output of dmesg seems to confirm this as per the list of 
> > > > > > modules sorted
> > > > > > as per above.
> > > > > >
> > > > > >> Something touched the module gap and left is RW+x...
> > > > > >
> > > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how 
> > > > > > that goes.
> > > > > 
> > > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > > That seems odd, but maybe unload isn't cleaning up?
> > > > > 
> > > > > >> Are you able to bisect this?
> > > > > >
> > > > > > This issue has been present for a while so since I recall this I 
> > > > > > might be
> > > > > > able to reduce the number of needed target kernels to bisect. Lemme 
> > > > > > tinker
> > > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > > 
> > > > > Okay, thanks!
> > > > 
> > > > Sorry to report that this issue is present since the feature's 
> > > > addition. So
> > > > the issue is there since its addition and is still present today. *But* 
> > > > it
> > > > may also be a configuration issue, given I have booted this guest 
> > > > *without*
> > > > this issue ...
> > > > 
> > > > So:
> > > > 
> > > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > > 
> > > > That boots with the warning. To help debug further I've minimized my 
> > > > modules
> > > > to only a few: scsi_mod, e1000, libata.
> > > > 
> > > > I suspect at this point this is not the fault of a particular module but
> > > > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > > > 
> > > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > > mappings") and I with:
> > > > 
> > > > [0.949435] [ cut here ] 
> > > > 
> > > > [0.949992] WARNING: CPU: 2 PID: 1 at 
> > > > arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > > > [0.950996] x86/mm: Found insecure W+X mapping at address 
> > > > 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Kees Cook
On Fri, May 19, 2017 at 12:18 PM, Andy Lutomirski  wrote:
> On Fri, May 19, 2017 at 12:16 PM, Kees Cook  wrote:
>> On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski  wrote:
>>> One thing I've pondered: can we make some debugging mode (kmemleak,
>>> perhaps?) check that freed memory is RW at the time it's freed?  I
>>> once wrote some buggy code that freed an R page and caused an OOPS
>>> much later, and this bug here seems likely to be some code that frees
>>> RWX memory.
>>
>> Which begs for even more checks: nothing should ever make a page RWX.
>> Either R, RW, or RX only... (or X too I guess, in the future).
>
> I could see pages being RWX temporarily during boot.  OTOH if we ban
> RWX outright (after very early boot, anyway), then catching code that
> messes up and leaves pages RWX gets much easier.

Right, early boot is kind of special. It'd be nice to have there, but
I meant during normal runtime. We'd probably need to adjust
set_memory_rw/ro/nx/x around to have the correct side-effects, instead
of just controlling specific bits:

set_memory_rw() (RW_)
set_memory_ro() (R__)
set_memory_rx() (R_X)
set_memory_x() (__X)

That kind of refactoring might be not _too_ bad:

- add set_memory_rx()
- s/\bset_memory_x\b/set_memory_rx/g
- fix what breaks from expecting writable-executable memory
- adjust set_memory_rw() to drop x
- fix what breaks from expecting writable-executable memory
- adjust set_memory_ro() to drop x
- fix what breaks from expecting executable memory
- add set_memory_x() some day...

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Kees Cook
On Fri, May 19, 2017 at 12:18 PM, Andy Lutomirski  wrote:
> On Fri, May 19, 2017 at 12:16 PM, Kees Cook  wrote:
>> On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski  wrote:
>>> One thing I've pondered: can we make some debugging mode (kmemleak,
>>> perhaps?) check that freed memory is RW at the time it's freed?  I
>>> once wrote some buggy code that freed an R page and caused an OOPS
>>> much later, and this bug here seems likely to be some code that frees
>>> RWX memory.
>>
>> Which begs for even more checks: nothing should ever make a page RWX.
>> Either R, RW, or RX only... (or X too I guess, in the future).
>
> I could see pages being RWX temporarily during boot.  OTOH if we ban
> RWX outright (after very early boot, anyway), then catching code that
> messes up and leaves pages RWX gets much easier.

Right, early boot is kind of special. It'd be nice to have there, but
I meant during normal runtime. We'd probably need to adjust
set_memory_rw/ro/nx/x around to have the correct side-effects, instead
of just controlling specific bits:

set_memory_rw() (RW_)
set_memory_ro() (R__)
set_memory_rx() (R_X)
set_memory_x() (__X)

That kind of refactoring might be not _too_ bad:

- add set_memory_rx()
- s/\bset_memory_x\b/set_memory_rx/g
- fix what breaks from expecting writable-executable memory
- adjust set_memory_rw() to drop x
- fix what breaks from expecting writable-executable memory
- adjust set_memory_ro() to drop x
- fix what breaks from expecting executable memory
- add set_memory_x() some day...

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Andy Lutomirski
On Fri, May 19, 2017 at 12:16 PM, Kees Cook  wrote:
> On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski  wrote:
>> One thing I've pondered: can we make some debugging mode (kmemleak,
>> perhaps?) check that freed memory is RW at the time it's freed?  I
>> once wrote some buggy code that freed an R page and caused an OOPS
>> much later, and this bug here seems likely to be some code that frees
>> RWX memory.
>
> Which begs for even more checks: nothing should ever make a page RWX.
> Either R, RW, or RX only... (or X too I guess, in the future).

I could see pages being RWX temporarily during boot.  OTOH if we ban
RWX outright (after very early boot, anyway), then catching code that
messes up and leaves pages RWX gets much easier.

--Andy


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Andy Lutomirski
On Fri, May 19, 2017 at 12:16 PM, Kees Cook  wrote:
> On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski  wrote:
>> One thing I've pondered: can we make some debugging mode (kmemleak,
>> perhaps?) check that freed memory is RW at the time it's freed?  I
>> once wrote some buggy code that freed an R page and caused an OOPS
>> much later, and this bug here seems likely to be some code that frees
>> RWX memory.
>
> Which begs for even more checks: nothing should ever make a page RWX.
> Either R, RW, or RX only... (or X too I guess, in the future).

I could see pages being RWX temporarily during boot.  OTOH if we ban
RWX outright (after very early boot, anyway), then catching code that
messes up and leaves pages RWX gets much easier.

--Andy


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Kees Cook
On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski  wrote:
> One thing I've pondered: can we make some debugging mode (kmemleak,
> perhaps?) check that freed memory is RW at the time it's freed?  I
> once wrote some buggy code that freed an R page and caused an OOPS
> much later, and this bug here seems likely to be some code that frees
> RWX memory.

Which begs for even more checks: nothing should ever make a page RWX.
Either R, RW, or RX only... (or X too I guess, in the future).

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Kees Cook
On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski  wrote:
> One thing I've pondered: can we make some debugging mode (kmemleak,
> perhaps?) check that freed memory is RW at the time it's freed?  I
> once wrote some buggy code that freed an R page and caused an OOPS
> much later, and this bug here seems likely to be some code that frees
> RWX memory.

Which begs for even more checks: nothing should ever make a page RWX.
Either R, RW, or RX only... (or X too I guess, in the future).

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Andy Lutomirski
On Fri, May 19, 2017 at 10:35 AM, Catalin Marinas
 wrote:
> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
>> If the following is a legit forced way to get query the kernel to ask it
>> who owns a page then perhaps this technique can be used in the future to
>> figure out who the hell caused this. Catalin, can you confirm? In this
>> case this is perhaps not a leaked page but I am trying to abuse the
>> kmemleak debugfs API to query who allocated the page. Is that fine?
>>
>> [0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 
>> note_page+0x63c/0x7e0
>> [0.917636] x86/mm: Found insecure W+X mapping at address 
>> c03d5000/0xc03d5000
>> [0.918502] Modules linked in:
>> [0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>> 4.11.0-mcgrof-force-config #340
>> [0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> [0.920011] Call Trace:
>> [0.920011]  dump_stack+0x63/0x81
>> [0.920011]  __warn+0xcb/0xf0
>> [0.920011]  warn_slowpath_fmt+0x5a/0x80
>> [0.920011]  note_page+0x63c/0x7e0
>> [0.920011]  ptdump_walk_pgd_level_core+0x3b1/0x460
>> [0.920011]  ? 0x86c0
>> [0.920011]  ptdump_walk_pgd_level_checkwx+0x17/0x20
>> [0.920011]  mark_rodata_ro+0xf4/0x100
>> [0.920011]  ? rest_init+0x80/0x80
>> [0.920011]  kernel_init+0x2a/0x100
>> [0.920011]  ret_from_fork+0x2c/0x40
>> [0.925474] ---[ end trace dca00cd779490a2b ]---
>> [0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>>
>> echo dump=0xc03d5000 > /sys/kernel/debug/kmemleak
>> dmesg | tail
>>
>> [   49.209565] kmemleak: Object 0xc03d5000 (size 335):
>> [   49.210814] kmemleak:   comm "swapper/0", pid 1, jiffies 4294892440
>> [   49.212148] kmemleak:   min_count = 2
>> [   49.212852] kmemleak:   count = 0
>> [   49.213363] kmemleak:   flags = 0x1
>> [   49.213363] kmemleak:   checksum = 0
>> [   49.213363] kmemleak:   backtrace:
>> [   49.213363]  kmemleak_alloc+0x4a/0xa0
>> [   49.213363]  __vmalloc_node_range+0x20a/0x2b0
>> [   49.213363]  module_alloc+0x67/0xc0
>> [   49.213363]  arch_ftrace_update_trampoline+0xba/0x260
>> [   49.213363]  ftrace_startup+0x90/0x210
>> [   49.213363]  register_ftrace_function+0x4b/0x60
>> [   49.213363]  arm_kprobe+0x84/0xe0
>> [   49.213363]  register_kprobe+0x56e/0x5b0
>> [   49.213363]  init_test_probes+0x61/0x560
>> [   49.213363]  init_kprobes+0x1e3/0x206
>> [   49.213363]  do_one_initcall+0x52/0x1a0
>> [   49.213363]  kernel_init_freeable+0x178/0x200
>> [   49.213363]  kernel_init+0xe/0x100
>> [   49.213363]  ret_from_fork+0x2c/0x40
>> [   49.213363]  0x
>
> You could as well use kmemleak this way since it tracks the memory
> allocations. However, it doesn't track alloc_pages and also doesn't
> track mapping existing pages (vmap etc.)

One thing I've pondered: can we make some debugging mode (kmemleak,
perhaps?) check that freed memory is RW at the time it's freed?  I
once wrote some buggy code that freed an R page and caused an OOPS
much later, and this bug here seems likely to be some code that frees
RWX memory.

--Andy


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Andy Lutomirski
On Fri, May 19, 2017 at 10:35 AM, Catalin Marinas
 wrote:
> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
>> If the following is a legit forced way to get query the kernel to ask it
>> who owns a page then perhaps this technique can be used in the future to
>> figure out who the hell caused this. Catalin, can you confirm? In this
>> case this is perhaps not a leaked page but I am trying to abuse the
>> kmemleak debugfs API to query who allocated the page. Is that fine?
>>
>> [0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 
>> note_page+0x63c/0x7e0
>> [0.917636] x86/mm: Found insecure W+X mapping at address 
>> c03d5000/0xc03d5000
>> [0.918502] Modules linked in:
>> [0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>> 4.11.0-mcgrof-force-config #340
>> [0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> [0.920011] Call Trace:
>> [0.920011]  dump_stack+0x63/0x81
>> [0.920011]  __warn+0xcb/0xf0
>> [0.920011]  warn_slowpath_fmt+0x5a/0x80
>> [0.920011]  note_page+0x63c/0x7e0
>> [0.920011]  ptdump_walk_pgd_level_core+0x3b1/0x460
>> [0.920011]  ? 0x86c0
>> [0.920011]  ptdump_walk_pgd_level_checkwx+0x17/0x20
>> [0.920011]  mark_rodata_ro+0xf4/0x100
>> [0.920011]  ? rest_init+0x80/0x80
>> [0.920011]  kernel_init+0x2a/0x100
>> [0.920011]  ret_from_fork+0x2c/0x40
>> [0.925474] ---[ end trace dca00cd779490a2b ]---
>> [0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>>
>> echo dump=0xc03d5000 > /sys/kernel/debug/kmemleak
>> dmesg | tail
>>
>> [   49.209565] kmemleak: Object 0xc03d5000 (size 335):
>> [   49.210814] kmemleak:   comm "swapper/0", pid 1, jiffies 4294892440
>> [   49.212148] kmemleak:   min_count = 2
>> [   49.212852] kmemleak:   count = 0
>> [   49.213363] kmemleak:   flags = 0x1
>> [   49.213363] kmemleak:   checksum = 0
>> [   49.213363] kmemleak:   backtrace:
>> [   49.213363]  kmemleak_alloc+0x4a/0xa0
>> [   49.213363]  __vmalloc_node_range+0x20a/0x2b0
>> [   49.213363]  module_alloc+0x67/0xc0
>> [   49.213363]  arch_ftrace_update_trampoline+0xba/0x260
>> [   49.213363]  ftrace_startup+0x90/0x210
>> [   49.213363]  register_ftrace_function+0x4b/0x60
>> [   49.213363]  arm_kprobe+0x84/0xe0
>> [   49.213363]  register_kprobe+0x56e/0x5b0
>> [   49.213363]  init_test_probes+0x61/0x560
>> [   49.213363]  init_kprobes+0x1e3/0x206
>> [   49.213363]  do_one_initcall+0x52/0x1a0
>> [   49.213363]  kernel_init_freeable+0x178/0x200
>> [   49.213363]  kernel_init+0xe/0x100
>> [   49.213363]  ret_from_fork+0x2c/0x40
>> [   49.213363]  0x
>
> You could as well use kmemleak this way since it tracks the memory
> allocations. However, it doesn't track alloc_pages and also doesn't
> track mapping existing pages (vmap etc.)

One thing I've pondered: can we make some debugging mode (kmemleak,
perhaps?) check that freed memory is RW at the time it's freed?  I
once wrote some buggy code that freed an R page and caused an OOPS
much later, and this bug here seems likely to be some code that frees
RWX memory.

--Andy


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Catalin Marinas
On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> If the following is a legit forced way to get query the kernel to ask it 
> who owns a page then perhaps this technique can be used in the future to
> figure out who the hell caused this. Catalin, can you confirm? In this
> case this is perhaps not a leaked page but I am trying to abuse the
> kmemleak debugfs API to query who allocated the page. Is that fine?
> 
> [0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 
> note_page+0x63c/0x7e0
> [0.917636] x86/mm: Found insecure W+X mapping at address 
> c03d5000/0xc03d5000
> [0.918502] Modules linked in:
> [0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 4.11.0-mcgrof-force-config #340
> [0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [0.920011] Call Trace:
> [0.920011]  dump_stack+0x63/0x81
> [0.920011]  __warn+0xcb/0xf0
> [0.920011]  warn_slowpath_fmt+0x5a/0x80
> [0.920011]  note_page+0x63c/0x7e0
> [0.920011]  ptdump_walk_pgd_level_core+0x3b1/0x460
> [0.920011]  ? 0x86c0
> [0.920011]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> [0.920011]  mark_rodata_ro+0xf4/0x100
> [0.920011]  ? rest_init+0x80/0x80
> [0.920011]  kernel_init+0x2a/0x100
> [0.920011]  ret_from_fork+0x2c/0x40
> [0.925474] ---[ end trace dca00cd779490a2b ]---
> [0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> 
> echo dump=0xc03d5000 > /sys/kernel/debug/kmemleak
> dmesg | tail
> 
> [   49.209565] kmemleak: Object 0xc03d5000 (size 335):
> [   49.210814] kmemleak:   comm "swapper/0", pid 1, jiffies 4294892440
> [   49.212148] kmemleak:   min_count = 2
> [   49.212852] kmemleak:   count = 0
> [   49.213363] kmemleak:   flags = 0x1
> [   49.213363] kmemleak:   checksum = 0
> [   49.213363] kmemleak:   backtrace:
> [   49.213363]  kmemleak_alloc+0x4a/0xa0
> [   49.213363]  __vmalloc_node_range+0x20a/0x2b0
> [   49.213363]  module_alloc+0x67/0xc0
> [   49.213363]  arch_ftrace_update_trampoline+0xba/0x260
> [   49.213363]  ftrace_startup+0x90/0x210
> [   49.213363]  register_ftrace_function+0x4b/0x60
> [   49.213363]  arm_kprobe+0x84/0xe0
> [   49.213363]  register_kprobe+0x56e/0x5b0
> [   49.213363]  init_test_probes+0x61/0x560
> [   49.213363]  init_kprobes+0x1e3/0x206
> [   49.213363]  do_one_initcall+0x52/0x1a0
> [   49.213363]  kernel_init_freeable+0x178/0x200
> [   49.213363]  kernel_init+0xe/0x100
> [   49.213363]  ret_from_fork+0x2c/0x40
> [   49.213363]  0x

You could as well use kmemleak this way since it tracks the memory
allocations. However, it doesn't track alloc_pages and also doesn't
track mapping existing pages (vmap etc.)

-- 
Catalin


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Catalin Marinas
On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> If the following is a legit forced way to get query the kernel to ask it 
> who owns a page then perhaps this technique can be used in the future to
> figure out who the hell caused this. Catalin, can you confirm? In this
> case this is perhaps not a leaked page but I am trying to abuse the
> kmemleak debugfs API to query who allocated the page. Is that fine?
> 
> [0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 
> note_page+0x63c/0x7e0
> [0.917636] x86/mm: Found insecure W+X mapping at address 
> c03d5000/0xc03d5000
> [0.918502] Modules linked in:
> [0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 4.11.0-mcgrof-force-config #340
> [0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [0.920011] Call Trace:
> [0.920011]  dump_stack+0x63/0x81
> [0.920011]  __warn+0xcb/0xf0
> [0.920011]  warn_slowpath_fmt+0x5a/0x80
> [0.920011]  note_page+0x63c/0x7e0
> [0.920011]  ptdump_walk_pgd_level_core+0x3b1/0x460
> [0.920011]  ? 0x86c0
> [0.920011]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> [0.920011]  mark_rodata_ro+0xf4/0x100
> [0.920011]  ? rest_init+0x80/0x80
> [0.920011]  kernel_init+0x2a/0x100
> [0.920011]  ret_from_fork+0x2c/0x40
> [0.925474] ---[ end trace dca00cd779490a2b ]---
> [0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> 
> echo dump=0xc03d5000 > /sys/kernel/debug/kmemleak
> dmesg | tail
> 
> [   49.209565] kmemleak: Object 0xc03d5000 (size 335):
> [   49.210814] kmemleak:   comm "swapper/0", pid 1, jiffies 4294892440
> [   49.212148] kmemleak:   min_count = 2
> [   49.212852] kmemleak:   count = 0
> [   49.213363] kmemleak:   flags = 0x1
> [   49.213363] kmemleak:   checksum = 0
> [   49.213363] kmemleak:   backtrace:
> [   49.213363]  kmemleak_alloc+0x4a/0xa0
> [   49.213363]  __vmalloc_node_range+0x20a/0x2b0
> [   49.213363]  module_alloc+0x67/0xc0
> [   49.213363]  arch_ftrace_update_trampoline+0xba/0x260
> [   49.213363]  ftrace_startup+0x90/0x210
> [   49.213363]  register_ftrace_function+0x4b/0x60
> [   49.213363]  arm_kprobe+0x84/0xe0
> [   49.213363]  register_kprobe+0x56e/0x5b0
> [   49.213363]  init_test_probes+0x61/0x560
> [   49.213363]  init_kprobes+0x1e3/0x206
> [   49.213363]  do_one_initcall+0x52/0x1a0
> [   49.213363]  kernel_init_freeable+0x178/0x200
> [   49.213363]  kernel_init+0xe/0x100
> [   49.213363]  ret_from_fork+0x2c/0x40
> [   49.213363]  0x

You could as well use kmemleak this way since it tracks the memory
allocations. However, it doesn't track alloc_pages and also doesn't
track mapping existing pages (vmap etc.)

-- 
Catalin


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  
> > > > wrote:
> > > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > > I had a different layout, the ASLR gap was much larger:
> > > > >
> > > > > ---[ Modules ]---
> > > > > 0xc000-0xc01b1728K
> > > > >pte
> > > > > 0xc01b-0xc01b1000   4K RW 
> > > > > GLB x  pte
> > > > > 0xc01b1000-0xc01b2000   4K
> > > > >pte
> > > > > 0xc01b2000-0xc01c6000  80K ro 
> > > > > GLB x  pte
> > > > > 0xc01c6000-0xc01cc000  24K ro 
> > > > > GLB NX pte
> > > > > 0xc01cc000-0xc01d5000  36K RW 
> > > > > GLB NX pte
> > > > >
> > > > > As you can guess if we follow similar pattern the RW hole is the one 
> > > > > this boot
> > > > > warned about:
> > > > >
> > > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > > c01b/0xc01b
> > > > > [1.451280] [ cut here ]
> > > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > [1.452499] Modules linked in:
> > > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > > 4.12.0-rc1-next-20170515+ #145
> > > > >
> > > > > I checked and indeed 0xc01b2000 is part of a module, it was 
> > > > > not the first one
> > > > > on the /proc/modules list but then again /proc/modules does not seem 
> > > > > to have a specific
> > > > > order other than perhaps being pegged into a linked list of modules 
> > > > > once they go live,
> > > > > and it seems its typically output backwards from when that happened, 
> > > > > sorting that
> > > > > by address we get:
> > > > 
> > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > /proc/modules, but that's fine, it's there.
> > > > 
> > > > >
> > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> > > > >
> > > > > And this then seems to be the first module loaded:
> > > > >
> > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > >
> > > > > The output of dmesg seems to confirm this as per the list of modules 
> > > > > sorted
> > > > > as per above.
> > > > >
> > > > >> Something touched the module gap and left is RW+x...
> > > > >
> > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how 
> > > > > that goes.
> > > > 
> > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > That seems odd, but maybe unload isn't cleaning up?
> > > > 
> > > > >> Are you able to bisect this?
> > > > >
> > > > > This issue has been present for a while so since I recall this I 
> > > > > might be
> > > > > able to reduce the number of needed target kernels to bisect. Lemme 
> > > > > tinker
> > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > 
> > > > Okay, thanks!
> > > 
> > > Sorry to report that this issue is present since the feature's addition. 
> > > So
> > > the issue is there since its addition and is still present today. *But* it
> > > may also be a configuration issue, given I have booted this guest 
> > > *without*
> > > this issue ...
> > > 
> > > So:
> > > 
> > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > 
> > > That boots with the warning. To help debug further I've minimized my 
> > > modules
> > > to only a few: scsi_mod, e1000, libata.
> > > 
> > > I suspect at this point this is not the fault of a particular module but
> > > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > > 
> > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > mappings") and I with:
> > > 
> > > [0.949435] [ cut here ]   
> > >   
> > > [0.949992] WARNING: CPU: 2 PID: 1 at 
> > > arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > > [0.950996] x86/mm: Found insecure W+X mapping at address 
> > > c000/0xc000
> > > [0.951814] Modules linked in: 
> > >   
> > > [0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
> > > 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > > [0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  
> > > > wrote:
> > > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > > I had a different layout, the ASLR gap was much larger:
> > > > >
> > > > > ---[ Modules ]---
> > > > > 0xc000-0xc01b1728K
> > > > >pte
> > > > > 0xc01b-0xc01b1000   4K RW 
> > > > > GLB x  pte
> > > > > 0xc01b1000-0xc01b2000   4K
> > > > >pte
> > > > > 0xc01b2000-0xc01c6000  80K ro 
> > > > > GLB x  pte
> > > > > 0xc01c6000-0xc01cc000  24K ro 
> > > > > GLB NX pte
> > > > > 0xc01cc000-0xc01d5000  36K RW 
> > > > > GLB NX pte
> > > > >
> > > > > As you can guess if we follow similar pattern the RW hole is the one 
> > > > > this boot
> > > > > warned about:
> > > > >
> > > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > > c01b/0xc01b
> > > > > [1.451280] [ cut here ]
> > > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > [1.452499] Modules linked in:
> > > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > > 4.12.0-rc1-next-20170515+ #145
> > > > >
> > > > > I checked and indeed 0xc01b2000 is part of a module, it was 
> > > > > not the first one
> > > > > on the /proc/modules list but then again /proc/modules does not seem 
> > > > > to have a specific
> > > > > order other than perhaps being pegged into a linked list of modules 
> > > > > once they go live,
> > > > > and it seems its typically output backwards from when that happened, 
> > > > > sorting that
> > > > > by address we get:
> > > > 
> > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > /proc/modules, but that's fine, it's there.
> > > > 
> > > > >
> > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> > > > >
> > > > > And this then seems to be the first module loaded:
> > > > >
> > > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > >
> > > > > The output of dmesg seems to confirm this as per the list of modules 
> > > > > sorted
> > > > > as per above.
> > > > >
> > > > >> Something touched the module gap and left is RW+x...
> > > > >
> > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how 
> > > > > that goes.
> > > > 
> > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > That seems odd, but maybe unload isn't cleaning up?
> > > > 
> > > > >> Are you able to bisect this?
> > > > >
> > > > > This issue has been present for a while so since I recall this I 
> > > > > might be
> > > > > able to reduce the number of needed target kernels to bisect. Lemme 
> > > > > tinker
> > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > 
> > > > Okay, thanks!
> > > 
> > > Sorry to report that this issue is present since the feature's addition. 
> > > So
> > > the issue is there since its addition and is still present today. *But* it
> > > may also be a configuration issue, given I have booted this guest 
> > > *without*
> > > this issue ...
> > > 
> > > So:
> > > 
> > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > 
> > > That boots with the warning. To help debug further I've minimized my 
> > > modules
> > > to only a few: scsi_mod, e1000, libata.
> > > 
> > > I suspect at this point this is not the fault of a particular module but
> > > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > > 
> > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > mappings") and I with:
> > > 
> > > [0.949435] [ cut here ]   
> > >   
> > > [0.949992] WARNING: CPU: 2 PID: 1 at 
> > > arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > > [0.950996] x86/mm: Found insecure W+X mapping at address 
> > > c000/0xc000
> > > [0.951814] Modules linked in: 
> > >   
> > > [0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
> > > 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > > [0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > > BIOS 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  
> > > wrote:
> > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > I had a different layout, the ASLR gap was much larger:
> > > >
> > > > ---[ Modules ]---
> > > > 0xc000-0xc01b1728K  
> > > >  pte
> > > > 0xc01b-0xc01b1000   4K RW   
> > > >   GLB x  pte
> > > > 0xc01b1000-0xc01b2000   4K  
> > > >  pte
> > > > 0xc01b2000-0xc01c6000  80K ro   
> > > >   GLB x  pte
> > > > 0xc01c6000-0xc01cc000  24K ro   
> > > >   GLB NX pte
> > > > 0xc01cc000-0xc01d5000  36K RW   
> > > >   GLB NX pte
> > > >
> > > > As you can guess if we follow similar pattern the RW hole is the one 
> > > > this boot
> > > > warned about:
> > > >
> > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > c01b/0xc01b
> > > > [1.451280] [ cut here ]
> > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > [1.452499] Modules linked in:
> > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > 4.12.0-rc1-next-20170515+ #145
> > > >
> > > > I checked and indeed 0xc01b2000 is part of a module, it was not 
> > > > the first one
> > > > on the /proc/modules list but then again /proc/modules does not seem to 
> > > > have a specific
> > > > order other than perhaps being pegged into a linked list of modules 
> > > > once they go live,
> > > > and it seems its typically output backwards from when that happened, 
> > > > sorting that
> > > > by address we get:
> > > 
> > > Right, sorry, I'd expect it at the bottom of the list in
> > > /proc/modules, but that's fine, it's there.
> > > 
> > > >
> > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> > > >
> > > > And this then seems to be the first module loaded:
> > > >
> > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > >
> > > > The output of dmesg seems to confirm this as per the list of modules 
> > > > sorted
> > > > as per above.
> > > >
> > > >> Something touched the module gap and left is RW+x...
> > > >
> > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how 
> > > > that goes.
> > > 
> > > Is it possible a module got loaded before e1000 and then unloaded?
> > > That seems odd, but maybe unload isn't cleaning up?
> > > 
> > > >> Are you able to bisect this?
> > > >
> > > > This issue has been present for a while so since I recall this I might 
> > > > be
> > > > able to reduce the number of needed target kernels to bisect. Lemme 
> > > > tinker
> > > > a bit and if no clear culprit comes up then will try bisect.
> > > 
> > > Okay, thanks!
> > 
> > Sorry to report that this issue is present since the feature's addition. So
> > the issue is there since its addition and is still present today. *But* it
> > may also be a configuration issue, given I have booted this guest *without*
> > this issue ...
> > 
> > So:
> > 
> > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > 
> > That boots with the warning. To help debug further I've minimized my modules
> > to only a few: scsi_mod, e1000, libata.
> > 
> > I suspect at this point this is not the fault of a particular module but
> > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > 
> > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > mappings") and I with:
> > 
> > [0.949435] [ cut here ] 
> > 
> > [0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 
> > note_page+0x635/0x7e0()
> > [0.950996] x86/mm: Found insecure W+X mapping at address 
> > c000/0xc000
> > [0.951814] Modules linked in:   
> > 
> > [0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
> > 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > [0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> > rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [0.954033]   1f722925 88013a5d7d40 
> > 812ff335
> > [0.954742]  88013a5d7d88 88013a5d7d78 81079be2 
> > 88013a5d7e90
> > [0.955522]   

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-19 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  
> > > wrote:
> > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > I had a different layout, the ASLR gap was much larger:
> > > >
> > > > ---[ Modules ]---
> > > > 0xc000-0xc01b1728K  
> > > >  pte
> > > > 0xc01b-0xc01b1000   4K RW   
> > > >   GLB x  pte
> > > > 0xc01b1000-0xc01b2000   4K  
> > > >  pte
> > > > 0xc01b2000-0xc01c6000  80K ro   
> > > >   GLB x  pte
> > > > 0xc01c6000-0xc01cc000  24K ro   
> > > >   GLB NX pte
> > > > 0xc01cc000-0xc01d5000  36K RW   
> > > >   GLB NX pte
> > > >
> > > > As you can guess if we follow similar pattern the RW hole is the one 
> > > > this boot
> > > > warned about:
> > > >
> > > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > > c01b/0xc01b
> > > > [1.451280] [ cut here ]
> > > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > [1.452499] Modules linked in:
> > > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > > 4.12.0-rc1-next-20170515+ #145
> > > >
> > > > I checked and indeed 0xc01b2000 is part of a module, it was not 
> > > > the first one
> > > > on the /proc/modules list but then again /proc/modules does not seem to 
> > > > have a specific
> > > > order other than perhaps being pegged into a linked list of modules 
> > > > once they go live,
> > > > and it seems its typically output backwards from when that happened, 
> > > > sorting that
> > > > by address we get:
> > > 
> > > Right, sorry, I'd expect it at the bottom of the list in
> > > /proc/modules, but that's fine, it's there.
> > > 
> > > >
> > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> > > >
> > > > And this then seems to be the first module loaded:
> > > >
> > > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > >
> > > > The output of dmesg seems to confirm this as per the list of modules 
> > > > sorted
> > > > as per above.
> > > >
> > > >> Something touched the module gap and left is RW+x...
> > > >
> > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how 
> > > > that goes.
> > > 
> > > Is it possible a module got loaded before e1000 and then unloaded?
> > > That seems odd, but maybe unload isn't cleaning up?
> > > 
> > > >> Are you able to bisect this?
> > > >
> > > > This issue has been present for a while so since I recall this I might 
> > > > be
> > > > able to reduce the number of needed target kernels to bisect. Lemme 
> > > > tinker
> > > > a bit and if no clear culprit comes up then will try bisect.
> > > 
> > > Okay, thanks!
> > 
> > Sorry to report that this issue is present since the feature's addition. So
> > the issue is there since its addition and is still present today. *But* it
> > may also be a configuration issue, given I have booted this guest *without*
> > this issue ...
> > 
> > So:
> > 
> > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > 
> > That boots with the warning. To help debug further I've minimized my modules
> > to only a few: scsi_mod, e1000, libata.
> > 
> > I suspect at this point this is not the fault of a particular module but
> > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > 
> > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > mappings") and I with:
> > 
> > [0.949435] [ cut here ] 
> > 
> > [0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 
> > note_page+0x635/0x7e0()
> > [0.950996] x86/mm: Found insecure W+X mapping at address 
> > c000/0xc000
> > [0.951814] Modules linked in:   
> > 
> > [0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
> > 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > [0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> > rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [0.954033]   1f722925 88013a5d7d40 
> > 812ff335
> > [0.954742]  88013a5d7d88 88013a5d7d78 81079be2 
> > 88013a5d7e90
> > [0.955522]   0004 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-18 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  
> > wrote:
> > > Yes, but I had killed that boot session again, so upon my next boot
> > > I had a different layout, the ASLR gap was much larger:
> > >
> > > ---[ Modules ]---
> > > 0xc000-0xc01b1728K
> > >pte
> > > 0xc01b-0xc01b1000   4K RW 
> > > GLB x  pte
> > > 0xc01b1000-0xc01b2000   4K
> > >pte
> > > 0xc01b2000-0xc01c6000  80K ro 
> > > GLB x  pte
> > > 0xc01c6000-0xc01cc000  24K ro 
> > > GLB NX pte
> > > 0xc01cc000-0xc01d5000  36K RW 
> > > GLB NX pte
> > >
> > > As you can guess if we follow similar pattern the RW hole is the one this 
> > > boot
> > > warned about:
> > >
> > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > c01b/0xc01b
> > > [1.451280] [ cut here ]
> > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > [1.452499] Modules linked in:
> > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > 4.12.0-rc1-next-20170515+ #145
> > >
> > > I checked and indeed 0xc01b2000 is part of a module, it was not 
> > > the first one
> > > on the /proc/modules list but then again /proc/modules does not seem to 
> > > have a specific
> > > order other than perhaps being pegged into a linked list of modules once 
> > > they go live,
> > > and it seems its typically output backwards from when that happened, 
> > > sorting that
> > > by address we get:
> > 
> > Right, sorry, I'd expect it at the bottom of the list in
> > /proc/modules, but that's fine, it's there.
> > 
> > >
> > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> > >
> > > And this then seems to be the first module loaded:
> > >
> > > e1000 143360 0 - Live 0xc01b2000 (E)
> > >
> > > The output of dmesg seems to confirm this as per the list of modules 
> > > sorted
> > > as per above.
> > >
> > >> Something touched the module gap and left is RW+x...
> > >
> > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that 
> > > goes.
> > 
> > Is it possible a module got loaded before e1000 and then unloaded?
> > That seems odd, but maybe unload isn't cleaning up?
> > 
> > >> Are you able to bisect this?
> > >
> > > This issue has been present for a while so since I recall this I might be
> > > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > > a bit and if no clear culprit comes up then will try bisect.
> > 
> > Okay, thanks!
> 
> Sorry to report that this issue is present since the feature's addition. So
> the issue is there since its addition and is still present today. *But* it
> may also be a configuration issue, given I have booted this guest *without*
> this issue ...
> 
> So:
> 
> git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> 
> That boots with the warning. To help debug further I've minimized my modules
> to only a few: scsi_mod, e1000, libata.
> 
> I suspect at this point this is not the fault of a particular module but
> instead just an accounting semantic (>= or <= on an edge) but let's see.
> 
> I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> mappings") and I with:
> 
> [0.949435] [ cut here ]   
>   
> [0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 
> note_page+0x635/0x7e0()
> [0.950996] x86/mm: Found insecure W+X mapping at address 
> c000/0xc000
> [0.951814] Modules linked in: 
>   
> [0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
> 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> [0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [0.954033]   1f722925 88013a5d7d40 
> 812ff335
> [0.954742]  88013a5d7d88 88013a5d7d78 81079be2 
> 88013a5d7e90
> [0.955522]   0004  
> 
> [0.956256] Call Trace:
>   
> [0.956496]  [] dump_stack+0x44/0x5f 
>   
> [0.956953]  [] warn_slowpath_common+0x82/0xc0   
>   
> [0.957519]  [] 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-18 Thread Luis R. Rodriguez
On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  
> > wrote:
> > > Yes, but I had killed that boot session again, so upon my next boot
> > > I had a different layout, the ASLR gap was much larger:
> > >
> > > ---[ Modules ]---
> > > 0xc000-0xc01b1728K
> > >pte
> > > 0xc01b-0xc01b1000   4K RW 
> > > GLB x  pte
> > > 0xc01b1000-0xc01b2000   4K
> > >pte
> > > 0xc01b2000-0xc01c6000  80K ro 
> > > GLB x  pte
> > > 0xc01c6000-0xc01cc000  24K ro 
> > > GLB NX pte
> > > 0xc01cc000-0xc01d5000  36K RW 
> > > GLB NX pte
> > >
> > > As you can guess if we follow similar pattern the RW hole is the one this 
> > > boot
> > > warned about:
> > >
> > > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > > c01b/0xc01b
> > > [1.451280] [ cut here ]
> > > [1.451721] WARNING: CPU: 1 PID: 1 at 
> > > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > [1.452499] Modules linked in:
> > > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > > 4.12.0-rc1-next-20170515+ #145
> > >
> > > I checked and indeed 0xc01b2000 is part of a module, it was not 
> > > the first one
> > > on the /proc/modules list but then again /proc/modules does not seem to 
> > > have a specific
> > > order other than perhaps being pegged into a linked list of modules once 
> > > they go live,
> > > and it seems its typically output backwards from when that happened, 
> > > sorting that
> > > by address we get:
> > 
> > Right, sorry, I'd expect it at the bottom of the list in
> > /proc/modules, but that's fine, it's there.
> > 
> > >
> > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > e1000 143360 0 - Live 0xc01b2000 (E)
> > > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> > >
> > > And this then seems to be the first module loaded:
> > >
> > > e1000 143360 0 - Live 0xc01b2000 (E)
> > >
> > > The output of dmesg seems to confirm this as per the list of modules 
> > > sorted
> > > as per above.
> > >
> > >> Something touched the module gap and left is RW+x...
> > >
> > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that 
> > > goes.
> > 
> > Is it possible a module got loaded before e1000 and then unloaded?
> > That seems odd, but maybe unload isn't cleaning up?
> > 
> > >> Are you able to bisect this?
> > >
> > > This issue has been present for a while so since I recall this I might be
> > > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > > a bit and if no clear culprit comes up then will try bisect.
> > 
> > Okay, thanks!
> 
> Sorry to report that this issue is present since the feature's addition. So
> the issue is there since its addition and is still present today. *But* it
> may also be a configuration issue, given I have booted this guest *without*
> this issue ...
> 
> So:
> 
> git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> 
> That boots with the warning. To help debug further I've minimized my modules
> to only a few: scsi_mod, e1000, libata.
> 
> I suspect at this point this is not the fault of a particular module but
> instead just an accounting semantic (>= or <= on an edge) but let's see.
> 
> I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> mappings") and I with:
> 
> [0.949435] [ cut here ]   
>   
> [0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 
> note_page+0x635/0x7e0()
> [0.950996] x86/mm: Found insecure W+X mapping at address 
> c000/0xc000
> [0.951814] Modules linked in: 
>   
> [0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
> 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> [0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [0.954033]   1f722925 88013a5d7d40 
> 812ff335
> [0.954742]  88013a5d7d88 88013a5d7d78 81079be2 
> 88013a5d7e90
> [0.955522]   0004  
> 
> [0.956256] Call Trace:
>   
> [0.956496]  [] dump_stack+0x44/0x5f 
>   
> [0.956953]  [] warn_slowpath_common+0x82/0xc0   
>   
> [0.957519]  [] warn_slowpath_fmt+0x5c/0x80  

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-18 Thread Luis R. Rodriguez
On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  wrote:
> > Yes, but I had killed that boot session again, so upon my next boot
> > I had a different layout, the ASLR gap was much larger:
> >
> > ---[ Modules ]---
> > 0xc000-0xc01b1728K  
> >  pte
> > 0xc01b-0xc01b1000   4K RW 
> > GLB x  pte
> > 0xc01b1000-0xc01b2000   4K  
> >  pte
> > 0xc01b2000-0xc01c6000  80K ro 
> > GLB x  pte
> > 0xc01c6000-0xc01cc000  24K ro 
> > GLB NX pte
> > 0xc01cc000-0xc01d5000  36K RW 
> > GLB NX pte
> >
> > As you can guess if we follow similar pattern the RW hole is the one this 
> > boot
> > warned about:
> >
> > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > c01b/0xc01b
> > [1.451280] [ cut here ]
> > [1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
> > note_page+0x630/0x7e0
> > [1.452499] Modules linked in:
> > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > 4.12.0-rc1-next-20170515+ #145
> >
> > I checked and indeed 0xc01b2000 is part of a module, it was not the 
> > first one
> > on the /proc/modules list but then again /proc/modules does not seem to 
> > have a specific
> > order other than perhaps being pegged into a linked list of modules once 
> > they go live,
> > and it seems its typically output backwards from when that happened, 
> > sorting that
> > by address we get:
> 
> Right, sorry, I'd expect it at the bottom of the list in
> /proc/modules, but that's fine, it's there.
> 
> >
> > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > e1000 143360 0 - Live 0xc01b2000 (E)
> > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> >
> > And this then seems to be the first module loaded:
> >
> > e1000 143360 0 - Live 0xc01b2000 (E)
> >
> > The output of dmesg seems to confirm this as per the list of modules sorted
> > as per above.
> >
> >> Something touched the module gap and left is RW+x...
> >
> > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that 
> > goes.
> 
> Is it possible a module got loaded before e1000 and then unloaded?
> That seems odd, but maybe unload isn't cleaning up?
> 
> >> Are you able to bisect this?
> >
> > This issue has been present for a while so since I recall this I might be
> > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > a bit and if no clear culprit comes up then will try bisect.
> 
> Okay, thanks!

Sorry to report that this issue is present since the feature's addition. So
the issue is there since its addition and is still present today. *But* it
may also be a configuration issue, given I have booted this guest *without*
this issue ...

So:

git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db

That boots with the warning. To help debug further I've minimized my modules
to only a few: scsi_mod, e1000, libata.

I suspect at this point this is not the fault of a particular module but
instead just an accounting semantic (>= or <= on an edge) but let's see.

I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
mappings") and I with:

[0.949435] [ cut here ] 
[0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 
note_page+0x635/0x7e0()
[0.950996] x86/mm: Found insecure W+X mapping at address 
c000/0xc000
[0.951814] Modules linked in:   
[0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
[0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[0.954033]   1f722925 88013a5d7d40 
812ff335
[0.954742]  88013a5d7d88 88013a5d7d78 81079be2 
88013a5d7e90
[0.955522]   0004  

[0.956256] Call Trace:  
[0.956496]  [] dump_stack+0x44/0x5f   
[0.956953]  [] warn_slowpath_common+0x82/0xc0 
[0.957519]  [] warn_slowpath_fmt+0x5c/0x80
[0.958066]  [] note_page+0x635/0x7e0  
[0.958595]  [] ptdump_walk_pgd_level_core+0x2eb/0x410 
[0.959219]  [] ptdump_walk_pgd_level_checkwx+0x17/0x20
[0.959856]  [] mark_rodata_ro+0xed/0x100  
[0.960372]  [] ? rest_init+0x80/0x80

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-18 Thread Luis R. Rodriguez
On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  wrote:
> > Yes, but I had killed that boot session again, so upon my next boot
> > I had a different layout, the ASLR gap was much larger:
> >
> > ---[ Modules ]---
> > 0xc000-0xc01b1728K  
> >  pte
> > 0xc01b-0xc01b1000   4K RW 
> > GLB x  pte
> > 0xc01b1000-0xc01b2000   4K  
> >  pte
> > 0xc01b2000-0xc01c6000  80K ro 
> > GLB x  pte
> > 0xc01c6000-0xc01cc000  24K ro 
> > GLB NX pte
> > 0xc01cc000-0xc01d5000  36K RW 
> > GLB NX pte
> >
> > As you can guess if we follow similar pattern the RW hole is the one this 
> > boot
> > warned about:
> >
> > [1.450483] x86/mm: Found insecure W+X mapping at address 
> > c01b/0xc01b
> > [1.451280] [ cut here ]
> > [1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
> > note_page+0x630/0x7e0
> > [1.452499] Modules linked in:
> > [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> > 4.12.0-rc1-next-20170515+ #145
> >
> > I checked and indeed 0xc01b2000 is part of a module, it was not the 
> > first one
> > on the /proc/modules list but then again /proc/modules does not seem to 
> > have a specific
> > order other than perhaps being pegged into a linked list of modules once 
> > they go live,
> > and it seems its typically output backwards from when that happened, 
> > sorting that
> > by address we get:
> 
> Right, sorry, I'd expect it at the bottom of the list in
> /proc/modules, but that's fine, it's there.
> 
> >
> > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > e1000 143360 0 - Live 0xc01b2000 (E)
> > mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
> >
> > And this then seems to be the first module loaded:
> >
> > e1000 143360 0 - Live 0xc01b2000 (E)
> >
> > The output of dmesg seems to confirm this as per the list of modules sorted
> > as per above.
> >
> >> Something touched the module gap and left is RW+x...
> >
> > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that 
> > goes.
> 
> Is it possible a module got loaded before e1000 and then unloaded?
> That seems odd, but maybe unload isn't cleaning up?
> 
> >> Are you able to bisect this?
> >
> > This issue has been present for a while so since I recall this I might be
> > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > a bit and if no clear culprit comes up then will try bisect.
> 
> Okay, thanks!

Sorry to report that this issue is present since the feature's addition. So
the issue is there since its addition and is still present today. *But* it
may also be a configuration issue, given I have booted this guest *without*
this issue ...

So:

git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db

That boots with the warning. To help debug further I've minimized my modules
to only a few: scsi_mod, e1000, libata.

I suspect at this point this is not the fault of a particular module but
instead just an accounting semantic (>= or <= on an edge) but let's see.

I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
mappings") and I with:

[0.949435] [ cut here ] 
[0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 
note_page+0x635/0x7e0()
[0.950996] x86/mm: Found insecure W+X mapping at address 
c000/0xc000
[0.951814] Modules linked in:   
[0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
[0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[0.954033]   1f722925 88013a5d7d40 
812ff335
[0.954742]  88013a5d7d88 88013a5d7d78 81079be2 
88013a5d7e90
[0.955522]   0004  

[0.956256] Call Trace:  
[0.956496]  [] dump_stack+0x44/0x5f   
[0.956953]  [] warn_slowpath_common+0x82/0xc0 
[0.957519]  [] warn_slowpath_fmt+0x5c/0x80
[0.958066]  [] note_page+0x635/0x7e0  
[0.958595]  [] ptdump_walk_pgd_level_core+0x2eb/0x410 
[0.959219]  [] ptdump_walk_pgd_level_checkwx+0x17/0x20
[0.959856]  [] mark_rodata_ro+0xed/0x100  
[0.960372]  [] ? rest_init+0x80/0x80  

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-17 Thread Kees Cook
On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  wrote:
> Yes, but I had killed that boot session again, so upon my next boot
> I had a different layout, the ASLR gap was much larger:
>
> ---[ Modules ]---
> 0xc000-0xc01b1728K
>pte
> 0xc01b-0xc01b1000   4K RW GLB 
> x  pte
> 0xc01b1000-0xc01b2000   4K
>pte
> 0xc01b2000-0xc01c6000  80K ro GLB 
> x  pte
> 0xc01c6000-0xc01cc000  24K ro GLB 
> NX pte
> 0xc01cc000-0xc01d5000  36K RW GLB 
> NX pte
>
> As you can guess if we follow similar pattern the RW hole is the one this boot
> warned about:
>
> [1.450483] x86/mm: Found insecure W+X mapping at address 
> c01b/0xc01b
> [1.451280] [ cut here ]
> [1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
> note_page+0x630/0x7e0
> [1.452499] Modules linked in:
> [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> 4.12.0-rc1-next-20170515+ #145
>
> I checked and indeed 0xc01b2000 is part of a module, it was not the 
> first one
> on the /proc/modules list but then again /proc/modules does not seem to have 
> a specific
> order other than perhaps being pegged into a linked list of modules once they 
> go live,
> and it seems its typically output backwards from when that happened, sorting 
> that
> by address we get:

Right, sorry, I'd expect it at the bottom of the list in
/proc/modules, but that's fine, it's there.

>
> root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> e1000 143360 0 - Live 0xc01b2000 (E)
> mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
>
> And this then seems to be the first module loaded:
>
> e1000 143360 0 - Live 0xc01b2000 (E)
>
> The output of dmesg seems to confirm this as per the list of modules sorted
> as per above.
>
>> Something touched the module gap and left is RW+x...
>
> Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.

Is it possible a module got loaded before e1000 and then unloaded?
That seems odd, but maybe unload isn't cleaning up?

>> Are you able to bisect this?
>
> This issue has been present for a while so since I recall this I might be
> able to reduce the number of needed target kernels to bisect. Lemme tinker
> a bit and if no clear culprit comes up then will try bisect.

Okay, thanks!

-Kees


-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-17 Thread Kees Cook
On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez  wrote:
> Yes, but I had killed that boot session again, so upon my next boot
> I had a different layout, the ASLR gap was much larger:
>
> ---[ Modules ]---
> 0xc000-0xc01b1728K
>pte
> 0xc01b-0xc01b1000   4K RW GLB 
> x  pte
> 0xc01b1000-0xc01b2000   4K
>pte
> 0xc01b2000-0xc01c6000  80K ro GLB 
> x  pte
> 0xc01c6000-0xc01cc000  24K ro GLB 
> NX pte
> 0xc01cc000-0xc01d5000  36K RW GLB 
> NX pte
>
> As you can guess if we follow similar pattern the RW hole is the one this boot
> warned about:
>
> [1.450483] x86/mm: Found insecure W+X mapping at address 
> c01b/0xc01b
> [1.451280] [ cut here ]
> [1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
> note_page+0x630/0x7e0
> [1.452499] Modules linked in:
> [1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> 4.12.0-rc1-next-20170515+ #145
>
> I checked and indeed 0xc01b2000 is part of a module, it was not the 
> first one
> on the /proc/modules list but then again /proc/modules does not seem to have 
> a specific
> order other than perhaps being pegged into a linked list of modules once they 
> go live,
> and it seems its typically output backwards from when that happened, sorting 
> that
> by address we get:

Right, sorry, I'd expect it at the bottom of the list in
/proc/modules, but that's fine, it's there.

>
> root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> e1000 143360 0 - Live 0xc01b2000 (E)
> mbcache 16384 1 ext4, Live 0xc01d6000 (E)
> scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xc01df000 (E)
>
> And this then seems to be the first module loaded:
>
> e1000 143360 0 - Live 0xc01b2000 (E)
>
> The output of dmesg seems to confirm this as per the list of modules sorted
> as per above.
>
>> Something touched the module gap and left is RW+x...
>
> Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.

Is it possible a module got loaded before e1000 and then unloaded?
That seems odd, but maybe unload isn't cleaning up?

>> Are you able to bisect this?
>
> This issue has been present for a while so since I recall this I might be
> able to reduce the number of needed target kernels to bisect. Lemme tinker
> a bit and if no clear culprit comes up then will try bisect.

Okay, thanks!

-Kees


-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-17 Thread Luis R. Rodriguez
On Mon, May 15, 2017 at 05:12:18PM -0700, Kees Cook wrote:
> On Mon, May 15, 2017 at 4:45 PM, Luis R. Rodriguez  wrote:
> > On Mon, May 15, 2017 at 3:57 PM, Kees Cook  wrote:
> >> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  
> >> wrote:
> >>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>  Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
> 
>  I will try updating my distro package for qemu and see if perhaps its 
>  this
>  and for the other odd fork issue I reported [0].
> 
>  [0] 
>  https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
> >>>
> >>> Yeah nope, using my distribution latest:
> >>>
> >>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
> >>>
> >>> And still both issues are present.
> >>>
> >>>   Luis
> >>
> >> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
> >> at c0288000 via /sys/kernel/debug/kernel_page_tables ?
> >
> > Sure thing.
> >
> > Recompiled with this enabled, new warning:
> >
> > [0.891559] x86/mm: Found insecure W+X mapping at address
> > c00e4000/0xc00e4000
> > [0.892394] [ cut here ]
> > [0.892834] WARNING: CPU: 0 PID: 1 at
> > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > [0.893674] Modules linked in:
> > [0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > 4.12.0-rc1-next-20170515+ #145
> > [0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [0.895828] task: 8ed7fa5ccc80 task.stack: ae390063
> > [0.896403] RIP: 0010:note_page+0x630/0x7e0
> > [0.896780] RSP: 0018:ae3900633df0 EFLAGS: 00010286
> > [0.897271] RAX: 0051 RBX: ae3900633e88 RCX: 
> > 9b456708
> > [0.897940] RDX:  RSI: 0096 RDI: 
> > 0246
> > [0.898624] RBP: ae3900633e28 R08: 203a6d6d2f363878 R09: 
> > 0165
> > [0.899314] R10: ae3900633dd8 R11: 736e6920646e756f R12: 
> > 
> > [0.899987] R13: 0004 R14:  R15: 
> > 
> > [0.900629] FS:  () GS:8ed7ffc0()
> > knlGS:
> > [0.901398] CS:  0010 DS:  ES:  CR0: 80050033
> > [0.901908] CR2:  CR3: 000118009000 CR4: 
> > 06f0
> > [0.902590] Call Trace:
> > [0.902827]  ptdump_walk_pgd_level_core+0x3e7/0x490
> > [0.903274]  ? 0x9a80
> > [0.903595]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> > [0.904064]  mark_rodata_ro+0xf4/0x100
> > [0.904423]  ? rest_init+0x80/0x80
> > [0.904744]  kernel_init+0x2f/0x100
> > [0.905068]  ret_from_fork+0x2c/0x40
> > [0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
> > ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
> > cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
> > b6 fc
> > [0.907173] ---[ end trace 878b39cb0c248e66 ]---
> > [0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> >
> > And c00e4000 is:
> >
> > ---[ Modules ]---
> > 0xc000-0xc00e4000 912K
> >   pte
> > 0xc00e4000-0xc00e5000   4K RW
> >GLB x  pte
> >
> > In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
> >
> > http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt
> 
> ---[ Modules ]---
> 0xc000-0xc00e4000 912K
>   pte
> 
> This should be the modules ASLR gap
> 
> 0xc00e4000-0xc00e5000   4K RW
>GLB x  pte
> 
> This is part of the same gap, but it's RW+x strangely?
> 
> 0xc00e5000-0xc00e6000   4K
>   pte
> 
> This is more of the gap?
> 
> 0xc00e6000-0xc00fa000  80K ro
>GLB x  pte
> 0xc00fa000-0xc010c000  72K ro
>GLB NX pte
> 0xc010c000-0xc011b000  60K RW
>GLB NX pte
> 
> This should be the first loaded module. Can you check that
> 0xc00e6000 matches the first module in /proc/modules?

Yes, but I had killed that boot session again, so upon my next boot
I had a different layout, the ASLR gap was much larger:

---[ Modules ]---
0xc000-0xc01b1728K  
 pte
0xc01b-0xc01b1000   4K RW GLB x 
 pte
0xc01b1000-0xc01b2000   4K  
 pte
0xc01b2000-0xc01c6000  80K ro GLB x 
 pte
0xc01c6000-0xc01cc000  24K ro 

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-17 Thread Luis R. Rodriguez
On Mon, May 15, 2017 at 05:12:18PM -0700, Kees Cook wrote:
> On Mon, May 15, 2017 at 4:45 PM, Luis R. Rodriguez  wrote:
> > On Mon, May 15, 2017 at 3:57 PM, Kees Cook  wrote:
> >> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  
> >> wrote:
> >>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>  Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
> 
>  I will try updating my distro package for qemu and see if perhaps its 
>  this
>  and for the other odd fork issue I reported [0].
> 
>  [0] 
>  https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
> >>>
> >>> Yeah nope, using my distribution latest:
> >>>
> >>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
> >>>
> >>> And still both issues are present.
> >>>
> >>>   Luis
> >>
> >> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
> >> at c0288000 via /sys/kernel/debug/kernel_page_tables ?
> >
> > Sure thing.
> >
> > Recompiled with this enabled, new warning:
> >
> > [0.891559] x86/mm: Found insecure W+X mapping at address
> > c00e4000/0xc00e4000
> > [0.892394] [ cut here ]
> > [0.892834] WARNING: CPU: 0 PID: 1 at
> > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > [0.893674] Modules linked in:
> > [0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > 4.12.0-rc1-next-20170515+ #145
> > [0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [0.895828] task: 8ed7fa5ccc80 task.stack: ae390063
> > [0.896403] RIP: 0010:note_page+0x630/0x7e0
> > [0.896780] RSP: 0018:ae3900633df0 EFLAGS: 00010286
> > [0.897271] RAX: 0051 RBX: ae3900633e88 RCX: 
> > 9b456708
> > [0.897940] RDX:  RSI: 0096 RDI: 
> > 0246
> > [0.898624] RBP: ae3900633e28 R08: 203a6d6d2f363878 R09: 
> > 0165
> > [0.899314] R10: ae3900633dd8 R11: 736e6920646e756f R12: 
> > 
> > [0.899987] R13: 0004 R14:  R15: 
> > 
> > [0.900629] FS:  () GS:8ed7ffc0()
> > knlGS:
> > [0.901398] CS:  0010 DS:  ES:  CR0: 80050033
> > [0.901908] CR2:  CR3: 000118009000 CR4: 
> > 06f0
> > [0.902590] Call Trace:
> > [0.902827]  ptdump_walk_pgd_level_core+0x3e7/0x490
> > [0.903274]  ? 0x9a80
> > [0.903595]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> > [0.904064]  mark_rodata_ro+0xf4/0x100
> > [0.904423]  ? rest_init+0x80/0x80
> > [0.904744]  kernel_init+0x2f/0x100
> > [0.905068]  ret_from_fork+0x2c/0x40
> > [0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
> > ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
> > cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
> > b6 fc
> > [0.907173] ---[ end trace 878b39cb0c248e66 ]---
> > [0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> >
> > And c00e4000 is:
> >
> > ---[ Modules ]---
> > 0xc000-0xc00e4000 912K
> >   pte
> > 0xc00e4000-0xc00e5000   4K RW
> >GLB x  pte
> >
> > In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
> >
> > http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt
> 
> ---[ Modules ]---
> 0xc000-0xc00e4000 912K
>   pte
> 
> This should be the modules ASLR gap
> 
> 0xc00e4000-0xc00e5000   4K RW
>GLB x  pte
> 
> This is part of the same gap, but it's RW+x strangely?
> 
> 0xc00e5000-0xc00e6000   4K
>   pte
> 
> This is more of the gap?
> 
> 0xc00e6000-0xc00fa000  80K ro
>GLB x  pte
> 0xc00fa000-0xc010c000  72K ro
>GLB NX pte
> 0xc010c000-0xc011b000  60K RW
>GLB NX pte
> 
> This should be the first loaded module. Can you check that
> 0xc00e6000 matches the first module in /proc/modules?

Yes, but I had killed that boot session again, so upon my next boot
I had a different layout, the ASLR gap was much larger:

---[ Modules ]---
0xc000-0xc01b1728K  
 pte
0xc01b-0xc01b1000   4K RW GLB x 
 pte
0xc01b1000-0xc01b2000   4K  
 pte
0xc01b2000-0xc01c6000  80K ro GLB x 
 pte
0xc01c6000-0xc01cc000  24K ro GLB 
NX pte
0xc01cc000-0xc01d5000  36K  

Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Kees Cook
On Mon, May 15, 2017 at 4:45 PM, Luis R. Rodriguez  wrote:
> On Mon, May 15, 2017 at 3:57 PM, Kees Cook  wrote:
>> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
>>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
 Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)

 I will try updating my distro package for qemu and see if perhaps its this
 and for the other odd fork issue I reported [0].

 [0] 
 https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>>>
>>> Yeah nope, using my distribution latest:
>>>
>>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>>>
>>> And still both issues are present.
>>>
>>>   Luis
>>
>> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
>> at c0288000 via /sys/kernel/debug/kernel_page_tables ?
>
> Sure thing.
>
> Recompiled with this enabled, new warning:
>
> [0.891559] x86/mm: Found insecure W+X mapping at address
> c00e4000/0xc00e4000
> [0.892394] [ cut here ]
> [0.892834] WARNING: CPU: 0 PID: 1 at
> arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> [0.893674] Modules linked in:
> [0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 4.12.0-rc1-next-20170515+ #145
> [0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [0.895828] task: 8ed7fa5ccc80 task.stack: ae390063
> [0.896403] RIP: 0010:note_page+0x630/0x7e0
> [0.896780] RSP: 0018:ae3900633df0 EFLAGS: 00010286
> [0.897271] RAX: 0051 RBX: ae3900633e88 RCX: 
> 9b456708
> [0.897940] RDX:  RSI: 0096 RDI: 
> 0246
> [0.898624] RBP: ae3900633e28 R08: 203a6d6d2f363878 R09: 
> 0165
> [0.899314] R10: ae3900633dd8 R11: 736e6920646e756f R12: 
> 
> [0.899987] R13: 0004 R14:  R15: 
> 
> [0.900629] FS:  () GS:8ed7ffc0()
> knlGS:
> [0.901398] CS:  0010 DS:  ES:  CR0: 80050033
> [0.901908] CR2:  CR3: 000118009000 CR4: 
> 06f0
> [0.902590] Call Trace:
> [0.902827]  ptdump_walk_pgd_level_core+0x3e7/0x490
> [0.903274]  ? 0x9a80
> [0.903595]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> [0.904064]  mark_rodata_ro+0xf4/0x100
> [0.904423]  ? rest_init+0x80/0x80
> [0.904744]  kernel_init+0x2f/0x100
> [0.905068]  ret_from_fork+0x2c/0x40
> [0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
> ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
> cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
> b6 fc
> [0.907173] ---[ end trace 878b39cb0c248e66 ]---
> [0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> And c00e4000 is:
>
> ---[ Modules ]---
> 0xc000-0xc00e4000 912K
>   pte
> 0xc00e4000-0xc00e5000   4K RW
>GLB x  pte
>
> In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
>
> http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt

---[ Modules ]---
0xc000-0xc00e4000 912K
  pte

This should be the modules ASLR gap

0xc00e4000-0xc00e5000   4K RW
   GLB x  pte

This is part of the same gap, but it's RW+x strangely?

0xc00e5000-0xc00e6000   4K
  pte

This is more of the gap?

0xc00e6000-0xc00fa000  80K ro
   GLB x  pte
0xc00fa000-0xc010c000  72K ro
   GLB NX pte
0xc010c000-0xc011b000  60K RW
   GLB NX pte

This should be the first loaded module. Can you check that
0xc00e6000 matches the first module in /proc/modules?

Something touched the module gap and left is RW+x...

Are you able to bisect this?

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Kees Cook
On Mon, May 15, 2017 at 4:45 PM, Luis R. Rodriguez  wrote:
> On Mon, May 15, 2017 at 3:57 PM, Kees Cook  wrote:
>> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
>>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
 Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)

 I will try updating my distro package for qemu and see if perhaps its this
 and for the other odd fork issue I reported [0].

 [0] 
 https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>>>
>>> Yeah nope, using my distribution latest:
>>>
>>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>>>
>>> And still both issues are present.
>>>
>>>   Luis
>>
>> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
>> at c0288000 via /sys/kernel/debug/kernel_page_tables ?
>
> Sure thing.
>
> Recompiled with this enabled, new warning:
>
> [0.891559] x86/mm: Found insecure W+X mapping at address
> c00e4000/0xc00e4000
> [0.892394] [ cut here ]
> [0.892834] WARNING: CPU: 0 PID: 1 at
> arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> [0.893674] Modules linked in:
> [0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 4.12.0-rc1-next-20170515+ #145
> [0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [0.895828] task: 8ed7fa5ccc80 task.stack: ae390063
> [0.896403] RIP: 0010:note_page+0x630/0x7e0
> [0.896780] RSP: 0018:ae3900633df0 EFLAGS: 00010286
> [0.897271] RAX: 0051 RBX: ae3900633e88 RCX: 
> 9b456708
> [0.897940] RDX:  RSI: 0096 RDI: 
> 0246
> [0.898624] RBP: ae3900633e28 R08: 203a6d6d2f363878 R09: 
> 0165
> [0.899314] R10: ae3900633dd8 R11: 736e6920646e756f R12: 
> 
> [0.899987] R13: 0004 R14:  R15: 
> 
> [0.900629] FS:  () GS:8ed7ffc0()
> knlGS:
> [0.901398] CS:  0010 DS:  ES:  CR0: 80050033
> [0.901908] CR2:  CR3: 000118009000 CR4: 
> 06f0
> [0.902590] Call Trace:
> [0.902827]  ptdump_walk_pgd_level_core+0x3e7/0x490
> [0.903274]  ? 0x9a80
> [0.903595]  ptdump_walk_pgd_level_checkwx+0x17/0x20
> [0.904064]  mark_rodata_ro+0xf4/0x100
> [0.904423]  ? rest_init+0x80/0x80
> [0.904744]  kernel_init+0x2f/0x100
> [0.905068]  ret_from_fork+0x2c/0x40
> [0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
> ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
> cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
> b6 fc
> [0.907173] ---[ end trace 878b39cb0c248e66 ]---
> [0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> And c00e4000 is:
>
> ---[ Modules ]---
> 0xc000-0xc00e4000 912K
>   pte
> 0xc00e4000-0xc00e5000   4K RW
>GLB x  pte
>
> In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
>
> http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt

---[ Modules ]---
0xc000-0xc00e4000 912K
  pte

This should be the modules ASLR gap

0xc00e4000-0xc00e5000   4K RW
   GLB x  pte

This is part of the same gap, but it's RW+x strangely?

0xc00e5000-0xc00e6000   4K
  pte

This is more of the gap?

0xc00e6000-0xc00fa000  80K ro
   GLB x  pte
0xc00fa000-0xc010c000  72K ro
   GLB NX pte
0xc010c000-0xc011b000  60K RW
   GLB NX pte

This should be the first loaded module. Can you check that
0xc00e6000 matches the first module in /proc/modules?

Something touched the module gap and left is RW+x...

Are you able to bisect this?

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
On Mon, May 15, 2017 at 3:57 PM, Kees Cook  wrote:
> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>>
>>> I will try updating my distro package for qemu and see if perhaps its this
>>> and for the other odd fork issue I reported [0].
>>>
>>> [0] 
>>> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>>
>> Yeah nope, using my distribution latest:
>>
>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>>
>> And still both issues are present.
>>
>>   Luis
>
> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
> at c0288000 via /sys/kernel/debug/kernel_page_tables ?

Sure thing.

Recompiled with this enabled, new warning:

[0.891559] x86/mm: Found insecure W+X mapping at address
c00e4000/0xc00e4000
[0.892394] [ cut here ]
[0.892834] WARNING: CPU: 0 PID: 1 at
arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
[0.893674] Modules linked in:
[0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.12.0-rc1-next-20170515+ #145
[0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[0.895828] task: 8ed7fa5ccc80 task.stack: ae390063
[0.896403] RIP: 0010:note_page+0x630/0x7e0
[0.896780] RSP: 0018:ae3900633df0 EFLAGS: 00010286
[0.897271] RAX: 0051 RBX: ae3900633e88 RCX: 9b456708
[0.897940] RDX:  RSI: 0096 RDI: 0246
[0.898624] RBP: ae3900633e28 R08: 203a6d6d2f363878 R09: 0165
[0.899314] R10: ae3900633dd8 R11: 736e6920646e756f R12: 
[0.899987] R13: 0004 R14:  R15: 
[0.900629] FS:  () GS:8ed7ffc0()
knlGS:
[0.901398] CS:  0010 DS:  ES:  CR0: 80050033
[0.901908] CR2:  CR3: 000118009000 CR4: 06f0
[0.902590] Call Trace:
[0.902827]  ptdump_walk_pgd_level_core+0x3e7/0x490
[0.903274]  ? 0x9a80
[0.903595]  ptdump_walk_pgd_level_checkwx+0x17/0x20
[0.904064]  mark_rodata_ro+0xf4/0x100
[0.904423]  ? rest_init+0x80/0x80
[0.904744]  kernel_init+0x2f/0x100
[0.905068]  ret_from_fork+0x2c/0x40
[0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
b6 fc
[0.907173] ---[ end trace 878b39cb0c248e66 ]---
[0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.

And c00e4000 is:

---[ Modules ]---
0xc000-0xc00e4000 912K
  pte
0xc00e4000-0xc00e5000   4K RW
   GLB x  pte

In case someone needs the full /sys/kernel/debug/kernel_page_tables file:

http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt

 Luis


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
On Mon, May 15, 2017 at 3:57 PM, Kees Cook  wrote:
> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>>
>>> I will try updating my distro package for qemu and see if perhaps its this
>>> and for the other odd fork issue I reported [0].
>>>
>>> [0] 
>>> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>>
>> Yeah nope, using my distribution latest:
>>
>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>>
>> And still both issues are present.
>>
>>   Luis
>
> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
> at c0288000 via /sys/kernel/debug/kernel_page_tables ?

Sure thing.

Recompiled with this enabled, new warning:

[0.891559] x86/mm: Found insecure W+X mapping at address
c00e4000/0xc00e4000
[0.892394] [ cut here ]
[0.892834] WARNING: CPU: 0 PID: 1 at
arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
[0.893674] Modules linked in:
[0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.12.0-rc1-next-20170515+ #145
[0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[0.895828] task: 8ed7fa5ccc80 task.stack: ae390063
[0.896403] RIP: 0010:note_page+0x630/0x7e0
[0.896780] RSP: 0018:ae3900633df0 EFLAGS: 00010286
[0.897271] RAX: 0051 RBX: ae3900633e88 RCX: 9b456708
[0.897940] RDX:  RSI: 0096 RDI: 0246
[0.898624] RBP: ae3900633e28 R08: 203a6d6d2f363878 R09: 0165
[0.899314] R10: ae3900633dd8 R11: 736e6920646e756f R12: 
[0.899987] R13: 0004 R14:  R15: 
[0.900629] FS:  () GS:8ed7ffc0()
knlGS:
[0.901398] CS:  0010 DS:  ES:  CR0: 80050033
[0.901908] CR2:  CR3: 000118009000 CR4: 06f0
[0.902590] Call Trace:
[0.902827]  ptdump_walk_pgd_level_core+0x3e7/0x490
[0.903274]  ? 0x9a80
[0.903595]  ptdump_walk_pgd_level_checkwx+0x17/0x20
[0.904064]  mark_rodata_ro+0xf4/0x100
[0.904423]  ? rest_init+0x80/0x80
[0.904744]  kernel_init+0x2f/0x100
[0.905068]  ret_from_fork+0x2c/0x40
[0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
b6 fc
[0.907173] ---[ end trace 878b39cb0c248e66 ]---
[0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.

And c00e4000 is:

---[ Modules ]---
0xc000-0xc00e4000 912K
  pte
0xc00e4000-0xc00e5000   4K RW
   GLB x  pte

In case someone needs the full /sys/kernel/debug/kernel_page_tables file:

http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt

 Luis


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>
>> I will try updating my distro package for qemu and see if perhaps its this
>> and for the other odd fork issue I reported [0].
>>
>> [0] 
>> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>
> Yeah nope, using my distribution latest:
>
> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>
> And still both issues are present.

FWIW also compiled and tried to boot with the latest qemu, v2.9.0-rc5
and it also has both issues, so I don't think this is because of the
version of qemu.

 Luis


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>
>> I will try updating my distro package for qemu and see if perhaps its this
>> and for the other odd fork issue I reported [0].
>>
>> [0] 
>> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>
> Yeah nope, using my distribution latest:
>
> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>
> And still both issues are present.

FWIW also compiled and tried to boot with the latest qemu, v2.9.0-rc5
and it also has both issues, so I don't think this is because of the
version of qemu.

 Luis


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Kees Cook
On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>
>> I will try updating my distro package for qemu and see if perhaps its this
>> and for the other odd fork issue I reported [0].
>>
>> [0] 
>> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>
> Yeah nope, using my distribution latest:
>
> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>
> And still both issues are present.
>
>   Luis

Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
at c0288000 via /sys/kernel/debug/kernel_page_tables ?

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Kees Cook
On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez  wrote:
> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>
>> I will try updating my distro package for qemu and see if perhaps its this
>> and for the other odd fork issue I reported [0].
>>
>> [0] 
>> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com
>
> Yeah nope, using my distribution latest:
>
> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>
> And still both issues are present.
>
>   Luis

Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
at c0288000 via /sys/kernel/debug/kernel_page_tables ?

-Kees

-- 
Kees Cook
Pixel Security


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
> 
> I will try updating my distro package for qemu and see if perhaps its this
> and for the other odd fork issue I reported [0].
> 
> [0] 
> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com

Yeah nope, using my distribution latest:

QEMU emulator version 2.8.0(openSUSE Tumbleweed)

And still both issues are present.

  Luis


Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
> 
> I will try updating my distro package for qemu and see if perhaps its this
> and for the other odd fork issue I reported [0].
> 
> [0] 
> https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com

Yeah nope, using my distribution latest:

QEMU emulator version 2.8.0(openSUSE Tumbleweed)

And still both issues are present.

  Luis


next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
For a few kernel releases now I have managed to trigger the warning added via
commit e1a58320a38dfa ("x86/mm: Warn on W^X mappings", merged upstream since
v4.4) on my KVM qemu x86_64 system. Since I just booted into the shiny new
linux-next tag next-20170515 (based on v4.12-rc1) and this is still triggering
I figured its time to tackle this.

Let me know if this is already known or what can be done to try to fix this.

Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)

I will try updating my distro package for qemu and see if perhaps its this
and for the other odd fork issue I reported [0].

[0] 
https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com

My config:

http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/configs/piggy-x86_64_qemu_fork_kmemleak.config

The splat:

[0.911209] x86/mm: Found insecure W+X mapping at address 
c0288000/0xc0288000
[0.912066] [ cut here ]
[0.912544] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
note_page+0x630/0x7e0
[0.913381] Modules linked in:
[0.913672] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.12.0-rc1-next-20170515+ #144
[0.914434] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
[0.915595] task: 98d43a5eac80 task.stack: ad22c063
[0.916174] RIP: 0010:note_page+0x630/0x7e0
[0.916595] RSP: 0018:ad22c0633df0 EFLAGS: 00010286
[0.917101] RAX: 0051 RBX: ad22c0633e88 RCX: 91256708
[0.917805] RDX:  RSI: 0096 RDI: 0246
[0.918511] RBP: ad22c0633e28 R08: 6678 R09: 0160
[0.919214] R10: ad22c0633dd8 R11: 3030303838323063 R12: 
[0.919917] R13: 0004 R14:  R15: 
[0.920615] FS:  () GS:98d43fc0() 
knlGS:
[0.921384] CS:  0010 DS:  ES:  CR0: 80050033
[0.921943] CR2:  CR3: a3a09000 CR4: 06f0
[0.922657] Call Trace:
[0.922901]  ptdump_walk_pgd_level_core+0x3e7/0x490
[0.923354]  ? 0x9060
[0.923662]  ptdump_walk_pgd_level_checkwx+0x17/0x20
[0.924145]  mark_rodata_ro+0xf4/0x100
[0.924536]  ? rest_init+0x80/0x80
[0.924862]  kernel_init+0x2f/0x100
[0.925197]  ret_from_fork+0x2c/0x40
[0.925552] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 
73 10 48 c7 c7 c8 34 fe 90 c6 05 c8 eb bc 00 01 48 89 f2 e8 8d fc 11 00 <0f> ff 
e9 1f fa ff ff 48 8b 70 20 48 c7 c7 05 b1 fe 90 e8 76 fc
[0.927368] ---[ end trace 97137ae213b9cb25 ]---
[0.927830] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.

  Luis


next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

2017-05-15 Thread Luis R. Rodriguez
For a few kernel releases now I have managed to trigger the warning added via
commit e1a58320a38dfa ("x86/mm: Warn on W^X mappings", merged upstream since
v4.4) on my KVM qemu x86_64 system. Since I just booted into the shiny new
linux-next tag next-20170515 (based on v4.12-rc1) and this is still triggering
I figured its time to tackle this.

Let me know if this is already known or what can be done to try to fix this.

Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)

I will try updating my distro package for qemu and see if perhaps its this
and for the other odd fork issue I reported [0].

[0] 
https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=sw...@mail.gmail.com

My config:

http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/configs/piggy-x86_64_qemu_fork_kmemleak.config

The splat:

[0.911209] x86/mm: Found insecure W+X mapping at address 
c0288000/0xc0288000
[0.912066] [ cut here ]
[0.912544] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 
note_page+0x630/0x7e0
[0.913381] Modules linked in:
[0.913672] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.12.0-rc1-next-20170515+ #144
[0.914434] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
[0.915595] task: 98d43a5eac80 task.stack: ad22c063
[0.916174] RIP: 0010:note_page+0x630/0x7e0
[0.916595] RSP: 0018:ad22c0633df0 EFLAGS: 00010286
[0.917101] RAX: 0051 RBX: ad22c0633e88 RCX: 91256708
[0.917805] RDX:  RSI: 0096 RDI: 0246
[0.918511] RBP: ad22c0633e28 R08: 6678 R09: 0160
[0.919214] R10: ad22c0633dd8 R11: 3030303838323063 R12: 
[0.919917] R13: 0004 R14:  R15: 
[0.920615] FS:  () GS:98d43fc0() 
knlGS:
[0.921384] CS:  0010 DS:  ES:  CR0: 80050033
[0.921943] CR2:  CR3: a3a09000 CR4: 06f0
[0.922657] Call Trace:
[0.922901]  ptdump_walk_pgd_level_core+0x3e7/0x490
[0.923354]  ? 0x9060
[0.923662]  ptdump_walk_pgd_level_checkwx+0x17/0x20
[0.924145]  mark_rodata_ro+0xf4/0x100
[0.924536]  ? rest_init+0x80/0x80
[0.924862]  kernel_init+0x2f/0x100
[0.925197]  ret_from_fork+0x2c/0x40
[0.925552] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 
73 10 48 c7 c7 c8 34 fe 90 c6 05 c8 eb bc 00 01 48 89 f2 e8 8d fc 11 00 <0f> ff 
e9 1f fa ff ff 48 8b 70 20 48 c7 c7 05 b1 fe 90 e8 76 fc
[0.927368] ---[ end trace 97137ae213b9cb25 ]---
[0.927830] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.

  Luis