Re: KASAN: use-after-free Read in remove_wait_queue (2)

2018-05-23 Thread Guillaume Nault
On Tue, May 22, 2018 at 08:29:58PM -0700, Eric Biggers wrote:
> On Fri, May 18, 2018 at 06:02:23PM +0200, Guillaume Nault wrote:
> > On Sun, May 13, 2018 at 11:11:55PM -0700, Eric Biggers wrote:
> > > [+ppp list and maintainer]
> > > 
> > > This is a bug in ppp_generic.c; it still happens on Linus' tree and it's 
> > > easily
> > > reproducible, see program below.  The bug is that the PPPIOCDETACH ioctl 
> > > doesn't
> > > consider that the file can still be attached to epoll instances even when
> > > ->f_count == 1.
> > 
> > Right. What would it take to remove the file for the epoll instances?
> > Sorry for the naive question, but I'm not familiar with VFS and didn't
> > find a helper function we could call.
> > 
> 
> There is eventpoll_release_file(), but it's not exported to modules.  It might
> work to call it, but it seems like a hack.
> 
> > > Also, the reproducer doesn't test this but I think ppp_poll(),
> > > ppp_read(), and ppp_write() can all race with PPPIOCDETACH, causing
> > > use-after-frees as well.
> > 
> > I also believe so. ppp_release() resets ->private_data, and even though
> > functions like ppp_read() test ->private_data before executing, there's
> > no synchronisation mechanism to ensure that the update is visible
> > before the unit or channel is destroyed. Is that the kind of race you
> > had in mind?
> 
> Yes, though after looking into it more I *think* these additional races aren't
> actually possible, due to the 'f_count < 2' check.  These races could only
> happen with a shared fd table, but in that case fdget() would increment 
> f_count
> for the duration of each operation, resulting in 'f_count >= 2' if both 
> ioctl()
> and something else is running on the same file concurrently.
> 
> Note that this also means PPPIOCDETACH doesn't work at all if called from a
> multithreaded application...
> 
> > 
> > > Any chance that PPPIOCDETACH can simply be removed,
> > > given that it's apparently been "deprecated" for 16 years?
> > > Does anyone use it?
> > 
> > The only users I'm aware of are pppd versions older than ppp-2.4.2
> > (released in November 2003). But even at that time, there were issues
> > with PPPIOCDETACH as pppd didn't seem to react properly when this call
> > failed. An Internet search on the "PPPIOCDETACH file->f_count=" kernel
> > log string, or on the "Couldn't release PPP unit: Invalid argument"
> > error message of pppd, returns several related bug reports.
> > 
> > Originally, PPPIOCDETACH never failed, but testing ->f_count was
> > later introduced to fix crashes when the file descriptor had been
> > duplicated. It seems that this was motivated by polling issues too.
> > 
> > Long story short, it looks like PPPIOCDETACH never has worked well
> > and we have at least two more bugs to fix. Given how it has proven
> > fragile, I wouldn't be surprised if there were even more lurking
> > around. I'd say that it's probably safer to drop it than to add more
> > workarounds and playing wack-a-mole with those bugs.
> 
> IMO, if we can get away with removing it without any users noticing, that 
> would
> be much better than trying to fix it with a VFS-level hack, and probably 
> missing
> some cases.  I'll send a patch to get things started...
> 
Yes, I fully agree. That looks much safer, and given the track record
of this ioctl I very much doubt anyone would depend on it.
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KASAN: use-after-free Read in remove_wait_queue (2)

2018-05-22 Thread Eric Biggers
On Fri, May 18, 2018 at 06:02:23PM +0200, Guillaume Nault wrote:
> On Sun, May 13, 2018 at 11:11:55PM -0700, Eric Biggers wrote:
> > [+ppp list and maintainer]
> > 
> > This is a bug in ppp_generic.c; it still happens on Linus' tree and it's 
> > easily
> > reproducible, see program below.  The bug is that the PPPIOCDETACH ioctl 
> > doesn't
> > consider that the file can still be attached to epoll instances even when
> > ->f_count == 1.
> 
> Right. What would it take to remove the file for the epoll instances?
> Sorry for the naive question, but I'm not familiar with VFS and didn't
> find a helper function we could call.
> 

There is eventpoll_release_file(), but it's not exported to modules.  It might
work to call it, but it seems like a hack.

> > Also, the reproducer doesn't test this but I think ppp_poll(),
> > ppp_read(), and ppp_write() can all race with PPPIOCDETACH, causing
> > use-after-frees as well.
> 
> I also believe so. ppp_release() resets ->private_data, and even though
> functions like ppp_read() test ->private_data before executing, there's
> no synchronisation mechanism to ensure that the update is visible
> before the unit or channel is destroyed. Is that the kind of race you
> had in mind?

Yes, though after looking into it more I *think* these additional races aren't
actually possible, due to the 'f_count < 2' check.  These races could only
happen with a shared fd table, but in that case fdget() would increment f_count
for the duration of each operation, resulting in 'f_count >= 2' if both ioctl()
and something else is running on the same file concurrently.

Note that this also means PPPIOCDETACH doesn't work at all if called from a
multithreaded application...

> 
> > Any chance that PPPIOCDETACH can simply be removed,
> > given that it's apparently been "deprecated" for 16 years?
> > Does anyone use it?
> 
> The only users I'm aware of are pppd versions older than ppp-2.4.2
> (released in November 2003). But even at that time, there were issues
> with PPPIOCDETACH as pppd didn't seem to react properly when this call
> failed. An Internet search on the "PPPIOCDETACH file->f_count=" kernel
> log string, or on the "Couldn't release PPP unit: Invalid argument"
> error message of pppd, returns several related bug reports.
> 
> Originally, PPPIOCDETACH never failed, but testing ->f_count was
> later introduced to fix crashes when the file descriptor had been
> duplicated. It seems that this was motivated by polling issues too.
> 
> Long story short, it looks like PPPIOCDETACH never has worked well
> and we have at least two more bugs to fix. Given how it has proven
> fragile, I wouldn't be surprised if there were even more lurking
> around. I'd say that it's probably safer to drop it than to add more
> workarounds and playing wack-a-mole with those bugs.

IMO, if we can get away with removing it without any users noticing, that would
be much better than trying to fix it with a VFS-level hack, and probably missing
some cases.  I'll send a patch to get things started...

- Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KASAN: use-after-free Read in remove_wait_queue (2)

2018-05-18 Thread Guillaume Nault
On Sun, May 13, 2018 at 11:11:55PM -0700, Eric Biggers wrote:
> [+ppp list and maintainer]
> 
> This is a bug in ppp_generic.c; it still happens on Linus' tree and it's 
> easily
> reproducible, see program below.  The bug is that the PPPIOCDETACH ioctl 
> doesn't
> consider that the file can still be attached to epoll instances even when
> ->f_count == 1.

Right. What would it take to remove the file for the epoll instances?
Sorry for the naive question, but I'm not familiar with VFS and didn't
find a helper function we could call.

> Also, the reproducer doesn't test this but I think ppp_poll(),
> ppp_read(), and ppp_write() can all race with PPPIOCDETACH, causing
> use-after-frees as well.

I also believe so. ppp_release() resets ->private_data, and even though
functions like ppp_read() test ->private_data before executing, there's
no synchronisation mechanism to ensure that the update is visible
before the unit or channel is destroyed. Is that the kind of race you
had in mind?

> Any chance that PPPIOCDETACH can simply be removed,
> given that it's apparently been "deprecated" for 16 years?
> Does anyone use it?

The only users I'm aware of are pppd versions older than ppp-2.4.2
(released in November 2003). But even at that time, there were issues
with PPPIOCDETACH as pppd didn't seem to react properly when this call
failed. An Internet search on the "PPPIOCDETACH file->f_count=" kernel
log string, or on the "Couldn't release PPP unit: Invalid argument"
error message of pppd, returns several related bug reports.

Originally, PPPIOCDETACH never failed, but testing ->f_count was
later introduced to fix crashes when the file descriptor had been
duplicated. It seems that this was motivated by polling issues too.

Long story short, it looks like PPPIOCDETACH never has worked well
and we have at least two more bugs to fix. Given how it has proven
fragile, I wouldn't be surprised if there were even more lurking
around. I'd say that it's probably safer to drop it than to add more
workarounds and playing wack-a-mole with those bugs.
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KASAN: use-after-free Read in remove_wait_queue (2)

2018-05-14 Thread Eric Biggers
[+ppp list and maintainer]

On Wed, Feb 28, 2018 at 08:59:02AM -0800, syzbot wrote:
> Hello,
> 
> syzbot hit the following crash on upstream commit
> f3afe530d644488a074291da04a69a296ab63046 (Tue Feb 27 22:02:39 2018 +)
> Merge branch 'fixes-v4.16-rc4' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
> 
> So far this crash happened 3 times on upstream.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+16363c99d4134717c...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
> 
> audit: type=1400 audit(1519800493.311:7): avc:  denied  { map } for
> pid=4238 comm="syzkaller305740" path="/root/syzkaller305740266" dev="sda1"
> ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1
> ==
> BUG: KASAN: use-after-free in __lock_acquire+0x3d4d/0x3e00
> kernel/locking/lockdep.c:3310
> Read of size 8 at addr 8801afa039c0 by task syzkaller305740/4238
> 
> CPU: 1 PID: 4238 Comm: syzkaller305740 Not tainted 4.16.0-rc3+ #332
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x24d lib/dump_stack.c:53
>  print_address_description+0x73/0x250 mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report+0x23b/0x360 mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  __lock_acquire+0x3d4d/0x3e00 kernel/locking/lockdep.c:3310
>  lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>  _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
>  remove_wait_queue+0x81/0x350 kernel/sched/wait.c:50
>  ep_remove_wait_queue fs/eventpoll.c:596 [inline]
>  ep_unregister_pollwait.isra.7+0x18c/0x590 fs/eventpoll.c:614
>  ep_free+0x13f/0x320 fs/eventpoll.c:832
>  ep_eventpoll_release+0x44/0x60 fs/eventpoll.c:864
>  __fput+0x327/0x7e0 fs/file_table.c:209
>  fput+0x15/0x20 fs/file_table.c:243
>  task_work_run+0x199/0x270 kernel/task_work.c:113
>  exit_task_work include/linux/task_work.h:22 [inline]
>  do_exit+0x9bb/0x1ad0 kernel/exit.c:865
>  do_group_exit+0x149/0x400 kernel/exit.c:968
>  SYSC_exit_group kernel/exit.c:979 [inline]
>  SyS_exit_group+0x1d/0x20 kernel/exit.c:977
>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x43e958
> RSP: 002b:7ffe63a11468 EFLAGS: 0246 ORIG_RAX: 00e7
> RAX: ffda RBX:  RCX: 0043e958
> RDX:  RSI: 003c RDI: 
> RBP: 004be300 R08: 00e7 R09: ffd0
> R10: 004002c8 R11: 0246 R12: 0001
> R13: 006cc160 R14:  R15: 
> 
> Allocated by task 4238:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:552
>  __do_kmalloc_node mm/slab.c:3669 [inline]
>  __kmalloc_node+0x47/0x70 mm/slab.c:3676
>  kmalloc_node include/linux/slab.h:554 [inline]
>  kvmalloc_node+0x99/0xd0 mm/util.c:419
>  kvmalloc include/linux/mm.h:541 [inline]
>  kvzalloc include/linux/mm.h:549 [inline]
>  alloc_netdev_mqs+0x16d/0xfb0 net/core/dev.c:8299
>  ppp_create_interface drivers/net/ppp/ppp_generic.c:3018 [inline]
>  ppp_unattached_ioctl drivers/net/ppp/ppp_generic.c:866 [inline]
>  ppp_ioctl+0x1715/0x2a50 drivers/net/ppp/ppp_generic.c:602
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
>  SYSC_ioctl fs/ioctl.c:701 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> 
> Freed by task 4238:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527
>  __cache_free mm/slab.c:3485 [inline]
>  kfree+0xd9/0x260 mm/slab.c:3800
>  kvfree+0x36/0x60 mm/util.c:438
>  netdev_freemem+0x4c/0x60 net/core/dev.c:8253
>  netdev_release+0x10a/0x160 net/core/net-sysfs.c:1502
>  device_release+0x7c/0x210 drivers/base/core.c:814
>  kobject_cleanup lib/kobject.c:646 [inline]
>  kobject_release lib/kobject.c:675 [inline]
>  kref_put include/linux/kref.h:70 [inline]
>  kobject_put+0x14c/0x250 lib/kobject.c:692
>  put_device+0x20/0x30 drivers/base/core.c:1931
>  free_netdev+0x2f5/0x400 net/core/dev.c:8410
>