Re: scsi: use-after-free in bio_copy_from_iter

2016-12-06 Thread Dmitry Vyukov
On Tue, Dec 6, 2016 at 4:38 PM, Johannes Thumshirn  wrote:
> On Tue, Dec 06, 2016 at 10:43:57AM +0100, Dmitry Vyukov wrote:
>> On Tue, Dec 6, 2016 at 10:32 AM, Johannes Thumshirn  
>> wrote:
>> > On Mon, Dec 05, 2016 at 07:03:39PM +, Al Viro wrote:
>> >> On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote:
>> >> > 633 hp = >header;
>> >> > [...]
>> >> > 646 hp->dxferp = (char __user *)buf + cmd_size;
>> >>
>> >> > So the memory for hp->dxferp comes from:
>> >> > 633 hp = >header;
>> >>
>> >> 
>> >>
>> >> > >From my debug instrumentation I see that the dxferp ends up in the
>> >> > iovec_iter's kvec->iov_base and the faulting address is always dxferp + 
>> >> > n *
>> >> > 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into 
>> >> > the
>> >> > bio).
>> >>
>> >> _Address_ of hp->dxferp comes from that assignment; the value is 'buf'
>> >> argument of sg_write() + small offset.  In this case, it should point
>> >> inside a pipe buffer, which is, indeed, at a kernel address.  Who'd
>> >> allocated srp is irrelevant.
>> >
>> > Yes I realized that as well when I had enough distance between me and the
>> > code...
>> >
>> >>
>> >> And if you end up dereferencing more than one page worth there, you do 
>> >> have
>> >> a problem - pipe buffers are not going to be that large.  Could you slap
>> >>   WARN_ON((size_t)input_size > count);
>> >> right after the calculation of input_size in sg_write() and see if it 
>> >> triggers
>> >> on your reproducer?
>> >
>> > I did and it didn't trigger. What triggers is (as expected) a
>> > WARN_ON((size_t)mxsize > count);
>> > We have count at 80 and mxsize (which ends in hp->dxfer_len) at 65499. But 
>> > the
>> > 65499 bytes are the len of the data we're suppost to be copying in via the
>> > iov. I'm still rather confused what's happening here, sorry.
>>
>>
>> I think the critical piece here is some kind of race or timing
>> condition. Note that the test program executes all of
>> memfd_create/write/open/sendfile twice. Second time the calls race
>> with each other, but they also can race with the first execution of
>> the calls.
>
> FWIW I've just run the reproducer once instead of looping it to check how it
> would normally behave and it bailes out at:
>
> 604 if (count < (SZ_SG_HEADER + 6))
> 605 return -EIO;/* The minimum scsi command length is 6 
> bytes. */
>
> That means, weren't going down the copy_form_iter() road at all. Usually, but
> sometimes we do. And then we try to copy 16 pages from the pipe buffer (is
> this correct?).
> The reproducer does: sendfile("/dev/sg0", memfd, offset_in_memfd, 0x1);
>
> I don't see how we get there? Could it be random data from the mmap() we point
> the memfd to?
>
> This bug is confusing to be honest.


Where does this count come from? What address in the user program? Is
it 0x20012fxx?
One possibility for non-deterministically changing inputs is that this part:

  case 2:
NONFAILING(*(uint32_t*)0x20012fd8 = (uint32_t)0x28);
NONFAILING(*(uint32_t*)0x20012fdc = (uint32_t)0x);
NONFAILING(*(uint64_t*)0x20012fe0 = (uint64_t)0x0);
NONFAILING(*(uint64_t*)0x20012fe8 = (uint64_t)0x993f);
NONFAILING(*(uint64_t*)0x20012ff0 = (uint64_t)0xa8b);
NONFAILING(*(uint32_t*)0x20012ff8 = (uint32_t)0xff);
r[9] = syscall(__NR_write, r[2], 0x20012fd8ul, 0x28ul, 0, 0,
   0, 0, 0, 0);

runs concurrently with this part:

  case 0:
r[0] =
syscall(__NR_mmap, 0x2000ul, 0x13000ul, 0x3ul,
0x32ul, 0xul, 0x0ul, 0, 0, 0);

So all of the input data to the write, or a subset of the input data,
can be zeros.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-06 Thread Johannes Thumshirn
On Tue, Dec 06, 2016 at 10:43:57AM +0100, Dmitry Vyukov wrote:
> On Tue, Dec 6, 2016 at 10:32 AM, Johannes Thumshirn  
> wrote:
> > On Mon, Dec 05, 2016 at 07:03:39PM +, Al Viro wrote:
> >> On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote:
> >> > 633 hp = >header;
> >> > [...]
> >> > 646 hp->dxferp = (char __user *)buf + cmd_size;
> >>
> >> > So the memory for hp->dxferp comes from:
> >> > 633 hp = >header;
> >>
> >> 
> >>
> >> > >From my debug instrumentation I see that the dxferp ends up in the
> >> > iovec_iter's kvec->iov_base and the faulting address is always dxferp + 
> >> > n *
> >> > 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into 
> >> > the
> >> > bio).
> >>
> >> _Address_ of hp->dxferp comes from that assignment; the value is 'buf'
> >> argument of sg_write() + small offset.  In this case, it should point
> >> inside a pipe buffer, which is, indeed, at a kernel address.  Who'd
> >> allocated srp is irrelevant.
> >
> > Yes I realized that as well when I had enough distance between me and the
> > code...
> >
> >>
> >> And if you end up dereferencing more than one page worth there, you do have
> >> a problem - pipe buffers are not going to be that large.  Could you slap
> >>   WARN_ON((size_t)input_size > count);
> >> right after the calculation of input_size in sg_write() and see if it 
> >> triggers
> >> on your reproducer?
> >
> > I did and it didn't trigger. What triggers is (as expected) a
> > WARN_ON((size_t)mxsize > count);
> > We have count at 80 and mxsize (which ends in hp->dxfer_len) at 65499. But 
> > the
> > 65499 bytes are the len of the data we're suppost to be copying in via the
> > iov. I'm still rather confused what's happening here, sorry.
> 
> 
> I think the critical piece here is some kind of race or timing
> condition. Note that the test program executes all of
> memfd_create/write/open/sendfile twice. Second time the calls race
> with each other, but they also can race with the first execution of
> the calls.

FWIW I've just run the reproducer once instead of looping it to check how it
would normally behave and it bailes out at:

604 if (count < (SZ_SG_HEADER + 6))
605 return -EIO;/* The minimum scsi command length is 6 
bytes. */

That means, weren't going down the copy_form_iter() road at all. Usually, but
sometimes we do. And then we try to copy 16 pages from the pipe buffer (is
this correct?).
The reproducer does: sendfile("/dev/sg0", memfd, offset_in_memfd, 0x1);

I don't see how we get there? Could it be random data from the mmap() we point
the memfd to?

This bug is confusing to be honest.

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-06 Thread Dmitry Vyukov
On Tue, Dec 6, 2016 at 10:32 AM, Johannes Thumshirn  wrote:
> On Mon, Dec 05, 2016 at 07:03:39PM +, Al Viro wrote:
>> On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote:
>> > 633 hp = >header;
>> > [...]
>> > 646 hp->dxferp = (char __user *)buf + cmd_size;
>>
>> > So the memory for hp->dxferp comes from:
>> > 633 hp = >header;
>>
>> 
>>
>> > >From my debug instrumentation I see that the dxferp ends up in the
>> > iovec_iter's kvec->iov_base and the faulting address is always dxferp + n *
>> > 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into the
>> > bio).
>>
>> _Address_ of hp->dxferp comes from that assignment; the value is 'buf'
>> argument of sg_write() + small offset.  In this case, it should point
>> inside a pipe buffer, which is, indeed, at a kernel address.  Who'd
>> allocated srp is irrelevant.
>
> Yes I realized that as well when I had enough distance between me and the
> code...
>
>>
>> And if you end up dereferencing more than one page worth there, you do have
>> a problem - pipe buffers are not going to be that large.  Could you slap
>>   WARN_ON((size_t)input_size > count);
>> right after the calculation of input_size in sg_write() and see if it 
>> triggers
>> on your reproducer?
>
> I did and it didn't trigger. What triggers is (as expected) a
> WARN_ON((size_t)mxsize > count);
> We have count at 80 and mxsize (which ends in hp->dxfer_len) at 65499. But the
> 65499 bytes are the len of the data we're suppost to be copying in via the
> iov. I'm still rather confused what's happening here, sorry.


I think the critical piece here is some kind of race or timing
condition. Note that the test program executes all of
memfd_create/write/open/sendfile twice. Second time the calls race
with each other, but they also can race with the first execution of
the calls.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-06 Thread Johannes Thumshirn
On Mon, Dec 05, 2016 at 07:03:39PM +, Al Viro wrote:
> On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote:
> > 633 hp = >header;
> > [...]
> > 646 hp->dxferp = (char __user *)buf + cmd_size;
> 
> > So the memory for hp->dxferp comes from:
> > 633 hp = >header;
> 
> 
> 
> > >From my debug instrumentation I see that the dxferp ends up in the
> > iovec_iter's kvec->iov_base and the faulting address is always dxferp + n *
> > 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into the
> > bio).
> 
> _Address_ of hp->dxferp comes from that assignment; the value is 'buf'
> argument of sg_write() + small offset.  In this case, it should point
> inside a pipe buffer, which is, indeed, at a kernel address.  Who'd
> allocated srp is irrelevant.

Yes I realized that as well when I had enough distance between me and the
code...

> 
> And if you end up dereferencing more than one page worth there, you do have
> a problem - pipe buffers are not going to be that large.  Could you slap
>   WARN_ON((size_t)input_size > count);
> right after the calculation of input_size in sg_write() and see if it triggers
> on your reproducer?

I did and it didn't trigger. What triggers is (as expected) a
WARN_ON((size_t)mxsize > count);
We have count at 80 and mxsize (which ends in hp->dxfer_len) at 65499. But the
65499 bytes are the len of the data we're suppost to be copying in via the
iov. I'm still rather confused what's happening here, sorry.

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-05 Thread Al Viro
On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote:
> 633 hp = >header;
> [...]
> 646 hp->dxferp = (char __user *)buf + cmd_size;

> So the memory for hp->dxferp comes from:
> 633 hp = >header;



> >From my debug instrumentation I see that the dxferp ends up in the
> iovec_iter's kvec->iov_base and the faulting address is always dxferp + n *
> 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into the
> bio).

_Address_ of hp->dxferp comes from that assignment; the value is 'buf'
argument of sg_write() + small offset.  In this case, it should point
inside a pipe buffer, which is, indeed, at a kernel address.  Who'd
allocated srp is irrelevant.

And if you end up dereferencing more than one page worth there, you do have
a problem - pipe buffers are not going to be that large.  Could you slap
WARN_ON((size_t)input_size > count);
right after the calculation of input_size in sg_write() and see if it triggers
on your reproducer?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-05 Thread Johannes Thumshirn
On Mon, Dec 05, 2016 at 03:31:43PM +0100, Dmitry Vyukov wrote:
> On Sat, Dec 3, 2016 at 7:19 PM, Johannes Thumshirn  wrote:
> > On Sat, Dec 03, 2016 at 04:22:39PM +0100, Dmitry Vyukov wrote:
> >> On Sat, Dec 3, 2016 at 11:38 AM, Johannes Thumshirn  
> >> wrote:
> >> > On Fri, Dec 02, 2016 at 05:50:39PM +0100, Dmitry Vyukov wrote:
> >> >> On Fri, Nov 25, 2016 at 8:08 PM, Dmitry Vyukov  
> >> >> wrote:
> >
> > [...]
> >
> > Hi Dmitry,
> >
> >>
> >> Thanks for looking into this!
> >>
> >> As I noted I don't think this is use-after-free, more likely it is an
> >> out-of-bounds access against non-slab range.
> >>
> >> Report says that we are copying 0x1000 bytes starting at 
> >> 0x880062c6e02a.
> >> The first bad address is 0x880062c6f000, this address was freed
> >> previously and that's why KASAN reports UAF.
> >
> > We're copying 65499 bytes (65535 - sizeof(sg_header)) and we've got 2 order 
> > 3
> > page allocations to do this. It fails somewhere in there. I have seen fails 
> > at
> > 0x2000, 0xe000 and all (0x1000 aligned) offsets inbetween.
> >
> >> But this is already next page, and KASAN does not insert redzones
> >> around pages (only around slab allocations).
> >> So most likely the code should have not touch 0x880062c6f000 as it
> >> is not his memory.
> >> Also I noticed that the report happens after few minutes of repeatedly
> >> running this program, so I would expect that this is some kind of race
> >> -- either between kernel threads, or maybe between user space threads
> >> and kernel.
> >
> > I somehow think it's a race as well, especially as I have to run the
> > reproducer in an endless loop and break out of it once I have the 1st
> > stacktrace in dmesg. This takes between some minutes up to one hour on my
> > setup.
> >
> > But the race against a userspace thread... Could it be that the reproducer 
> > has
> > already exited it's threads while the copy_from_iter() is still running?
> > Normally I'd say no, as user-space shouldn't run while the kernel is doing
> > things in it's address space, but this is highly suspicious.
> >
> >> Or maybe it's just that the next page is not always marked
> >> as free, so we just don't detect the bad access.
> >
> > Could be, but I lack the memory management knowledge to say more than a 
> > 'could
> > be'.
> >
> >>
> >> Does it all make any sense to you?
> >> Can you think of any additional sanity checks that will ensure that
> >> this code copies only memory it owns?
> >
> > Given that we pass the 0x as dxfer_len it thinks it owns all memory, so
> > this is OK, kinda. All that could be would be that user-space has already
> > exited and thus it's memory is already freed.
> 
> 
> The crash happens in the context of sendfile syscall, we the address
> space should be alive. But the crash happens on address
> 0x880062c6f000 which is not a user-space address, right? This is
> something that kernel has allocated previously.
> Do you have CONFIG_DEBUG_PAGEALLOC enabled? I have it enabled. Maybe
> it increases changes of triggering the bug.
> 
> Do we know where that memory that we are copying was allocated? Is it
> slab or page alloc? We could extend KASAN output with more details.
> E.g. print allocation stack for the _first_ byte of memcpy, or
> memorize page alloc/free stacks in page struct using lib/stackdepot.c.

It comes in this way:

drivers/scsi/sg.c:
sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
580 struct sg_header old_hdr;
581 sg_io_hdr_t *hp;
582 unsigned char cmnd[SG_MAX_CDB_SIZE];
[...]
598 if (__copy_from_user(_hdr, buf, SZ_SG_HEADER))
599 return -EFAULT;
[...]
612 buf += SZ_SG_HEADER;
613 __get_user(opcode, buf);
[...]
614 if (sfp->next_cmd_len > 0) {
615 cmd_size = sfp->next_cmd_len;
616 sfp->next_cmd_len = 0;  /* reset so only this write() 
effected */
617 } else {
618 cmd_size = COMMAND_SIZE(opcode);/* based on SCSI 
command group */
619 if ((opcode >= 0xc0) && old_hdr.twelve_byte)
620 cmd_size = 12;
621 }
[...]
625 input_size = count - cmd_size;
626 mxsize = (input_size > old_hdr.reply_len) ? input_size : 
old_hdr.reply_len;
627 mxsize -= SZ_SG_HEADER;
633 hp = >header;
[...]
646 hp->dxferp = (char __user *)buf + cmd_size;
[...]
654 if (__copy_from_user(cmnd, buf, cmd_size))
655 return -EFAULT;
[...]
675 k = sg_common_write(sfp, srp, cmnd, sfp->timeout, blocking);
[...]
752 static int
753 sg_common_write(Sg_fd * sfp, Sg_request * srp,
754 unsigned char *cmnd, int timeout, int blocking)
755 {
[...]
772 k = sg_start_req(srp, cmnd);
[...]
1653 static int
1654 sg_start_req(Sg_request *srp, unsigned char *cmd)
1655 {
[...]
1757 

Re: scsi: use-after-free in bio_copy_from_iter

2016-12-05 Thread Dmitry Vyukov
On Sat, Dec 3, 2016 at 7:19 PM, Johannes Thumshirn  wrote:
> On Sat, Dec 03, 2016 at 04:22:39PM +0100, Dmitry Vyukov wrote:
>> On Sat, Dec 3, 2016 at 11:38 AM, Johannes Thumshirn  
>> wrote:
>> > On Fri, Dec 02, 2016 at 05:50:39PM +0100, Dmitry Vyukov wrote:
>> >> On Fri, Nov 25, 2016 at 8:08 PM, Dmitry Vyukov  wrote:
>
> [...]
>
> Hi Dmitry,
>
>>
>> Thanks for looking into this!
>>
>> As I noted I don't think this is use-after-free, more likely it is an
>> out-of-bounds access against non-slab range.
>>
>> Report says that we are copying 0x1000 bytes starting at 0x880062c6e02a.
>> The first bad address is 0x880062c6f000, this address was freed
>> previously and that's why KASAN reports UAF.
>
> We're copying 65499 bytes (65535 - sizeof(sg_header)) and we've got 2 order 3
> page allocations to do this. It fails somewhere in there. I have seen fails at
> 0x2000, 0xe000 and all (0x1000 aligned) offsets inbetween.
>
>> But this is already next page, and KASAN does not insert redzones
>> around pages (only around slab allocations).
>> So most likely the code should have not touch 0x880062c6f000 as it
>> is not his memory.
>> Also I noticed that the report happens after few minutes of repeatedly
>> running this program, so I would expect that this is some kind of race
>> -- either between kernel threads, or maybe between user space threads
>> and kernel.
>
> I somehow think it's a race as well, especially as I have to run the
> reproducer in an endless loop and break out of it once I have the 1st
> stacktrace in dmesg. This takes between some minutes up to one hour on my
> setup.
>
> But the race against a userspace thread... Could it be that the reproducer has
> already exited it's threads while the copy_from_iter() is still running?
> Normally I'd say no, as user-space shouldn't run while the kernel is doing
> things in it's address space, but this is highly suspicious.
>
>> Or maybe it's just that the next page is not always marked
>> as free, so we just don't detect the bad access.
>
> Could be, but I lack the memory management knowledge to say more than a 'could
> be'.
>
>>
>> Does it all make any sense to you?
>> Can you think of any additional sanity checks that will ensure that
>> this code copies only memory it owns?
>
> Given that we pass the 0x as dxfer_len it thinks it owns all memory, so
> this is OK, kinda. All that could be would be that user-space has already
> exited and thus it's memory is already freed.


The crash happens in the context of sendfile syscall, we the address
space should be alive. But the crash happens on address
0x880062c6f000 which is not a user-space address, right? This is
something that kernel has allocated previously.
Do you have CONFIG_DEBUG_PAGEALLOC enabled? I have it enabled. Maybe
it increases changes of triggering the bug.

Do we know where that memory that we are copying was allocated? Is it
slab or page alloc? We could extend KASAN output with more details.
E.g. print allocation stack for the _first_ byte of memcpy, or
memorize page alloc/free stacks in page struct using lib/stackdepot.c.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-03 Thread Johannes Thumshirn
On Sat, Dec 03, 2016 at 04:22:39PM +0100, Dmitry Vyukov wrote:
> On Sat, Dec 3, 2016 at 11:38 AM, Johannes Thumshirn  
> wrote:
> > On Fri, Dec 02, 2016 at 05:50:39PM +0100, Dmitry Vyukov wrote:
> >> On Fri, Nov 25, 2016 at 8:08 PM, Dmitry Vyukov  wrote:

[...]

Hi Dmitry,

> 
> Thanks for looking into this!
> 
> As I noted I don't think this is use-after-free, more likely it is an
> out-of-bounds access against non-slab range.
> 
> Report says that we are copying 0x1000 bytes starting at 0x880062c6e02a.
> The first bad address is 0x880062c6f000, this address was freed
> previously and that's why KASAN reports UAF.

We're copying 65499 bytes (65535 - sizeof(sg_header)) and we've got 2 order 3
page allocations to do this. It fails somewhere in there. I have seen fails at
0x2000, 0xe000 and all (0x1000 aligned) offsets inbetween.

> But this is already next page, and KASAN does not insert redzones
> around pages (only around slab allocations).
> So most likely the code should have not touch 0x880062c6f000 as it
> is not his memory.
> Also I noticed that the report happens after few minutes of repeatedly
> running this program, so I would expect that this is some kind of race
> -- either between kernel threads, or maybe between user space threads
> and kernel.

I somehow think it's a race as well, especially as I have to run the
reproducer in an endless loop and break out of it once I have the 1st
stacktrace in dmesg. This takes between some minutes up to one hour on my
setup.

But the race against a userspace thread... Could it be that the reproducer has
already exited it's threads while the copy_from_iter() is still running?
Normally I'd say no, as user-space shouldn't run while the kernel is doing
things in it's address space, but this is highly suspicious.

> Or maybe it's just that the next page is not always marked
> as free, so we just don't detect the bad access.

Could be, but I lack the memory management knowledge to say more than a 'could
be'.

> 
> Does it all make any sense to you?
> Can you think of any additional sanity checks that will ensure that
> this code copies only memory it owns?

Given that we pass the 0x as dxfer_len it thinks it owns all memory, so
this is OK, kinda. All that could be would be that user-space has already
exited and thus it's memory is already freed.

Byte,
Johannes
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-03 Thread Dmitry Vyukov
On Sat, Dec 3, 2016 at 11:38 AM, Johannes Thumshirn  wrote:
> On Fri, Dec 02, 2016 at 05:50:39PM +0100, Dmitry Vyukov wrote:
>> On Fri, Nov 25, 2016 at 8:08 PM, Dmitry Vyukov  wrote:
>
> [...]
>
>>
>> +David did some debugging of a similar case. His 0x400 at location
>> 0x2000efdc refers to 0x at 0x20012fdc in the provided reproducer:
>> NONFAILING(*(uint32_t*)0x20012fdc = (uint32_t)0x);
>> Here is his explanation:
>> =
>> That's not nice, it's passing a reply_len mismatch of 0x400 which is going
>> to cause an sg_write warning (in 988 bytes, out 38 bytes).  input_size
>> will be 74 since count == 80, the mxsize is 0x400 (!) so after subtracting
>> SZ_SG_HEADER we get the 988 bytes in.  This forces SG_DXFER_TO_FROM_DEV
>> when we really want SG_DXFER_TO_DEV.  If you set the value at 0x2000efdc
>> to 0x24 rather than 0x400 so there's a legitimate reply_len equal to
>> SG_DXFER_TO_FROM_DEV, I'd assume this passes just fine, assuming your
>> container can get enough memory.
>> =
>>
>> I've tried to replace 0x in the provided reproducer with 0x24 and
>> it indeed does not crash.
>
> This is somewhat expected, AFAICS the value at 0x20012fdc ends up as
> hp->dxfer_len in the SG driver. I did a lot of debugging (actually I spend the
> whole last work week soley on this) but I can't find where it does the actual
> use-after-free, so if you have anything to share I'd be glad.

Hi Johannes,

Thanks for looking into this!

As I noted I don't think this is use-after-free, more likely it is an
out-of-bounds access against non-slab range.

Report says that we are copying 0x1000 bytes starting at 0x880062c6e02a.
The first bad address is 0x880062c6f000, this address was freed
previously and that's why KASAN reports UAF.
But this is already next page, and KASAN does not insert redzones
around pages (only around slab allocations).
So most likely the code should have not touch 0x880062c6f000 as it
is not his memory.
Also I noticed that the report happens after few minutes of repeatedly
running this program, so I would expect that this is some kind of race
-- either between kernel threads, or maybe between user space threads
and kernel. Or maybe it's just that the next page is not always marked
as free, so we just don't detect the bad access.

Does it all make any sense to you?
Can you think of any additional sanity checks that will ensure that
this code copies only memory it owns?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-03 Thread Johannes Thumshirn
On Fri, Dec 02, 2016 at 05:50:39PM +0100, Dmitry Vyukov wrote:
> On Fri, Nov 25, 2016 at 8:08 PM, Dmitry Vyukov  wrote:

[...]

> 
> +David did some debugging of a similar case. His 0x400 at location
> 0x2000efdc refers to 0x at 0x20012fdc in the provided reproducer:
> NONFAILING(*(uint32_t*)0x20012fdc = (uint32_t)0x);
> Here is his explanation:
> =
> That's not nice, it's passing a reply_len mismatch of 0x400 which is going
> to cause an sg_write warning (in 988 bytes, out 38 bytes).  input_size
> will be 74 since count == 80, the mxsize is 0x400 (!) so after subtracting
> SZ_SG_HEADER we get the 988 bytes in.  This forces SG_DXFER_TO_FROM_DEV
> when we really want SG_DXFER_TO_DEV.  If you set the value at 0x2000efdc
> to 0x24 rather than 0x400 so there's a legitimate reply_len equal to
> SG_DXFER_TO_FROM_DEV, I'd assume this passes just fine, assuming your
> container can get enough memory.
> =
> 
> I've tried to replace 0x in the provided reproducer with 0x24 and
> it indeed does not crash.

This is somewhat expected, AFAICS the value at 0x20012fdc ends up as
hp->dxfer_len in the SG driver. I did a lot of debugging (actually I spend the
whole last work week soley on this) but I can't find where it does the actual
use-after-free, so if you have anything to share I'd be glad.

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi: use-after-free in bio_copy_from_iter

2016-12-02 Thread Dmitry Vyukov
On Fri, Nov 25, 2016 at 8:08 PM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers use-after-free in bio_copy_from_iter:
> https://gist.githubusercontent.com/dvyukov/80cd94b4e4c288f16ee4c787d404118b/raw/10536069562444da51b758bb39655b514ff93b45/gistfile1.txt
>
>
> ==
> BUG: KASAN: use-after-free in copy_from_iter+0xf30/0x15e0 at addr
> 880062c6e02a
> Read of size 4096 by task a.out/8529
> page:ea00018b1b80 count:2 mapcount:0 mapping:88006c80e9d0 index:0x1695
> flags: 0x5fffc000864(referenced|lru|active|private)
> page dumped because: kasan: bad access detected
> page->mem_cgroup:88003ebcea00
> CPU: 1 PID: 8529 Comm: a.out Not tainted 4.9.0-rc6+ #55
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  880039ca6770 834c3bb9 0001 110007394c81
>  ed0007394c79 41b58ab3 89575c30 834c38cb
>    0001 
> Call Trace:
>  [< inline >] __dump_stack lib/dump_stack.c:15
>  [] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
>  [< inline >] print_address_description mm/kasan/report.c:211
>  [< inline >] kasan_report_error mm/kasan/report.c:285
>  [] kasan_report+0x418/0x440 mm/kasan/report.c:305
>  [< inline >] check_memory_region_inline mm/kasan/kasan.c:308
>  [] check_memory_region+0x139/0x190 mm/kasan/kasan.c:315
>  [] memcpy+0x23/0x50 mm/kasan/kasan.c:350
>  [] copy_from_iter+0xf30/0x15e0 lib/iov_iter.c:559
>  [] copy_page_from_iter+0x474/0x860 lib/iov_iter.c:614
>  [< inline >] bio_copy_from_iter block/bio.c:1025
>  [] bio_copy_user_iov+0xb50/0xf10 block/bio.c:1226
>  [< inline >] __blk_rq_map_user_iov block/blk-map.c:56
>  [] blk_rq_map_user_iov+0x295/0x930 block/blk-map.c:130
>  [] blk_rq_map_user+0x139/0x1e0 block/blk-map.c:159
>  [< inline >] sg_start_req drivers/scsi/sg.c:1757
>  [] sg_common_write.isra.20+0x12da/0x1b20
> drivers/scsi/sg.c:772
>  [] sg_write+0x78a/0xda0 drivers/scsi/sg.c:675
>  [] __vfs_write+0x65e/0x830 fs/read_write.c:510
>  [] __kernel_write+0xec/0x340 fs/read_write.c:532
>  [] write_pipe_buf+0x19c/0x260 fs/splice.c:814
>  [< inline >] splice_from_pipe_feed fs/splice.c:519
>  [] __splice_from_pipe+0x31f/0x750 fs/splice.c:643
>  [] splice_from_pipe+0x1dc/0x300 fs/splice.c:678
>  [] default_file_splice_write+0x45/0x90 fs/splice.c:826
>  [< inline >] do_splice_from fs/splice.c:868
>  [] direct_splice_actor+0x12a/0x190 fs/splice.c:1035
>  [] splice_direct_to_actor+0x2c6/0x840 fs/splice.c:990
>  [] do_splice_direct+0x2ce/0x470 fs/splice.c:1078
>  [] do_sendfile+0x73f/0x1060 fs/read_write.c:1401
>  [< inline >] SYSC_sendfile64 fs/read_write.c:1456
>  [] SyS_sendfile64+0xee/0x1a0 fs/read_write.c:1448
>  [] entry_SYSCALL_64_fastpath+0x23/0xc6
> arch/x86/entry/entry_64.S:209
> Memory state around the buggy address:
>  880062c6ef00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  880062c6ef80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>880062c6f000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>^
>  880062c6f080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>  880062c6f100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ==
> Disabling lock debugging due to kernel taint
> BUG: unable to handle kernel paging request at 880062c6f000
> IP: [] __memcpy+0x12/0x20 arch/x86/lib/memcpy_64.S:37
> PGD c53d067 [  494.351750] PUD c540067
> PMD 7fdea067 [  494.351750] PTE 800062c6f060
>
> Oops:  [#1] SMP DEBUG_PAGEALLOC KASAN
> Modules linked in:
> CPU: 1 PID: 8529 Comm: a.out Tainted: GB   4.9.0-rc6+ #55
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: 88003c54e3c0 task.stack: 880039ca
> RIP: 0010:[]  []
> __memcpy+0x12/0x20 arch/x86/lib/memcpy_64.S:37
> RSP: 0018:880039ca6838  EFLAGS: 00010246
> RAX: 880031b99000 RBX: 1000 RCX: 0006
> RDX:  RSI: 880062c6effa RDI: 880031b99fd0
> RBP: 880039ca6858 R08:  R09: 
> R10: 0200 R11: ed00063733ff R12: 880031b99000
> R13: 880062c6e02a R14: 880039ca6f70 R15: 880039ca6d18
> FS:  7f7a78013700() GS:88003ed0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 880062c6f000 CR3: 6411f000 CR4: 06e0
> Stack:
>  819f0a65 880039ca71a0 1000 2000
>  880039ca6d40 83525d10 41b58ab3 89576240
>  0001  41b58ab3 894d07b0
> Call Trace:
>  [] copy_from_iter+0xf30/0x15e0 lib/iov_iter.c:559
>  [] copy_page_from_iter+0x474/0x860 lib/iov_iter.c:614
>  [< inline >] 

scsi: use-after-free in bio_copy_from_iter

2016-11-25 Thread Dmitry Vyukov
Hello,

The following program triggers use-after-free in bio_copy_from_iter:
https://gist.githubusercontent.com/dvyukov/80cd94b4e4c288f16ee4c787d404118b/raw/10536069562444da51b758bb39655b514ff93b45/gistfile1.txt


==
BUG: KASAN: use-after-free in copy_from_iter+0xf30/0x15e0 at addr
880062c6e02a
Read of size 4096 by task a.out/8529
page:ea00018b1b80 count:2 mapcount:0 mapping:88006c80e9d0 index:0x1695
flags: 0x5fffc000864(referenced|lru|active|private)
page dumped because: kasan: bad access detected
page->mem_cgroup:88003ebcea00
CPU: 1 PID: 8529 Comm: a.out Not tainted 4.9.0-rc6+ #55
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 880039ca6770 834c3bb9 0001 110007394c81
 ed0007394c79 41b58ab3 89575c30 834c38cb
   0001 
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
 [< inline >] print_address_description mm/kasan/report.c:211
 [< inline >] kasan_report_error mm/kasan/report.c:285
 [] kasan_report+0x418/0x440 mm/kasan/report.c:305
 [< inline >] check_memory_region_inline mm/kasan/kasan.c:308
 [] check_memory_region+0x139/0x190 mm/kasan/kasan.c:315
 [] memcpy+0x23/0x50 mm/kasan/kasan.c:350
 [] copy_from_iter+0xf30/0x15e0 lib/iov_iter.c:559
 [] copy_page_from_iter+0x474/0x860 lib/iov_iter.c:614
 [< inline >] bio_copy_from_iter block/bio.c:1025
 [] bio_copy_user_iov+0xb50/0xf10 block/bio.c:1226
 [< inline >] __blk_rq_map_user_iov block/blk-map.c:56
 [] blk_rq_map_user_iov+0x295/0x930 block/blk-map.c:130
 [] blk_rq_map_user+0x139/0x1e0 block/blk-map.c:159
 [< inline >] sg_start_req drivers/scsi/sg.c:1757
 [] sg_common_write.isra.20+0x12da/0x1b20
drivers/scsi/sg.c:772
 [] sg_write+0x78a/0xda0 drivers/scsi/sg.c:675
 [] __vfs_write+0x65e/0x830 fs/read_write.c:510
 [] __kernel_write+0xec/0x340 fs/read_write.c:532
 [] write_pipe_buf+0x19c/0x260 fs/splice.c:814
 [< inline >] splice_from_pipe_feed fs/splice.c:519
 [] __splice_from_pipe+0x31f/0x750 fs/splice.c:643
 [] splice_from_pipe+0x1dc/0x300 fs/splice.c:678
 [] default_file_splice_write+0x45/0x90 fs/splice.c:826
 [< inline >] do_splice_from fs/splice.c:868
 [] direct_splice_actor+0x12a/0x190 fs/splice.c:1035
 [] splice_direct_to_actor+0x2c6/0x840 fs/splice.c:990
 [] do_splice_direct+0x2ce/0x470 fs/splice.c:1078
 [] do_sendfile+0x73f/0x1060 fs/read_write.c:1401
 [< inline >] SYSC_sendfile64 fs/read_write.c:1456
 [] SyS_sendfile64+0xee/0x1a0 fs/read_write.c:1448
 [] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209
Memory state around the buggy address:
 880062c6ef00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 880062c6ef80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>880062c6f000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
   ^
 880062c6f080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 880062c6f100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==
Disabling lock debugging due to kernel taint
BUG: unable to handle kernel paging request at 880062c6f000
IP: [] __memcpy+0x12/0x20 arch/x86/lib/memcpy_64.S:37
PGD c53d067 [  494.351750] PUD c540067
PMD 7fdea067 [  494.351750] PTE 800062c6f060

Oops:  [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 1 PID: 8529 Comm: a.out Tainted: GB   4.9.0-rc6+ #55
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: 88003c54e3c0 task.stack: 880039ca
RIP: 0010:[]  []
__memcpy+0x12/0x20 arch/x86/lib/memcpy_64.S:37
RSP: 0018:880039ca6838  EFLAGS: 00010246
RAX: 880031b99000 RBX: 1000 RCX: 0006
RDX:  RSI: 880062c6effa RDI: 880031b99fd0
RBP: 880039ca6858 R08:  R09: 
R10: 0200 R11: ed00063733ff R12: 880031b99000
R13: 880062c6e02a R14: 880039ca6f70 R15: 880039ca6d18
FS:  7f7a78013700() GS:88003ed0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 880062c6f000 CR3: 6411f000 CR4: 06e0
Stack:
 819f0a65 880039ca71a0 1000 2000
 880039ca6d40 83525d10 41b58ab3 89576240
 0001  41b58ab3 894d07b0
Call Trace:
 [] copy_from_iter+0xf30/0x15e0 lib/iov_iter.c:559
 [] copy_page_from_iter+0x474/0x860 lib/iov_iter.c:614
 [< inline >] bio_copy_from_iter block/bio.c:1025
 [] bio_copy_user_iov+0xb50/0xf10 block/bio.c:1226
 [< inline >] __blk_rq_map_user_iov block/blk-map.c:56
 [] blk_rq_map_user_iov+0x295/0x930 block/blk-map.c:130
 [] blk_rq_map_user+0x139/0x1e0