Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-15 Thread Roland Dreier
Jens / James, do you guys plan to send this to Linus for 3.11?
Triggering this bug is a bit esoteric but the impact is pretty nasty
(corrupting an unrelated process).

Thanks,
  Roland
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-15 Thread Douglas Gilbert

On 13-08-15 12:45 PM, Roland Dreier wrote:

Jens / James, do you guys plan to send this to Linus for 3.11?
Triggering this bug is a bit esoteric but the impact is pretty nasty
(corrupting an unrelated process).


The patch is fine with me. Even though the sg driver is
named in the patch title, I note that the v2 patch is
only against the block layer. Hence it does not need an
ack from me.

Doug Gilbert


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-08 Thread Roland Dreier
On Wed, Aug 7, 2013 at 9:31 AM, Douglas Gilbert dgilb...@interlog.com wrote:
 So what kind of signal was leading to your stomping on the memory?
 Was it user generated or something like SIGIO, SIGPIPE or a RT signal?

It was sometimes SIGHUP (for reopening log files) and sometimes
SIGALARM (for various periodic things).

 To get around the SG_IO ioctl restart problem (for non idempotent
 SCSI commands) could we replace a -ERESTARTSYS return value
 with -EINTR ?

 As I noted in a previous post, for robust user space code using the
 SG_IO ioctl, masking signals during the IO may help.

Yes, absolutely.  But process A should be able to keep its memory
uncorrupted even if process B is coded wrong :)

 And what about bsg? Is it any better or worse than sg in the case
 of interrupted SG_IO ioctls? Apart from the interface (sg_io_hdr
 v3 versus v4) it should be a drop in replacement for sg.

As far as I can tell bsg looks much better w.r.t. signals -- I don't
see anywhere that it schedules work onto a workqueue or other kernel
thread, and it looks like the SG_IO ioctl there actually has nowhere
that checks for signals.  All sleeps will be uninterruptible, which I
guess may be better or worse depending on your perspective.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-07 Thread David Milburn

Roland Dreier wrote:

From: Roland Dreier rol...@purestorage.com

There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
leads to one process writing data into the address space of some other
random unrelated process if the ioctl is interrupted by a signal.
What happens is the following:

 - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
   underlying SCSI command will transfer data from the SCSI device to
   the buffer provided in the ioctl)

 - Before the command finishes, a signal is sent to the process waiting
   in the ioctl.  This will end up waking up the sg_ioctl() code:

result = wait_event_interruptible(sfp-read_wait,
(srp_done(sfp, srp) || sdp-detached));

   but neither srp_done() nor sdp-detached is true, so we end up just
   setting srp-orphan and returning to userspace:

srp-orphan = 1;
write_unlock_irq(sfp-rq_list_lock);
return result;  /* -ERESTARTSYS because signal hit process */

   At this point the original process is done with the ioctl and
   blithely goes ahead handling the signal, reissuing the ioctl, etc.

 - Eventually, the SCSI command issued by the first ioctl finishes and
   ends up in sg_rq_end_io().  At the end of that function, we run through:

write_lock_irqsave(sfp-rq_list_lock, iflags);
if (unlikely(srp-orphan)) {
if (sfp-keep_orphan)
srp-sg_io_owned = 0;
else
done = 0;
}
srp-done = done;
write_unlock_irqrestore(sfp-rq_list_lock, iflags);

if (likely(done)) {
/* Now wake up any sg_read() that is waiting for this
 * packet.
 */
wake_up_interruptible(sfp-read_wait);
kill_fasync(sfp-async_qp, SIGPOLL, POLL_IN);
kref_put(sfp-f_ref, sg_remove_sfp);
} else {
INIT_WORK(srp-ew.work, sg_rq_end_io_usercontext);
schedule_work(srp-ew.work);
}

   Since srp-orphan *is* set, we set done to 0 (assuming the
   userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN
   ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext()
   to run in a workqueue.

 - In workqueue context we go through sg_rq_end_io_usercontext() -
   sg_finish_rem_req() - blk_rq_unmap_user() - ... -
   bio_uncopy_user() - __bio_copy_iov() - copy_to_user().

   The key point here is that we are doing copy_to_user() on a
   workqueue -- that is, we're on a kernel thread with current-mm
   equal to whatever random previous user process was scheduled before
   this kernel thread.  So we end up copying whatever data the SCSI
   command returned to the virtual address of the buffer passed into
   the original ioctl, but it's quite likely we do this copying into a
   different address space!

As suggested by James Bottomley james.bottom...@hansenpartnership.com,
add a check for current-mm (which is NULL if we're on a kernel thread
without a real userspace address space) in bio_uncopy_user(), and skip
the copy if we're on a kernel thread.

There's no reason that I can think of for any caller of bio_uncopy_user()
to want to do copying on a kernel thread with a random active userspace
address space.

Huge thanks to Costa Sapuntzakis co...@purestorage.com for the
original pointer to this bug in the sg code.

Signed-off-by: Roland Dreier rol...@purestorage.com
Cc: sta...@vger.kernel.org
---
 fs/bio.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 94bbc04..c5eae72 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1045,12 +1045,22 @@ static int __bio_copy_iov(struct bio *bio, struct 
bio_vec *iovecs,
 int bio_uncopy_user(struct bio *bio)
 {
struct bio_map_data *bmd = bio-bi_private;
-   int ret = 0;
+   struct bio_vec *bvec;
+   int ret = 0, i;
 
-	if (!bio_flagged(bio, BIO_NULL_MAPPED))

-   ret = __bio_copy_iov(bio, bmd-iovecs, bmd-sgvecs,
-bmd-nr_sgvecs, bio_data_dir(bio) == READ,
-0, bmd-is_our_pages);
+   if (!bio_flagged(bio, BIO_NULL_MAPPED)) {
+   /*
+* if we're in a workqueue, the request is orphaned, so
+* don't copy into a random user address space, just free.
+*/
+   if (current-mm)
+   ret = __bio_copy_iov(bio, bmd-iovecs, bmd-sgvecs,
+bmd-nr_sgvecs, bio_data_dir(bio) 
== READ,
+0, bmd-is_our_pages);
+   else if (bmd-is_our_pages)
+   bio_for_each_segment_all(bvec, bio, i)
+   __free_page(bvec-bv_page);
+   }
bio_free_map_data(bmd);
bio_put(bio);
return ret;


Hi Roland,

I 

Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-07 Thread Roland Dreier
On Wed, Aug 7, 2013 at 7:38 AM, David Milburn dmilb...@redhat.com wrote:
 I was able to succesfully test this patch overnight, I had been experimenting 
 with the
 sg driver setting the BIO_NULL_MAPPED flag in sg_rq_end_io_usercontext for a 
 orphan process
 which prevented the corruption, but your solution seems much better.

Very cool, thanks for the testing.

I actually looked at using BIO_NULL_MAPPED as well, but it seemed a
bit too fragile to me -- it had the right effect of skipping
__bio_copy_iov(), and skipping the __free_pages() stuff in there is OK
because sg owns its pages rather than the bio layer, but all that
seemed vulnerable to being broken by an unrelated change.

Out of curiousity, were you already working on this bug?  Because if
you had fixed it a few weeks earlier we might not have spent so long
wondering WTF was stomping on the memory of one of our processes :)

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-07 Thread David Milburn

Roland Dreier wrote:

On Wed, Aug 7, 2013 at 7:38 AM, David Milburn dmilb...@redhat.com wrote:

I was able to succesfully test this patch overnight, I had been experimenting 
with the
sg driver setting the BIO_NULL_MAPPED flag in sg_rq_end_io_usercontext for a 
orphan process
which prevented the corruption, but your solution seems much better.


Very cool, thanks for the testing.

I actually looked at using BIO_NULL_MAPPED as well, but it seemed a
bit too fragile to me -- it had the right effect of skipping
__bio_copy_iov(), and skipping the __free_pages() stuff in there is OK
because sg owns its pages rather than the bio layer, but all that
seemed vulnerable to being broken by an unrelated change.

Out of curiousity, were you already working on this bug?  Because if
you had fixed it a few weeks earlier we might not have spent so long
wondering WTF was stomping on the memory of one of our processes :)



Hi Roland,

Actually, I was waiting for confirmation from the field which I
recently received, I was getting ready to bring this up on linux-scsi,
sorry I should have brought it up sooner. I wasn't positive that setting
BIO_NULL_MAPPED flag from sg driver was the fix. David Jeffery
came up with a reproducer which I ran overnight on the latest
upstream kernel with your patch.

Thanks,
David



--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-07 Thread Douglas Gilbert

On 13-08-07 11:50 AM, Roland Dreier wrote:

On Wed, Aug 7, 2013 at 7:38 AM, David Milburn dmilb...@redhat.com wrote:

I was able to succesfully test this patch overnight, I had been experimenting 
with the
sg driver setting the BIO_NULL_MAPPED flag in sg_rq_end_io_usercontext for a 
orphan process
which prevented the corruption, but your solution seems much better.


Very cool, thanks for the testing.

I actually looked at using BIO_NULL_MAPPED as well, but it seemed a
bit too fragile to me -- it had the right effect of skipping
__bio_copy_iov(), and skipping the __free_pages() stuff in there is OK
because sg owns its pages rather than the bio layer, but all that
seemed vulnerable to being broken by an unrelated change.

Out of curiousity, were you already working on this bug?  Because if
you had fixed it a few weeks earlier we might not have spent so long
wondering WTF was stomping on the memory of one of our processes :)


Roland,
So what kind of signal was leading to your stomping on the memory?
Was it user generated or something like SIGIO, SIGPIPE or a RT signal?

To get around the SG_IO ioctl restart problem (for non idempotent
SCSI commands) could we replace a -ERESTARTSYS return value
with -EINTR ?

As I noted in a previous post, for robust user space code using the
SG_IO ioctl, masking signals during the IO may help.


And what about bsg? Is it any better or worse than sg in the case
of interrupted SG_IO ioctls? Apart from the interface (sg_io_hdr
v3 versus v4) it should be a drop in replacement for sg.

Doug Gilbert


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-05 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com

There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
leads to one process writing data into the address space of some other
random unrelated process if the ioctl is interrupted by a signal.
What happens is the following:

 - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
   underlying SCSI command will transfer data from the SCSI device to
   the buffer provided in the ioctl)

 - Before the command finishes, a signal is sent to the process waiting
   in the ioctl.  This will end up waking up the sg_ioctl() code:

result = wait_event_interruptible(sfp-read_wait,
(srp_done(sfp, srp) || sdp-detached));

   but neither srp_done() nor sdp-detached is true, so we end up just
   setting srp-orphan and returning to userspace:

srp-orphan = 1;
write_unlock_irq(sfp-rq_list_lock);
return result;  /* -ERESTARTSYS because signal hit process */

   At this point the original process is done with the ioctl and
   blithely goes ahead handling the signal, reissuing the ioctl, etc.

 - Eventually, the SCSI command issued by the first ioctl finishes and
   ends up in sg_rq_end_io().  At the end of that function, we run through:

write_lock_irqsave(sfp-rq_list_lock, iflags);
if (unlikely(srp-orphan)) {
if (sfp-keep_orphan)
srp-sg_io_owned = 0;
else
done = 0;
}
srp-done = done;
write_unlock_irqrestore(sfp-rq_list_lock, iflags);

if (likely(done)) {
/* Now wake up any sg_read() that is waiting for this
 * packet.
 */
wake_up_interruptible(sfp-read_wait);
kill_fasync(sfp-async_qp, SIGPOLL, POLL_IN);
kref_put(sfp-f_ref, sg_remove_sfp);
} else {
INIT_WORK(srp-ew.work, sg_rq_end_io_usercontext);
schedule_work(srp-ew.work);
}

   Since srp-orphan *is* set, we set done to 0 (assuming the
   userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN
   ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext()
   to run in a workqueue.

 - In workqueue context we go through sg_rq_end_io_usercontext() -
   sg_finish_rem_req() - blk_rq_unmap_user() - ... -
   bio_uncopy_user() - __bio_copy_iov() - copy_to_user().

   The key point here is that we are doing copy_to_user() on a
   workqueue -- that is, we're on a kernel thread with current-mm
   equal to whatever random previous user process was scheduled before
   this kernel thread.  So we end up copying whatever data the SCSI
   command returned to the virtual address of the buffer passed into
   the original ioctl, but it's quite likely we do this copying into a
   different address space!

As suggested by James Bottomley james.bottom...@hansenpartnership.com,
add a check for current-mm (which is NULL if we're on a kernel thread
without a real userspace address space) in bio_uncopy_user(), and skip
the copy if we're on a kernel thread.

There's no reason that I can think of for any caller of bio_uncopy_user()
to want to do copying on a kernel thread with a random active userspace
address space.

Huge thanks to Costa Sapuntzakis co...@purestorage.com for the
original pointer to this bug in the sg code.

Signed-off-by: Roland Dreier rol...@purestorage.com
Cc: sta...@vger.kernel.org
---
 fs/bio.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 94bbc04..c5eae72 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1045,12 +1045,22 @@ static int __bio_copy_iov(struct bio *bio, struct 
bio_vec *iovecs,
 int bio_uncopy_user(struct bio *bio)
 {
struct bio_map_data *bmd = bio-bi_private;
-   int ret = 0;
+   struct bio_vec *bvec;
+   int ret = 0, i;
 
-   if (!bio_flagged(bio, BIO_NULL_MAPPED))
-   ret = __bio_copy_iov(bio, bmd-iovecs, bmd-sgvecs,
-bmd-nr_sgvecs, bio_data_dir(bio) == READ,
-0, bmd-is_our_pages);
+   if (!bio_flagged(bio, BIO_NULL_MAPPED)) {
+   /*
+* if we're in a workqueue, the request is orphaned, so
+* don't copy into a random user address space, just free.
+*/
+   if (current-mm)
+   ret = __bio_copy_iov(bio, bmd-iovecs, bmd-sgvecs,
+bmd-nr_sgvecs, bio_data_dir(bio) 
== READ,
+0, bmd-is_our_pages);
+   else if (bmd-is_our_pages)
+   bio_for_each_segment_all(bvec, bio, i)
+   __free_page(bvec-bv_page);
+   }
bio_free_map_data(bmd);
bio_put(bio);
return ret;
-- 
1.8.3.2

--
To unsubscribe from