Re: SG does not ignore dxferp (direct io + mmap)

2016-12-02 Thread Ewan D. Milne
On Thu, 2016-12-01 at 08:40 -0500, Martin K. Petersen wrote:
> > "Ewan" == Ewan D Milne  writes:
...
> Specifically, the problem appears to be caused by the removal of
> the setting of bio->bi_bdev, which would previously be set to NULL.
> If I add:
> 
> diff --git a/block/bio.c b/block/bio.c
> index 0723d4c..ecac37b 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1351,6 +1351,7 @@ struct bio *bio_map_user_iov(struct request_queue
> *q,
> if (iter->type & WRITE)
> bio->bi_rw |= REQ_WRITE;
>  
> +   bio->bi_bdev = NULL;
> bio->bi_flags |= (1 << BIO_USER_MAPPED);
>  
> /*
> 
> Ewan> The test passes (no zero byte corruption).
> 

Adding this seemed to work on top of the commit commit id mentioned
above (37f19e57a0de3c4a3417aa13ff4d04f1e0dee4b3)
but does not help on 4.7, because the test case now takes the
bio_copy_user_iov() code path rather than bio_map_user_iov().

-Ewan



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-12-02 Thread Ewan D. Milne
On Fri, 2016-12-02 at 15:10 +0100, Hannes Reinecke wrote:
> On 12/02/2016 02:29 PM, Ewan D. Milne wrote:
> > On Fri, 2016-12-02 at 04:21 -0800, Christoph Hellwig wrote:
> >> On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
> >>> Specifically, the problem appears to be caused by the removal of
> >>> the setting of bio->bi_bdev, which would previously be set to NULL.
> >>> If I add:
> >>
> >> Very odd.  For one I would expect it to be NULL anyway, second
> >> I don't see why the behavior changed.  But given that this reverts
> >> to the original assignment and makes things work I'll happily hack it
> >> to get things working again:
> >>
> >> Acked-by: Christoph Hellwig 
> >
> > Yeah, I'm not sure I understand this either, apart from the change
> > adjusting the code to effectively do what it used to and making the
> > test case work.  I'm reluctant to cc: stable yet, let me look at this
> > a bit more and I'll post the actual patch soon.
> >
> Plus we found that this is basically a timing issue; we've found that 
> supposedly fixed bugs will crop up after ~4k iterations.
> (Johannes did a _lot_ of testing here :-)
> So just because the bug failed to materialize can also mean that you 
> simply didn't test long enough.
> 
Yes, and following the code paths it isn't completely clear how this
leads to the single zero-byte corruption, I am continuing to investigate.
There may very well be more than one problem.

On kernel versions I tested where I got a failure it was a solid
failure, it never worked no matter how many times I tried, but I
did not exhaustively test apparently successful kernel versions.
Not thousands, of times, anyway.

-Ewan


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-12-02 Thread Laurence Oberman


- Original Message -
> From: "Hannes Reinecke" <h...@suse.de>
> To: emi...@redhat.com, "Christoph Hellwig" <h...@infradead.org>
> Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "Johannes Thumshirn" 
> <jthumsh...@suse.de>, "Laurence Oberman"
> <lober...@redhat.com>, "Eyal Ben David" <bde...@gmail.com>, 
> dgilb...@interlog.com, linux-scsi@vger.kernel.org
> Sent: Friday, December 2, 2016 9:10:01 AM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> On 12/02/2016 02:29 PM, Ewan D. Milne wrote:
> > On Fri, 2016-12-02 at 04:21 -0800, Christoph Hellwig wrote:
> >> On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
> >>> Specifically, the problem appears to be caused by the removal of
> >>> the setting of bio->bi_bdev, which would previously be set to NULL.
> >>> If I add:
> >>
> >> Very odd.  For one I would expect it to be NULL anyway, second
> >> I don't see why the behavior changed.  But given that this reverts
> >> to the original assignment and makes things work I'll happily hack it
> >> to get things working again:
> >>
> >> Acked-by: Christoph Hellwig <h...@lst.de>
> >
> > Yeah, I'm not sure I understand this either, apart from the change
> > adjusting the code to effectively do what it used to and making the
> > test case work.  I'm reluctant to cc: stable yet, let me look at this
> > a bit more and I'll post the actual patch soon.
> >
> Plus we found that this is basically a timing issue; we've found that
> supposedly fixed bugs will crop up after ~4k iterations.
> (Johannes did a _lot_ of testing here :-)
> So just because the bug failed to materialize can also mean that you
> simply didn't test long enough.
> 
> Cheers,
> 
> Hannes
> --
> Dr. Hannes Reinecke  Teamlead Storage & Networking
> h...@suse.de +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
> HRB 21284 (AG Nürnberg)
> 
Hannes, 
Just to clarify, my original test reverting  commit 
3fa6c507319c897598512da91c010a4ad2ed682c,
what Ewan originally bisected to and running > 10 iterations never failed.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-12-02 Thread Hannes Reinecke

On 12/02/2016 02:29 PM, Ewan D. Milne wrote:

On Fri, 2016-12-02 at 04:21 -0800, Christoph Hellwig wrote:

On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:

Specifically, the problem appears to be caused by the removal of
the setting of bio->bi_bdev, which would previously be set to NULL.
If I add:


Very odd.  For one I would expect it to be NULL anyway, second
I don't see why the behavior changed.  But given that this reverts
to the original assignment and makes things work I'll happily hack it
to get things working again:

Acked-by: Christoph Hellwig 


Yeah, I'm not sure I understand this either, apart from the change
adjusting the code to effectively do what it used to and making the
test case work.  I'm reluctant to cc: stable yet, let me look at this
a bit more and I'll post the actual patch soon.

Plus we found that this is basically a timing issue; we've found that 
supposedly fixed bugs will crop up after ~4k iterations.

(Johannes did a _lot_ of testing here :-)
So just because the bug failed to materialize can also mean that you 
simply didn't test long enough.


Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-12-02 Thread Ewan D. Milne
On Fri, 2016-12-02 at 04:21 -0800, Christoph Hellwig wrote:
> On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
> > Specifically, the problem appears to be caused by the removal of
> > the setting of bio->bi_bdev, which would previously be set to NULL.
> > If I add:
> 
> Very odd.  For one I would expect it to be NULL anyway, second
> I don't see why the behavior changed.  But given that this reverts
> to the original assignment and makes things work I'll happily hack it
> to get things working again:
> 
> Acked-by: Christoph Hellwig 

Yeah, I'm not sure I understand this either, apart from the change
adjusting the code to effectively do what it used to and making the
test case work.  I'm reluctant to cc: stable yet, let me look at this
a bit more and I'll post the actual patch soon.

-Ewan


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-12-02 Thread Christoph Hellwig
On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
> Specifically, the problem appears to be caused by the removal of
> the setting of bio->bi_bdev, which would previously be set to NULL.
> If I add:

Very odd.  For one I would expect it to be NULL anyway, second
I don't see why the behavior changed.  But given that this reverts
to the original assignment and makes things work I'll happily hack it
to get things working again:

Acked-by: Christoph Hellwig 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-12-01 Thread Martin K. Petersen
> "Ewan" == Ewan D Milne  writes:

>> I think what we need to understand is what caused the regression in
>> the first place, I probably should have been bisecting the original
>> failure rather than trying to find where it started working.

Ewan> Bisecting leads to this commit:

commit 37f19e57a0de3c4a3417aa13ff4d04f1e0dee4b3
Author: Christoph Hellwig 
Date:   Sun Jan 18 16:16:33 2015 +0100

block: merge __bio_map_user_iov into bio_map_user_iov

And also remove the unused bdev argument.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Ming Lei 
Signed-off-by: Jens Axboe 

Specifically, the problem appears to be caused by the removal of
the setting of bio->bi_bdev, which would previously be set to NULL.
If I add:

diff --git a/block/bio.c b/block/bio.c
index 0723d4c..ecac37b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1351,6 +1351,7 @@ struct bio *bio_map_user_iov(struct request_queue
*q,
if (iter->type & WRITE)
bio->bi_rw |= REQ_WRITE;
 
+   bio->bi_bdev = NULL;
bio->bi_flags |= (1 << BIO_USER_MAPPED);
 
/*

Ewan> The test passes (no zero byte corruption).

Ewan> Setting dxferp would cause map_data.null_mapped to be set before
Ewan> it is passed to blk_rq_map_user(_iov) which would cause a
Ewan> difference in behavior.

Christoph?

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-30 Thread Ewan D. Milne
On Fri, 2016-11-25 at 12:56 -0500, Ewan Milne wrote:
> I think what we need to understand is what caused the regression in the
> first place, I probably should have been bisecting the original failure
> rather than trying to find where it started working.
> 

Bisecting leads to this commit:

commit 37f19e57a0de3c4a3417aa13ff4d04f1e0dee4b3
Author: Christoph Hellwig 
Date:   Sun Jan 18 16:16:33 2015 +0100

block: merge __bio_map_user_iov into bio_map_user_iov

And also remove the unused bdev argument.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Ming Lei 
Signed-off-by: Jens Axboe 

Specifically, the problem appears to be caused by the removal of
the setting of bio->bi_bdev, which would previously be set to NULL.
If I add:

diff --git a/block/bio.c b/block/bio.c
index 0723d4c..ecac37b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1351,6 +1351,7 @@ struct bio *bio_map_user_iov(struct request_queue
*q,
if (iter->type & WRITE)
bio->bi_rw |= REQ_WRITE;
 
+   bio->bi_bdev = NULL;
bio->bi_flags |= (1 << BIO_USER_MAPPED);
 
/*

The test passes (no zero byte corruption).

Setting dxferp would cause map_data.null_mapped to be set before it
is passed to blk_rq_map_user(_iov) which would cause a difference in
behavior.

-Ewan



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-28 Thread Johannes Thumshirn
On Fri, Nov 25, 2016 at 09:46:05AM -0500, Laurence Oberman wrote:

[...]

> 
> Johannes, you are reproducing another race in your test I think.
> 

This might actually be true. Disclaimer I'm hauntiung that one for quite some
time and now have a reproducer for it not just a downstream bug report.

I was tracing it down to the exact opposite of this KASAN report [1] which came
in over the weekend, so I'll be looking into this one as well now.

[1] http://www.spinics.net/lists/linux-scsi/msg102232.html

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Laurence Oberman


- Original Message -
> From: "Ewan Milne" <emi...@redhat.com>
> To: "Johannes Thumshirn" <jthumsh...@suse.de>
> Cc: "Laurence Oberman" <lober...@redhat.com>, "Eyal Ben David" 
> <bde...@gmail.com>, dgilb...@interlog.com,
> linux-scsi@vger.kernel.org
> Sent: Friday, November 25, 2016 12:56:16 PM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> >> ---
> >> 
> >> In other words, this commit made the bad behavior go away in 4.8.
> >> Need to look at this in more detail, it doesn't appear as if this patch
> >> was intended to fix such a problem.
> >> 
> >> -Ewan
> >
> >Are you sure it did? I can repropduce copy_to_user() errors with 4.8 as
> >well.
> >Using the very same reproducer. On 4.8 it's just harder to trigger and
> >doesn't trigger on AHCI as fas as I can telli (maybe I just haven't hit
> >it hard enough). I can trigger it on QEMUs SCSI CDROM emulation and hpsa
> >though. I cannot however trigger this with a minimal config inside an
> >initrd.
> 
> It did for Eyal's supplied test case on my machine, but that was not an
> exhaustive test, and I am a little suspicious that the behavior change was
> due to a side-effect of the patch rather than actually fixing something.
> 
> I think what we need to understand is what caused the regression in the
> first place, I probably should have been bisecting the original failure
> rather than trying to find where it started working.
> 
> I was running against an internal (physical) drive.
> 
> -Ewan
> 

My 10 loop was against an HPSA target and passed all tests.
Again, all I did was patch the 4.7.9 with the 2 line changes, the rest of the 
patch was line breaks.

I guess we need to understand when it first broke and what caused that, versus 
what seems to correct it.
Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Ewan Milne
>> ---
>> 
>> In other words, this commit made the bad behavior go away in 4.8.
>> Need to look at this in more detail, it doesn't appear as if this patch
>> was intended to fix such a problem.
>> 
>> -Ewan
>
>Are you sure it did? I can repropduce copy_to_user() errors with 4.8 as well.
>Using the very same reproducer. On 4.8 it's just harder to trigger and
>doesn't trigger on AHCI as fas as I can telli (maybe I just haven't hit
>it hard enough). I can trigger it on QEMUs SCSI CDROM emulation and hpsa
>though. I cannot however trigger this with a minimal config inside an initrd.

It did for Eyal's supplied test case on my machine, but that was not an
exhaustive test, and I am a little suspicious that the behavior change was
due to a side-effect of the patch rather than actually fixing something.

I think what we need to understand is what caused the regression in the
first place, I probably should have been bisecting the original failure
rather than trying to find where it started working.

I was running against an internal (physical) drive.

-Ewan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Laurence Oberman


- Original Message -
> From: "Eyal Ben David" <bde...@gmail.com>
> To: "Johannes Thumshirn" <jthumsh...@suse.de>
> Cc: "Ewan D. Milne" <emi...@redhat.com>, "Laurence Oberman" 
> <lober...@redhat.com>, dgilb...@interlog.com,
> linux-scsi@vger.kernel.org
> Sent: Friday, November 25, 2016 7:36:34 AM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> On Fri, Nov 25, 2016 at 1:53 PM, Johannes Thumshirn <jthumsh...@suse.de>
> wrote:
> > On Fri, Nov 25, 2016 at 01:20:34PM +0200, Eyal Ben David wrote:
> >> Note that sg_mmap_read does not parse the SCSI sense, so the script
> >> might fail for other reasons (some SCSI error) and think its a zero
> >> byte corruption.
> >
> > But SCSI generic checks for errors and returns -EINVAL on CHECK_CONDITION
> > or
> > DRIVER_SENSE (and sets SG_INFO_CHECK in hdr.info).
> >
> Ah OK. We use async write/read instead of ioctl and forgot that ioctl
> checks the read.
> 
> > Anyways, can you test the patch Ewan found on one of your kernel's that are
> > known to fail?
> 
> All the examples I gave before were on physical hosts and storage at a
> testing lab.
> That would be difficult to do. Sorry.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

I applied just the patch myself and Ewan isolated to the 4.7.9 kernel and the 
issue is resolved for me.
I have 10 loops of the test with no issue.

I repeated the test flushing cache each time as well to make sure

for i in `seq 1 10`; do ./sg_mmap_read -d /dev/sg1 -l 0 -m -b | hexdump | 
grep 6300; done

for i in `seq 1 10`; do ./sg_mmap_read -d /dev/sg1 -l 0 -m -b | hexdump | 
grep 6300; echo 3 > /proc/sys/vm/drop_caches;done

Johannes, you are reproducing another race in your test I think.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Eyal Ben David
On Fri, Nov 25, 2016 at 1:53 PM, Johannes Thumshirn  wrote:
> On Fri, Nov 25, 2016 at 01:20:34PM +0200, Eyal Ben David wrote:
>> Note that sg_mmap_read does not parse the SCSI sense, so the script
>> might fail for other reasons (some SCSI error) and think its a zero
>> byte corruption.
>
> But SCSI generic checks for errors and returns -EINVAL on CHECK_CONDITION or
> DRIVER_SENSE (and sets SG_INFO_CHECK in hdr.info).
>
Ah OK. We use async write/read instead of ioctl and forgot that ioctl
checks the read.

> Anyways, can you test the patch Ewan found on one of your kernel's that are
> known to fail?

All the examples I gave before were on physical hosts and storage at a
testing lab.
That would be difficult to do. Sorry.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Johannes Thumshirn
On Fri, Nov 25, 2016 at 12:53:17PM +0100, Johannes Thumshirn wrote:
> On Fri, Nov 25, 2016 at 01:20:34PM +0200, Eyal Ben David wrote:
> > Note that sg_mmap_read does not parse the SCSI sense, so the script
> > might fail for other reasons (some SCSI error) and think its a zero
> > byte corruption.
> 
> But SCSI generic checks for errors and returns -EINVAL on CHECK_CONDITION or
> DRIVER_SENSE (and sets SG_INFO_CHECK in hdr.info).
> 
> And:
> VM:~ # ./test.sh 
> FAIL on run 2
> Expect:
> 000 8240 3d1f 8800  0002   
> 020     6e9d 57ac  
> 040        
> *
> 100
> Fail:
> 000        
> *
> 100
> VM:~ # uname -r
> 4.8.9-60-default+
> VM:~ # 
> 
> 
> Anyways, can you test the patch Ewan found on one of your kernel's that are
> known to fail?

Hannes found another interesting aspect, if one sets the allow_dio module
parameter to one and sets the SG_FLAG_DIRECT_IO in the header, it works
reliably.

Byte,
Johannes
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Johannes Thumshirn
On Fri, Nov 25, 2016 at 01:20:34PM +0200, Eyal Ben David wrote:
> Note that sg_mmap_read does not parse the SCSI sense, so the script
> might fail for other reasons (some SCSI error) and think its a zero
> byte corruption.

But SCSI generic checks for errors and returns -EINVAL on CHECK_CONDITION or
DRIVER_SENSE (and sets SG_INFO_CHECK in hdr.info).

And:
VM:~ # ./test.sh 
FAIL on run 2
Expect:
000 8240 3d1f 8800  0002   
020     6e9d 57ac  
040        
*
100
Fail:
000        
*
100
VM:~ # uname -r
4.8.9-60-default+
VM:~ # 


Anyways, can you test the patch Ewan found on one of your kernel's that are
known to fail?

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Eyal Ben David
Note that sg_mmap_read does not parse the SCSI sense, so the script
might fail for other reasons (some SCSI error) and think its a zero
byte corruption.


If you think an improved version could help (compare results within
the program + parse senses) I can help.


On Fri, Nov 25, 2016 at 10:07 AM, Johannes Thumshirn  wrote:
> On Wed, Nov 23, 2016 at 03:22:04PM -0500, Ewan Milne wrote:
>
> [...]
>
>> ---
>>
>> In other words, this commit made the bad behavior go away in 4.8.
>> Need to look at this in more detail, it doesn't appear as if this patch
>> was intended to fix such a problem.
>>
>> -Ewan
>
> Are you sure it did? I can repropduce copy_to_user() errors with 4.8 as well.
> Using the very same reproducer. On 4.8 it's just harder to trigger and
> doesn't trigger on AHCI as fas as I can telli (maybe I just haven't hit
> it hard enough). I can trigger it on QEMUs SCSI CDROM emulation and hpsa
> though. I cannot however trigger this with a minimal config inside an initrd.
>
>
> Here's my test:
>
> --- 8< ---
> #!/bin/sh
>
> cleanup()
> {
> rm -f expect.txt
> rm -f test.txt
> }
>
> trap cleanup EXIT INT
>
> ./sg_mmap_read -d /dev/sg0 -l 0 | od -x > expect.txt
>
> for i in $(seq 1 5000); do
> ./sg_mmap_read -d /dev/sg0 -l 0 | od -x > test.txt
> diff expect.txt test.txt > /dev/null
> if [ $? -ne 0 ]; then
> echo "FAIL"
> exit 1
> fi
> done
>
> echo "PASS"
>
> --- >8 ---
>
> I added some trace_printk()s to sg.c and try to hunt it down. It'll be good to
> know if anyone of you can reproduce it with the above script as well.
>
> Byte,
> Johannes
> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-25 Thread Johannes Thumshirn
On Wed, Nov 23, 2016 at 03:22:04PM -0500, Ewan Milne wrote:

[...]

> ---
> 
> In other words, this commit made the bad behavior go away in 4.8.
> Need to look at this in more detail, it doesn't appear as if this patch
> was intended to fix such a problem.
> 
> -Ewan

Are you sure it did? I can repropduce copy_to_user() errors with 4.8 as well.
Using the very same reproducer. On 4.8 it's just harder to trigger and
doesn't trigger on AHCI as fas as I can telli (maybe I just haven't hit
it hard enough). I can trigger it on QEMUs SCSI CDROM emulation and hpsa
though. I cannot however trigger this with a minimal config inside an initrd.


Here's my test:

--- 8< ---
#!/bin/sh 

cleanup()
{
rm -f expect.txt
rm -f test.txt
}

trap cleanup EXIT INT 

./sg_mmap_read -d /dev/sg0 -l 0 | od -x > expect.txt

for i in $(seq 1 5000); do
./sg_mmap_read -d /dev/sg0 -l 0 | od -x > test.txt
diff expect.txt test.txt > /dev/null
if [ $? -ne 0 ]; then
echo "FAIL"
exit 1
fi
done

echo "PASS"

--- >8 ---

I added some trace_printk()s to sg.c and try to hunt it down. It'll be good to
know if anyone of you can reproduce it with the above script as well.

Byte,
Johannes
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-23 Thread Ewan D. Milne
On Wed, 2016-11-23 at 13:55 -0500, Laurence Oberman wrote:
> 
> - Original Message -
> > From: "Eyal Ben David" <bde...@gmail.com>
> > To: "Ewan D. Milne" <emi...@redhat.com>
> > Cc: "Johannes Thumshirn" <jthumsh...@suse.de>, dgilb...@interlog.com, 
> > "Laurence Oberman" <lober...@redhat.com>,
> > linux-scsi@vger.kernel.org
> > Sent: Tuesday, November 22, 2016 3:55:44 PM
> > Subject: Re: SG does not ignore dxferp (direct io + mmap)
> > 
> > On Tue, Nov 22, 2016 at 8:30 PM, Ewan D. Milne <emi...@redhat.com> wrote:
> > >
> > > I see the behavior (zero byte) on the 4.4.34, 4.5.7, 4.6.7, and 4.7.10
> > > -stable kernels.  But not (of course) on 4.8.10 -stable.
> > >
> > > It doesn't look like the sg driver, might be something in the mmap code?
> > 
> > 
> > A kernel guy colleague suggested to look at copy_from_user / copy_to_user
> > code.
> > It was changed in 4.8
> > 
> > It was OK with 3.13  (Ubuntu 14.04) but from some kernel (prior or equal to
> > 4.4)
> > until 4.7 we see the bug. It was somehow fixed at 4.8.
> > 
> > In order to fully understand what happened, there are two changes to find.
> > They might not even be related.
> > 
> > Thanks!
> > Eyal
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> So 4.7.9 fails and 4.8.0 works and 4.8.0 is a rebase so we have 
> 
> [loberman@localhost linux-stable-4.8.10]$ git log --oneline v4.7.9..v4.8 | wc 
> -l
> 14552
> 
> No obvious single commits stand out for me for copy_from* or copy_to*
> There is this:
> 
> 3fa6c50 mm: optimize copy_page_to/from_iter_iovec
> 6e05050 sh: fix copy_from_user()
> e697100 x86/uaccess: force copy_*_user() to be inlined
> 
> I will have to do this the hard way with bisects to figure out which commit 
> addresses this.
> 
> Back when I have had enough time to do it.
> 
> Thanks
> Laurence

Yes, in fact...

Bisecting between 4.7 and 4.8 to see where the behavior changes so that
the zero byte no longer appears leads to this commit:

---
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[3fa6c507319c897598512da91c010a4ad2ed682c] mm: optimize 
copy_page_to/from_iter_iovec
---

commit 3fa6c507319c897598512da91c010a4ad2ed682c
Author: Mikulas Patocka <mpato...@redhat.com>
Date:   Thu Jul 28 15:48:50 2016 -0700

mm: optimize copy_page_to/from_iter_iovec

copy_page_to_iter_iovec() and copy_page_from_iter_iovec() copy some data
to userspace or from userspace.  These functions have a fast path where
they map a page using kmap_atomic and a slow path where they use kmap.

kmap is slower than kmap_atomic, so the fast path is preferred.

However, on kernels without highmem support, kmap just calls
page_address, so there is no need to avoid kmap.  On kernels without
highmem support, the fast path just increases code size (and cache
footprint) and it doesn't improve copy performance in any way.

This patch enables the fast path only if CONFIG_HIGHMEM is defined.

Code size reduced by this patch:
  x86 (without highmem)   928
  x86-64  960
  sparc64 848
  alpha  1136
  pa-risc1200

[a...@linux-foundation.org: use IS_ENABLED(), per Andi]
Link: 
http://lkml.kernel.org/r/alpine.lrh.2.02.1607221711410.4...@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Mikulas Patocka <mpato...@redhat.com>
Cc: Hugh Dickins <hu...@google.com>
Cc: Michal Hocko <mho...@kernel.org>
Cc: Alexander Viro <v...@zeniv.linux.org.uk>
Cc: Mel Gorman <mgor...@suse.de>
Cc: Johannes Weiner <han...@cmpxchg.org>
Cc: Andi Kleen <a...@firstfloor.org>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>

---

In other words, this commit made the bad behavior go away in 4.8.
Need to look at this in more detail, it doesn't appear as if this patch
was intended to fix such a problem.

-Ewan


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-23 Thread Laurence Oberman


- Original Message -
> From: "Eyal Ben David" <bde...@gmail.com>
> To: "Ewan D. Milne" <emi...@redhat.com>
> Cc: "Johannes Thumshirn" <jthumsh...@suse.de>, dgilb...@interlog.com, 
> "Laurence Oberman" <lober...@redhat.com>,
> linux-scsi@vger.kernel.org
> Sent: Tuesday, November 22, 2016 3:55:44 PM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> On Tue, Nov 22, 2016 at 8:30 PM, Ewan D. Milne <emi...@redhat.com> wrote:
> >
> > I see the behavior (zero byte) on the 4.4.34, 4.5.7, 4.6.7, and 4.7.10
> > -stable kernels.  But not (of course) on 4.8.10 -stable.
> >
> > It doesn't look like the sg driver, might be something in the mmap code?
> 
> 
> A kernel guy colleague suggested to look at copy_from_user / copy_to_user
> code.
> It was changed in 4.8
> 
> It was OK with 3.13  (Ubuntu 14.04) but from some kernel (prior or equal to
> 4.4)
> until 4.7 we see the bug. It was somehow fixed at 4.8.
> 
> In order to fully understand what happened, there are two changes to find.
> They might not even be related.
> 
> Thanks!
> Eyal
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

So 4.7.9 fails and 4.8.0 works and 4.8.0 is a rebase so we have 

[loberman@localhost linux-stable-4.8.10]$ git log --oneline v4.7.9..v4.8 | wc -l
14552

No obvious single commits stand out for me for copy_from* or copy_to*
There is this:

3fa6c50 mm: optimize copy_page_to/from_iter_iovec
6e05050 sh: fix copy_from_user()
e697100 x86/uaccess: force copy_*_user() to be inlined

I will have to do this the hard way with bisects to figure out which commit 
addresses this.

Back when I have had enough time to do it.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Eyal Ben David
On Tue, Nov 22, 2016 at 8:30 PM, Ewan D. Milne  wrote:
>
> I see the behavior (zero byte) on the 4.4.34, 4.5.7, 4.6.7, and 4.7.10
> -stable kernels.  But not (of course) on 4.8.10 -stable.
>
> It doesn't look like the sg driver, might be something in the mmap code?


A kernel guy colleague suggested to look at copy_from_user / copy_to_user code.
It was changed in 4.8

It was OK with 3.13  (Ubuntu 14.04) but from some kernel (prior or equal to 4.4)
until 4.7 we see the bug. It was somehow fixed at 4.8.

In order to fully understand what happened, there are two changes to find.
They might not even be related.

Thanks!
Eyal
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Laurence Oberman


- Original Message -
> From: "Ewan D. Milne" <emi...@redhat.com>
> To: "Johannes Thumshirn" <jthumsh...@suse.de>
> Cc: dgilb...@interlog.com, "Eyal Ben David" <bde...@gmail.com>, "Laurence 
> Oberman" <lober...@redhat.com>,
> linux-scsi@vger.kernel.org
> Sent: Tuesday, November 22, 2016 1:30:07 PM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> On Tue, 2016-11-22 at 09:37 +0100, Johannes Thumshirn wrote:
> > On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
> > > On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> > > > There was also this change which seems closer to the problem area:
> > > > 
> > > > commit 461c7fa126794157484dca48e88effa4963e3af3
> > > > Author: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
> > > > Date:   Tue Feb 2 16:57:35 2016 -0800
> > > > 
> > > >  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
> > > > ...
> > > > 
> > > > diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> > > > index 503ab8b..5e82067 100644
> > > > --- a/drivers/scsi/sg.c
> > > > +++ b/drivers/scsi/sg.c
> > > > @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct
> > > > *vma)
> > > >  }
> > > > 
> > > >  sfp->mmap_called = 1;
> > > > -   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > > > +   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
> > > >  vma->vm_private_data = sfp;
> > > >  vma->vm_ops = _mmap_vm_ops;
> > > >  return 0;
> > > > 
> > > > Doug Gilbert
> > > > 
> > > 
> > > Neither this change nor "sg: fix dxferp in from_to case" appears to
> > > fix the issue when applied on top of 4.4.  Still looking...
> > 
> > This brings bad memories from commit 2d99b55d3 back to live, but this is
> > applied on all test kernels I have.
> > 
> > I too will run some bisection as well now that we have an easy reproducer
> > and
> > my timezone is somewhat ahead. Let's see if we can stretch the workday a
> > bit.
> 
> I see the behavior (zero byte) on the 4.4.34, 4.5.7, 4.6.7, and 4.7.10
> -stable kernels.  But not (of course) on 4.8.10 -stable.
> 
> It doesn't look like the sg driver, might be something in the mmap code?
> 
> -Ewan
> 
> 
> 
> 
> 
> 
I have just re-installed my system here with a much larger /boot partition in 
preparation for the bisect builds.
My prior /boot was too small to hold multiple upstream kernels during the 
bisect.
I expect to have time Friday to Monday to do the bisects and hopefully narrow 
it down to what mmap code changes affect us here.

I will narrow the bisect starting with and 4.7.10 bad and 4.8.10 good.
Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Ewan D. Milne
On Tue, 2016-11-22 at 09:37 +0100, Johannes Thumshirn wrote:
> On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
> > On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> > > There was also this change which seems closer to the problem area:
> > > 
> > > commit 461c7fa126794157484dca48e88effa4963e3af3
> > > Author: Kirill A. Shutemov 
> > > Date:   Tue Feb 2 16:57:35 2016 -0800
> > > 
> > >  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
> > > ...
> > > 
> > > diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> > > index 503ab8b..5e82067 100644
> > > --- a/drivers/scsi/sg.c
> > > +++ b/drivers/scsi/sg.c
> > > @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct 
> > > *vma)
> > >  }
> > > 
> > >  sfp->mmap_called = 1;
> > > -   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > > +   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
> > >  vma->vm_private_data = sfp;
> > >  vma->vm_ops = _mmap_vm_ops;
> > >  return 0;
> > > 
> > > Doug Gilbert
> > > 
> > 
> > Neither this change nor "sg: fix dxferp in from_to case" appears to
> > fix the issue when applied on top of 4.4.  Still looking...
> 
> This brings bad memories from commit 2d99b55d3 back to live, but this is
> applied on all test kernels I have.
> 
> I too will run some bisection as well now that we have an easy reproducer and
> my timezone is somewhat ahead. Let's see if we can stretch the workday a bit.

I see the behavior (zero byte) on the 4.4.34, 4.5.7, 4.6.7, and 4.7.10
-stable kernels.  But not (of course) on 4.8.10 -stable.

It doesn't look like the sg driver, might be something in the mmap code?

-Ewan





--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Eyal Ben David
> @Eyal is this with a physical or virtual host? And what kind of HBA do you
> have? Just in case it makes a difference.


All physical hosts,

Original problem was detected on qlogic FC HBA (Ubuntu 16.04)

To check if this is transport related, we reproduced the 0 byte
corruption on iSCSI too.

To check if the problem is our user space tool we ran with with iSCSI + tcpdump
and learnt that the problem is probably in the kernel and is not
related to the transport selected.

Later to learn more I ran "sg_mmap_read" on several devices: local
hard disks, FC cards and iSCSI.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Johannes Thumshirn
On Tue, Nov 22, 2016 at 10:31:23AM -0500, Laurence Oberman wrote:
[...]

> Its not failing on 4.8-rc2 so working on tracing rgw ioctl SG_DXFER_FROM_DEV 
> as well and will bisect this weekend and/or add some kernel probes.
> Just need to get some time.
> Johannes or Ewan may beat me to it.

Sounds like a good plan. I did work on an automated test setup all day but
couldn't recreate the bug with qemu (and it's default AHCI emulation, will
try virtio-scsi tomorrow).

@Eyal is this with a physical or virtual host? And what kind of HBA do you
have? Just in case it makes a difference.

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Laurence Oberman


- Original Message -
> From: "Eyal Ben David" <bde...@gmail.com>
> To: "Johannes Thumshirn" <jthumsh...@suse.de>
> Cc: "Ewan D. Milne" <emi...@redhat.com>, dgilb...@interlog.com, "Laurence 
> Oberman" <lober...@redhat.com>,
> linux-scsi@vger.kernel.org
> Sent: Tuesday, November 22, 2016 8:48:06 AM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> Same problem on Fedora 23
> 
> $ uname -r
> 4.7.10-100.fc23.x86_64
> 
> $ sudo ./sg_mmap_read -d /dev/sg0 -l 0 | od -t x1
> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> ...
> 
> $ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -t x1
> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> ...
> 
> $ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -t x1
> 000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> ...
> 
> 2016-11-22 10:37 GMT+02:00 Johannes Thumshirn <jthumsh...@suse.de>:
> > On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
> >> On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> >> > There was also this change which seems closer to the problem area:
> >> >
> >> > commit 461c7fa126794157484dca48e88effa4963e3af3
> >> > Author: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
> >> > Date:   Tue Feb 2 16:57:35 2016 -0800
> >> >
> >> >  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
> >> > ...
> >> >
> >> > diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> >> > index 503ab8b..5e82067 100644
> >> > --- a/drivers/scsi/sg.c
> >> > +++ b/drivers/scsi/sg.c
> >> > @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct
> >> > *vma)
> >> >  }
> >> >
> >> >  sfp->mmap_called = 1;
> >> > -   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> >> > +   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
> >> >  vma->vm_private_data = sfp;
> >> >  vma->vm_ops = _mmap_vm_ops;
> >> >  return 0;
> >> >
> >> > Doug Gilbert
> >> >
> >>
> >> Neither this change nor "sg: fix dxferp in from_to case" appears to
> >> fix the issue when applied on top of 4.4.  Still looking...
> >
> > This brings bad memories from commit 2d99b55d3 back to live, but this is
> > applied on all test kernels I have.
> >
> > I too will run some bisection as well now that we have an easy reproducer
> > and
> > my timezone is somewhat ahead. Let's see if we can stretch the workday a
> > bit.
> >
> > Byte,
> > Johannes
> >
> > --
> > Johannes Thumshirn  Storage
> > jthumsh...@suse.de+49 911 74053 689
> > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> > GF: Felix Imendörffer, Jane Smithard, Graham Norton
> > HRB 21284 (AG Nürnberg)
> > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Its not failing on 4.8-rc2 so working on tracing rgw ioctl SG_DXFER_FROM_DEV as 
well and will bisect this weekend and/or add some kernel probes.
Just need to get some time.
Johannes or Ewan may beat me to it.

Working RHEL7 

open("/dev/sg1", O_RDWR)= 3
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f5645d4a000
ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[88, 00, 00, 00, 00, 00, 00, 
00, 00, 00, 00, 00, 00, 01, 00, 00], mx_sb_len=96, iovec_count=0, 
dxfer_len=512, timeout=3, flags=0x4, data[512]=[0x7f5645d4a000], status=00, 
masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, 
duration=1, info=0}) = 0
write(1, 0x7f5645d4a000, 64000 63eb 1090 d08e 00bc b8b0  d88e c08e
)= 64
010 befb 7c00 00bf b906 0200 a4f3 21ea 0006
020 be00 07be 0438 0b75 c683 8110 fefe 7507
030 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
munmap(0x7f5645d4a000, 8192)= 0
close(3)= 0
exit_group(0)   = ?
+++ exited with 0 +++
040

Failing 4.4

open("/dev/sg1", O_RDWR)= 3
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f2951de3000
ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[88, 00, 00, 00, 00, 00, 00, 
00, 00, 00, 00, 00, 00, 01, 00, 00]

Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Eyal Ben David
Same problem on Fedora 23

$ uname -r
4.7.10-100.fc23.x86_64

$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 | od -t x1
000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
...

$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -t x1
000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
...

$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -t x1
000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
...

2016-11-22 10:37 GMT+02:00 Johannes Thumshirn :
> On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
>> On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
>> > There was also this change which seems closer to the problem area:
>> >
>> > commit 461c7fa126794157484dca48e88effa4963e3af3
>> > Author: Kirill A. Shutemov 
>> > Date:   Tue Feb 2 16:57:35 2016 -0800
>> >
>> >  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
>> > ...
>> >
>> > diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
>> > index 503ab8b..5e82067 100644
>> > --- a/drivers/scsi/sg.c
>> > +++ b/drivers/scsi/sg.c
>> > @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct 
>> > *vma)
>> >  }
>> >
>> >  sfp->mmap_called = 1;
>> > -   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
>> > +   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
>> >  vma->vm_private_data = sfp;
>> >  vma->vm_ops = _mmap_vm_ops;
>> >  return 0;
>> >
>> > Doug Gilbert
>> >
>>
>> Neither this change nor "sg: fix dxferp in from_to case" appears to
>> fix the issue when applied on top of 4.4.  Still looking...
>
> This brings bad memories from commit 2d99b55d3 back to live, but this is
> applied on all test kernels I have.
>
> I too will run some bisection as well now that we have an easy reproducer and
> my timezone is somewhat ahead. Let's see if we can stretch the workday a bit.
>
> Byte,
> Johannes
>
> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-22 Thread Johannes Thumshirn
On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
> On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> > There was also this change which seems closer to the problem area:
> > 
> > commit 461c7fa126794157484dca48e88effa4963e3af3
> > Author: Kirill A. Shutemov 
> > Date:   Tue Feb 2 16:57:35 2016 -0800
> > 
> >  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
> > ...
> > 
> > diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> > index 503ab8b..5e82067 100644
> > --- a/drivers/scsi/sg.c
> > +++ b/drivers/scsi/sg.c
> > @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma)
> >  }
> > 
> >  sfp->mmap_called = 1;
> > -   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > +   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
> >  vma->vm_private_data = sfp;
> >  vma->vm_ops = _mmap_vm_ops;
> >  return 0;
> > 
> > Doug Gilbert
> > 
> 
> Neither this change nor "sg: fix dxferp in from_to case" appears to
> fix the issue when applied on top of 4.4.  Still looking...

This brings bad memories from commit 2d99b55d3 back to live, but this is
applied on all test kernels I have.

I too will run some bisection as well now that we have an easy reproducer and
my timezone is somewhat ahead. Let's see if we can stretch the workday a bit.

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Ewan D. Milne
On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> There was also this change which seems closer to the problem area:
> 
> commit 461c7fa126794157484dca48e88effa4963e3af3
> Author: Kirill A. Shutemov 
> Date:   Tue Feb 2 16:57:35 2016 -0800
> 
>  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
> ...
> 
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 503ab8b..5e82067 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma)
>  }
> 
>  sfp->mmap_called = 1;
> -   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> +   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
>  vma->vm_private_data = sfp;
>  vma->vm_ops = _mmap_vm_ops;
>  return 0;
> 
> Doug Gilbert
> 

Neither this change nor "sg: fix dxferp in from_to case" appears to
fix the issue when applied on top of 4.4.  Still looking...

-Ewan

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Douglas Gilbert

On 2016-11-21 09:24 AM, Ewan D. Milne wrote:

On Mon, 2016-11-21 at 11:23 +0200, Eyal Ben David wrote:

Hi,

The utility I mentioned is just a small program that I wrote to learn
more about the problem.

It is a very simple read16 with options for mmap and dxferp as null or other.

Here is the source code:

== cut here ==






2016-11-21 2:04 GMT+02:00 Laurence Oberman :



- Original Message -

From: "Eyal Ben David" 
To: linux-scsi@vger.kernel.org
Sent: Sunday, November 20, 2016 11:02:49 AM
Subject: SG does not ignore dxferp (direct io + mmap)

Hi all,

We have some IO utility that perform the IOs using sg and direct io with
mmap.
Our current systems are Ubuntu 14.04, RHEL 6,7
The IO utility always set dxferp to either the address or mmap of
other allocation (valloc)
Setting dxferp was harmless since SG is supposed to ignore the address
if mmap IO is selected.
When porting to Ubuntu 16.04, we had a corruption problem - first byte
of a read task is always 0.
When setting dxferp as NULL the corruption does not occur any more.
This is a regression and not according to SCSI generic documentation.

I wrote a small program that shows the change:

Read indirect (no mmap), lba=0:
===
$ ./sg_mmap_read -d /dev/sg0 -l 0
000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0

Read with mmap, lba=0, dxferp=NULL:

$ ./sg_mmap_read -d /dev/sg0 -l 0 -m
000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0

Read with mmap, lba=0, dxferp=address from mmap
==
$ ./sg_mmap_read -d /dev/sg0 -l 0 -m -b
000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0

On the older systems all results are the same.

Thanks for any answer!
--


For what it's worth, I ran this on a 4.8 kernel and did not see your
problem.  I couldn't reproduce it on a RHEL kernel either.

# ./sg_mmap_read -d /dev/sg0 -l 0 | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100
# ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100
# ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100

The only recent relevant change I see is:

commit 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4
Author: Douglas Gilbert 
Date:   Thu Mar 3 00:31:29 2016 -0500

sg: fix dxferp in from_to case

However the kernel I used has that change, and the
change purposely does not clear hp->dxferp if
SG_DXFER_TO_FROM_DEV is specified, which your program
does not, and the behavior is correct regardless.

Can you modify your program to output the userspace
address of your buffer?


There was also this change which seems closer to the problem area:

commit 461c7fa126794157484dca48e88effa4963e3af3
Author: Kirill A. Shutemov 
Date:   Tue Feb 2 16:57:35 2016 -0800

drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
...

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 503ab8b..5e82067 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma)
}

sfp->mmap_called = 1;
-   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
+   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_private_data = sfp;
vma->vm_ops = _mmap_vm_ops;
return 0;

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Ewan D. Milne
On Mon, 2016-11-21 at 16:15 +0100, Johannes Thumshirn wrote:
> 
> FWIW:
> jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 | hexdump 
> 000 c033 d08e 00bc 8e7c 8ec0 bed8 7c00 00bf
> 010 b906 0200 f3fc 50a4 1c68 cb06 b9fb 0004
> 020 bebd 8007 007e 7c00 0f0b 0e85 8301 10c5
> 030 f1e2 18cd 5688 5500 46c6 0511 46c6 0010
> 040
> jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m | hexdump 
> 000 c033 d08e 00bc 8e7c 8ec0 bed8 7c00 00bf
> 010 b906 0200 f3fc 50a4 1c68 cb06 b9fb 0004
> 020 bebd 8007 007e 7c00 0f0b 0e85 8301 10c5
> 030 f1e2 18cd 5688 5500 46c6 0511 46c6 0010
> 040
> jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | hexdump 
> 000 c033 d08e 00bc 8e7c 8ec0 bed8 7c00 00bf
> 010 b906 0200 f3fc 50a4 1c68 cb06 b9fb 0004
> 020 bebd 8007 007e 7c00 0f0b 0e85 8301 10c5
> 030 f1e2 18cd 5688 5500 46c6 0511 46c6 0010
> 040
> jthumshirn@linux-x5ow:~$ uname -r
> 4.8.4-1-default
> 
> root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 | hexdump
> 000 caec 1dfe 3fdc 5b26 5775 eff6 a760 d612
> 010 c43c 8551 cfb8 4630 5e61 0750 44f3 c99e
> 020 5718 84fd 073f e2f5 0f30 f1fa 1e1a 51e6
> 030 3379 217a 7d0f 8040 11cd 261f 3f8b 6de2
> 040
> root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -m | hexdump
> 000 cd00 5f33 8809  0002   
> 010 0007    f000  7fff 
> 020     6e9d 57ac  
> 030        
> 040
> root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -m -b | hexdump
> 000 4c00 bd7e 8805  0002   
> 010 002f    f000  7fff 
> 020     6e9d 57ac  
> 030        
> 040
> root@glass:~ # uname -r
> 4.4.21-69-default
> root@glass:~ # 
> 
> 
> Bt:
> root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
> 000 46c0 b3b0 8804  0002   
> 010 0021    f000  7fff 
> 020     6e9d 57ac  
> 030 626c 3170 6f5f 2070 2020 2020 3d20 3020
> 040
> root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
> 000        
> *
> 040
> root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
> 000        
> *
> 040
> root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
> 000 73db ffe0 7a7d 1bf6 85cf 39ab a094 7802
> 010 959c 5b58 dc6f 1a36 5e7d 0afe d5be 3d41
> 020 691e 36ee 458c 14d2 06ed 371b 24dd 1f45
> 030 428a 24a0 866a 4a19 4954 e46a afe9 f3df
> 040
> glass:~ # 
> 
> Maybe we have a race here.
> 
> for i in $(seq 1 100); do 
>   sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | hexdump
> done
> on 4.8 works fine.
> 
> On the 4.4 host it has some hickups.
> 
> I'll have a look at it.
> 
> Byte,
>   Johannes
> 

Sure enough.

>From a freshly checked-out and built 4.4, i.e.:

commit afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc
Author: Linus Torvalds 
Date:   Sun Jan 10 15:01:32 2016 -0800

Linux 4.4

---

# ./sg_mmap_read -d /dev/sg0 -l 0 | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100
# ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100
# ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -x
000 6300 1090 d08e 00bc b8b0  d88e c08e<= WRONG
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100

-Ewan



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Eyal Ben David
I am using the disro kernels, don't know if they have the patch.

Our IO testing utility use the same pattern (mmap + non-null dxferp) for
a long time, on RHEL 6.x, 7.x and Ubuntu 12.04, 14.04 without a problem,
long before the patch was applied.

Thanks,
Eyal


2016-11-21 17:44 GMT+02:00 Johannes Thumshirn :
> On Mon, Nov 21, 2016 at 04:15:52PM +0100, Johannes Thumshirn wrote:
>> On Mon, Nov 21, 2016 at 04:55:29PM +0200, Eyal Ben David wrote:
>> > Thanks for your reply,
>> >
>> > On RHEL system it does not occur.
>> >
>> > So far I have seen the problem on Ubuntu 16.04 and Fedora 22 (both
>> > with kernel 4.4.x)
>
> As Ewan already pointed out, do you have commit
> 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4 applied in your kernels? It went in
> 4.6 and got stable backported from there till 2.6.something. I'm trying this
> patch myself to see whether it is the required change or not.
>
> Byte,
> Johannes
>
> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Johannes Thumshirn
On Mon, Nov 21, 2016 at 04:15:52PM +0100, Johannes Thumshirn wrote:
> On Mon, Nov 21, 2016 at 04:55:29PM +0200, Eyal Ben David wrote:
> > Thanks for your reply,
> > 
> > On RHEL system it does not occur.
> > 
> > So far I have seen the problem on Ubuntu 16.04 and Fedora 22 (both
> > with kernel 4.4.x)

As Ewan already pointed out, do you have commit
5ecee0a3ee8d74b6950cb41e8989b0c2174568d4 applied in your kernels? It went in
4.6 and got stable backported from there till 2.6.something. I'm trying this
patch myself to see whether it is the required change or not.

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Johannes Thumshirn
On Mon, Nov 21, 2016 at 04:55:29PM +0200, Eyal Ben David wrote:
> Thanks for your reply,
> 
> On RHEL system it does not occur.
> 
> So far I have seen the problem on Ubuntu 16.04 and Fedora 22 (both
> with kernel 4.4.x)

FWIW:
jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 | hexdump 
000 c033 d08e 00bc 8e7c 8ec0 bed8 7c00 00bf
010 b906 0200 f3fc 50a4 1c68 cb06 b9fb 0004
020 bebd 8007 007e 7c00 0f0b 0e85 8301 10c5
030 f1e2 18cd 5688 5500 46c6 0511 46c6 0010
040
jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m | hexdump 
000 c033 d08e 00bc 8e7c 8ec0 bed8 7c00 00bf
010 b906 0200 f3fc 50a4 1c68 cb06 b9fb 0004
020 bebd 8007 007e 7c00 0f0b 0e85 8301 10c5
030 f1e2 18cd 5688 5500 46c6 0511 46c6 0010
040
jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | hexdump 
000 c033 d08e 00bc 8e7c 8ec0 bed8 7c00 00bf
010 b906 0200 f3fc 50a4 1c68 cb06 b9fb 0004
020 bebd 8007 007e 7c00 0f0b 0e85 8301 10c5
030 f1e2 18cd 5688 5500 46c6 0511 46c6 0010
040
jthumshirn@linux-x5ow:~$ uname -r
4.8.4-1-default

root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 | hexdump
000 caec 1dfe 3fdc 5b26 5775 eff6 a760 d612
010 c43c 8551 cfb8 4630 5e61 0750 44f3 c99e
020 5718 84fd 073f e2f5 0f30 f1fa 1e1a 51e6
030 3379 217a 7d0f 8040 11cd 261f 3f8b 6de2
040
root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -m | hexdump
000 cd00 5f33 8809  0002   
010 0007    f000  7fff 
020     6e9d 57ac  
030        
040
root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -m -b | hexdump
000 4c00 bd7e 8805  0002   
010 002f    f000  7fff 
020     6e9d 57ac  
030        
040
root@glass:~ # uname -r
4.4.21-69-default
root@glass:~ # 


Bt:
root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
000 46c0 b3b0 8804  0002   
010 0021    f000  7fff 
020     6e9d 57ac  
030 626c 3170 6f5f 2070 2020 2020 3d20 3020
040
root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
000        
*
040
root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
000        
*
040
root@glass:~ # ~jthumshirn/sg_mmap_read -d /dev/sg0 -l 0 -| hexdump
000 73db ffe0 7a7d 1bf6 85cf 39ab a094 7802
010 959c 5b58 dc6f 1a36 5e7d 0afe d5be 3d41
020 691e 36ee 458c 14d2 06ed 371b 24dd 1f45
030 428a 24a0 866a 4a19 4954 e46a afe9 f3df
040
glass:~ # 

Maybe we have a race here.

for i in $(seq 1 100); do 
sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | hexdump
done
on 4.8 works fine.

On the 4.4 host it has some hickups.

I'll have a look at it.

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Laurence Oberman


- Original Message -
> From: "Eyal Ben David" <bde...@gmail.com>
> To: emi...@redhat.com
> Cc: "Laurence Oberman" <lober...@redhat.com>, dgilb...@interlog.com, 
> linux-scsi@vger.kernel.org
> Sent: Monday, November 21, 2016 9:55:29 AM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> Thanks for your reply,
> 
> On RHEL system it does not occur.
> 
> So far I have seen the problem on Ubuntu 16.04 and Fedora 22 (both
> with kernel 4.4.x)
> 
> 2016-11-21 16:24 GMT+02:00 Ewan D. Milne <emi...@redhat.com>:
> > On Mon, 2016-11-21 at 11:23 +0200, Eyal Ben David wrote:
> >> Hi,
> >>
> >> The utility I mentioned is just a small program that I wrote to learn
> >> more about the problem.
> >>
> >> It is a very simple read16 with options for mmap and dxferp as null or
> >> other.
> >>
> >> Here is the source code:
> >>
> >> == cut here ==
> >>
> >
> > 
> >
> >> 2016-11-21 2:04 GMT+02:00 Laurence Oberman <lober...@redhat.com>:
> >> >
> >> >
> >> > - Original Message -
> >> >> From: "Eyal Ben David" <bde...@gmail.com>
> >> >> To: linux-scsi@vger.kernel.org
> >> >> Sent: Sunday, November 20, 2016 11:02:49 AM
> >> >> Subject: SG does not ignore dxferp (direct io + mmap)
> >> >>
> >> >> Hi all,
> >> >>
> >> >> We have some IO utility that perform the IOs using sg and direct io
> >> >> with
> >> >> mmap.
> >> >> Our current systems are Ubuntu 14.04, RHEL 6,7
> >> >> The IO utility always set dxferp to either the address or mmap of
> >> >> other allocation (valloc)
> >> >> Setting dxferp was harmless since SG is supposed to ignore the address
> >> >> if mmap IO is selected.
> >> >> When porting to Ubuntu 16.04, we had a corruption problem - first byte
> >> >> of a read task is always 0.
> >> >> When setting dxferp as NULL the corruption does not occur any more.
> >> >> This is a regression and not according to SCSI generic documentation.
> >> >>
> >> >> I wrote a small program that shows the change:
> >> >>
> >> >> Read indirect (no mmap), lba=0:
> >> >> ===
> >> >> $ ./sg_mmap_read -d /dev/sg0 -l 0
> >> >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> >> >>
> >> >> Read with mmap, lba=0, dxferp=NULL:
> >> >> 
> >> >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m
> >> >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> >> >>
> >> >> Read with mmap, lba=0, dxferp=address from mmap
> >> >> ==
> >> >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m -b
> >> >> 000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> >> >>
> >> >> On the older systems all results are the same.
> >> >>
> >> >> Thanks for any answer!
> >> >> --
> >
> > For what it's worth, I ran this on a 4.8 kernel and did not see your
> > problem.  I couldn't reproduce it on a RHEL kernel either.
> >
> > # ./sg_mmap_read -d /dev/sg0 -l 0 | od -x
> > 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> > 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> > 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> > 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> > 100
> > # ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -x
> > 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> > 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> > 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> > 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> > 100
> > # ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -x
> > 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> > 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> > 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> > 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> > 100
> >
> > The only recent relevant change I see is:
> >
> > commit 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4
> > Author: Douglas Gilbert <dgilb...@interlog.com>
> > Date:   Thu Mar 3 00:31:29 2016 -0500
> >
> > sg: fix dxferp in from_to case
> >
> > However the kernel I used has tha

Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Eyal Ben David
Thanks for your reply,

On RHEL system it does not occur.

So far I have seen the problem on Ubuntu 16.04 and Fedora 22 (both
with kernel 4.4.x)

2016-11-21 16:24 GMT+02:00 Ewan D. Milne :
> On Mon, 2016-11-21 at 11:23 +0200, Eyal Ben David wrote:
>> Hi,
>>
>> The utility I mentioned is just a small program that I wrote to learn
>> more about the problem.
>>
>> It is a very simple read16 with options for mmap and dxferp as null or other.
>>
>> Here is the source code:
>>
>> == cut here ==
>>
>
> 
>
>> 2016-11-21 2:04 GMT+02:00 Laurence Oberman :
>> >
>> >
>> > - Original Message -
>> >> From: "Eyal Ben David" 
>> >> To: linux-scsi@vger.kernel.org
>> >> Sent: Sunday, November 20, 2016 11:02:49 AM
>> >> Subject: SG does not ignore dxferp (direct io + mmap)
>> >>
>> >> Hi all,
>> >>
>> >> We have some IO utility that perform the IOs using sg and direct io with
>> >> mmap.
>> >> Our current systems are Ubuntu 14.04, RHEL 6,7
>> >> The IO utility always set dxferp to either the address or mmap of
>> >> other allocation (valloc)
>> >> Setting dxferp was harmless since SG is supposed to ignore the address
>> >> if mmap IO is selected.
>> >> When porting to Ubuntu 16.04, we had a corruption problem - first byte
>> >> of a read task is always 0.
>> >> When setting dxferp as NULL the corruption does not occur any more.
>> >> This is a regression and not according to SCSI generic documentation.
>> >>
>> >> I wrote a small program that shows the change:
>> >>
>> >> Read indirect (no mmap), lba=0:
>> >> ===
>> >> $ ./sg_mmap_read -d /dev/sg0 -l 0
>> >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
>> >>
>> >> Read with mmap, lba=0, dxferp=NULL:
>> >> 
>> >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m
>> >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
>> >>
>> >> Read with mmap, lba=0, dxferp=address from mmap
>> >> ==
>> >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m -b
>> >> 000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
>> >>
>> >> On the older systems all results are the same.
>> >>
>> >> Thanks for any answer!
>> >> --
>
> For what it's worth, I ran this on a 4.8 kernel and did not see your
> problem.  I couldn't reproduce it on a RHEL kernel either.
>
> # ./sg_mmap_read -d /dev/sg0 -l 0 | od -x
> 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> 100
> # ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -x
> 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> 100
> # ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -x
> 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> 100
>
> The only recent relevant change I see is:
>
> commit 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4
> Author: Douglas Gilbert 
> Date:   Thu Mar 3 00:31:29 2016 -0500
>
> sg: fix dxferp in from_to case
>
> However the kernel I used has that change, and the
> change purposely does not clear hp->dxferp if
> SG_DXFER_TO_FROM_DEV is specified, which your program
> does not, and the behavior is correct regardless.
>
> Can you modify your program to output the userspace
> address of your buffer?
>
> -Ewan
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Laurence Oberman


- Original Message -
> From: "Ewan D. Milne" <emi...@redhat.com>
> To: "Eyal Ben David" <bde...@gmail.com>
> Cc: "Laurence Oberman" <lober...@redhat.com>, dgilb...@interlog.com, 
> linux-scsi@vger.kernel.org
> Sent: Monday, November 21, 2016 9:24:35 AM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
> 
> On Mon, 2016-11-21 at 11:23 +0200, Eyal Ben David wrote:
> > Hi,
> > 
> > The utility I mentioned is just a small program that I wrote to learn
> > more about the problem.
> > 
> > It is a very simple read16 with options for mmap and dxferp as null or
> > other.
> > 
> > Here is the source code:
> > 
> > == cut here ==
> > 
> 
> 
> 
> > 2016-11-21 2:04 GMT+02:00 Laurence Oberman <lober...@redhat.com>:
> > >
> > >
> > > - Original Message -
> > >> From: "Eyal Ben David" <bde...@gmail.com>
> > >> To: linux-scsi@vger.kernel.org
> > >> Sent: Sunday, November 20, 2016 11:02:49 AM
> > >> Subject: SG does not ignore dxferp (direct io + mmap)
> > >>
> > >> Hi all,
> > >>
> > >> We have some IO utility that perform the IOs using sg and direct io with
> > >> mmap.
> > >> Our current systems are Ubuntu 14.04, RHEL 6,7
> > >> The IO utility always set dxferp to either the address or mmap of
> > >> other allocation (valloc)
> > >> Setting dxferp was harmless since SG is supposed to ignore the address
> > >> if mmap IO is selected.
> > >> When porting to Ubuntu 16.04, we had a corruption problem - first byte
> > >> of a read task is always 0.
> > >> When setting dxferp as NULL the corruption does not occur any more.
> > >> This is a regression and not according to SCSI generic documentation.
> > >>
> > >> I wrote a small program that shows the change:
> > >>
> > >> Read indirect (no mmap), lba=0:
> > >> ===
> > >> $ ./sg_mmap_read -d /dev/sg0 -l 0
> > >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> > >>
> > >> Read with mmap, lba=0, dxferp=NULL:
> > >> 
> > >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m
> > >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> > >>
> > >> Read with mmap, lba=0, dxferp=address from mmap
> > >> ==
> > >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m -b
> > >> 000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> > >>
> > >> On the older systems all results are the same.
> > >>
> > >> Thanks for any answer!
> > >> --
> 
> For what it's worth, I ran this on a 4.8 kernel and did not see your
> problem.  I couldn't reproduce it on a RHEL kernel either.
> 
> # ./sg_mmap_read -d /dev/sg0 -l 0 | od -x
> 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> 100
> # ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -x
> 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> 100
> # ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -x
> 000 63eb 1090 d08e 00bc b8b0  d88e c08e
> 020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
> 040 be00 07be 0438 0b75 c683 8110 fefe 7507
> 060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
> 100
> 
> The only recent relevant change I see is:
> 
> commit 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4
> Author: Douglas Gilbert <dgilb...@interlog.com>
> Date:   Thu Mar 3 00:31:29 2016 -0500
> 
> sg: fix dxferp in from_to case
> 
> However the kernel I used has that change, and the
> change purposely does not clear hp->dxferp if
> SG_DXFER_TO_FROM_DEV is specified, which your program
> does not, and the behavior is correct regardless.
> 
> Can you modify your program to output the userspace
> address of your buffer?
> 
> -Ewan
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

This is from a current RHEL6.9 kernel and seems to be unable to reproduce as 
well

Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Ewan D. Milne
On Mon, 2016-11-21 at 11:23 +0200, Eyal Ben David wrote:
> Hi,
> 
> The utility I mentioned is just a small program that I wrote to learn
> more about the problem.
> 
> It is a very simple read16 with options for mmap and dxferp as null or other.
> 
> Here is the source code:
> 
> == cut here ==
> 



> 2016-11-21 2:04 GMT+02:00 Laurence Oberman :
> >
> >
> > - Original Message -
> >> From: "Eyal Ben David" 
> >> To: linux-scsi@vger.kernel.org
> >> Sent: Sunday, November 20, 2016 11:02:49 AM
> >> Subject: SG does not ignore dxferp (direct io + mmap)
> >>
> >> Hi all,
> >>
> >> We have some IO utility that perform the IOs using sg and direct io with
> >> mmap.
> >> Our current systems are Ubuntu 14.04, RHEL 6,7
> >> The IO utility always set dxferp to either the address or mmap of
> >> other allocation (valloc)
> >> Setting dxferp was harmless since SG is supposed to ignore the address
> >> if mmap IO is selected.
> >> When porting to Ubuntu 16.04, we had a corruption problem - first byte
> >> of a read task is always 0.
> >> When setting dxferp as NULL the corruption does not occur any more.
> >> This is a regression and not according to SCSI generic documentation.
> >>
> >> I wrote a small program that shows the change:
> >>
> >> Read indirect (no mmap), lba=0:
> >> ===
> >> $ ./sg_mmap_read -d /dev/sg0 -l 0
> >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> >>
> >> Read with mmap, lba=0, dxferp=NULL:
> >> 
> >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m
> >> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> >>
> >> Read with mmap, lba=0, dxferp=address from mmap
> >> ==
> >> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m -b
> >> 000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> >>
> >> On the older systems all results are the same.
> >>
> >> Thanks for any answer!
> >> --

For what it's worth, I ran this on a 4.8 kernel and did not see your
problem.  I couldn't reproduce it on a RHEL kernel either.

# ./sg_mmap_read -d /dev/sg0 -l 0 | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100
# ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100
# ./sg_mmap_read -d /dev/sg0 -l 0 -m -b | od -x
000 63eb 1090 d08e 00bc b8b0  d88e c08e
020 befb 7c00 00bf b906 0200 a4f3 21ea 0006
040 be00 07be 0438 0b75 c683 8110 fefe 7507
060 ebf3 b416 b002 bb01 7c00 80b2 748a 8b01
100

The only recent relevant change I see is:

commit 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4
Author: Douglas Gilbert 
Date:   Thu Mar 3 00:31:29 2016 -0500

sg: fix dxferp in from_to case

However the kernel I used has that change, and the
change purposely does not clear hp->dxferp if
SG_DXFER_TO_FROM_DEV is specified, which your program
does not, and the behavior is correct regardless.

Can you modify your program to output the userspace
address of your buffer?

-Ewan


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SG does not ignore dxferp (direct io + mmap)

2016-11-21 Thread Eyal Ben David
Hi,

The utility I mentioned is just a small program that I wrote to learn
more about the problem.

It is a very simple read16 with options for mmap and dxferp as null or other.

Here is the source code:

== cut here ==

#include 
#include 
#include 
#include 
#include 

#include 
#include 
#include 
#include 
#include 
#include 


void* prepare_addr_fd(const char* devname, size_t block_size, int* fd,
int fmmap)
{
void* addr = NULL;
int sg_fd;

if ((sg_fd = open(devname, O_RDWR)) < 0) {
fprintf(stderr, "failed to open %s: %s\n", devname, strerror(errno));
exit(1);
}

if (fmmap) {
addr = mmap(NULL, block_size, (PROT_READ | PROT_WRITE),
MAP_SHARED, sg_fd, 0);
if (addr == MAP_FAILED) {
fprintf(stderr, "failed to mmap %s: %s\n", devname,
strerror(errno));
exit(1);
}
}
else {
addr = valloc(block_size);
if (!addr) {
fprintf(stderr, "failed to allocate: %s\n", strerror(errno));
exit(1);
}
}

assert(fd != NULL);
*fd = sg_fd;
return addr;
}


void cleanup_addr_fd(void* addr, size_t size, int fd, int mmap)
{
if (addr) {
if (mmap)
munmap(addr, size);
else
free(addr);
}

close(fd);
}


void prepare_read16_cdb(size_t lba, size_t block_count, unsigned char
scsi_cmd[])
{
// initialize the scsi_cmd buffer
//
scsi_cmd[0] = (unsigned char)(0x88); /* READ 16 opcode */

scsi_cmd[2] = (unsigned char)((lba >> 56) & 0xff);  /* lba */
scsi_cmd[3] = (unsigned char)((lba >> 48) & 0xff);
scsi_cmd[4] = (unsigned char)((lba >> 40) & 0xff);
scsi_cmd[5] = (unsigned char)((lba >> 32) & 0xff);
scsi_cmd[6] = (unsigned char)((lba >> 24) & 0xff);
scsi_cmd[7] = (unsigned char)((lba >> 16) & 0xff);
scsi_cmd[8] = (unsigned char)((lba >> 8) & 0xff);
scsi_cmd[9] = (unsigned char)(lba & 0xff);

scsi_cmd[10] = (unsigned char)((block_count >> 24) & 0xff); /*
block count */
scsi_cmd[11] = (unsigned char)((block_count >> 16) & 0xff);
scsi_cmd[12] = (unsigned char)((block_count >> 8) & 0xff);
scsi_cmd[13] = (unsigned char)(block_count & 0xff);
}

void scsi_read_block(const char* device, size_t lba, int fmmap, int fmmap_bug)
{
struct sg_io_hdr hdr;
unsigned char cdb[16];
unsigned char scsi_sense[96];
int fd = 0;
void* addr = NULL;

memset(cdb, 0, sizeof cdb);
memset(, 0, sizeof hdr);
memset(scsi_sense, 0, sizeof scsi_sense);

addr = prepare_addr_fd(device, 8192, , fmmap);
prepare_read16_cdb(lba, 1, cdb);

hdr.interface_id= 'S';
hdr.dxfer_direction = SG_DXFER_FROM_DEV;
hdr.iovec_count = 0;
hdr.dxfer_len   = 512;
hdr.mx_sb_len   = sizeof scsi_sense;
hdr.sbp = scsi_sense;
hdr.timeout = 3;
hdr.pack_id = 0;
hdr.usr_ptr = 0;
hdr.cmdp= cdb;
hdr.cmd_len = sizeof(cdb);

if (fmmap) {
hdr.flags = 4;

// dxferp is null unless we want to recreate the bug
if (fmmap_bug)
hdr.dxferp = addr;
}
else {
hdr.dxferp = addr;
}

if (ioctl(fd, SG_IO, ) < 0) {
fprintf(stderr, "ioctl failed: %s\n", strerror(errno));
exit(1);
}

// pipe to od or hexdump
//
write(STDOUT_FILENO, addr, 64);

cleanup_addr_fd(addr, 8192, fd, fmmap);
}

int main(int argc, char* argv[])
{
const char* device = NULL;
ssize_t lba = -1;
int fmmap = 0;
int fmmap_bug = 0;
int opt;

const char* help_str =
"Usage: sg_mmap_read -d /dev/sgNNN -l  [-m -b]\n\n"
"Use -m to use mmap IO, use -b to check if dxferp is ignored\n"
"Redirect printing to od or hexdump\n";

if (argc == 1) {
puts(help_str);
return 0;
}

while ((opt = getopt(argc, argv, "d:l:mbh")) != -1) {
switch (opt) {
case 'd':
device = strdup(optarg);
break;
case 'l':
lba = atol(optarg);
break;
case 'm':
fmmap = 1;
break;
case 'b':
fmmap_bug = 1;
break;
case 'h':
puts(help_str);
return 0;
default:
return 1;
}
}

if (!device || lba == -1) {
fprintf(stderr, "command line error: missing device or lba");
return 1;
}

scsi_read_block(device, lba, fmmap, fmmap_bug);

return 0;
}

2016-11-21 2:04 GMT+02:00 Laurence Oberman :
>
>
> - Original Message -
>> From: "Eyal Ben David" 
>> To: linux-scsi@vger.kernel.org
>> Sent: Sunday, November 20, 2016 11:02:49 AM
>> Subject: SG does not ignore dxferp (direct io + mmap)
>>
>> Hi all,
>>
>> We have some IO utility that perform the IOs using sg and direct io with
>> mmap.
>> Our current systems are Ubuntu 14.04, RHEL 6,7
>> The IO utility always set 

Re: SG does not ignore dxferp (direct io + mmap)

2016-11-20 Thread Laurence Oberman


- Original Message -
> From: "Eyal Ben David" 
> To: linux-scsi@vger.kernel.org
> Sent: Sunday, November 20, 2016 11:02:49 AM
> Subject: SG does not ignore dxferp (direct io + mmap)
> 
> Hi all,
> 
> We have some IO utility that perform the IOs using sg and direct io with
> mmap.
> Our current systems are Ubuntu 14.04, RHEL 6,7
> The IO utility always set dxferp to either the address or mmap of
> other allocation (valloc)
> Setting dxferp was harmless since SG is supposed to ignore the address
> if mmap IO is selected.
> When porting to Ubuntu 16.04, we had a corruption problem - first byte
> of a read task is always 0.
> When setting dxferp as NULL the corruption does not occur any more.
> This is a regression and not according to SCSI generic documentation.
> 
> I wrote a small program that shows the change:
> 
> Read indirect (no mmap), lba=0:
> ===
> $ ./sg_mmap_read -d /dev/sg0 -l 0
> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> 
> Read with mmap, lba=0, dxferp=NULL:
> 
> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m
> 000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> 
> Read with mmap, lba=0, dxferp=address from mmap
> ==
> $ ./sg_mmap_read -d /dev/sg0 -l 0 -m -b
> 000 00 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
> 
> On the older systems all results are the same.
> 
> Thanks for any answer!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Hello

Given that we cannot see what your utility (sg_mmap_read) is doing, can we get 
the source for that or an strace of your test here.
I am sure Doug will then be able to help you.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html