changes and things get cleaned up nicely. We'll probably
have to coordinate the sd changes a bit.
Anyway. Great to finally get this resolved!
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
e
function but that does not satisfy the -Wswitch logic either.
Anyway. Enough energy wasted on this. I'm OK with either the default:
case or Max' if statement approach. My objection is purely
wrt. introducing semantically incorrect and/or unreachable code to
silence compiler warnings. Se
ther up the stack preventing that from ever happening. Adding a Type
0 here gives the reader the false impression that it's valid input to
the function. Which it really isn't.
Arnd: Any ideas how to handle this?
--
Martin K. Petersen Oracle Linux Engineering
. So my
preference is to just leave things as-is for now.
--
Martin K. Petersen Oracle Linux Engineering
Jens,
> While I like the idea of centralizing stuff like this, I'm also not
> happy with adding checks like this to the fast path.
Yeah, but at least it's just checking a request queue flag.
--
Martin K. Petersen Oracle Linux Engineering
s. Introduce
> .prepare_fn and .complete_fn callbacks within the integrity profile
> that each type can implement according to its needs.
LGTM.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Max,
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
not. So it is unclear
whether implementors (if any) went with the SCSI compatible route or
with what the NVMe spec actually says.
--
Martin K. Petersen Oracle Linux Engineering
it" and everything else is legacy.
>
> do you see any reason to support the broken type 3 ?
Only to support existing installations. We can't really remove it
without the risk of breaking something for somebody out there.
--
Martin K. Petersen Oracle Linux Engineering
format).
Anyway. So my take on all this is that the T10-DIF-TYPE1-CRC profile is
"it" and everything else is legacy.
> I think I'll prepare dummy/empty callbacks for type3 and for nop
> profiles instead of setting it to NULL.
>
> agreed ?
Sure. Whatever works.
--
Martin K. Petersen Oracle Linux Engineering
egrity profile in that case (see
previous mail about why keying off of the protection_type is a problem).
--
Martin K. Petersen Oracle Linux Engineering
Max,
> Only type 1 and type 2 have a reference tag by definition.
DIX Type 0 needs remapping so this assertion is not correct.
--
Martin K. Petersen Oracle Linux Engineering
t for the Type 1 profile.
> static inline unsigned short
> +blk_integrity_interval_shift(struct request_queue *q)
> +{
> + return q->limits.logical_block_shift;
> +}
> +
Why not use bio_integrity_intervals() or bi->interval_exp?
Note that for T10 PI Type 2, the protection inter
oth land linus tree.
I'll set up an amalgamated for-next branch tomorrow.
--
Martin K. Petersen Oracle Linux Engineering
Junxiao,
> Anybody could help review this bug?
It's on my list. However, your patch is clashing with my general
read-only handling changes so I'll probably need to roll your changes
into mine.
I'll try to look at this today.
--
Martin K. Petersen Oracle Linux Engineering
Jens,
> Martin, I'd like someone to vet/review the SCSI side of it before I
> apply it.
Looks good to me.
--
Martin K. Petersen Oracle Linux Engineering
by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
thank you!
--
Martin K. Petersen Oracle Linux Engineering
Damien,
> kbuild test robot gets the following compilation warning using gcc 7.4
> cross compilation for c6x (GCC_VERSION=7.4.0 make.cross ARCH=c6x).
Applied to 5.3/scsi-fixes, thanks!
--
Martin K. Petersen Oracle Linux Engineering
ould you prefer adding a check ?
I checked your call sites and they look fine. Also, I don't think
returning a capacity of 0 on error is going to help us much.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Damien,
> Do you want me to send a new version with updated commit message and
> Fixes tag ? Or will you fix that when applying ?
Please send me a tweaked one and I'll apply it right away.
--
Martin K. Petersen Oracle Linux Engineering
Jens,
> Otherwise obviously looks fine to me. Martin, do you want to pick this
> one up?
Yep, I'll merge it.
--
Martin K. Petersen Oracle Linux Engineering
Chaitanya,
> Set the current process's iopriority for discard, write-zeroes and
> write-same operations.
OK.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Chaitanya,
> Set the current process's iopriority for flush bio.
Kind of weird for a flush command. But we might as well be consistent.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Chaitanya,
> Set the current process's iopriority to the bio for REQ_OP_ZONE_RESET.
Looks fine.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Hi Chaitanya,
> +static inline sector_t bdev_nr_sects(struct block_device *bdev)
> +{
> + return part_nr_sects_read(bdev->bd_part);
> +}
Can bdev end up being NULL in any of the call sites?
Otherwise no objections.
--
Martin K. Petersen Oracle Linux Engineering
Wenwen,
> To fix this issue, free the allocated buffer before returning from
> bio_integrity_prep().
Acked-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Bart,
Looks fine.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Bart,
> Fix the spelling of the wbt_lat_usec sysfs attribute.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Bart,
> Commit f9824952ee1c ("block: update sysfs documentation") # v5.0 broke
> the alphabetical order of the sysfs attribute names. List queue sysfs
> attribute names alphabetically.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
nds for.
I don't have a problem clarifying the documentation. But the sysfs knobs
have a "segment" suffix so for better or for worse it is part of the
block layer lingo.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
net/lists/linux-scsi/msg129146.html). This patch
> is an extension of a subset of Martin's patch.
I'd rather we just get this one merged:
https://patchwork.kernel.org/patch/10967367/
It even comes with a shiny blktest.
--
Martin K. Petersen Oracle Linux Engineering
r the
VPD page to be present. The protocol feature is not tied to the
transport signaling speed in any way. But general support for the BL VPD
page roughly coincided with vendors introducing 12 Gbps SAS and 16 Gbps
FC products to the market.
--
Martin K. Petersen Oracle Linux Engineering
's Block Limits directly:
# sg_vpd -p bl /dev/sdX
If you want to tinker, you can simulate a SCSI disk with your choice of
io_opt:
# modprobe scsi_debug opt_blks=N
where N is the number of logical blocks to report as being the optimal
I/O size.
--
Martin K. Petersen Oracle Linux Engineering
Eric,
> While some drivers set queue_limits.io_opt (e.g., md raid5), there are
> currently no SCSI/RAID controller drivers that do.
That's not true. Lots of SCSI RAID devices report a stripe width.
--
Martin K. Petersen Oracle Linux Engineering
So it would be good to use the same
plumbing for both.
--
Martin K. Petersen Oracle Linux Engineering
int exactly the same string for both
error log messages and tracepoints. Since Chaitanya is doing a lot of
work in this area anyway, that may be worth looking into?
--
Martin K. Petersen Oracle Linux Engineering
Chaitanya,
> This is a pure code cleanup patch and doesn't change any functionality.
> In block layer to identify the request operation req_op() macro is
> used, so change the open coding the req_op() in the blk-mq-debugfs.c.
Looks good.
Reviewed-by: Martin K. Petersen
.
I'll monitor and merge them.
--
Martin K. Petersen Oracle Linux Engineering
in the block
layer.
Signed-off-by: Martin K. Petersen
---
v2: Hadn't verified the non-sdebug-wp parameter case after I changed
output format. Fixed.
---
tests/block/022 | 92 +
tests/block/022.out | 22 +++
2 files changed, 114 inser
set, both the whole disk device and any
partitions will reflect the current write-protect state of the
underlying device.
Cc:
Cc: Jeremy Cline
Cc: Ewan D. Milne
Reported-by: Oleksii Kurochko
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221
Signed-off-by: Martin K. Petersen
---
in the block
layer.
Signed-off-by: Martin K. Petersen
---
tests/block/022 | 92 +
tests/block/022.out | 22 +++
2 files changed, 114 insertions(+)
create mode 100755 tests/block/022
create mode 100644 tests/block/022.out
diff --git a
ial filesystem blocks are zeroed, and
whole filesystem blocks are removed from the file. After a successful
call, subsequent reads from this range will return zeroes."
That matches the block device behavior as far as I'm concerned.
--
Martin K. Petersen Oracle Linux Engineering
rence between (1) and (3) would
be negligible and make this distinction moot. However, we have to
support devices that have a wide variety of media and hardware
characteristics. So I don't see pure deallocate going away. Doesn't mean
that I am not pushing vendors to handle (3) because I think it is very
important. And why we defined WRITE ZEROES in the first place.
--
Martin K. Petersen Oracle Linux Engineering
.
We have:
Allocate and zero: FALLOC_FL_ZERO_RANGE
Deallocate and zero: FALLOC_FL_PUNCH_HOLE
Deallocate: FALLOC_FL_PUNCH_HOLE | FALLOC_FL_NO_HIDE_STALE
but are missing:
Allocate:FALLOC_FL_ZERO_RANGE | FALLOC_FL_NO_HIDE_STALE
The devices that implement anchor s
pin down important areas to ensure one doesn't
get ENOSPC when writing journal or metadata. However, these are
typically the areas that we deliberately zero to ensure predictable
results. So I think the only case where anchoring makes much sense is on
devices that do zero detection and thus w
MAP command).
WRITE SAME also has an ANCHOR flag which provides a use case we
currently don't have fallocate plumbing for: Allocating blocks without
caring about their contents. I.e. the blocks described by the I/O are
locked down to prevent ENOSPC for future writes.
--
Martin K. Petersen Oracle Linux Engineering
?
The problem is that most vendors implement (3) using (1). But can't make
it work well because (3) was -- and still is for ATA -- outside the
scope of what the protocols can express.
And I agree with you that if (3) was implemented correctly in all
devices, we wouldn't need (1) at all
Matthew,
> Do we have an lsf-discuss mailing list this year? Might be good to
> coordinate arrivals / departures for taxi sharing purposes.
l...@lists.linux-foundation.org
--
Martin K. Petersen Oracle Linux Engineering
/O patterns that produced a
single, non-mergeable 8 byte integrity metadata allocation for every 512
bytes of data in the I/O. It's a pathological corner case. Just make
sure it's something you handle when you muck with this. ext[23] and dd
to the block device used to be able to reproduc
are in sync.
For the retry stuff we should have a similar expectation. It doesn't
have to be fancy. I'm perfectly happy with a check at mkfs/growfs time
that complains if the resulting configuration violates whichever
alignment and other assumptions we end up baking into this.
--
Martin K. Petersen Oracle Linux Engineering
order for us to build highly reliable systems, we have to have a
better building block than "this redundancy retry feature works most of
the time". So to me it is imperative that we provide hard guarantees
once a particular configuration has been set up and stacked. And if the
retry guarantee is somehow invalidated, then we really need to let the
user know about it.
--
Martin K. Petersen Oracle Linux Engineering
ify an
arbitrary multiple of 512 bytes at any 512-byte offset inside a
submitted bio. That would work, but I don't think that's the pony the
filesystem were wishing for?
--
Martin K. Petersen Oracle Linux Engineering
cy inside the array and
therefore not MD.
But I think there are other problems with the callback approach. See my
impending email.
--
Martin K. Petersen Oracle Linux Engineering
Jens,
> The integrity stuff still has that nasty pointer in there. It'd be
> nice to get rid of that as well, and hide it all in a parent container
> of the bio.
That's fine with me. We really didn't have that option a decade ago.
--
Martin K. Petersen Oracle Linux Engineering
that.
We'll just need to handle it exactly like the integrity stuff. We only
allocate the extra bits when the underlying device indicates that it's
required and desired.
--
Martin K. Petersen Oracle Linux Engineering
purpose of the series is to
solicit feedback on the callback approach.
--
Martin K. Petersen Oracle Linux Engineering
olicy" means, rename the field to
> "read_only" for clarity.
>
> Cc: Jeremy Cline
> Cc: Oleksii Kurochko
> Cc: sta...@vger.kernel.org # v4.16+
> Reported-by: Oleksii Kurochko
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221
> Fixes: 20bd1d02
a: https://bugzilla.kernel.org/show_bug.cgi?id=201221
Fixes: 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading partition")
Signed-off-by: Martin K. Petersen
---
v3:
- Drop ?: since gcc complains about mixing int and bool (zeroday)
- Drop EXPORT_SYMBOL (hch)
- s
essful in compelling the drive manufacturers to make
DEALLOCATE perform well for typical application workloads. So I'm not
holding my breath...
--
Martin K. Petersen Oracle Linux Engineering
d is dead on anything but the cheapest
devices. And on those it is probably going to be
performance-prohibitive to use it in any other way than a weekly fstrim.
--
Martin K. Petersen Oracle Linux Engineering
devices really need to distinguish between discard-as-a-hint where
it is free to ignore anything that's not a whole multiple of whatever
the internal granularity is, and the WRITE ZEROES use case where the end
result needs to be deterministic.
--
Martin K. Petersen Oracle Linux Engineering
x27;t handle the NVMe WRITE ZEROES
command. That seems like something that will cause us headaches in the
future...
--
Martin K. Petersen Oracle Linux Engineering
Chaitanya,
> - if (!blk_rq_payload_bytes(rq))
> + if (!blk_rq_nr_phys_segments(rq))
Wouldn't it be better to set RQF_SPECIAL_PAYLOAD and friends in
nvme_setup_write_zeroes() like it's done for discard?
--
Martin K. Petersen Oracle Linux Engineering
x27;t see it because the device setting always took precedence and
voided any user setting.
--
Martin K. Petersen Oracle Linux Engineering
we stick with the UINT_MAX check, the comment should at least point
out why it's there.
--
Martin K. Petersen Oracle Linux Engineering
# v4.16+
> Reported-by: Oleksii Kurochko
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221
> Fixes: 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading
> partition")
> Signed-off-by: Martin K. Petersen
>
> ---
>
> v2:
> - Trac
e same as discarding.
--
Martin K. Petersen Oracle Linux Engineering
zhangxiaoxu,
> Any progress about the problme?
> Should we disable the write same when stack the different LBA disks?
Yes, please.
--
Martin K. Petersen Oracle Linux Engineering
ry and
> not helpful in debugging.
Reviewed-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
when re-reading partition")
Signed-off-by: Martin K. Petersen
---
v2:
- Track user read-only state in a bitmap
- Work around the regression that caused us to drop user
preferences on revalidate
---
block/genhd.c | 22 +-
block/ioct
Yeah, this looks good to me. I'll queue it up for fixes.
--
Martin K. Petersen Oracle Linux Engineering
rified
that the code does the right thing.
But obviously I have my doubts about $RANDOM_USB_GIZMO.
--
Martin K. Petersen Oracle Linux Engineering
for stable.
The intent with this patch was merely as a workaround for people stuck
with write-protected drives after boot. The tristate wasn't my first
choice, but it turned out to be the path of least resistance for a
stable fix.
--
Martin K. Petersen Oracle Linux Engineering
he disk as writable again since this would be
> true. Perhaps it's a somewhat far-fetched scenario.
OK, I missed that particular entry point. Will fix.
--
Martin K. Petersen Oracle Linux Engineering
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221
Fixes: 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading partition")
Signed-off-by: Martin K. Petersen
---
I have verified that get_disk_ro() and bdev_read_only() callers all
handle the additional value correctly. Same
d for years and is getting in the
> way of block / SCSI changes, and does not even work properly currently,
> so I think it's finally time to drop it.
Applied to 5.1/scsi-queue. Thanks!
--
Martin K. Petersen Oracle Linux Engineering
c. But adoption has been pretty slow.
I don't have any problems keeping WRITE_SAME around if people are
actually using it. It just seemed like most active users only cared
about writing zeroes.
--
Martin K. Petersen Oracle Linux Engineering
evant due to the payload being the ZERO_PAGE), it may be worthwhile
to remove REQ_OP_WRITE_SAME. I think drbd is the only user relying on a
non-zero payload. The target code ends up manually iterating, if I
remember correctly...
--
Martin K. Petersen Oracle Linux Engineering
in the various SCSI structs. So I
don't have any problem killing the wrappers except they may actually be
handy for regressions like this one where you could #error if the driver
writer violates the ordering requirement.
--
Martin K. Petersen Oracle Linux Engineering
es / 128 bits that we can grab for things like KV.
Great!
--
Martin K. Petersen Oracle Linux Engineering
nitialization. So that's something I think
we could--and should--improve.
--
Martin K. Petersen Oracle Linux Engineering
key. How do we extend this interface beyond the
flags?
--
Martin K. Petersen Oracle Linux Engineering
Logan,
> To prevent this, the calls to scsi_host_set_prot() are moved into
> isci_host_alloc() before the call to scsi_add_host(). Out of caution,
> also move the similar call to scsi_host_set_guard().
Applied to 5.0/scsi-fixes. Thanks much!
--
Martin K. Petersen Ora
Jens,
> You can add my reviewed-by to 10 and add that one too, that makes more
> sense than me adding it for a post-merge pull. But either way works
> for me.
OK, done.
--
Martin K. Petersen Oracle Linux Engineering
for me to
pull in).
--
Martin K. Petersen Oracle Linux Engineering
John,
> What would we be missing that this is unset?
60a89a3ce0cc scsi: t10-pi: Return correct ref tag when queue has no integrity
profile
--
Martin K. Petersen Oracle Linux Engineering
ingle I/O to complete. I'll try to hack away at it
tomorrow.
> BTW, on a loosely related topic, in drivers/scsi/sd.h, I noticed that
> the arrays in sd_prot_op() and sd_prot_flag_mask() could be made
> static. In doing so I found sd.o object code shrunk by ~100B.
Will fix, thanks!
--
Martin K. Petersen Oracle Linux Engineering
to test against for this transport.
We did a preliminary qual of LSISAS3008 many moons ago. But we had no
immediate use for it in our offerings so it never went beyond that. I
know other companies are using it, though.
--
Martin K. Petersen Oracle Linux Engineering
le
> lot of simplifying.
I think it's fine to export these. The block device topology was
explicitly designed to be stackable like this.
--
Martin K. Petersen Oracle Linux Engineering
Ming,
> For DIX/DIF test, I think you need to pass 'dev_size_mb=XXX' instead of
> 'virtual_gb'.
Correct!
--
Martin K. Petersen Oracle Linux Engineering
lusively affects DIF-only setups (so primarily mpt3sas due to lack of
DIX support). It's a regression caused by a commit that went in through
block a few months ago. I'll send a patch...
--
Martin K. Petersen Oracle Linux Engineering
assing
protection information to an HBA.
> For now we may not support DIX. It seems to have issues. We wanted to
> try 3008 card on our system, but it does not seem to support DIX 0-3.
For some reason Broadcom have not upstreamed their DIX support. It's
supposedly available in their outbox driver.
--
Martin K. Petersen Oracle Linux Engineering
ust the last patch?
>
> Today I run this test again and can't reproduce it any more.
>
> So maybe not a new issue, and it is just triggered in yesterday's
> for-4.21/block, :-)
Can you reproduce with yesterday's tree then?
I'm obviously a bit concerned about merg
Sure, that's fine.
--
Martin K. Petersen Oracle Linux Engineering
Jens,
Looks good to me.
Acked-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
Jens,
> Only the SCSI legacy path provides a way to check if target is
> currently busy, provide the same for the MQ path.
Acked-by: Martin K. Petersen
--
Martin K. Petersen Oracle Linux Engineering
think it should be properly documented if there is a preferred
way to order things...
--
Martin K. Petersen Oracle Linux Engineering
he rest of the kernel appears to be either arbitrary
ordering or favoring author SoB as the first tag.
--
Martin K. Petersen Oracle Linux Engineering
he time :-)
Is this a qualified guess or have you actually tried?
> Personally, I don't think there'll be many users around so I've just
> added if for completions sake.
Yeah, I agree it's less critical than mpt[23]sas. However, I'm sure
we'll break someth
1 - 100 of 354 matches
Mail list logo