Re: usercopy whitelist woe in scsi_sense_cache

2018-04-20 Thread Oleksandr Natalenko
fusion I should note that I've removed the reproducer from my server, but I can re-upload it if needed. -- Oleksandr Natalenko (post-factum)

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-17 Thread Oleksandr Natalenko
Hi. 17.04.2018 23:47, Kees Cook wrote: I sent the patch anyway, since it's kind of a robustness improvement, I'd hope. If you fix BFQ also, please add: Reported-by: Oleksandr Natalenko <oleksa...@natalenko.name> Root-caused-by: Kees Cook <keesc...@chromium.org> :) I gotta task-swi

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-17 Thread Oleksandr Natalenko
Hi. 17.04.2018 05:12, Kees Cook wrote: Turning off HARDENED_USERCOPY and turning on KASAN, I see the same report: [ 38.274106] BUG: KASAN: slab-out-of-bounds in _copy_to_user+0x42/0x60 [ 38.274841] Read of size 22 at addr 8800122b8c4b by task smartctl/1064 [ 38.275630] [

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-12 Thread Oleksandr Natalenko
Hi. On čtvrtek 12. dubna 2018 20:44:37 CEST Kees Cook wrote: > My first bisect attempt gave me commit 5448aca41cd5 ("null_blk: wire > up timeouts"), which seems insane given that null_blk isn't even built > in the .config. I managed to get the testing automated now for a "git > bisect run ...",

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-10 Thread Oleksandr Natalenko
Hi, Kees, Paolo et al. 10.04.2018 08:53, Kees Cook wrote: Unfortunately I only had a single hang with no dumps. I haven't been able to reproduce it since. :( For your convenience I've prepared a VM that contains a reproducer. It consists of 3 disk images (sda.img is for the system, it is

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-10 Thread Oleksandr Natalenko
Hi. 10.04.2018 08:35, Oleksandr Natalenko wrote: - does it reproduce _without_ hardened usercopy? (I would assume yes, but you'd just not get any warning until the hangs started.) If it does reproduce without hardened usercopy, then a new bisect run could narrow the search even more. Looks

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-10 Thread Oleksandr Natalenko
Hi. 09.04.2018 22:30, Kees Cook wrote: echo 1 | tee /sys/block/sd*/queue/nr_requests I can't get this below "4". Oops, yeah. It cannot be less than BLKDEV_MIN_RQ (which is 4), so it is enforced explicitly in queue_requests_store(). It is the same for me. echo 1 | tee

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-09 Thread Oleksandr Natalenko
Hi. (fancy details for linux-block and BFQ people go below) 09.04.2018 20:32, Kees Cook wrote: Ah, this detail I didn't have. I've changed my environment to build with: CONFIG_BLK_MQ_PCI=y CONFIG_BLK_MQ_VIRTIO=y CONFIG_IOSCHED_BFQ=y boot with scsi_mod.use_blk_mq=1 and select BFQ in the

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-09 Thread Oleksandr Natalenko
Hi. 09.04.2018 11:35, Christoph Hellwig wrote: I really can't make sense of that report. Sorry, I have nothing to add there so far, I just see the symptom of something going wrong in the ioctl code path that is invoked by smartctl, but I have no idea what's the minimal environment to

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-08 Thread Oleksandr Natalenko
Hi. Cc'ing linux-block people (mainly, Christoph) too because of 17cb960f29c2. Also, duplicating the initial statement for them. With v4.16 (and now with v4.16.1) it is possible to trigger usercopy whitelist warning and/or bug while doing smartctl on a SATA disk having blk-mq and BFQ enabled.

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-06 Thread Oleksandr Natalenko
Hi. 05.04.2018 20:52, Kees Cook wrote: Okay. My qemu gets mad about that and wants the format=raw argument, so I'm using: -drive file=sda.img,format=raw \ -drive file=sdb.img,format=raw \ How are you running your smartctl? I'm doing this now: [1] Running while :; do (

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-05 Thread Oleksandr Natalenko
05.04.2018 16:32, Oleksandr Natalenko wrote: "-hda sda.img -hdb sda.img" "-hda sda.img -hdb sdb.img", of course, I don't pass the same disk twice ☺

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-05 Thread Oleksandr Natalenko
Hi. 05.04.2018 16:21, Kees Cook wrote: I had a VM running over night with: [1] Running while :; do smartctl -a /dev/sda > /dev/null; done & [2]- Running while :; do ls --color=auto -lR / > /dev/null 2> /dev/null; done & [3]+ Running

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-05 Thread Oleksandr Natalenko
Hi. 04.04.2018 23:25, Kees Cook wrote: Thanks for the report! I hope someone more familiar with sg_io() can help explain the changing buffer offset... :P Also, FYI, I kept the server running with smartctl periodically invoked, and it was still triggering BUGs, however, I consider them to be

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-04 Thread Oleksandr Natalenko
Hi. 04.04.2018 23:25, Kees Cook wrote: Actually, I can trigger a BUG too: [ 129.259213] usercopy: Kernel memory exposure attempt detected from SLUB object 'scsi_sense_cache' (offset 119, size 22)! Wow, yeah, that's totally outside the slub object_size. How did you trigger this? Just luck

Re: usercopy whitelist woe in scsi_sense_cache

2018-04-04 Thread Oleksandr Natalenko
Hi. On středa 4. dubna 2018 22:21:53 CEST Kees Cook wrote: ... That means scsi_sense_cache should be 96 bytes in size? But a 22 byte read starting at offset 94 happened? That seems like a 20 byte read beyond the end of the SLUB object? Though if it were reading past the actual end of the

usercopy whitelist woe in scsi_sense_cache

2018-04-04 Thread Oleksandr Natalenko
Hi, Kees, David et al. With v4.16 I get the following dump while using smartctl: === [ 261.260617] [ cut here ] [ 261.262135] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'scsi_sense_cache' (offset 94, size 22)! [

Re: [PATCH v11 0/7] block, scsi, md: Improve suspend and resume

2017-11-09 Thread Oleksandr Natalenko
Then, Reported-by: Oleksandr Natalenko <oleksa...@natalenko.name> Tested-by: Oleksandr Natalenko <oleksa...@natalenko.name> On čtvrtek 9. listopadu 2017 17:55:58 CET Jens Axboe wrote: > On 11/09/2017 09:54 AM, Bart Van Assche wrote: > > On Thu, 2017-11-09 at 07:16 +0100

Re: [PATCH v11 0/7] block, scsi, md: Improve suspend and resume

2017-11-08 Thread Oleksandr Natalenko
Bart, is this something known to you, or it is just my fault applying this series to v4.13? Except having this warning, suspend/resume works for me: === [ 27.383846] sd 0:0:0:0: [sda] Starting disk [ 27.383976] sd 1:0:0:0: [sdb] Starting disk [ 27.451218] sdb: Attempt to allocate

Re: [PATCH V8 0/8] block/scsi: safe SCSI quiescing

2017-11-07 Thread Oleksandr Natalenko
Hi Ming, Jens. What is the fate of this patchset please? 03.10.2017 16:03, Ming Lei wrote: Hi Jens, Please consider this patchset for V4.15, and it fixes one kind of long-term I/O hang issue in either block legacy path or blk-mq. The current SCSI quiesce isn't safe and easy to trigger I/O

Re: [PATCH v10 00/10] block, scsi, md: Improve suspend and resume

2017-10-21 Thread Oleksandr Natalenko
Well, I've cherry-picked this series for current upstream/master branch, and got this while performing another suspend try: === [ 62.415890] Freezing of tasks failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): [ 62.421150] xfsaild/dm-7D0 289 2 0x8000 [

Re: [PATCH v10 00/10] block, scsi, md: Improve suspend and resume

2017-10-17 Thread Oleksandr Natalenko
I've indeed tested some previous versions of this patchset, but still use Ming's solution. Can it be clarified which one (Bart's or Ming's) is a correct approach? On středa 18. října 2017 1:28:14 CEST Jens Axboe wrote: > On 10/17/2017 05:26 PM, Bart Van Assche wrote: > > Hello Jens, > > > > It

Re: [PATCH V10 0/8] blk-mq-sched: improve sequential I/O performance

2017-10-14 Thread Oleksandr Natalenko
Hi. By any chance, could this be backported to 4.14? I'm confused with "SCSI: allow to pass null rq to scsi_prep_state_check()" since it uses refactored flags. === if (req && !(req->rq_flags & RQF_PREEMPT)) === Is it safe to revert to REQ_PREEMPT here, or rq_flags should also be replaced

Re: [PATCH V8 0/8] block/scsi: safe SCSI quiescing

2017-10-03 Thread Oleksandr Natalenko
Also Tested-by: Oleksandr Natalenko <oleksa...@natalenko.name> for whole v8. On úterý 3. října 2017 16:03:58 CEST Ming Lei wrote: > Hi Jens, > > Please consider this patchset for V4.15, and it fixes one > kind of long-term I/O hang issue in either block legac

Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing

2017-09-28 Thread Oleksandr Natalenko
Hey. I can confirm that v6 of your patchset still works well for me. Tested on v4.13 kernel. Thanks. On středa 27. září 2017 10:52:41 CEST Ming Lei wrote: > On Wed, Sep 27, 2017 at 04:27:51PM +0800, Ming Lei wrote: > > On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote: > > >

Re: [PATCH V4 0/10] block/scsi: safe SCSI quiescing

2017-09-11 Thread Oleksandr Natalenko
For v4 with regard to suspend/resume: Tested-by: Oleksandr Natalenko <oleksa...@natalenko.name> On pondělí 11. září 2017 13:10:11 CEST Ming Lei wrote: > Hi, > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > Once SCSI device is put into QUIESCE,

Re: [PATCH V3 0/8] block/scsi: safe SCSI quiescing

2017-09-02 Thread Oleksandr Natalenko
Again, Tested-by: Oleksandr Natalenko <oleksa...@natalenko.name> On sobota 2. září 2017 15:08:32 CEST Ming Lei wrote: > Hi, > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > Once SCSI device is put into QUIESCE, no new request except

Re: [PATCH V2 0/8] block/scsi: safe SCSI quiescing

2017-09-02 Thread Oleksandr Natalenko
With regard to suspend/resume cycle: Tested-by: Oleksandr Natalenko <oleksa...@natalenko.name> On pátek 1. září 2017 20:49:49 CEST Ming Lei wrote: > Hi, > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > Once SCSI device is put into QUIESCE,

Re: [PATCH 0/9] block/scsi: safe SCSI quiescing

2017-08-31 Thread Oleksandr Natalenko
at 07:34:06PM +0200, Oleksandr Natalenko wrote: > > Since I'm in CC, does this series aim to replace 2 patches I've tested > > before: > > > > blk-mq: add requests in the tail of hctx->dispatch > > blk-mq: align to legacy's implementation of blk_execute_rq > > &

Re: [PATCH 0/9] block/scsi: safe SCSI quiescing

2017-08-31 Thread Oleksandr Natalenko
Since I'm in CC, does this series aim to replace 2 patches I've tested before: blk-mq: add requests in the tail of hctx->dispatch blk-mq: align to legacy's implementation of blk_execute_rq ? On čtvrtek 31. srpna 2017 19:27:19 CEST Ming Lei wrote: > The current SCSI quiesce isn't safe and easy