Re: [PATCH] IB/srp: Fix a memory leak
On 19/11/2015 01:00, Bart Van Assche wrote: If srp_connect_ch() returns a positive value then that is considered by its caller as a connection failure but this does not result in a scsi_host_put() call and additionally causes the srp_create_target() function to return a positive value while it should return a negative value. Avoid all this confusion and additionally fix a memory leak by ensuring that srp_connect_ch() always returns a value that is <= 0. This patch avoids that a rejected login triggers the following memory leak: unreferenced object 0x88021b24a220 (size 8): comm "srp_daemon", pid 56421, jiffies 4295006762 (age 4240.750s) hex dump (first 8 bytes): 68 6f 73 74 35 38 00 a5 host58.. backtrace: [] kmemleak_alloc+0x7a/0xc0 [] __kmalloc_track_caller+0xfe/0x160 [] kvasprintf+0x5b/0x90 [] kvasprintf_const+0x8d/0xb0 [] kobject_set_name_vargs+0x3c/0xa0 [] dev_set_name+0x3c/0x40 [] scsi_host_alloc+0x327/0x4b0 [] srp_create_target+0x4e/0x8a0 [ib_srp] [] dev_attr_store+0x1b/0x20 [] sysfs_kf_write+0x4a/0x60 [] kernfs_fop_write+0x14e/0x180 [] __vfs_write+0x2f/0xf0 [] vfs_write+0xa4/0x100 [] SyS_write+0x54/0xc0 [] entry_SYSCALL_64_fastpath+0x12/0x6f Signed-off-by: Bart Van Assche Cc: Sagi Grimberg Cc: Sebastian Parschauer Cc: Christoph Hellwig Cc: stable Looks good, Reviewed-by: Sagi Grimberg -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
Hi Ondrej, On Fri, 20 Nov 2015, Ondrej Zary wrote: > On Friday 20 November 2015 02:41:19 Finn Thain wrote: > > > > > > My tests involved 3 different scsi targets (two disks and a CD-ROM) > > but none of these send a SDTR. Your log says the driver correctly > > rejected the SDTR message but that doesn't mean the target actually > > went to MSG IN phase and got the message. Do you have any older > > targets you can test? > > Another disk, without patches: > > [ 84.481582] pnp 01:01.00: activated > [ 84.489650] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x240, > n_io_port 16, base 0x0, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 128, > this_id 7, flags { DTC3181E NO_PSEUDO_DMA }, USLEEP_POLL 3, USLEEP_WAITLONG > 1250, options { AUTOPROBE_IRQ PSEUDO_DMA NCR53C400 } > [ 84.953332] scsi 2:0:1:0: Direct-Access QUANTUM LP240S GM240S01X 4.6 > PQ: 0 ANSI: 2 CCS > [ 86.786475] sd 2:0:1:0: Attached scsi generic sg1 type 0 > [ 86.793753] sd 2:0:1:0: [sdb] 479350 512-byte logical blocks: (245 MB/234 > MiB) > [ 86.998555] sd 2:0:1:0: [sdb] Write Protect is off > [ 87.406068] sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, > doesn't support DPO or FUA > [ 118.888271] sd 2:0:1:0: [sdb] aborting command > [ 118.888738] sd 2:0:1:0: [sdb] aborting command > > With patches: > > [ 258.473748] pnp 01:01.00: activated > [ 258.483592] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x240, > n_io_port 16, base 0x0, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 128, > this_id 7, flags { DTC3181E NO_PSEUDO_DMA }, options { AUTOPROBE_IRQ > PSEUDO_DMA } > [ 261.347632] scsi 2:0:1:0: Direct-Access QUANTUM LP240S GM240S01X 4.6 > PQ: 0 ANSI: 2 CCS > [ 275.560451] sd 2:0:1:0: Attached scsi generic sg1 type 0 > [ 275.632519] sd 2:0:1:0: [sdb] 479350 512-byte logical blocks: (245 MB/234 > MiB) > [ 275.635533] sd 2:0:1:0: [sdb] Write Protect is off > [ 275.642315] sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, > doesn't support DPO or FUA > [ 469.076347] sd 2:0:1:0: [sdb] FAILED Result: hostbyte=DID_TIME_OUT > driverbyte=DRIVER_SENSE > [ 469.076613] sd 2:0:1:0: [sdb] Sense Key : Aborted Command [current] > [ 469.076851] sd 2:0:1:0: [sdb] Add. Sense: No additional sense information > [ 469.077086] sd 2:0:1:0: [sdb] CDB: Read(10) 28 00 00 00 00 02 00 00 02 00 > [ 469.077306] blk_update_request: I/O error, dev sdb, sector 2 > [ 469.077522] Buffer I/O error on dev sdb, logical block 1, async page read > [ 480.108255] INFO: task kworker/u2:2:60 blocked for more than 120 seconds. > [ 480.109773] Not tainted 4.3.0-rc1+ #74 > [ 480.109973] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 480.110179] kworker/u2:2D 0040 060 2 0x > [ 480.110671] Workqueue: events_unbound async_run_entry_fn > [ 480.110999] cf9e8780 0046 2eff25f7 0040 c117f111 2ee82733 > 0040 0016fec4 > [ 480.112390] cfaa6000 7fff c139c7d2 c139c504 > 7fff c139d9d3 > [ 480.113661] 0040 cfaa5cfc c106f460 00161108 c648 > 2106dcce 0040 > [ 480.114893] Call Trace: > [ 480.115124] [] ? blk_queue_bio+0x1e8/0x1fb > [ 480.115344] [] ? bit_wait_io_timeout+0x3d/0x3d > [ 480.115564] [] ? schedule+0x5b/0x67 > [ 480.115794] [] ? schedule_timeout+0x13/0xc5 > [ 480.116007] [] ? timekeeping_get_ns+0x10/0x69 > [ 480.116406] [] ? bit_wait_io_timeout+0x3d/0x3d > [ 480.116636] [] ? ktime_get+0x38/0x48 > [ 480.116843] [] ? io_schedule_timeout+0x83/0xd7 > [ 480.117062] [] ? bit_wait_io+0x21/0x26 > [ 480.117256] [] ? __wait_on_bit+0x2f/0x5a > [ 480.117486] [] ? blkdev_readpages+0x15/0x15 > [ 480.117704] [] ? wait_on_page_bit+0x57/0x5e > [ 480.117942] [] ? wake_atomic_t_function+0x2a/0x2a > [ 480.118151] [] ? wait_on_page_read+0xf/0x2a > [ 480.118373] [] ? do_read_cache_page+0x8e/0x116 > [ 480.118587] [] ? blkdev_readpages+0x15/0x15 > [ 480.118809] [] ? read_cache_page+0x14/0x18 > [ 480.119008] [] ? read_dev_sector+0x25/0x57 > [ 480.119222] [] ? adfspart_check_ICS+0x30/0x1ac > [ 480.119438] [] ? vsnprintf+0x78/0x25d > [ 480.119671] [] ? snprintf+0x16/0x18 > [ 480.119874] [] ? check_partition+0xd7/0x165 > [ 480.120253] [] ? rescan_partitions+0x95/0x283 > [ 480.120443] [] ? scsi_block_when_processing_errors+0x13/0xae > [ 480.120693] [] ? mutex_lock+0x9/0x21 > [ 480.120915] [] ? __blkdev_get+0x155/0x2f6 > [ 480.121133] [] ? blkdev_get+0x148/0x258 > [ 480.121350] [] ? unlock_new_inode+0x36/0x3c > [ 480.121570] [] ? bdget+0xdc/0xe6 > [ 480.121761] [] ? add_disk+0x221/0x368 > [ 480.121996] [] ? sd_probe_async+0xed/0x157 > [ 480.122214] [] ? async_run_entry_fn+0x2c/0xad > [ 480.122437] [] ? process_one_work+0x130/0x21f > [ 480.122639] [] ? worker_thread+0x18a/0x247 > [ 480.122854] [] ? process_scheduled_works+0x1d/0x1d > [ 480.123069] [] ? kthread+0x7c/0x81 > [ 480.123288] [] ? ret_from_kernel_th
[PATCH 0/3] Badblock tracking for gendisks
Patch 1 copies badblock management code into a header of its own, making it generally available. It follows common libraries of code such as linked lists, where anyone may embed a core data structure in another place, and use the provided accessor functions to manipulate the data. Patch 2 adds badblock tracking to gendisks (in preparation for use by NVDIMM devices). Right now, it is turned on unconditionally - I'd appreciate comments on if that is the right path. Patch 3 converts md over to use the new badblocks 'library'. I have done some pretty simple testing on this - created a raid 1 device, made sure the sysfs entries show up, and can be used to add and view badblocks. A closer look by the md folks would be nice here. Vishal Verma (3): badblocks: Add core badblock management code block: Add badblock management for gendisks md: convert to use the generic badblocks code block/genhd.c | 64 ++ drivers/md/md.c | 495 ++-- drivers/md/md.h | 31 +-- include/linux/badblocks.h | 512 ++ include/linux/genhd.h | 6 + 5 files changed, 603 insertions(+), 505 deletions(-) create mode 100644 include/linux/badblocks.h -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] md: convert to use the generic badblocks code
Retain badblocks as part of rdev, but use the accessor functions from include/linux/badblocks for all manipulation. Signed-off-by: Vishal Verma --- drivers/md/md.c | 495 +++- drivers/md/md.h | 31 +--- 2 files changed, 21 insertions(+), 505 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index c702de1..82994d7 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -34,6 +34,7 @@ #include #include +#include #include #include #include @@ -1358,8 +1359,6 @@ static __le32 calc_sb_1_csum(struct mdp_superblock_1 *sb) return cpu_to_le32(csum); } -static int md_set_badblocks(struct badblocks *bb, sector_t s, int sectors, - int acknowledged); static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_version) { struct mdp_superblock_1 *sb; @@ -1484,7 +1483,7 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_ count <<= sb->bblog_shift; if (bb + 1 == 0) break; - if (md_set_badblocks(&rdev->badblocks, + if (badblocks_set(&rdev->badblocks, sector, count, 1) == 0) return -EINVAL; } @@ -2226,7 +2225,7 @@ repeat: rdev_for_each(rdev, mddev) { if (rdev->badblocks.changed) { rdev->badblocks.changed = 0; - md_ack_all_badblocks(&rdev->badblocks); + ack_all_badblocks(&rdev->badblocks); md_error(mddev, rdev); } clear_bit(Blocked, &rdev->flags); @@ -2352,7 +2351,7 @@ repeat: clear_bit(Blocked, &rdev->flags); if (any_badblocks_changed) - md_ack_all_badblocks(&rdev->badblocks); + ack_all_badblocks(&rdev->badblocks); clear_bit(BlockedBadBlocks, &rdev->flags); wake_up(&rdev->blocked_wait); } @@ -2944,11 +2943,17 @@ static ssize_t recovery_start_store(struct md_rdev *rdev, const char *buf, size_ static struct rdev_sysfs_entry rdev_recovery_start = __ATTR(recovery_start, S_IRUGO|S_IWUSR, recovery_start_show, recovery_start_store); -static ssize_t -badblocks_show(struct badblocks *bb, char *page, int unack); -static ssize_t -badblocks_store(struct badblocks *bb, const char *page, size_t len, int unack); - +/* sysfs access to bad-blocks list. + * We present two files. + * 'bad-blocks' lists sector numbers and lengths of ranges that + *are recorded as bad. The list is truncated to fit within + *the one-page limit of sysfs. + *Writing "sector length" to this file adds an acknowledged + *bad block list. + * 'unacknowledged-bad-blocks' lists bad blocks that have not yet + *been acknowledged. Writing to this file adds bad blocks + *without acknowledging them. This is largely for testing. + */ static ssize_t bb_show(struct md_rdev *rdev, char *page) { return badblocks_show(&rdev->badblocks, page, 0); @@ -8348,253 +8353,7 @@ void md_finish_reshape(struct mddev *mddev) } EXPORT_SYMBOL(md_finish_reshape); -/* Bad block management. - * We can record which blocks on each device are 'bad' and so just - * fail those blocks, or that stripe, rather than the whole device. - * Entries in the bad-block table are 64bits wide. This comprises: - * Length of bad-range, in sectors: 0-511 for lengths 1-512 - * Start of bad-range, sector offset, 54 bits (allows 8 exbibytes) - * A 'shift' can be set so that larger blocks are tracked and - * consequently larger devices can be covered. - * 'Acknowledged' flag - 1 bit. - the most significant bit. - * - * Locking of the bad-block table uses a seqlock so md_is_badblock - * might need to retry if it is very unlucky. - * We will sometimes want to check for bad blocks in a bi_end_io function, - * so we use the write_seqlock_irq variant. - * - * When looking for a bad block we specify a range and want to - * know if any block in the range is bad. So we binary-search - * to the last range that starts at-or-before the given endpoint, - * (or "before the sector after the target range") - * then see if it ends after the given start. - * We return - * 0 if there are no known bad blocks in the range - * 1 if there are known bad block which are all acknowledged - * -1 if there are bad blocks which have not yet been acknowledged in metadata. - * plus the start/length of the first bad section we overlap. - */ -int md_is_badblock(struct badblocks *bb, sector_t s, int sectors, - sector_t *first_bad, int *bad_sectors) -{ - int hi; - int lo; - u64 *p = bb->pag
[PATCH 1/3] badblocks: Add core badblock management code
Take the core badblocks implementation from md, and make it generally available. This follows the same style as kernel implementations of linked lists, rb-trees etc, where you can have a structure that can be embedded anywhere, and accessor functions to manipulate the data. The only changes in this copy of the code are ones to generalize function/variable names from md-specific ones. Also add init and free functions. Signed-off-by: Vishal Verma --- include/linux/badblocks.h | 512 ++ 1 file changed, 512 insertions(+) create mode 100644 include/linux/badblocks.h diff --git a/include/linux/badblocks.h b/include/linux/badblocks.h new file mode 100644 index 000..94fa348 --- /dev/null +++ b/include/linux/badblocks.h @@ -0,0 +1,512 @@ +#ifndef _LINUX_BADBLOCKS_H +#define _LINUX_BADBLOCKS_H + +#include +#include +#include + +#define BB_LEN_MASK(0x01FFULL) +#define BB_OFFSET_MASK (0x7E00ULL) +#define BB_ACK_MASK(0x8000ULL) +#define BB_MAX_LEN 512 +#define BB_OFFSET(x) (((x) & BB_OFFSET_MASK) >> 9) +#define BB_LEN(x) (((x) & BB_LEN_MASK) + 1) +#define BB_ACK(x) (!!((x) & BB_ACK_MASK)) +#define BB_MAKE(a, l, ack) (((a)<<9) | ((l)-1) | ((u64)(!!(ack)) << 63)) + +/* Bad block numbers are stored sorted in a single page. + * 64bits is used for each block or extent. + * 54 bits are sector number, 9 bits are extent size, + * 1 bit is an 'acknowledged' flag. + */ +#define MAX_BADBLOCKS (PAGE_SIZE/8) + +struct badblocks { + int count; /* count of bad blocks */ + int unacked_exist; /* there probably are unacknowledged +* bad blocks. This is only cleared +* when a read discovers none +*/ + int shift; /* shift from sectors to block size +* a -ve shift means badblocks are +* disabled.*/ + u64 *page; /* badblock list */ + int changed; + seqlock_t lock; + sector_t sector; + sector_t size; /* in sectors */ +}; + +/* Bad block management. + * We can record which blocks on each device are 'bad' and so just + * fail those blocks, or that stripe, rather than the whole device. + * Entries in the bad-block table are 64bits wide. This comprises: + * Length of bad-range, in sectors: 0-511 for lengths 1-512 + * Start of bad-range, sector offset, 54 bits (allows 8 exbibytes) + * A 'shift' can be set so that larger blocks are tracked and + * consequently larger devices can be covered. + * 'Acknowledged' flag - 1 bit. - the most significant bit. + * + * Locking of the bad-block table uses a seqlock so badblocks_check + * might need to retry if it is very unlucky. + * We will sometimes want to check for bad blocks in a bi_end_io function, + * so we use the write_seqlock_irq variant. + * + * When looking for a bad block we specify a range and want to + * know if any block in the range is bad. So we binary-search + * to the last range that starts at-or-before the given endpoint, + * (or "before the sector after the target range") + * then see if it ends after the given start. + * We return + * 0 if there are no known bad blocks in the range + * 1 if there are known bad block which are all acknowledged + * -1 if there are bad blocks which have not yet been acknowledged in metadata. + * plus the start/length of the first bad section we overlap. + */ +static inline int badblocks_check(struct badblocks *bb, sector_t s, int sectors, + sector_t *first_bad, int *bad_sectors) +{ + int hi; + int lo; + u64 *p = bb->page; + int rv; + sector_t target = s + sectors; + unsigned seq; + + if (bb->shift > 0) { + /* round the start down, and the end up */ + s >>= bb->shift; + target += (1>= bb->shift; + sectors = target - s; + } + /* 'target' is now the first block after the bad range */ + +retry: + seq = read_seqbegin(&bb->lock); + lo = 0; + rv = 0; + hi = bb->count; + + /* Binary search between lo and hi for 'target' +* i.e. for the last range that starts before 'target' +*/ + /* INVARIANT: ranges before 'lo' and at-or-after 'hi' +* are known not to be the last range before target. +* VARIANT: hi-lo is the number of possible +* ranges, and decreases until it reaches 1 +*/ + while (hi - lo > 1) { + int mid = (lo + hi) / 2; + sector_t a = BB_OFFSET(p[mid]); + if (a < target) + /* This could still be the one, earlier ranges +* could not. */ + lo = mid; + else + /* This and later ranges are de
[PATCH 2/3] block: Add badblock management for gendisks
NVDIMM devices, which can behave more like DRAM rather than block devices, may develop bad cache lines, or 'poison'. A block device exposed by the pmem driver can then consume poison via a read (or write), and cause a machine check. On platforms without machine check recovery features, this would mean a crash. The block device maintaining a runtime list of all known sectors that have poison can directly avoid this, and also provide a path forward to enable proper handling/recovery for DAX faults on such a device. Use the new badblock management interfaces to add a badblocks list to gendisks. Signed-off-by: Vishal Verma --- block/genhd.c | 64 +++ include/linux/genhd.h | 6 + 2 files changed, 70 insertions(+) diff --git a/block/genhd.c b/block/genhd.c index 0c706f3..4209c32 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "blk.h" @@ -505,6 +506,20 @@ static int exact_lock(dev_t devt, void *data) return 0; } +static void disk_alloc_badblocks(struct gendisk *disk) +{ + disk->bb = kzalloc(sizeof(disk->bb), GFP_KERNEL); + if (!disk->bb) { + pr_warn("%s: failed to allocate space for badblocks\n", + disk->disk_name); + return; + } + + if (badblocks_init(disk->bb, 1)) + pr_warn("%s: failed to initialize badblocks\n", + disk->disk_name); +} + static void register_disk(struct gendisk *disk) { struct device *ddev = disk_to_dev(disk); @@ -609,6 +624,7 @@ void add_disk(struct gendisk *disk) disk->first_minor = MINOR(devt); disk_alloc_events(disk); + disk_alloc_badblocks(disk); /* Register BDI before referencing it from bdev */ bdi = &disk->queue->backing_dev_info; @@ -657,6 +673,9 @@ void del_gendisk(struct gendisk *disk) blk_unregister_queue(disk); blk_unregister_region(disk_devt(disk), disk->minors); + badblocks_free(disk->bb); + kfree(disk->bb); + part_stat_set_all(&disk->part0, 0); disk->part0.stamp = 0; @@ -670,6 +689,48 @@ void del_gendisk(struct gendisk *disk) } EXPORT_SYMBOL(del_gendisk); +/* + * The gendisk usage of badblocks does not track acknowledgements for + * badblocks. We always assume they are acknowledged. + */ +int disk_check_badblocks(struct gendisk *disk, sector_t s, int sectors, + sector_t *first_bad, int *bad_sectors) +{ + return badblocks_check(disk->bb, s, sectors, first_bad, bad_sectors); +} +EXPORT_SYMBOL(disk_check_badblocks); + +int disk_set_badblocks(struct gendisk *disk, sector_t s, int sectors) +{ + return badblocks_set(disk->bb, s, sectors, 1); +} +EXPORT_SYMBOL(disk_set_badblocks); + +int disk_clear_badblocks(struct gendisk *disk, sector_t s, int sectors) +{ + return badblocks_clear(disk->bb, s, sectors); +} +EXPORT_SYMBOL(disk_clear_badblocks); + +/* sysfs access to bad-blocks list. */ +static ssize_t disk_badblocks_show(struct device *dev, + struct device_attribute *attr, + char *page) +{ + struct gendisk *disk = dev_to_disk(dev); + + return badblocks_show(disk->bb, page, 0); +} + +static ssize_t disk_badblocks_store(struct device *dev, + struct device_attribute *attr, + const char *page, size_t len) +{ + struct gendisk *disk = dev_to_disk(dev); + + return badblocks_store(disk->bb, page, len, 0); +} + /** * get_gendisk - get partitioning information for a given device * @devt: device to get partitioning information for @@ -988,6 +1049,8 @@ static DEVICE_ATTR(discard_alignment, S_IRUGO, disk_discard_alignment_show, static DEVICE_ATTR(capability, S_IRUGO, disk_capability_show, NULL); static DEVICE_ATTR(stat, S_IRUGO, part_stat_show, NULL); static DEVICE_ATTR(inflight, S_IRUGO, part_inflight_show, NULL); +static DEVICE_ATTR(badblocks, S_IRUGO | S_IWUSR, disk_badblocks_show, + disk_badblocks_store); #ifdef CONFIG_FAIL_MAKE_REQUEST static struct device_attribute dev_attr_fail = __ATTR(make-it-fail, S_IRUGO|S_IWUSR, part_fail_show, part_fail_store); @@ -1009,6 +1072,7 @@ static struct attribute *disk_attrs[] = { &dev_attr_capability.attr, &dev_attr_stat.attr, &dev_attr_inflight.attr, + &dev_attr_badblocks.attr, #ifdef CONFIG_FAIL_MAKE_REQUEST &dev_attr_fail.attr, #endif diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 2adbfa6..5563bde 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -162,6 +162,7 @@ struct disk_part_tbl { }; struct disk_events; +struct badblocks; struct gendisk { /* major, first_minor and minors are input parameters only, @@ -201,6 +202,7 @@ struct gendisk { struct blk_integrity *
Re: [PATCH 00/18] ALUA device handler update, part 1
On 11/20/2015 02:52 AM, Hannes Reinecke wrote: One thing, though: I don't really agree with Barts objection that moving to a workqueue would tie in too many resources. Thing is, I'm not convinces that using a work queue is allocating too many resources (we're speaking of 460 vs 240 bytes here). Also we have to retry commands for quite some time (cite the infamous NetApp takeover/giveback, which can take minutes). If we were to handle that without workqueue we'd have to initiate the retry from the end_io callback, causing a quite deep stack recursion. Which I'm not really fond of. Hello Hannes, Sorry if I wasn't clear enough in my previous e-mail about this topic but I'm more concerned about the additional memory needed for thread stacks and thread control data structures than about the additional memory needed for the workqueue. I'd like to see the ALUA device handler implementation scale to thousands of LUNs and target port groups. In case all connections between an initiator and a target port group fail, with a synchronous implementation of STPG we will either need a large number of threads (in case of one thread per STPG command) or the STPG commands will be serialized (if there are fewer threads than portal groups). Neither alternative looks attractive to me. BTW, not all storage arrays need STPG retries. Some arrays are able to process an STPG command quickly (this means within a few seconds). A previous discussion about this topic is available e.g. at http://thread.gmane.org/gmane.linux.scsi/105340/focus=105601. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix a bdi reregistration race, v2
On 11/20/2015 02:44 PM, Aaro Koskinen wrote: I think you should squash the revert of v1 into this patch, and then document the crash the original patch caused and how this new patch is fixing that. Hello Aaro, I'd like to know the opinion of the SCSI maintainers about this. It's not impossible that they would prefer to submit the revert to Linus quickly and only send the reworked fix to Linus during the v4.5 merge window. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix a bdi reregistration race, v2
Hi, I think you should squash the revert of v1 into this patch, and then document the crash the original patch caused and how this new patch is fixing that. A. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Fix a bdi reregistration race, v2
Unregister and reregister BDI devices in the proper order. This patch avoids that the following kernel warning can be triggered during SCSI device reregistration: WARNING: CPU: 7 PID: 203 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x80() sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:32' Workqueue: events_unbound async_run_entry_fn Call Trace: [] dump_stack+0x4c/0x65 [] warn_slowpath_common+0x8a/0xc0 [] warn_slowpath_fmt+0x46/0x50 [] sysfs_warn_dup+0x68/0x80 [] sysfs_create_dir_ns+0x7e/0x90 [] kobject_add_internal+0xa8/0x320 [] kobject_add+0x60/0xb0 [] device_add+0x107/0x5e0 [] device_create_groups_vargs+0xd8/0x100 [] device_create_vargs+0x1c/0x20 [] bdi_register+0x63/0x2a0 [] bdi_register_dev+0x27/0x30 [] add_disk+0x1a9/0x4e0 [] sd_probe_async+0x119/0x1d0 [sd_mod] [] async_run_entry_fn+0x4a/0x140 [] process_one_work+0x1d8/0x7c0 [] worker_thread+0x114/0x460 [] kthread+0xf8/0x110 [] ret_from_fork+0x3f/0x70 Signed-off-by: Bart Van Assche Cc: Jens Axboe Cc: Tejun Heo Cc: Jan Kara Cc: Hannes Reinecke Cc: Aaro Koskinen Cc: --- drivers/scsi/scsi_sysfs.c| 2 ++ include/linux/backing-dev-defs.h | 1 + include/linux/backing-dev.h | 1 + mm/backing-dev.c | 28 ++-- 4 files changed, 30 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index f5ace2b..8d64518 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -1110,6 +,7 @@ void __scsi_remove_device(struct scsi_device *sdev) device_unregister(&sdev->sdev_dev); transport_remove_device(dev); scsi_dh_remove_device(sdev); + bdi_sysfs_del(&sdev->request_queue->backing_dev_info); device_del(dev); } else put_device(&sdev->sdev_dev); diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 1b4d69f..1a42ecb 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -135,6 +135,7 @@ struct bdi_writeback { struct backing_dev_info { struct list_head bdi_list; + bool is_visible; unsigned long ra_pages; /* max readahead in PAGE_CACHE_SIZE units */ unsigned int capabilities; /* Device capabilities */ congested_fn *congested_fn; /* Function pointer if device is md/dm */ diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index c82794f..9004d90 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -24,6 +24,7 @@ __printf(3, 4) int bdi_register(struct backing_dev_info *bdi, struct device *parent, const char *fmt, ...); int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev); +void bdi_sysfs_del(struct backing_dev_info *bdi); void bdi_unregister(struct backing_dev_info *bdi); int __must_check bdi_setup_and_register(struct backing_dev_info *, char *); diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 8ed2ffd..b56971f 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -774,6 +774,7 @@ int bdi_init(struct backing_dev_info *bdi) int ret; bdi->dev = NULL; + bdi->is_visible = false; bdi->min_ratio = 0; bdi->max_ratio = 100; @@ -806,6 +807,7 @@ int bdi_register(struct backing_dev_info *bdi, struct device *parent, return PTR_ERR(dev); bdi->dev = dev; + bdi->is_visible = true; bdi_debug_register(bdi, dev_name(dev)); set_bit(WB_registered, &bdi->wb.state); @@ -837,6 +839,28 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi) synchronize_rcu_expedited(); } +/** + * bdi_sysfs_del - remove a BDI device from sysfs + * @bdi: BDI device pointer. + * + * It is safe to call this function more than once. + */ +void bdi_sysfs_del(struct backing_dev_info *bdi) +{ + bool is_visible = false; + + spin_lock_bh(&bdi_lock); + swap(bdi->is_visible, is_visible); + spin_unlock_bh(&bdi_lock); + + if (!is_visible) + return; + + bdi_debug_unregister(bdi); + device_del(bdi->dev); +} +EXPORT_SYMBOL(bdi_sysfs_del); + void bdi_unregister(struct backing_dev_info *bdi) { /* make sure nobody finds us on the bdi_list anymore */ @@ -845,8 +869,8 @@ void bdi_unregister(struct backing_dev_info *bdi) cgwb_bdi_destroy(bdi); if (bdi->dev) { - bdi_debug_unregister(bdi); - device_unregister(bdi->dev); + bdi_sysfs_del(bdi); + put_device(bdi->dev); bdi->dev = NULL; } } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Revert "scsi: Fix a bdi reregistration race"
The SCSI sd driver probes SCSI devices asynchronously. The sd_remove() function, called indirectly by device_del(), waits until asynchronous probing has finished. Since the block layer queue must only be cleaned up after probing has finished, device_del() has to be called before blk_cleanup_queue(). Hence revert commit bf2cf3baa20b. Signed-off-by: Bart Van Assche Cc: Hannes Reinecke Cc: Aaro Koskinen Cc: --- drivers/scsi/scsi_sysfs.c | 14 +++--- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 8d23122..f5ace2b 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1110,7 +1110,9 @@ void __scsi_remove_device(struct scsi_device *sdev) device_unregister(&sdev->sdev_dev); transport_remove_device(dev); scsi_dh_remove_device(sdev); - } + device_del(dev); + } else + put_device(&sdev->sdev_dev); /* * Stop accepting new requests and wait until all queuecommand() and @@ -1121,16 +1123,6 @@ void __scsi_remove_device(struct scsi_device *sdev) blk_cleanup_queue(sdev->request_queue); cancel_work_sync(&sdev->requeue_work); - /* -* Remove the device after blk_cleanup_queue() has been called such -* a possible bdi_register() call with the same name occurs after -* blk_cleanup_queue() has called bdi_destroy(). -*/ - if (sdev->is_visible) - device_del(dev); - else - put_device(&sdev->sdev_dev); - if (sdev->host->hostt->slave_destroy) sdev->host->hostt->slave_destroy(sdev); transport_destroy_device(dev); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] scsi_transport_fc: Introduce scsi_host_{get,put}()
Use scsi_host_{get,put}() instead of open-coding these functions. Compile-tested only. Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: James Smart Cc: stable --- drivers/scsi/scsi_transport_fc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c index 24eaaf6..8a88226 100644 --- a/drivers/scsi/scsi_transport_fc.c +++ b/drivers/scsi/scsi_transport_fc.c @@ -2586,7 +2586,7 @@ fc_rport_final_delete(struct work_struct *work) transport_remove_device(dev); device_del(dev); transport_destroy_device(dev); - put_device(&shost->shost_gendev); /* for fc_host->rport list */ + scsi_host_put(shost); /* for fc_host->rport list */ put_device(dev);/* for self-reference */ } @@ -2650,7 +2650,7 @@ fc_rport_create(struct Scsi_Host *shost, int channel, else rport->scsi_target_id = -1; list_add_tail(&rport->peers, &fc_host->rports); - get_device(&shost->shost_gendev); /* for fc_host->rport list */ + scsi_host_get(shost); /* for fc_host->rport list */ spin_unlock_irqrestore(shost->host_lock, flags); @@ -2685,7 +2685,7 @@ delete_rport: transport_destroy_device(dev); spin_lock_irqsave(shost->host_lock, flags); list_del(&rport->peers); - put_device(&shost->shost_gendev); /* for fc_host->rport list */ + scsi_host_put(shost); /* for fc_host->rport list */ spin_unlock_irqrestore(shost->host_lock, flags); put_device(dev->parent); kfree(rport); @@ -3383,7 +3383,7 @@ fc_vport_setup(struct Scsi_Host *shost, int channel, struct device *pdev, fc_host->npiv_vports_inuse++; vport->number = fc_host->next_vport_number++; list_add_tail(&vport->peers, &fc_host->vports); - get_device(&shost->shost_gendev); /* for fc_host->vport list */ + scsi_host_get(shost); /* for fc_host->vport list */ spin_unlock_irqrestore(shost->host_lock, flags); @@ -3441,7 +3441,7 @@ delete_vport: transport_destroy_device(dev); spin_lock_irqsave(shost->host_lock, flags); list_del(&vport->peers); - put_device(&shost->shost_gendev); /* for fc_host->vport list */ + scsi_host_put(shost); /* for fc_host->vport list */ fc_host->npiv_vports_inuse--; spin_unlock_irqrestore(shost->host_lock, flags); put_device(dev->parent); @@ -3504,7 +3504,7 @@ fc_vport_terminate(struct fc_vport *vport) vport->flags |= FC_VPORT_DELETED; list_del(&vport->peers); fc_host->npiv_vports_inuse--; - put_device(&shost->shost_gendev); /* for fc_host->vport list */ + scsi_host_put(shost); /* for fc_host->vport list */ } spin_unlock_irqrestore(shost->host_lock, flags); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Fix a memory leak in scsi_host_dev_release()
Fix a memory leak that occurs if a SCSI LLD calls scsi_host_alloc() and scsi_host_put() but neither scsi_host_add() nor scsi_host_remove(). This leak is fixed by ensuring that put_device(&shost->shost_dev) is always called. This patch also removes the get_device() call from scsi_add_host*() and the put_device() call from scsi_remove_host() since these calls are no longer needed. The following shell command triggers the scenario described above: for ((i=0; i<2; i++)); do srp_daemon -oac | while read line; do echo $line >/sys/class/infiniband_srp/srp-mlx4_0-1/add_target done done The kmemleak report is as follows: unreferenced object 0x88021b24a220 (size 8): comm "srp_daemon", pid 56421, jiffies 4295006762 (age 4240.750s) hex dump (first 8 bytes): 68 6f 73 74 35 38 00 a5 host58.. backtrace: [] kmemleak_alloc+0x7a/0xc0 [] __kmalloc_track_caller+0xfe/0x160 [] kvasprintf+0x5b/0x90 [] kvasprintf_const+0x8d/0xb0 [] kobject_set_name_vargs+0x3c/0xa0 [] dev_set_name+0x3c/0x40 [] scsi_host_alloc+0x327/0x4b0 [] srp_create_target+0x4e/0x8a0 [ib_srp] [] dev_attr_store+0x1b/0x20 [] sysfs_kf_write+0x4a/0x60 [] kernfs_fop_write+0x14e/0x180 [] __vfs_write+0x2f/0xf0 [] vfs_write+0xa4/0x100 [] SyS_write+0x54/0xc0 [] entry_SYSCALL_64_fastpath+0x12/0x6f Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: stable --- drivers/scsi/hosts.c | 57 +--- include/scsi/scsi_host.h | 5 + 2 files changed, 44 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 323982f..b9fa1f9 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -45,16 +45,6 @@ static atomic_t scsi_host_next_hn = ATOMIC_INIT(0);/* host_no for next new host */ -static void scsi_host_cls_release(struct device *dev) -{ - put_device(&class_to_shost(dev)->shost_gendev); -} - -static struct class shost_class = { - .name = "scsi_host", - .dev_release= scsi_host_cls_release, -}; - /** * scsi_host_set_state - Take the given host through the host state model. * @shost: scsi host to change the state of. @@ -180,7 +170,7 @@ void scsi_remove_host(struct Scsi_Host *shost) spin_unlock_irqrestore(shost->host_lock, flags); transport_unregister_device(&shost->shost_gendev); - device_unregister(&shost->shost_dev); + device_del(&shost->shost_dev); device_del(&shost->shost_gendev); } EXPORT_SYMBOL(scsi_remove_host); @@ -263,8 +253,6 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, if (error) goto out_del_gendev; - get_device(&shost->shost_gendev); - if (shost->transportt->host_size) { shost->shost_data = kzalloc(shost->transportt->host_size, GFP_KERNEL); @@ -311,10 +299,10 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, } EXPORT_SYMBOL(scsi_add_host_with_dma); -static void scsi_host_dev_release(struct device *dev) +static void scsi_host_free(struct kref *kref) { - struct Scsi_Host *shost = dev_to_shost(dev); - struct device *parent = dev->parent; + struct Scsi_Host *shost = container_of(kref, typeof(*shost), kref2); + struct device *parent = shost->shost_gendev.parent; struct request_queue *q; void *queuedata; @@ -349,6 +337,35 @@ static void scsi_host_dev_release(struct device *dev) kfree(shost); } +/* Called if shost_gendev refcnt drops to zero. */ +static void scsi_host_dev_release(struct device *dev) +{ + struct Scsi_Host *shost = dev_to_shost(dev); + + kref_put(&shost->kref2, scsi_host_free); +} + +/* Called if shost_dev refcnt drops to zero. */ +static void scsi_host_cls_release(struct device *dev) +{ + struct Scsi_Host *shost = class_to_shost(dev); + + kref_put(&shost->kref2, scsi_host_free); +} + +static struct class shost_class = { + .name = "scsi_host", + .dev_release= scsi_host_cls_release, +}; + +static void scsi_host_release(struct kref *kref) +{ + struct Scsi_Host *shost = container_of(kref, typeof(*shost), kref1); + + put_device(&shost->shost_gendev); + put_device(&shost->shost_dev); +} + static int shost_eh_deadline = -1; module_param_named(eh_deadline, shost_eh_deadline, int, S_IRUGO|S_IWUSR); @@ -467,6 +484,10 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) shost->use_blk_mq = scsi_use_blk_mq && !shost->hostt->disable_blk_mq; + kref_init(&shost->kref1); + kref_init(&shost->kref2); + kref_get(&shost->kref2); + device_initialize(&shost->shost_gendev); dev_set_name(&shost->shost_gendev, "host%d", shost->host_no); shost->shost_gendev.bus = &scsi_bus_type; @@ -571,7 +592,7 @@ EXPORT_SYMBOL(scsi_host_lookup); struct Scsi_Host *scsi_hos
[PATCH 0/2] Fix a scsi_host_dev_release() memory leak
This patch series consists of two patches: - A first patch that converts the code that accesses shost->shost_gendev to manipulate the shost reference count into scsi_host_{get,put}() calls. - A second patch that fixes the actual memory leak. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux kernel security issuse scsi_report_lun_scan report
On Fri, Nov 20, 2015 at 12:57 PM, James Bottomley wrote: > > It's done under the scan mutex, so there can only be one thread in that > code at once. Hmm. Looking at the call chain seems to confirm that. But looking at the call chain I _also_ see that we have scsi_free_host_dev() there, which seems to be some stale frame data from a previous scan. I'm wondering if that is a clue. I find exactly two callers of that functions, both in the gdth driver. Maybe this is some odd refcount bug, brought on by reuse of a sdev. Would that make more sense? Why is that scsi_free_host_dev() used only by that one driver, and nobody else wants it or needs it? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux kernel security issuse scsi_report_lun_scan report
On Fri, 2015-11-20 at 12:33 -0800, Linus Torvalds wrote: > On Fri, Nov 20, 2015 at 10:26 AM, James Bottomley > wrote: > > > > We can look at it, but the analysis shouldn't be correct. > > Just take the five seconds to check the freeing path, please. The last > free is this: I did ... it can only be something freed the device while we were scanning it because it can't be a device we allocated (unless there's an unbalanced put somewhere). I can't see how it can happen, but moving the put to after the if should fix it. I'd just like confirmation it does in case this is actually caused by something else entirely. James > > INFO: Freed in scsi_device_dev_release_usercontext+0x23d/0x2d7 age=4 cpu=0 > > pid=1 > > free_debug_processing+0x188/0x207 mm/slub.c:1108 > > scsi_device_dev_release_usercontext+0x23d/0x2d7 > > drivers/scsi/scsi_sysfs.c:429 > > __slab_free+0x4a/0x27a mm/slub.c:2587 > > mempool_free_slab+0x0/0x13 mm/mempool.c:453 > > ida_remove+0xc6/0x159 lib/idr.c:1042 > > __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e ??:? > > __read_once_size include/linux/compiler.h:218 > > atomic_read ./arch/x86/include/asm/atomic.h:27 > > __rcu_is_watching+0x18/0x1f kernel/rcu/tree.c:987 > > scsi_device_dev_release_usercontext+0x23d/0x2d7 > > drivers/scsi/scsi_sysfs.c:429 > > scsi_device_dev_release_usercontext+0x23d/0x2d7 > > drivers/scsi/scsi_sysfs.c:429 > > scsi_device_dev_release_usercontext+0x0/0x2d7 > > drivers/scsi/scsi_sysfs.c:438 > > execute_in_process_context+0x24/0x82 kernel/workqueue.c:2969 > > device_release+0x44/0xde drivers/base/core.c:247 > > kobject_cleanup lib/kobject.c:631 > > kobject_release lib/kobject.c:660 > > kref_sub include/linux/kref.h:74 > > kref_put include/linux/kref.h:99 > > kobject_put+0xbc/0xcf lib/kobject.c:677 > > scsi_report_lun_scan+0x27f/0x434 drivers/scsi/scsi_scan.c:1458 > > scsi_report_lun_scan+0x0/0x434 drivers/scsi/scsi_scan.c:1053 > > so clearly the device *was* freed by scsi_report_lun_scan() doing the > scsi_device_put(). > > > This device > > is the one we first used to issue the report lun scan. Either it's an > > existing device, or we created it specially for the purpose. If it's an > > existing one, that put just releases our reference, but the core still > > has one (there'd have to be a very unusual scan destroy race for the > > core to be releasing a reference to an object it was in process of > > scanning). > > A race it may be. Or maybe it's even a kasan bug. But so far, those > kasan reports have been pretty much on the money. > > It may be that a possible race is widened by kasan itself slowing things down. > > Linus > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux kernel security issuse scsi_report_lun_scan report
On Fri, 2015-11-20 at 12:54 -0800, Linus Torvalds wrote: > On Fri, Nov 20, 2015 at 10:26 AM, James Bottomley > wrote: > > > > We can look at it, but the analysis shouldn't be correct. This device > > is the one we first used to issue the report lun scan. Either it's an > > existing device, or we created it specially for the purpose. If it's an > > existing one, that put just releases our reference, but the core still > > has one (there'd have to be a very unusual scan destroy race for the > > core to be releasing a reference to an object it was in process of > > scanning). > > Side note: that whole "if it's an existing one" looks fundamentally racy. > > What if two threads have that existing one, and both threads decide > "there's no device there", so they'll both decide to do that > __scsi_remove_device()? It's done under the scan mutex, so there can only be one thread in that code at once. James > In fact, one of the threads might have created the device, so it looks > like it's sufficient that just one thread has that > "scsi_device_lookup_by_target()" case.. > > I don't see any serialization around this. > > Now, I do agree that it's odd that this happens during early kernel > initcalls, but the scsi layer is one of the things that uses async > stuff, so if that do_scan_async() ever ends up using the same target > twice, that would explain it. Can that happen some way? > > Linus > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux kernel security issuse scsi_report_lun_scan report
On Fri, Nov 20, 2015 at 10:26 AM, James Bottomley wrote: > > We can look at it, but the analysis shouldn't be correct. This device > is the one we first used to issue the report lun scan. Either it's an > existing device, or we created it specially for the purpose. If it's an > existing one, that put just releases our reference, but the core still > has one (there'd have to be a very unusual scan destroy race for the > core to be releasing a reference to an object it was in process of > scanning). Side note: that whole "if it's an existing one" looks fundamentally racy. What if two threads have that existing one, and both threads decide "there's no device there", so they'll both decide to do that __scsi_remove_device()? In fact, one of the threads might have created the device, so it looks like it's sufficient that just one thread has that "scsi_device_lookup_by_target()" case.. I don't see any serialization around this. Now, I do agree that it's odd that this happens during early kernel initcalls, but the scsi layer is one of the things that uses async stuff, so if that do_scan_async() ever ends up using the same target twice, that would explain it. Can that happen some way? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux kernel security issuse scsi_report_lun_scan report
On Fri, Nov 20, 2015 at 10:26 AM, James Bottomley wrote: > > We can look at it, but the analysis shouldn't be correct. Just take the five seconds to check the freeing path, please. The last free is this: > INFO: Freed in scsi_device_dev_release_usercontext+0x23d/0x2d7 age=4 cpu=0 > pid=1 > free_debug_processing+0x188/0x207 mm/slub.c:1108 > scsi_device_dev_release_usercontext+0x23d/0x2d7 > drivers/scsi/scsi_sysfs.c:429 > __slab_free+0x4a/0x27a mm/slub.c:2587 > mempool_free_slab+0x0/0x13 mm/mempool.c:453 > ida_remove+0xc6/0x159 lib/idr.c:1042 > __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e ??:? > __read_once_size include/linux/compiler.h:218 > atomic_read ./arch/x86/include/asm/atomic.h:27 > __rcu_is_watching+0x18/0x1f kernel/rcu/tree.c:987 > scsi_device_dev_release_usercontext+0x23d/0x2d7 > drivers/scsi/scsi_sysfs.c:429 > scsi_device_dev_release_usercontext+0x23d/0x2d7 > drivers/scsi/scsi_sysfs.c:429 > scsi_device_dev_release_usercontext+0x0/0x2d7 drivers/scsi/scsi_sysfs.c:438 > execute_in_process_context+0x24/0x82 kernel/workqueue.c:2969 > device_release+0x44/0xde drivers/base/core.c:247 > kobject_cleanup lib/kobject.c:631 > kobject_release lib/kobject.c:660 > kref_sub include/linux/kref.h:74 > kref_put include/linux/kref.h:99 > kobject_put+0xbc/0xcf lib/kobject.c:677 > scsi_report_lun_scan+0x27f/0x434 drivers/scsi/scsi_scan.c:1458 > scsi_report_lun_scan+0x0/0x434 drivers/scsi/scsi_scan.c:1053 so clearly the device *was* freed by scsi_report_lun_scan() doing the scsi_device_put(). > This device > is the one we first used to issue the report lun scan. Either it's an > existing device, or we created it specially for the purpose. If it's an > existing one, that put just releases our reference, but the core still > has one (there'd have to be a very unusual scan destroy race for the > core to be releasing a reference to an object it was in process of > scanning). A race it may be. Or maybe it's even a kasan bug. But so far, those kasan reports have been pretty much on the money. It may be that a possible race is widened by kasan itself slowing things down. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Friday 20 November 2015 02:41:19 Finn Thain wrote: > > On Thu, 19 Nov 2015, Ondrej Zary wrote: > > > On Thursday 19 November 2015 03:24:56 Finn Thain wrote: > > > > > On Wed, 18 Nov 2015, Ondrej Zary wrote: > > > > > > > > > > > I have some NCR5380 ISA cards and can test them. > > > > > > Thanks Ondrej. I've no idea which ISA drivers are presently working in > > > mainline. Finding regressions may be more difficult than usual ;-) > > > > You're right... looks very broken: > > > > [ 62.577194] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x240, > > n_io_port 16, base 0x0, irq 0, can_queue 16, cmd_per_lun 2, > > sg_tablesize 128, this_id 7, flags { DTC3181E NO_PSEUDO_DMA }, USLEEP_POLL > > 3, USLEEP_WAITLONG 1250, options { AUTOPROBE_IRQ PSEUDO_DMA > > NCR53C400 } > > [ 62.796635] scsi 2:0:0:0: Direct-Access IBM 0663 e > > PQ: 0 ANSI: 2 > > [ 63.878494] sd 2:0:0:0: Attached scsi generic sg1 type 0 > > [ 95.848260] sd 2:0:0:0: aborting command > > > > And the system hangs completely. > > > > Yes. That was the usual failure mode. The old EH abort routine is fatal. > Up until I disabled PDMA by default for mac_scsi (in v3.19), that driver > would do the same thing. > > > It's much better with your patches, but still not great :) > > > > Pleased to hear it :) > > > [ 93.963264] pnp 01:01.00: [io 0x0240-0x025f] > > [ 93.963493] pnp 01:01.00: [irq 5] > > [ 93.965768] pnp 01:01.00: activated > > [ 93.977147] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x240, > > n_io_port 16, base 0x0, irq 0, can_queue 16, cmd_per_lun 2, > > sg_tablesize 128, this_id 7, flags { DTC3181E NO_PSEUDO_DMA }, options { > > AUTOPROBE_IRQ PSEUDO_DMA } > > [ 93.987527] scsi host2: rejecting message > > [ 93.987647] Synchronous Data Transfer Request period = 100 ns offset = 12 > > [ 94.001219] scsi 2:0:0:0: Direct-Access IBM 0663 e > > PQ: 0 ANSI: 2 > > [ 113.000794] sd 2:0:0:0: Attached scsi generic sg1 type 0 > > I'd be interested to know what commands were in play in that 19 second > interval. Might need to use scsi_logging_level to figure that out. > > My tests involved 3 different scsi targets (two disks and a CD-ROM) but > none of these send a SDTR. Your log says the driver correctly rejected the > SDTR message but that doesn't mean the target actually went to MSG IN > phase and got the message. Do you have any older targets you can test? Another disk, without patches: [ 84.481582] pnp 01:01.00: activated [ 84.489650] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x240, n_io_port 16, base 0x0, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 128, this_id 7, flags { DTC3181E NO_PSEUDO_DMA }, USLEEP_POLL 3, USLEEP_WAITLONG 1250, options { AUTOPROBE_IRQ PSEUDO_DMA NCR53C400 } [ 84.953332] scsi 2:0:1:0: Direct-Access QUANTUM LP240S GM240S01X 4.6 PQ: 0 ANSI: 2 CCS [ 86.786475] sd 2:0:1:0: Attached scsi generic sg1 type 0 [ 86.793753] sd 2:0:1:0: [sdb] 479350 512-byte logical blocks: (245 MB/234 MiB) [ 86.998555] sd 2:0:1:0: [sdb] Write Protect is off [ 87.406068] sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 118.888271] sd 2:0:1:0: [sdb] aborting command [ 118.888738] sd 2:0:1:0: [sdb] aborting command With patches: [ 258.473748] pnp 01:01.00: activated [ 258.483592] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x240, n_io_port 16, base 0x0, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 128, this_id 7, flags { DTC3181E NO_PSEUDO_DMA }, options { AUTOPROBE_IRQ PSEUDO_DMA } [ 261.347632] scsi 2:0:1:0: Direct-Access QUANTUM LP240S GM240S01X 4.6 PQ: 0 ANSI: 2 CCS [ 275.560451] sd 2:0:1:0: Attached scsi generic sg1 type 0 [ 275.632519] sd 2:0:1:0: [sdb] 479350 512-byte logical blocks: (245 MB/234 MiB) [ 275.635533] sd 2:0:1:0: [sdb] Write Protect is off [ 275.642315] sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 469.076347] sd 2:0:1:0: [sdb] FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_SENSE [ 469.076613] sd 2:0:1:0: [sdb] Sense Key : Aborted Command [current] [ 469.076851] sd 2:0:1:0: [sdb] Add. Sense: No additional sense information [ 469.077086] sd 2:0:1:0: [sdb] CDB: Read(10) 28 00 00 00 00 02 00 00 02 00 [ 469.077306] blk_update_request: I/O error, dev sdb, sector 2 [ 469.077522] Buffer I/O error on dev sdb, logical block 1, async page read [ 480.108255] INFO: task kworker/u2:2:60 blocked for more than 120 seconds. [ 480.109773] Not tainted 4.3.0-rc1+ #74 [ 480.109973] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 480.110179] kworker/u2:2D 0040 060 2 0x [ 480.110671] Workqueue: events_unbound async_run_entry_fn [ 480.110999] cf9e8780 0046 2eff25f7 0040 c117f111 2ee82733 0040 0016fec4 [ 480.112390] cfaa6000 7fff c139c7d2 c139c504 7fff c139d9d3
Re: linux kernel security issuse scsi_report_lun_scan report
On Fri, 2015-11-20 at 10:14 -0800, Linus Torvalds wrote: > [ I don't know if the original reporter ended up actually sending this > to the scsi list like Greg asked, so I'll forward it myself just in > case ] No, this is the first time we've seen this. > There seems to be a very old use-after-free in the scsi code (git > blame says the lines around this area are from 2005 and 2008) that > kasan reports. > > I've tried to clean up the formatting in the email a bit, but the > executive summary seems to be that this: > > drivers/scsi/scsi_scan.c, around line 1459: > > scsi_device_put(sdev); > if (scsi_device_created(sdev)) > > is just wrong, because the "scsi_device_put()" may be freeing the > sdev, so then doing "scsi_device_created(sdev)" after it is bogus. We can look at it, but the analysis shouldn't be correct. This device is the one we first used to issue the report lun scan. Either it's an existing device, or we created it specially for the purpose. If it's an existing one, that put just releases our reference, but the core still has one (there'd have to be a very unusual scan destroy race for the core to be releasing a reference to an object it was in process of scanning). If we create it, then it releases our reference taken in line 1331 but it's also not a final put because alloc has to be paired with remove. However, it should also be harmless to move the put to after the if() clause ... can we ask the reporter to check if that's the fix? James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux kernel security issuse scsi_report_lun_scan report
[ I don't know if the original reporter ended up actually sending this to the scsi list like Greg asked, so I'll forward it myself just in case ] There seems to be a very old use-after-free in the scsi code (git blame says the lines around this area are from 2005 and 2008) that kasan reports. I've tried to clean up the formatting in the email a bit, but the executive summary seems to be that this: drivers/scsi/scsi_scan.c, around line 1459: scsi_device_put(sdev); if (scsi_device_created(sdev)) is just wrong, because the "scsi_device_put()" may be freeing the sdev, so then doing "scsi_device_created(sdev)" after it is bogus. Linus On Wed, Nov 18, 2015 at 5:15 AM, 程君(成淼) wrote: > > Dear all: > we find one security issuse in kernel 4.3.0,aslo check the > lastest code,please check , thanks. > > 1.user-after-free in scsi_report_lun_scan > > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 4.3.0+ #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 > Call Trace: > scsi_device_created include/scsi/scsi_device.h:460 > scsi_report_lun_scan+0x28b/0x434 drivers/scsi/scsi_scan.c:1459 > device_release+0x44/0xde drivers/base/core.c:247 > scsi_device_created include/scsi/scsi_device.h:460 > scsi_report_lun_scan+0x28b/0x434 drivers/scsi/scsi_scan.c:1459 > scsi_probe_and_add_lun+0xe4f/0xe4f drivers/scsi/scsi_scan.c:1053 > scsi_free_host_dev+0x4d/0x4d drivers/scsi/scsi_scan.c:1921 > __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e ??:? > __scsi_scan_target+0x16f/0x293 drivers/scsi/scsi_scan.c:1563 > scsi_add_device+0x20/0x20 drivers/scsi/scsi_scan.c:1513 > __pm_runtime_idle+0x5c/0x5c drivers/base/power/runtime.c:904 > __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e ??:? > scsi_scan_channel+0x81/0x8f drivers/scsi/scsi_scan.c:1641 > scsi_scan_host_selected+0x144/0x161 drivers/scsi/scsi_scan.c:1669 > scsi_scan_host+0xa5/0x21d drivers/scsi/scsi_scan.c:1837 > virtscsi_probe+0x4c8/0x50b drivers/scsi/virtio_scsi.c:1032 > virtscsi_init+0x392/0x392 drivers/scsi/virtio_scsi.c:908 > kernfs_type include/linux/kernfs.h:239 > kernfs_leftmost_descendant+0x12/0x36 fs/kernfs/dir.c:970 > kernfs_activate+0x30/0xea fs/kernfs/dir.c:1036 > mutex_clear_owner kernel/locking/mutex.h:27 > mutex_unlock+0x12/0x2a kernel/locking/mutex.c:435 > kernfs_add_one+0x20d/0x21f fs/kernfs/dir.c:647 > __virtio_clear_bit include/linux/virtio_config.h:135 > vring_transport_features+0x3f/0x55 drivers/virtio/virtio_ring.c:786 > vp_finalize_features+0x49/0x4e drivers/virtio/virtio_pci_legacy.c:44 > virtio_dev_probe+0x163/0x2d0 drivers/virtio/virtio.c:235 > really_probe drivers/base/dd.c:316 > driver_probe_device+0x25d/0x378 drivers/base/dd.c:429 > really_probe drivers/base/dd.c:368 > driver_probe_device+0x378/0x378 drivers/base/dd.c:429 > __driver_attach+0x6d/0xae drivers/base/dd.c:642 > bus_for_each_dev+0x106/0x10a drivers/base/bus.c:314 > next_device+0x24/0x24 drivers/base/bus.c:280 > __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e ??:? > bus_add_driver+0x269/0x2bc drivers/base/bus.c:708 > initcall_blacklist+0xbe/0xbe init/main.c:726 > driver_register+0x103/0x144 drivers/base/driver.c:168 > scsi_init_sysctl+0x1d/0x1d drivers/scsi/scsi_sysctl.c:46 > init+0xaf/0xb9 net/ipv4/netfilter/nf_nat_h323.c:590 > scsi_init_sysctl+0x1d/0x1d drivers/scsi/scsi_sysctl.c:46 > do_one_initcall+0x9d/0x1f4 init/main.c:794 > try_to_run_init_process+0x2f/0x2f init/main.c:928 > parse_args+0x4b/0x3ab kernel/params.c:234 > Memory state around the buggy address: > 88006984d000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > 88006984d080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >>88006984d100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ^ > 88006984d180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > 88006984d200: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc > > INFO: Allocated in scsi_alloc_sdev+0x7c/0x56e age=4 cpu=0 pid=1 > set_track+0x6d/0x108 mm/slub.c:528 > alloc_debug_processing+0xaf/0x142 mm/slub.c:1049 > __slab_alloc+0x3e0/0x4b8 mm/slub.c:2402 > kmalloc include/linux/slab.h:445 > kzalloc include/linux/slab.h:593 > scsi_alloc_sdev+0x7c/0x56e drivers/scsi/scsi_scan.c:218 > init_object+0x2d/0x5e mm/slub.c:681 > __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e ??:? > scsi_probe_and_add_lun+0xe3d/0xe4f drivers/scsi/scsi_scan.c:1178 > slab_alloc_node mm/slub.c:2470 > slab_alloc mm/slub.c:2512 > __kmalloc+0x84/0x169 mm/slub.c:3417 > slab_alloc_node mm/slub.c:2470 > slab_alloc mm/slub.c:2512 > __kmalloc+0x84/0x169 mm/slub.c:3417 > kmalloc include/linux/slab.h:445 > kzalloc include/linux/slab.h:593 > scsi_alloc_sdev+0x7c/0x56e drivers/scsi/scsi_scan.c:218 > scsi_report_lun_scan+0x17f/0x434 drivers/scsi/scsi_scan.c:1328 > scsi_report_lun_scan+0x0/0x434 drivers/scsi/scsi_scan.c:1
Re: [PATCH] Fix a memory leak in scsi_host_dev_release()
On 11/20/2015 03:52 AM, Christoph Hellwig wrote: the memory leak looks real, and your fix looks corret, but I still don't like it. I think it's reasonable for SCSI to assume that the final put_device fully frees the struct device including the name pointer that is assigned entirely behind the back of the caller. So I think the fix for this probably should be in the driver core. Hello Christoph, Thanks for the feedback. However, I'm not sure this can be fixed by modifying the driver core. If scsi_host_remove() is not called the SCSI core doesn't call put_device(&shost->shost_dev). I will post a second version of this patch that ensures that the SCSI core always calls put_device(&shost->shost_dev). Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] IB: add a proper completion queue abstraction
On 11/20/2015 02:16 AM, Christoph Hellwig wrote: > On Wed, Nov 18, 2015 at 10:20:14AM -0800, Bart Van Assche wrote: >> Are you perhaps referring to the sysfs CPU mask that allows to control >> workqueue affinity ? > > I think he is referring to the defintion of WQ_UNBOUND: > >WQ_UNBOUND > > Work items queued to an unbound wq are served by the special > woker-pools which host workers which are not bound to any > specific CPU. This makes the wq behave as a simple execution > context provider without concurrency management. The unbound > worker-pools try to start execution of work items as soon as > possible. Unbound wq sacrifices locality but is useful for > the following cases. > > * Wide fluctuation in the concurrency level requirement is > expected and using bound wq may end up creating large number > of mostly unused workers across different CPUs as the issuer > hops through different CPUs. > > * Long running CPU intensive workloads which can be better > managed by the system scheduler. Hello Christoph, The comment about locality in the above quote is interesting. How about modifying patch 2/9 as indicated below ? The modification below does not change the behavior of this patch if ib_cq.w.cpu is not modified. And it allows users who care about locality and who want to skip the scheduler overhead by setting ib_cq.w.cpu to the index of the CPU they want the work to be processed on. Thanks, Bart. --- drivers/infiniband/core/cq.c | 11 ++- include/rdma/ib_verbs.h | 5 - 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c index bf2a079..4d80d8c 100644 --- a/drivers/infiniband/core/cq.c +++ b/drivers/infiniband/core/cq.c @@ -94,18 +94,18 @@ static void ib_cq_completion_softirq(struct ib_cq *cq, void *private) static void ib_cq_poll_work(struct work_struct *work) { - struct ib_cq *cq = container_of(work, struct ib_cq, work); + struct ib_cq *cq = container_of(work, struct ib_cq, w.work); int completed; completed = __ib_process_cq(cq, IB_POLL_BUDGET_WORKQUEUE); if (completed >= IB_POLL_BUDGET_WORKQUEUE || ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0) - queue_work(ib_comp_wq, &cq->work); + queue_work_on(cq->w.cpu, ib_comp_wq, &cq->w.work); } static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private) { - queue_work(ib_comp_wq, &cq->work); + queue_work_on(cq->w.cpu, ib_comp_wq, &cq->w.work); } /** @@ -159,7 +159,8 @@ struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private, break; case IB_POLL_WORKQUEUE: cq->comp_handler = ib_cq_completion_workqueue; - INIT_WORK(&cq->work, ib_cq_poll_work); + INIT_WORK(&cq->w.work, ib_cq_poll_work); + cq->w.cpu = WORK_CPU_UNBOUND; ib_req_notify_cq(cq, IB_CQ_NEXT_COMP); break; default: @@ -195,7 +196,7 @@ void ib_free_cq(struct ib_cq *cq) irq_poll_disable(&cq->iop); break; case IB_POLL_WORKQUEUE: - flush_work(&cq->work); + flush_work(&cq->w.work); break; default: WARN_ON_ONCE(1); diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index f59a8d3..b1344f8 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1291,7 +1291,10 @@ struct ib_cq { struct ib_wc*wc; union { struct irq_poll iop; - struct work_struct work; + struct { + struct work_struct work; + int cpu; + } w; }; }; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: hpsa: select CONFIG_SCSI_SAS_ATTR
> "Arnd" == Arnd Bergmann writes: Arnd> The hpsa driver recently started using the sas transport class, Arnd> but it does not ensure that the corresponding code is actually Arnd> built, which may lead to a link error: Applied. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RESEND 2] SCSI: initio: remove duplicate module device table
> "Arnd" == Arnd Bergmann writes: Arnd> The initio driver has for many years had two copies of the same Arnd> module device table. One of them is also used for registering the Arnd> other driver, the other one is entirely useless after the large Arnd> scale cleanup that Alan Cox did back in 2007. Applied. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] scsi: advansys needs ISA dma api for ISA support
> "Arnd" == Arnd Bergmann writes: Arnd> The advansys drvier uses the request_dma function that is used on Arnd> ISA machines for the internal DMA controller, which causes build Arnd> errors on platforms that have ISA slots but do not provide the ISA Arnd> DMA API: Applied. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi: use sector_div instead of do_div
do_div is the wrong way to divide a sector_t, as it is less efficient when sector_t is 32-bit wide. With the upcoming do_div optimizations, the kernel starts warning about this: drivers/scsi/scsi_debug.c: In function 'dif_store': include/asm-generic/div64.h:207:28: warning: comparison of distinct pointer types lacks a cast This changes the code to use sector_div instead, which always produces optimal code. Signed-off-by: Arnd Bergmann --- Found on the ARM randconfig build today, after I merged Nico's patches for linux-next diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index dfcc45bb03b1..ec622ba9864a 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -678,7 +678,7 @@ static void *fake_store(unsigned long long lba) static struct sd_dif_tuple *dif_store(sector_t sector) { - sector = do_div(sector, sdebug_store_sectors); + sector = sector_div(sector, sdebug_store_sectors); return dif_storep + sector; } @@ -2780,7 +2780,7 @@ static unsigned long lba_to_map_index(sector_t lba) lba += scsi_debug_unmap_granularity - scsi_debug_unmap_alignment; } - do_div(lba, scsi_debug_unmap_granularity); + sector_div(lba, scsi_debug_unmap_granularity); return lba; } -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: hpsa: select CONFIG_SCSI_SAS_ATTR
On Fri, 2015-11-20 at 10:09 -0600, Don Brace wrote: > Signed-off-by: Don Brace You mean acked-by ... it can only be signed-off-by if *you* resend the patch because signoffs track the patch transmission path. James > And...Thanks! > > > On 11/19/2015 04:04 PM, Arnd Bergmann wrote: > > The hpsa driver recently started using the sas transport class, but it > > does not ensure that the corresponding code is actually built, which > > may lead to a link error: > > > > drivers/built-in.o: In function `hpsa_free_sas_phy': > > (.text+0x1ce874): undefined reference to `sas_port_delete_phy' > > (.text+0x1ce87c): undefined reference to `sas_phy_free' > > drivers/built-in.o: In function `hpsa_alloc_sas_port': > > (.text+0x1ceb9c): undefined reference to `sas_port_alloc_num' > > (.text+0x1ceba8): undefined reference to `sas_port_add' > > drivers/built-in.o: In function `hpsa_init': > > (.init.text+0x8838): undefined reference to `sas_attach_transport' > > (.init.text+0x8868): undefined reference to `sas_release_transport > > > > This adds 'select SCSI_SAS_ATTR' in the Kconfig entry, just like we do > > for all other drivers using those functions. > > > > Signed-off-by: Arnd Bergmann > > Fixes: d04e62b9d63a ("hpsa: add in sas transport class") > > --- > > This showed up on the ARM randconfig builds today for the first time. > > > > diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig > > index 04a720686516..64eed87d34a8 100644 > > --- a/drivers/scsi/Kconfig > > +++ b/drivers/scsi/Kconfig > > @@ -364,6 +364,7 @@ config SCSI_HPSA > > tristate "HP Smart Array SCSI driver" > > depends on PCI && SCSI > > select CHECK_SIGNATURE > > + select SCSI_SAS_ATTRS > > help > > This driver supports HP Smart Array Controllers (circa 2009). > > It is a SCSI alternative to the cciss driver, which is a block > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: hpsa: select CONFIG_SCSI_SAS_ATTR
Signed-off-by: Don Brace And...Thanks! On 11/19/2015 04:04 PM, Arnd Bergmann wrote: The hpsa driver recently started using the sas transport class, but it does not ensure that the corresponding code is actually built, which may lead to a link error: drivers/built-in.o: In function `hpsa_free_sas_phy': (.text+0x1ce874): undefined reference to `sas_port_delete_phy' (.text+0x1ce87c): undefined reference to `sas_phy_free' drivers/built-in.o: In function `hpsa_alloc_sas_port': (.text+0x1ceb9c): undefined reference to `sas_port_alloc_num' (.text+0x1ceba8): undefined reference to `sas_port_add' drivers/built-in.o: In function `hpsa_init': (.init.text+0x8838): undefined reference to `sas_attach_transport' (.init.text+0x8868): undefined reference to `sas_release_transport This adds 'select SCSI_SAS_ATTR' in the Kconfig entry, just like we do for all other drivers using those functions. Signed-off-by: Arnd Bergmann Fixes: d04e62b9d63a ("hpsa: add in sas transport class") --- This showed up on the ARM randconfig builds today for the first time. diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 04a720686516..64eed87d34a8 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -364,6 +364,7 @@ config SCSI_HPSA tristate "HP Smart Array SCSI driver" depends on PCI && SCSI select CHECK_SIGNATURE + select SCSI_SAS_ATTRS help This driver supports HP Smart Array Controllers (circa 2009). It is a SCSI alternative to the cciss driver, which is a block -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: > Can't we have a joint effort here? > I've been spending a _LOT_ of time trying to debug things here, but > none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. > > I'm almost tempted to increase the count from scsi_alloc_sgtable() > by one and be done with ... > That might not fix it if it is a problem with the merge code, though. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/20/2015 03:38 PM, Ewan Milne wrote: > On Thu, 2015-11-19 at 16:35 +0100, Hannes Reinecke wrote: >> On 11/19/2015 09:23 AM, Christoph Hellwig wrote: >>> It's pretty much guaranteed a block layer bug, most likely in the >>> merge bios to request infrastucture where we don't obey the merging >>> limits properly. >>> >>> Does either of you have a known good and first known bad kernel? >> >> Well, I have been fighting a similar issue for several months now, >> albeit with multipath enabled. Haven't had much progress with this, >> sadly. >> Seeing that this is our distro kernel it might or might not be >> related; however, as the symptoms are identical there still is a >> chance that this is actually a generic block-layer problem. >> >> Cheers, >> >> Hannes > > We have seen this also. (e.g. req->nr_phys_segments was 3, but > blk_rq_map_sg() returned 4.) I was suspicious of the patch: > > bio: modify __bio_add_page() to accept pages that don't start a new segment > > But we put some debugging code in and didn't hit it. We haven't > found the problem yet, either, though. We're still looking. > Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... > As Christoph said, it would seem to be a problem with the block layer > merging. > > The API for this seems defective, in that blk_rq_map_sg() should > never be returning a value indicating that it overwrote past the > end of the supplied SG array and depend on the caller to check it. > (We could get data corruption on another I/O if it used adjacent > memory for a different SG list, for example.) > Yeah, the API is bloody useless. By the time you hit the BUG_ON you've already caused a memory corruption, so no way you can recover there. At the very least we should be passing in the sg list count into blk_map_rq_sg(), but that's a core blocklayer API and changes here would require changes by quite a set of drivers. Plus it wouldn't help me for a distribution kernel ... Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Thu, 2015-11-19 at 16:35 +0100, Hannes Reinecke wrote: > On 11/19/2015 09:23 AM, Christoph Hellwig wrote: > > It's pretty much guaranteed a block layer bug, most likely in the > > merge bios to request infrastucture where we don't obey the merging > > limits properly. > > > > Does either of you have a known good and first known bad kernel? > > Well, I have been fighting a similar issue for several months now, > albeit with multipath enabled. Haven't had much progress with this, > sadly. > Seeing that this is our distro kernel it might or might not be > related; however, as the symptoms are identical there still is a > chance that this is actually a generic block-layer problem. > > Cheers, > > Hannes We have seen this also. (e.g. req->nr_phys_segments was 3, but blk_rq_map_sg() returned 4.) I was suspicious of the patch: bio: modify __bio_add_page() to accept pages that don't start a new segment But we put some debugging code in and didn't hit it. We haven't found the problem yet, either, though. We're still looking. As Christoph said, it would seem to be a problem with the block layer merging. The API for this seems defective, in that blk_rq_map_sg() should never be returning a value indicating that it overwrote past the end of the supplied SG array and depend on the caller to check it. (We could get data corruption on another I/O if it used adjacent memory for a different SG list, for example.) -Ewan -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Fri, 2015-11-20 at 13:56 +0100, Laurent Dufour wrote: > On 20/11/2015 13:10, Michael Ellerman wrote: > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > > > > > It's pretty much guaranteed a block layer bug, most likely in the > > > merge bios to request infrastucture where we don't obey the merging > > > limits properly. > > > > > > Does either of you have a known good and first known bad kernel? > > > > Not me, I've only hit it one or two times. All I can say is I have hit it in > > 4.4-rc1. > > > > Laurent, can you narrow it down at all? > > I'm trying bisect it but it's very time consuming, and I think I missed > some bad kernel because I didn't let enough time to the system to panic. > > But I'm still on... > I usually hit it at boot time. Maybe one out of five times. I'll take a stab at bisecting today... -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 20/11/2015 13:10, Michael Ellerman wrote: > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > >> It's pretty much guaranteed a block layer bug, most likely in the >> merge bios to request infrastucture where we don't obey the merging >> limits properly. >> >> Does either of you have a known good and first known bad kernel? > > Not me, I've only hit it one or two times. All I can say is I have hit it in > 4.4-rc1. > > Laurent, can you narrow it down at all? I'm trying bisect it but it's very time consuming, and I think I missed some bad kernel because I didn't let enough time to the system to panic. But I'm still on... Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Friday 20 November 2015, Geert Uytterhoeven wrote: > On Fri, Nov 20, 2015 at 12:40 PM, Ondrej Zary > wrote: > > Working ISA means more testing possibilities. It's much easier to get an > > ISA card than a Sun or Atari. Also faster CPU (such as 1 GHz P3) means > > quicker testing. > > Faster PCs without ISA slots? ;-) Faster but not too fast, you have to be careful :) There are many boards for Pentium 3 or Ahlon XP CPUs with ISA slots. -- Ondrej Zary -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH] lpfc: Add lockdep assertions
Several functions in lpfc have comments stating that the function must be called with the hbalock (or hostlock, or ringlock) held. Add lockdep_assert_held() annotations to these functions, so one can actually verify the locks are held. Signed-off-by: Johannes Thumshirn --- I'm not sure if this is actually helpfull upstream but it helped me to debug a downstream bug. drivers/scsi/lpfc/lpfc_hbadisc.c | 5 + drivers/scsi/lpfc/lpfc_sli.c | 43 2 files changed, 48 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index bfc2442..96de2ab 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -1314,6 +1315,8 @@ __lpfc_update_fcf_record_pri(struct lpfc_hba *phba, uint16_t fcf_index, { struct lpfc_fcf_pri *fcf_pri; + lockdep_assert_held(&phba->hbalock); + fcf_pri = &phba->fcf.fcf_pri[fcf_index]; fcf_pri->fcf_rec.fcf_index = fcf_index; /* FCF record priority */ @@ -1398,6 +1401,8 @@ __lpfc_update_fcf_record(struct lpfc_hba *phba, struct lpfc_fcf_rec *fcf_rec, struct fcf_record *new_fcf_record, uint32_t addr_mode, uint16_t vlan_id, uint32_t flag) { + lockdep_assert_held(&phba->hbalock); + /* Copy the fields from the HBA's FCF record */ lpfc_copy_fcf_record(fcf_rec, new_fcf_record); /* Update other fields of driver FCF record */ diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index f9585cd..6d84c77 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include @@ -576,6 +577,8 @@ __lpfc_sli_get_iocbq(struct lpfc_hba *phba) struct list_head *lpfc_iocb_list = &phba->lpfc_iocb_list; struct lpfc_iocbq * iocbq = NULL; + lockdep_assert_held(&phba->hbalock); + list_remove_head(lpfc_iocb_list, iocbq, struct lpfc_iocbq, list); if (iocbq) phba->iocb_cnt++; @@ -797,6 +800,7 @@ int lpfc_test_rrq_active(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, uint16_t xritag) { + lockdep_assert_held(&phba->hbalock); if (!ndlp) return 0; if (!ndlp->active_rrqs_xri_bitmap) @@ -914,6 +918,8 @@ __lpfc_sli_get_sglq(struct lpfc_hba *phba, struct lpfc_iocbq *piocbq) struct lpfc_nodelist *ndlp; int found = 0; + lockdep_assert_held(&phba->hbalock); + if (piocbq->iocb_flag & LPFC_IO_FCP) { lpfc_cmd = (struct lpfc_scsi_buf *) piocbq->context1; ndlp = lpfc_cmd->rdata->pnode; @@ -1003,6 +1009,8 @@ __lpfc_sli_release_iocbq_s4(struct lpfc_hba *phba, struct lpfc_iocbq *iocbq) unsigned long iflag = 0; struct lpfc_sli_ring *pring = &phba->sli.ring[LPFC_ELS_RING]; + lockdep_assert_held(&phba->hbalock); + if (iocbq->sli4_xritag == NO_XRI) sglq = NULL; else @@ -1058,6 +1066,7 @@ __lpfc_sli_release_iocbq_s3(struct lpfc_hba *phba, struct lpfc_iocbq *iocbq) { size_t start_clean = offsetof(struct lpfc_iocbq, iocb); + lockdep_assert_held(&phba->hbalock); /* * Clean all volatile data fields, preserve iotag and node struct. @@ -1080,6 +1089,8 @@ __lpfc_sli_release_iocbq_s3(struct lpfc_hba *phba, struct lpfc_iocbq *iocbq) static void __lpfc_sli_release_iocbq(struct lpfc_hba *phba, struct lpfc_iocbq *iocbq) { + lockdep_assert_held(&phba->hbalock); + phba->__lpfc_sli_release_iocbq(phba, iocbq); phba->iocb_cnt--; } @@ -1310,6 +1321,8 @@ static int lpfc_sli_ringtxcmpl_put(struct lpfc_hba *phba, struct lpfc_sli_ring *pring, struct lpfc_iocbq *piocb) { + lockdep_assert_held(&phba->hbalock); + list_add_tail(&piocb->list, &pring->txcmplq); piocb->iocb_flag |= LPFC_IO_ON_TXCMPLQ; @@ -1344,6 +1357,8 @@ lpfc_sli_ringtx_get(struct lpfc_hba *phba, struct lpfc_sli_ring *pring) { struct lpfc_iocbq *cmd_iocb; + lockdep_assert_held(&phba->hbalock); + list_remove_head((&pring->txq), cmd_iocb, struct lpfc_iocbq, list); return cmd_iocb; } @@ -1367,6 +1382,9 @@ lpfc_sli_next_iocb_slot (struct lpfc_hba *phba, struct lpfc_sli_ring *pring) { struct lpfc_pgp *pgp = &phba->port_gp[pring->ringno]; uint32_t max_cmd_idx = pring->sli.sli3.numCiocb; + + lockdep_assert_held(&phba->hbalock); + if ((pring->sli.sli3.next_cmdidx == pring->sli.sli3.cmdidx) && (++pring->sli.sli3.next_cmdidx >= max_cmd_idx)) pring->sli.sli3.next_cmdidx = 0; @@ -1497,6 +1515,7 @@ static void lpfc_sli_submit_iocb(struct lpfc_hba *phba, struct lpfc_sli_ring *pring, IOCB_t *iocb, struct lpfc_iocbq *nextiocb
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Fri, Nov 20, 2015 at 12:40 PM, Ondrej Zary wrote: > Working ISA means more testing possibilities. It's much easier to get an ISA > card than a Sun or Atari. Also faster CPU (such as 1 GHz P3) means quicker > testing. Faster PCs without ISA slots? ;-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > It's pretty much guaranteed a block layer bug, most likely in the > merge bios to request infrastucture where we don't obey the merging > limits properly. > > Does either of you have a known good and first known bad kernel? Not me, I've only hit it one or two times. All I can say is I have hit it in 4.4-rc1. Laurent, can you narrow it down at all? cheers -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix a memory leak in scsi_host_dev_release()
Hi Bart, the memory leak looks real, and your fix looks corret, but I still don't like it. I think it's reasonable for SCSI to assume that the final put_device fully frees the struct device including the name pointer that is assigned entirely behind the back of the caller. So I think the fix for this probably should be in the driver core. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Fri, Nov 20, 2015 at 12:40:03PM +0100, Ondrej Zary wrote: > > > I'd love to be able to get rid of the ISA drivers to be honest. > > > > Is that because of their use of scsi_module.c or their general decrepitude > > or something else? > > scsi_module.c usage shouldn't be hard to fix. I can do that after finding a > working setup. It's the general state of them. > > > > Given that they appear to be gravely broken before your cleanups this > > > might be an opportunity to get rid of them. > > > > At this stage, that's unclear (to me). It could be that g_NCR5380.c is not > > broken. It could be that the core driver can't handle certain targets. I > > think we need to do more testing. > > Maybe I was just unlucky and tested a drive that never worked with this > driver. > > Working ISA means more testing possibilities. It's much easier to get an ISA > card than a Sun or Atari. Also faster CPU (such as 1 GHz P3) means quicker > testing. Well, if you volunteer to bring the NCR5380 ISA drivers up to date and maintain them it's obvuously fine to keep them around. I'm more worried about all the unmaintained ISA drivers in horrible shape. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Friday 20 November 2015, Finn Thain wrote: > > On Fri, 20 Nov 2015, Christoph Hellwig wrote: > > > On Fri, Nov 20, 2015 at 07:19:21PM +1100, Finn Thain wrote: > > > > > Yes. I didn't do that conversion because I don't have ISA hardware and > > > I don't understand ISA probing. > > > > > > The present patch set doesn't seek to resurrect the ISA drivers. But I > > > am trying to avoid regressions. > > > > > > I have mixed feelings about the ISA drivers. ISA DMA support > > > complicates things (it was never completed) and DMA seems to be the > > > main obstacle to merging the two core driver forks. > > > > I'd love to be able to get rid of the ISA drivers to be honest. > > Is that because of their use of scsi_module.c or their general decrepitude > or something else? scsi_module.c usage shouldn't be hard to fix. I can do that after finding a working setup. > > Given that they appear to be gravely broken before your cleanups this > > might be an opportunity to get rid of them. > > At this stage, that's unclear (to me). It could be that g_NCR5380.c is not > broken. It could be that the core driver can't handle certain targets. I > think we need to do more testing. Maybe I was just unlucky and tested a drive that never worked with this driver. Working ISA means more testing possibilities. It's much easier to get an ISA card than a Sun or Atari. Also faster CPU (such as 1 GHz P3) means quicker testing. -- Ondrej Zary -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/18] ALUA device handler update, part 1
On 11/20/2015 11:54 AM, Christoph Hellwig wrote: > On Fri, Nov 20, 2015 at 11:52:21AM +0100, Hannes Reinecke wrote: >> One thing, though: I don't really agree with Barts objection that >> moving to a workqueue would tie in too many resources. >> Thing is, I'm not convinces that using a work queue is allocating >> too many resources (we're speaking of 460 vs 240 bytes here). >> Also we have to retry commands for quite some time (cite the >> infamous NetApp takeover/giveback, which can take minutes). >> If we were to handle that without workqueue we'd have to initiate >> the retry from the end_io callback, causing a quite deep stack >> recursion. Which I'm not really fond of. >> >> But if anyone has a better idea on how to handle retries without the >> need for workqueues I'm all ears :-) > > I tend to agree with you, but you better warm up that discussion again > on the old thread that actually has Bart on Cc. Or just resend once > Martin has merged this patch, and have discussion around that version. > Fully agree. So I'm waiting for Martin to pick up the patches first. Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_debug: fix prevent_allow+verify regressions
Hi, what happened to this patch? It's still not merged into the kernel. That's a pity. cu, Rudi On Monday 06 April 2015, Douglas Gilbert wrote: > Ruediger Meier observed a regression with the PREVENT ALLOW > MEDIUM REMOVAL command in lk 3.19: >http://www.spinics.net/lists/util-linux-ng/msg11448.html > > Inspection indicated the same regression with VERIFY(10). > > The patch is against lk 3.19.3 and also works with lk 4.0.0-rc6. > With this patch both commands are accepted and do nothing. > VERIFY(10) could be implemented since it shares logic with > COMPARE AND WRITE which is implemented; however that would > be an enhancement. > > ChangeLog: >- fix the lk 3.19 regression so that the PREVENT ALLOW > MEDIUM REMOVAL command is supported once again >- same fix for VERIFY(10) > > Signed-off-by: Douglas Gilbert -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/18] ALUA device handler update, part 1
On Fri, Nov 20, 2015 at 11:52:21AM +0100, Hannes Reinecke wrote: > One thing, though: I don't really agree with Barts objection that > moving to a workqueue would tie in too many resources. > Thing is, I'm not convinces that using a work queue is allocating > too many resources (we're speaking of 460 vs 240 bytes here). > Also we have to retry commands for quite some time (cite the > infamous NetApp takeover/giveback, which can take minutes). > If we were to handle that without workqueue we'd have to initiate > the retry from the end_io callback, causing a quite deep stack > recursion. Which I'm not really fond of. > > But if anyone has a better idea on how to handle retries without the > need for workqueues I'm all ears :-) I tend to agree with you, but you better warm up that discussion again on the old thread that actually has Bart on Cc. Or just resend once Martin has merged this patch, and have discussion around that version. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Fri, 20 Nov 2015, Christoph Hellwig wrote: > On Fri, Nov 20, 2015 at 07:19:21PM +1100, Finn Thain wrote: > > > Yes. I didn't do that conversion because I don't have ISA hardware and > > I don't understand ISA probing. > > > > The present patch set doesn't seek to resurrect the ISA drivers. But I > > am trying to avoid regressions. > > > > I have mixed feelings about the ISA drivers. ISA DMA support > > complicates things (it was never completed) and DMA seems to be the > > main obstacle to merging the two core driver forks. > > I'd love to be able to get rid of the ISA drivers to be honest. Is that because of their use of scsi_module.c or their general decrepitude or something else? > Given that they appear to be gravely broken before your cleanups this > might be an opportunity to get rid of them. At this stage, that's unclear (to me). It could be that g_NCR5380.c is not broken. It could be that the core driver can't handle certain targets. I think we need to do more testing. -- -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/18] ALUA device handler update, part 1
On 11/20/2015 11:47 AM, Christoph Hellwig wrote: > On Mon, Nov 09, 2015 at 04:08:05PM +0100, Hannes Reinecke wrote: >> Hi all, >> >> here's the first part of my ALUA device handler update. >> It's mainly bugfixes and minor improvements; the two important >> things are the addition of VPD parsing functions scsi_vpd_lun_id() >> and scsi_vpd_tpg_id(). >> This series has been split off from the original 'Asynchronous ALUA' >> patchset, as these bits are pretty uncontroversial and have a good >> chance of being merged reasonably soon. > > I've already reviewed most patches, but the other ones looks fine as > well: > > Reviewed-by: Christoph Hellwig > > Let's get this into the 4.5 queue ASAP an then tackle the actually > interesting bits next! > I'm all for it. One thing, though: I don't really agree with Barts objection that moving to a workqueue would tie in too many resources. Thing is, I'm not convinces that using a work queue is allocating too many resources (we're speaking of 460 vs 240 bytes here). Also we have to retry commands for quite some time (cite the infamous NetApp takeover/giveback, which can take minutes). If we were to handle that without workqueue we'd have to initiate the retry from the end_io callback, causing a quite deep stack recursion. Which I'm not really fond of. But if anyone has a better idea on how to handle retries without the need for workqueues I'm all ears :-) Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/18] ALUA device handler update, part 1
On Mon, Nov 09, 2015 at 04:08:05PM +0100, Hannes Reinecke wrote: > Hi all, > > here's the first part of my ALUA device handler update. > It's mainly bugfixes and minor improvements; the two important > things are the addition of VPD parsing functions scsi_vpd_lun_id() > and scsi_vpd_tpg_id(). > This series has been split off from the original 'Asynchronous ALUA' > patchset, as these bits are pretty uncontroversial and have a good > chance of being merged reasonably soon. I've already reviewed most patches, but the other ones looks fine as well: Reviewed-by: Christoph Hellwig Let's get this into the 4.5 queue ASAP an then tackle the actually interesting bits next! -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: hpsa: select CONFIG_SCSI_SAS_ATTR
Looks good, Reviewed-by: Christoph Hellwig -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] scsi: advansys needs ISA dma api for ISA support
Loosk good, Reviewed-by: Christoph Hellwig -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/9] srpt: chain RDMA READ/WRITE requests
On Wed, Nov 18, 2015 at 08:32:59AM -0800, Bart Van Assche wrote: > As you know events like a cable pull can cause some of the RDMA work > requests to succeed and others to fail. It is essential that all RDMA work > requests related to the same SCSI command have finished before the buffers > these requests operate upon are reused. The purpose of the SRPT_RDMA_ABORT > request is to wait for the RDMA requests that were posted without > IB_SEND_SIGNALED and for which no error completion will be received. BTW, I > think this consideration applies to all SCSI target drivers and not only to > SRP target drivers. I think everyone understand the theroetical issue, but we'd like to see a practical case that the implementation in isert and my proposed srpt one don't handle. Given that chained WRs must not be reordered the HCA must also give us the completions in the order we submitted them. Because of that the previous WRs must have been completed by the time we get the notification for the last one which usually does the cleanup. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] IB: add a proper completion queue abstraction
On Wed, Nov 18, 2015 at 10:20:14AM -0800, Bart Van Assche wrote: > Are you perhaps referring to the sysfs CPU mask that allows to control > workqueue affinity ? I think he is referring to the defintion of WQ_UNBOUND: WQ_UNBOUND Work items queued to an unbound wq are served by the special woker-pools which host workers which are not bound to any specific CPU. This makes the wq behave as a simple execution context provider without concurrency management. The unbound worker-pools try to start execution of work items as soon as possible. Unbound wq sacrifices locality but is useful for the following cases. * Wide fluctuation in the concurrency level requirement is expected and using bound wq may end up creating large number of mostly unused workers across different CPUs as the issuer hops through different CPUs. * Long running CPU intensive workloads which can be better managed by the system scheduler. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix a memory leak
Looks good, Reviewed-by: Christoph Hellwig -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RESEND 2] SCSI: initio: remove duplicate module device table
Looks fine, Reviewed-by: Christoph Hellwig -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Fri, Nov 20, 2015 at 07:19:21PM +1100, Finn Thain wrote: > Yes. I didn't do that conversion because I don't have ISA hardware and I > don't understand ISA probing. > > The present patch set doesn't seek to resurrect the ISA drivers. But I am > trying to avoid regressions. > > I have mixed feelings about the ISA drivers. ISA DMA support complicates > things (it was never completed) and DMA seems to be the main obstacle to > merging the two core driver forks. I'd love to be able to get rid of the ISA drivers to be honest. Given that they appear to be gravely broken before your cleanups this might be an opportunity to get rid of them. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Friday 20 November 2015, Finn Thain wrote: > > On Thu, 19 Nov 2015, Christoph Hellwig wrote: > > > On Fri, Nov 20, 2015 at 06:21:06PM +1100, Finn Thain wrote: > > > > > > Not sure what module was being probed here. I presume it was > > > > g_NCR5380 or g_NCR5380_mmio. Neither of these calls > > > > 'scsi_scan_host'. I'm not sure what the implications are (?) > > > > > > Nevermind. The call is in scsi_module.c. > > > > Which, btw really need to go away. If you want to resurrect the > > ISA drivers they need to be converted to proper probing. > > Yes. I didn't do that conversion because I don't have ISA hardware and I > don't understand ISA probing. > > The present patch set doesn't seek to resurrect the ISA drivers. But I am > trying to avoid regressions. > > I have mixed feelings about the ISA drivers. ISA DMA support complicates > things (it was never completed) and DMA seems to be the main obstacle to > merging the two core driver forks. IIRC, my ISA cards can't do DMA either. -- Ondrej Zary -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/71] More fixes, cleanup and modernization for NCR5380 drivers
On Thu, 19 Nov 2015, Christoph Hellwig wrote: > On Fri, Nov 20, 2015 at 06:21:06PM +1100, Finn Thain wrote: > > > > Not sure what module was being probed here. I presume it was > > > g_NCR5380 or g_NCR5380_mmio. Neither of these calls > > > 'scsi_scan_host'. I'm not sure what the implications are (?) > > > > Nevermind. The call is in scsi_module.c. > > Which, btw really need to go away. If you want to resurrect the > ISA drivers they need to be converted to proper probing. Yes. I didn't do that conversion because I don't have ISA hardware and I don't understand ISA probing. The present patch set doesn't seek to resurrect the ISA drivers. But I am trying to avoid regressions. I have mixed feelings about the ISA drivers. ISA DMA support complicates things (it was never completed) and DMA seems to be the main obstacle to merging the two core driver forks. -- -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html