Since the fw_event deletes itself from the list, cleanup_queue() can
walk onto garbage pointers or walk off into freed memory.
This refactors the code in _scsih_fw_event_cleanup_queue() to not
iterate over the fw_event_list without a lock.
Signed-off-by: Calvin Owens calvinow...@fb.com
This patch refactors the code in the driver to use the new reference
count on the sas_device struct.
Signed-off-by: Calvin Owens calvinow...@fb.com
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 4 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 329 ---
drivers/scsi
These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other.
Signed-off-by: Calvin Owens calvinow...@fb.com
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 16
1 file changed, 16 insertions(+)
diff
This refactors the fw_event code to use the new refcount.
Signed-off-by: Calvin Owens calvinow...@fb.com
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 20 +---
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
b/drivers/scsi/mpt2sas
Hello all,
This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.
I will provide a similar set of fixes for mpt3sas, since we see
similar issues there as well. Porting this to mpt3sas will be
trivial since the part of the driver
The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it.
Signed-off-by: Calvin Owens calvinow...@fb.com
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 28
1 file changed, 28 insertions(+)
diff --git a/drivers/scsi/mpt2sas
We cannot iterate over the list without holding a lock for the entire
duration, or we risk corrupting random memory if items are added or
deleted as we iterate.
This refactors code such that it always holds the lock when iterating
on or accessing the sas_device_list.
Signed-off-by: Calvin Owens
() concurrently deletes items from the list.
Cc: Christoph Hellwig h...@lst.de
Signed-off-by: Calvin Owens calvinow...@fb.com
---
Changes in v3:
* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
which can loop over a sleep forever (5m+ at least) at unloading. I
Hello all,
This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.
Changes are noted in the individual patches, I realized putting them in the
cover was probably a bit confusing.
Thanks,
Calvin
Patches in this series:
[PATCH
-off-by: Calvin Owens calvinow...@fb.com
---
Changes in v3:
* Drop the sas_device_lock while enabling devices, and leave the
sas_device object on the list, since it may need to be looked up
there while it is being enabled.
* Drop put() in _scsih_add_device
-off-by: Calvin Owens calvinow...@fb.com
---
Changes in v4:
* Fix lack of put() in non-SATA case in _scsih_change_queue_depth()
* Fix lack of put() in the non-error case in _scsih_check_device()
* Add missing put() at bottom of _scsih_add_device()
* Add put
Hello all,
This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.
Thanks,
Calvin
[PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
[PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
On Monday 08/10 at 18:45 +0530, Sreekanth Reddy wrote:
On Sat, Aug 1, 2015 at 10:32 AM, Calvin Owens calvinow...@fb.com wrote:
Sreekanth,
Thanks for the review, responses below. I'll have a v4 out shortly.
Calvin
These objects can be referenced concurrently throughout the driver, we
need
() concurrently deletes items from the list.
Cc: Christoph Hellwig h...@lst.de
Signed-off-by: Calvin Owens calvinow...@fb.com
---
Changes in v4: None
Changes in v3:
* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
which can loop over a sleep forever (5m+ at least
On Thursday 07/16 at 20:27 +0530, Sreekanth Reddy wrote:
On Sun, Jul 12, 2015 at 9:54 AM, Calvin Owens calvinow...@fb.com wrote:
These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch
On Monday 07/13 at 11:05 -0400, Joe Lawrence wrote:
On 07/12/2015 12:24 AM, Calvin Owens wrote:
These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors
On Sunday 07/12 at 23:52 -0700, Christoph Hellwig wrote:
On Sat, Jul 11, 2015 at 09:24:55PM -0700, Calvin Owens wrote:
These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds
if it isn't
embedded in the object itself.
KASAN was extremely helpful in finding the root cause of this bug.
Signed-off-by: Calvin Owens <calvinow...@fb.com>
---
drivers/scsi/sg.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/sg.c b/drivers/scsi/
Thanks for this, I'm sending a v2 shortly.
On Friday 07/03 at 09:00 -0700, Christoph Hellwig wrote:
On Mon, Jun 08, 2015 at 08:50:55PM -0700, Calvin Owens wrote:
This refactors the fw_event code to use the new refcount.
I spent some time looking over this code because it's so convoluted
On Friday 07/03 at 08:38 -0700, Christoph Hellwig wrote:
+struct _sas_device *
+mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER
*ioc,
+u64 sas_address)
Any chance to use a shorter name for this function? E.g.
__mpt2sas_get_sdev_by_addr ?
Will do.
On Friday 07/03 at 09:02 -0700, Christoph Hellwig wrote:
On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote:
Since the fw_event deletes itself from the list, cleanup_queue() can
walk onto garbage pointers or walk off into freed memory.
This refactors the code
() concurrently deletes items from the list.
Cc: Christoph Hellwig h...@infradead.org
Cc: Bart Van Assche bart.vanass...@sandisk.com
Signed-off-by: Calvin Owens calvinow...@fb.com
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 101 ---
1 file changed, 81 insertions(+), 20 deletions
Hello all,
This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.
Thanks,
Calvin
Patches in this series:
[PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
[PATCH 2/2] mpt2sas: Refcount fw_events and fix
, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.
Cc: Christoph Hellwig h...@infradead.org
Cc: Bart Van Assche bart.vanass...@sandisk.com
Signed-off-by: Calvin Owens calvinow...@fb.com
On Wednesday 08/26 at 04:09 +, Nicholas A. Bellinger wrote:
From: Nicholas Bellinger n...@linux-iscsi.org
Hi James Co,
This series is a mpt3sas forward port of Calvin Owens' in-flight
reference counting bugfixes for mpt2sas LLD code here:
[PATCH v4 0/2] Fixes for memory corruption
_scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.
This patch is a port of Calvin's PATCH-v4 for mpt2sas code.
Cc: Calvin Owens calvinow...@fb.com
Cc: Christoph Hellwig h...@infradead.org
Cc
On 05/13/2016 01:28 PM, Calvin Owens wrote:
Currently we free the resources backing the enclosure device before we
call device_unregister(). This is racy: during rmmod of low-level SCSI
drivers that hook into enclosure, we end up with a small window of time
during which writing to /sys can OOPS
On Thursday 06/02 at 15:50 -0700, Calvin Owens wrote:
> On 05/13/2016 01:28 PM, Calvin Owens wrote:
> > Currently we free the resources backing the enclosure device before we
> > call device_unregister(). This is racy: during rmmod of low-level SCSI
> > drivers that hook into
pt3sas]
[] do_one_initcall+0x113/0x2b0
[] do_init_module+0x1d0/0x4d8
[] load_module+0x6729/0x8dc0
[] SYSC_init_module+0x183/0x1a0
[] SyS_init_module+0xe/0x10
[] entry_SYSCALL_64_fastpath+0x12/0x6a
Fix this by pulling the value at the beginning of the loop.
Signed-off-by:
On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote:
>
>
> > -Original Message-
> > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > ow...@vger.kernel.org] On Behalf Of Calvin Owens
> > Sent: Friday, May 13, 2016 3:2
e_port() explicitly is
HPSA, and it does it in the opposite order mpt3sas does: scsi_remove_host()
first.
Thanks,
Calvin
> -Original Message-
> From: Calvin Owens [mailto:calvinow...@fb.com]
> Sent: Monday, May 16, 2016 2:25 PM
> To: Elliott, Robert (Persistent Memory)
> Cc: Sath
driver
core holds a reference over ->remove_dev(), so AFAICT this is safe.
Signed-off-by: Calvin Owens <calvinow...@fb.com>
---
drivers/scsi/ses.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index 53ef1cb..0e8601a 100644
---
On Wednesday 05/18 at 18:44 +0530, Sreekanth Reddy wrote:
> On Tue, May 17, 2016 at 8:43 AM, Calvin Owens <calvinow...@fb.com> wrote:
> > On Monday 05/16 at 15:51 -0600, Sathya Prakash Veerichetty wrote:
> >> Our understanding is the relationship between the SCSI host
This flag that conditionally acquires the mutex is confusing and prone
to bugginess: refactor it into two separate function calls, and make
the unlocked one complain if it's called outside the mutex.
Signed-off-by: Calvin Owens <calvinow...@fb.com>
---
drivers/scsi/mpt3sas/mpt3sas_base.h
with the potential error is non-trivial, so for now just WARN().
Signed-off-by: Calvin Owens <calvinow...@fb.com>
---
drivers/scsi/mpt3sas/mpt3sas_base.c | 18 +++-
drivers/scsi/mpt3sas/mpt3sas_config.c| 4 +-
drivers/scsi/mpt3sas/mpt3sas_ctl.c | 29 ++---
drivers/scsi/m
With the exception of a single call to wait_for_doorbell_int(), all
this conditional sleeping code is dead. So delete it.
Signed-off-by: Calvin Owens <calvinow...@fb.com>
---
drivers/scsi/mpt3sas/mpt3sas_base.c | 241 +--
drivers/scsi/mpt3sas/mpt3sas_
On 06/15/2016 01:24 PM, Calvin Owens wrote:
On Thursday 06/02 at 15:50 -0700, Calvin Owens wrote:
On 05/13/2016 01:28 PM, Calvin Owens wrote:
Currently we free the resources backing the enclosure device before we
call device_unregister(). This is racy: during rmmod of low-level SCSI
drivers
byte beyond our character array happens to be a NUL. Fix this
by explicitly writing '\0' to the end of the string to ensure we don't
run off the edge of the world in printk().
Signed-off-by: Calvin Owens <calvinow...@fb.com>
---
drivers/scsi/mpt3sas/mpt3sas_base.h | 2 +-
drivers/scsi/m
On 07/17/2016 11:02 PM, Dave Chinner wrote:
On Sun, Jul 17, 2016 at 10:00:03AM +1000, Dave Chinner wrote:
On Fri, Jul 15, 2016 at 05:18:02PM -0700, Calvin Owens wrote:
Hello all,
I've found a nasty source of slab corruption. Based on seeing similar symptoms
on boxes at Facebook, I suspect
On 07/18/2016 07:05 PM, Calvin Owens wrote:
On 07/17/2016 11:02 PM, Dave Chinner wrote:
On Sun, Jul 17, 2016 at 10:00:03AM +1000, Dave Chinner wrote:
On Fri, Jul 15, 2016 at 05:18:02PM -0700, Calvin Owens wrote:
Hello all,
I've found a nasty source of slab corruption. Based on seeing similar
Hello all,
I've found a nasty source of slab corruption. Based on seeing similar symptoms
on boxes at Facebook, I suspect it's been around since at least 3.10.
It only reproduces under memory pressure so far as I can tell: the issue seems
to be that XFS reclaims pages from buffers that are
41 matches
Mail list logo