Re: [PATCHv2] ib_srp: Remove WARN_ON in srp_terminate_io()

2018-10-17 Thread Doug Ledford
On Wed, 2018-10-17 at 11:30 -0400, Doug Ledford wrote:
> On Wed, 2018-10-17 at 07:43 -0700, Bart Van Assche wrote:
> > On 10/17/18 12:20 AM, Hannes Reinecke wrote:
> > > The WARN_ON() is pointless as the rport is placed in 
> > > SDEV_TRANSPORT_OFFLINE
> > > at that time, so no new commands can be submitted via srp_queuecommand()
> > > 
> > > Signed-off-by: Hannes Reinecke 
> > > Reviewed-by: Jens Axboe 
> > > Reviewed-by: Johannes Thumshirn 
> > > ---
> > >   drivers/infiniband/ulp/srp/ib_srp.c | 7 ---
> > >   1 file changed, 7 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
> > > b/drivers/infiniband/ulp/srp/ib_srp.c
> > > index 0b34e909505f..5a79444c2f3c 100644
> > > --- a/drivers/infiniband/ulp/srp/ib_srp.c
> > > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> > > @@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport 
> > > *rport)
> > >   struct scsi_device *sdev;
> > >   int i, j;
> > >   
> > > - /*
> > > -  * Invoking srp_terminate_io() while srp_queuecommand() is running
> > > -  * is not safe. Hence the warning statement below.
> > > -  */
> > > - shost_for_each_device(sdev, shost)
> > > - WARN_ON_ONCE(sdev->request_queue->request_fn_active);
> > > -
> > >   for (i = 0; i < target->ch_count; i++) {
> > >   ch = >ch[i];
> > 
> > Although I had explained before why I think that warning is not 
> > pointless, I agree with this change because the legacy block layer is 
> > going away. Anyway:
> > 
> > Acked-by: Bart Van Assche 
> 
> Thanks, applied to for-next.
> 

FWIW, this introduced a build warning (shost and sdev are now unused
variables).  I edited the patch to remove the newly unneeded variables.

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part


Re: [PATCHv2] ib_srp: Remove WARN_ON in srp_terminate_io()

2018-10-17 Thread Doug Ledford
On Wed, 2018-10-17 at 07:43 -0700, Bart Van Assche wrote:
> On 10/17/18 12:20 AM, Hannes Reinecke wrote:
> > The WARN_ON() is pointless as the rport is placed in SDEV_TRANSPORT_OFFLINE
> > at that time, so no new commands can be submitted via srp_queuecommand()
> > 
> > Signed-off-by: Hannes Reinecke 
> > Reviewed-by: Jens Axboe 
> > Reviewed-by: Johannes Thumshirn 
> > ---
> >   drivers/infiniband/ulp/srp/ib_srp.c | 7 ---
> >   1 file changed, 7 deletions(-)
> > 
> > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
> > b/drivers/infiniband/ulp/srp/ib_srp.c
> > index 0b34e909505f..5a79444c2f3c 100644
> > --- a/drivers/infiniband/ulp/srp/ib_srp.c
> > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> > @@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport *rport)
> > struct scsi_device *sdev;
> > int i, j;
> >   
> > -   /*
> > -* Invoking srp_terminate_io() while srp_queuecommand() is running
> > -* is not safe. Hence the warning statement below.
> > -*/
> > -   shost_for_each_device(sdev, shost)
> > -   WARN_ON_ONCE(sdev->request_queue->request_fn_active);
> > -
> > for (i = 0; i < target->ch_count; i++) {
> > ch = >ch[i];
> 
> Although I had explained before why I think that warning is not 
> pointless, I agree with this change because the legacy block layer is 
> going away. Anyway:
> 
> Acked-by: Bart Van Assche 

Thanks, applied to for-next.

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v2 12/15] RDMA/cma: make config_item_type const

2017-10-18 Thread Doug Ledford
On Mon, 2017-10-16 at 17:18 +0200, Bhumika Goyal wrote:
> Make these structures const as they are either passed to the
> functions
> having the argument as const or stored as a reference in the
> "ci_type"
> const field of a config_item structure.
> 
> Signed-off-by: Bhumika Goyal <bhumi...@gmail.com>

Acked-by: Doug Ledford <dledf...@redhat.com>

-- 
Doug Ledford <dledf...@redhat.com>
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH v10 00/15] Replace PCI pool by DMA pool API

2017-07-09 Thread Doug Ledford
On Thu, 2017-07-06 at 10:12 +0200, Romain Perier wrote:
> The current PCI pool API are simple macro functions direct expanded
> to
> the appropriate dma pool functions. The prototypes are almost the
> same
> and semantically, they are very similar. I propose to use the DMA
> pool
> API directly and get rid of the old API.
> 
> This set of patches, replaces the old API by the dma pool API
> and remove the defines.

Is someone planning on merging this series?  If not, I'll send through
the patches I've personally tested (3, 5, and 6).

> Changes in v10:
> - Rebased series onto next-20170706
> - I have fixed and improved patch "scsi: megaraid: Replace PCI pool
> old API"
> 
> Changes in v9:
> - Rebased series onto next-20170522
> - I have fixed and improved the patch for lpfc driver
> 
> Changes in v8:
> - Rebased series onto next-20170428
> 
> Changes in v7:
> - Rebased series onto next-20170416
> - Added Acked-by, Tested-by and Reviwed-by tags
> 
> Changes in v6:
> - Fixed an issue reported by kbuild test robot about changes in
> DAC960
> - Removed patches 15/19,16/19,17/19,18/19. They have been merged by
> Greg
> - Added Acked-by Tags
> 
> Changes in v5:
> - Re-worded the cover letter (remove sentence about checkpatch.pl)
> - Rebased series onto next-20170308
> - Fix typos in commit message
> - Added Acked-by Tags
> 
> Changes in v4:
> - Rebased series onto next-20170301
> - Removed patch 20/20: checks done by checkpath.pl, no longer
> required.
>   Thanks to Peter and Joe for their feedbacks.
> - Added Reviewed-by tags
> 
> Changes in v3:
> - Rebased series onto next-20170224
> - Fix checkpath.pl reports for patch 11/20 and patch 12/20
> - Remove prefix RFC
> Changes in v2:
> - Introduced patch 18/20
> - Fixed cosmetic changes: spaces before brace, live over 80
> characters
> - Removed some of the check for NULL pointers before calling
> dma_pool_destroy
> - Improved the regexp in checkpatch for pci_pool, thanks to Joe
> Perches
> - Added Tested-by and Acked-by tags
> 
> Romain Perier (15):
>   block: DAC960: Replace PCI pool old API
>   dmaengine: pch_dma: Replace PCI pool old API
>   IB/mthca: Replace PCI pool old API
>   net: e100: Replace PCI pool old API
>   mlx4: Replace PCI pool old API
>   mlx5: Replace PCI pool old API
>   wireless: ipw2200: Replace PCI pool old API
>   scsi: be2iscsi: Replace PCI pool old API
>   scsi: csiostor: Replace PCI pool old API
>   scsi: lpfc: Replace PCI pool old API
>   scsi: megaraid: Replace PCI pool old API
>   scsi: mpt3sas: Replace PCI pool old API
>   scsi: mvsas: Replace PCI pool old API
>   scsi: pmcraid: Replace PCI pool old API
>   PCI: Remove PCI pool macro functions
> 
>  drivers/block/DAC960.c| 38 ++-
>  drivers/block/DAC960.h|  4 +-
>  drivers/dma/pch_dma.c | 12 ++--
>  drivers/infiniband/hw/mthca/mthca_av.c| 10 +--
>  drivers/infiniband/hw/mthca/mthca_cmd.c   |  8 +--
>  drivers/infiniband/hw/mthca/mthca_dev.h   |  4 +-
>  drivers/net/ethernet/intel/e100.c | 12 ++--
>  drivers/net/ethernet/mellanox/mlx4/cmd.c  | 10 +--
>  drivers/net/ethernet/mellanox/mlx4/mlx4.h |  2 +-
>  drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 11 ++--
>  drivers/net/wireless/intel/ipw2x00/ipw2200.c  | 13 ++--
>  drivers/scsi/be2iscsi/be_iscsi.c  |  6 +-
>  drivers/scsi/be2iscsi/be_main.c   |  6 +-
>  drivers/scsi/be2iscsi/be_main.h   |  2 +-
>  drivers/scsi/csiostor/csio_hw.h   |  2 +-
>  drivers/scsi/csiostor/csio_init.c | 11 ++--
>  drivers/scsi/csiostor/csio_scsi.c |  6 +-
>  drivers/scsi/lpfc/lpfc.h  | 16 ++---
>  drivers/scsi/lpfc/lpfc_init.c | 16 ++---
>  drivers/scsi/lpfc/lpfc_mem.c  | 90 +--
> 
>  drivers/scsi/lpfc/lpfc_nvme.c |  6 +-
>  drivers/scsi/lpfc/lpfc_nvmet.c|  4 +-
>  drivers/scsi/lpfc/lpfc_scsi.c | 12 ++--
>  drivers/scsi/lpfc/lpfc_sli.c  |  6 +-
>  drivers/scsi/megaraid/megaraid_mbox.c | 30 -
>  drivers/scsi/megaraid/megaraid_mm.c   | 29 -
>  drivers/scsi/megaraid/megaraid_sas_base.c | 27 
>  drivers/scsi/megaraid/megaraid_sas_fusion.c   | 46 +++---
>  drivers/scsi/mpt3sas/mpt3sas_base.c   | 73 ++---
> -
>  drivers/scsi/mvsas/mv_init.c  |  6 +-
>  drivers/scsi/mvsas/mv_sas.c   |  6 +-
>  drivers/scsi/pmcraid.c| 10 +--
>  drivers/scsi/pmcraid.h|  2 +-
>  include/linux/mlx5/driver.h   |  2 +-
>  include/linux/pci.h   |  9 ---
>  35 files changed, 269 insertions(+), 278 deletions(-)
> 
-- 
Doug Ledford <dledf...@redhat.com>
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH v6 05/15] mlx4: Replace PCI pool old API

2017-03-25 Thread Doug Ledford
On Mon, 2017-03-20 at 08:32 +0200, Leon Romanovsky wrote:
> On Sun, Mar 19, 2017 at 06:03:54PM +0100, Romain Perier wrote:
> > 
> > The PCI pool API is deprecated. This commit replaces the PCI pool
> > old
> > API by the appropriate function with the DMA pool API.
> > 
> > Signed-off-by: Romain Perier <romain.per...@collabora.com>
> > Acked-by: Peter Senna Tschudin <peter.se...@collabora.com>
> > Tested-by: Peter Senna Tschudin <peter.se...@collabora.com>
> > ---
> >  drivers/net/ethernet/mellanox/mlx4/cmd.c  | 10 +-
> >  drivers/net/ethernet/mellanox/mlx4/mlx4.h |  2 +-
> >  2 files changed, 6 insertions(+), 6 deletions(-)
> > 
> 
> Thanks,
> Reviewed-by: Leon Romanovsky <leo...@mellanox.com>

Changes look fine to me, and it has been tested on this specific
hardware and confirmed RDMA communications still work.

Acked-by: Doug Ledford <dledf...@redhat.com>
Tested-by: Doug Ledford <dledf...@redhat.com>

-- 
Doug Ledford <dledf...@redhat.com>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH v6 03/15] IB/mthca: Replace PCI pool old API

2017-03-24 Thread Doug Ledford
On Sun, 2017-03-19 at 18:03 +0100, Romain Perier wrote:
> The PCI pool API is deprecated. This commit replaces the PCI pool old
> API by the appropriate function with the DMA pool API.
> 
> Signed-off-by: Romain Perier <romain.per...@collabora.com>
> Acked-by: Peter Senna Tschudin <peter.se...@collabora.com>
> Tested-by: Peter Senna Tschudin <peter.se...@collabora.com>
> 

Changes look ok, and I've tested them here as well.

Acked-by: Doug Ledford <dledf...@redhat.com>
Tested-by: Doug Ledford <dledf...@redhat.com>

-- 
Doug Ledford <dledf...@redhat.com>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH v6 06/15] mlx5: Replace PCI pool old API

2017-03-24 Thread Doug Ledford
On Mon, 2017-03-20 at 08:31 +0200, Leon Romanovsky wrote:
> On Sun, Mar 19, 2017 at 06:03:55PM +0100, Romain Perier wrote:
> > 
> > The PCI pool API is deprecated. This commit replaces the PCI pool
> > old
> > API by the appropriate function with the DMA pool API.
> > 
> > Signed-off-by: Romain Perier <romain.per...@collabora.com>
> > Reviewed-by: Peter Senna Tschudin <peter.se...@collabora.com>
> > ---
> >  drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 11 ++-
> >  include/linux/mlx5/driver.h   |  2 +-
> >  2 files changed, 7 insertions(+), 6 deletions(-)
> > 
> 
> Thanks,
> Acked-by: Leon Romanovsky <leo...@mellanox.com>

Changes look fine to me, and in addition I've compiled and tested them
on mlx5 hardware and verified that RDMA communications still work.

Acked-by: Doug Ledford <dledf...@redhat.com>
Tested-by: Doug Ledford <dledf...@redhat.com>

-- 
Doug Ledford <dledf...@redhat.com>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: iscsi_trx going into D state

2016-12-22 Thread Doug Ledford
00
>>>> [] iscsi_target_tx_thread+0x1aa/0x1d0
>>>> [] kthread+0xd8/0xf0
>>>> [] ret_from_fork+0x3f/0x70
>>>> [] 0x
>>>> # cat /proc/23018/stack
>>>> [] target_wait_for_sess_cmds+0x49/0x1a0
>>>> [] isert_wait_conn+0x1ab/0x2f0 [ib_isert]
>>>> [] iscsit_close_connection+0x162/0x870
>>>> [] iscsit_take_action_for_connection_exit+0x7f/0x100
>>>> [] iscsi_target_tx_thread+0x1aa/0x1d0
>>>> [] kthread+0xd8/0xf0
>>>> [] ret_from_fork+0x3f/0x70
>>>> [] 0x
>>>>
>>>> From dmesg:
>>>> [  394.476332] INFO: rcu_sched self-detected stall on CPU
>>>> [  394.476334]  20-...: (23976 ticks this GP)
>>>> idle=edd/141/0 softirq=292/292 fqs=18788
>>>> [  394.476336]   (t=24003 jiffies g=3146 c=3145 q=0)
>>>> [  394.476337] Task dump for CPU 20:
>>>> [  394.476338] kworker/u68:2   R  running task0 12906  2 
>>>> 0x0008
>>>> [  394.476345] Workqueue: isert_comp_wq isert_cq_work [ib_isert]
>>>> [  394.476346]  883f2fe38000 f805705e 883f7fd03da8
>>>> 810ac8ff
>>>> [  394.476347]  0014 81adb680 883f7fd03dc0
>>>> 810af239
>>>> [  394.476348]  0015 883f7fd03df0 810e1cd0
>>>> 883f7fd17b80
>>>> [  394.476348] Call Trace:
>>>> [  394.476354][] sched_show_task+0xaf/0x110
>>>> [  394.476355]  [] dump_cpu_task+0x39/0x40
>>>> [  394.476357]  [] rcu_dump_cpu_stacks+0x80/0xb0
>>>> [  394.476359]  [] rcu_check_callbacks+0x540/0x820
>>>> [  394.476360]  [] ? account_system_time+0x81/0x110
>>>> [  394.476363]  [] ? tick_sched_do_timer+0x50/0x50
>>>> [  394.476364]  [] update_process_times+0x39/0x60
>>>> [  394.476365]  [] tick_sched_handle.isra.17+0x25/0x60
>>>> [  394.476366]  [] tick_sched_timer+0x3d/0x70
>>>> [  394.476368]  [] __hrtimer_run_queues+0x102/0x290
>>>> [  394.476369]  [] hrtimer_interrupt+0xa8/0x1a0
>>>> [  394.476372]  [] local_apic_timer_interrupt+0x35/0x60
>>>> [  394.476374]  [] smp_apic_timer_interrupt+0x3d/0x50
>>>> [  394.476376]  [] apic_timer_interrupt+0x87/0x90
>>>> [  394.476379][] ? console_unlock+0x41e/0x4e0
>>>> [  394.476380]  [] vprintk_emit+0x2fc/0x500
>>>> [  394.476382]  [] vprintk_default+0x1f/0x30
>>>> [  394.476384]  [] printk+0x5d/0x74
>>>> [  394.476388]  [] transport_lookup_cmd_lun+0x1d1/0x200
>>>> [  394.476390]  [] iscsit_setup_scsi_cmd+0x230/0x540
>>>> [  394.476392]  [] isert_rx_do_work+0x3f3/0x7f0 
>>>> [ib_isert]
>>>> [  394.476394]  [] isert_cq_work+0x184/0x770 [ib_isert]
>>>> [  394.476396]  [] process_one_work+0x14f/0x400
>>>> [  394.476397]  [] worker_thread+0x114/0x470
>>>> [  394.476398]  [] ? __schedule+0x34a/0x7f0
>>>> [  394.476399]  [] ? rescuer_thread+0x310/0x310
>>>> [  394.476400]  [] kthread+0xd8/0xf0
>>>> [  394.476402]  [] ? kthread_park+0x60/0x60
>>>> [  394.476403]  [] ret_from_fork+0x3f/0x70
>>>> [  394.476404]  [] ? kthread_park+0x60/0x60
>>>> [  405.716632] Unexpected ret: -104 send data 360
>>>> [  405.721711] tx_data returned -32, expecting 360.
>>>> 
>>>> Robert LeBlanc
>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

When you combine this trace with the newest one, it really makes me
thing there is something of a bad interaction between the new drain cq
API and the iser/isert implementation to use said API.  Sagi, Christoph?

-- 
Doug Ledford <dledf...@redhat.com>
GPG Key ID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH net-next 4/5] treewide: replace dev->trans_start update with helper

2016-05-12 Thread Doug Ledford
On 05/03/2016 10:33 AM, Florian Westphal wrote:
> Replace all trans_start updates with netif_trans_update helper.
> change was done via spatch:
> 
> struct net_device *d;
> @@
> - d->trans_start = jiffies
> + netif_trans_update(d)
> 
> Compile tested only.
> 
> Cc: user-mode-linux-de...@lists.sourceforge.net
> Cc: linux-xte...@linux-xtensa.org
> Cc: linux1394-de...@lists.sourceforge.net
> Cc: linux-r...@vger.kernel.org
> Cc: net...@vger.kernel.org
> Cc: mpt-fusionlinux@broadcom.com
> Cc: linux-scsi@vger.kernel.org
> Cc: linux-...@vger.kernel.org
> Cc: linux-par...@vger.kernel.org
> Cc: linux-o...@vger.kernel.org
> Cc: linux-h...@vger.kernel.org
> Cc: linux-...@vger.kernel.org
> Cc: linux-wirel...@vger.kernel.org
> Cc: linux-s...@vger.kernel.org
> Cc: de...@driverdev.osuosl.org
> Cc: b.a.t.m@lists.open-mesh.org
> Cc: linux-blueto...@vger.kernel.org
> Signed-off-by: Florian Westphal <f...@strlen.de>
> ---
>  Checkpatch complains about whitespace damage, but
>  this extra whitespace already exists before this patch.
> 
>  drivers/infiniband/hw/nes/nes_nic.c| 2 +-
>  drivers/infiniband/ulp/ipoib/ipoib_cm.c    | 2 +-
>  drivers/infiniband/ulp/ipoib/ipoib_ib.c| 2 +-

For InfiniBand bits,

Acked-by: Doug Ledford <dledf...@redhat.com>

-- 
Doug Ledford <dledf...@redhat.com>
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 10/13] IB/srp: use the new CQ API

2015-12-14 Thread Doug Ledford
On 12/13/2015 05:26 AM, Sagi Grimberg wrote:
> 
>> Allright.  How do you want to proceed?  The current rdma-cq branch
>> has all kinds of dependencies, but I've also prepared a new rdma-cq.2
>> branch that could go straight on top of your current queue:
>>
>> http://git.infradead.org/users/hch/rdma.git/shortlog/refs/heads/rdma-cq.2
>>
>> If you're ready to start the 4.5 tree I can send those out as a patch
>> series.
> 
> Will this get on top of iser-remote-inv? Or should I resend atop of this?

I'm going through my inbox right now.  I expect somewhere in there Or
will make his case for why he doesn't like Christoph's patch to get rid
of the attr struct.  I'll listen, and if I'm not convinced, I'll take
that patchset first and this one second. (I reviewed the patchset
alreadyaside from the fact that I *like* having the attr struct
elements in an organized sub-struct, it's fine and it definitely
improves on all of those query calls).

-- 
Doug Ledford <dledf...@redhat.com>
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 10/13] IB/srp: use the new CQ API

2015-12-11 Thread Doug Ledford
On 12/11/2015 09:22 AM, Christoph Hellwig wrote:
> Hi Bart,
> 
> thanks for all the reviews.  I've updated the git branch with your
> suggestions and reviewed-by tags.  I'm going to wait a little bit
> longer for other reviews to come in before reposting the series.

Indeed, thanks for all the catches Bart.  This patchset, with Bart's
fixups, looks good to me.


-- 
Doug Ledford <dledf...@redhat.com>
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 06/17] Update the infiniband uverbs driver to use idr helper functions.

2015-09-16 Thread Doug Ledford
On 09/16/2015 01:50 PM, Lee Duncan wrote:
> Signed-off-by: Lee Duncan <ldun...@suse.com>

Looks OK to me.  The setting of uobj->id is no longer under the lock,
but we won't succeed at an idr lookup until it is set, which means it
won't be found and can't be used in idr_remove_uobj() until after the
uobj->id is set regardless of the lock.

Acked-by: Doug Ledford <dledf...@redhat.com>

> ---
>  drivers/infiniband/core/uverbs_cmd.c | 12 ++--
>  1 file changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/core/uverbs_cmd.c 
> b/drivers/infiniband/core/uverbs_cmd.c
> index bbb02ffe87df..1e5b2a66a501 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -120,24 +120,16 @@ static int idr_add_uobj(struct idr *idr, struct 
> ib_uobject *uobj)
>  {
>   int ret;
>  
> - idr_preload(GFP_KERNEL);
> - spin_lock(_uverbs_idr_lock);
> -
> - ret = idr_alloc(idr, uobj, 0, 0, GFP_NOWAIT);
> + ret = idr_get_index(idr, _uverbs_idr_lock, uobj);
>   if (ret >= 0)
>   uobj->id = ret;
>  
> - spin_unlock(_uverbs_idr_lock);
> - idr_preload_end();
> -
>   return ret < 0 ? ret : 0;
>  }
>  
>  void idr_remove_uobj(struct idr *idr, struct ib_uobject *uobj)
>  {
> - spin_lock(_uverbs_idr_lock);
> - idr_remove(idr, uobj->id);
> - spin_unlock(_uverbs_idr_lock);
> + idr_put_index(idr, _uverbs_idr_lock, uobj->id);
>  }
>  
>  static struct ib_uobject *__idr_get_uobj(struct idr *idr, int id,
> 


-- 
Doug Ledford <dledf...@redhat.com>
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 0/3] IB/srp patches for Linux kernel v4.3

2015-08-01 Thread Doug Ledford
On 07/31/2015 05:12 PM, Bart Van Assche wrote:
 Hello Doug,
 
 Please apply the following three patches at your earliest convenience:
 
 0001-IB-srp-Constify-a-function-argument.patch
 0002-IB-srp-Handle-partial-connection-success-correctly.patch
 0003-IB-srp-Bump-driver-version-and-release-date.patch

Got 'em, thanks!


-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 04/12] IB/srp: Fix connection state tracking

2015-05-05 Thread Doug Ledford
On Tue, 2015-05-05 at 17:27 +0200, Bart Van Assche wrote:
 On 05/05/15 17:10, Doug Ledford wrote:
  On Tue, 2015-05-05 at 16:26 +0200, Bart Van Assche wrote:
  On 05/05/15 16:10, Doug Ledford wrote:
  However, while looking through the driver to research this, I noticed
  something else that seems more important if you ask me.  With this patch
  we now implement individual channel connection tracking.  However, in
  srp_queuecommand() you pick the channel based on the tag, and the blk
  layer has no idea of these disconnects, so the blk layer is free to
  assign a tag/channel to a channel that's disconnected, and then as best
  I can tell, you will simply try to post a work request to a channel
  that's already disconnected, which I would expect to fail if we have
  already disconnected this particular qp and not brought up a new one
  yet.  So it seems to me there is a race condition between new incoming
  SCSI commands and this disconnect/reconnect window, and that maybe we
  should be sending these commands back to the mid layer for requeueing
  when the channel the blk_mq tag points to is disconnected.  Or am I
  missing something in there?
 
  Hello Doug,
 
  Around the time a cable disconnect or other link layer failure is
  detected by the SRP initiator or any other SCSI LLD it is unavoidable
  that one or more SCSI requests fail. It is up to a higher layer (e.g.
  dm-multipath + multipathd) to decide what to do with such requests, e.g.
  queue these requests and resend these over another path.
 
  Sure, but that wasn't my point.  My point was that if you know the
  channel is disconnected, then why don't you go immediately to the
  correct action in queuecommand (where correct action could be requeue
  waiting on reconnect or return with error, whatever is appropriate)?
  Instead you attempt to post a command to a known disconnected queue
  pair.
 
  The SRP initiator driver has been tested thoroughly with the multipath
  queue_if_no_path policy, with a fio job with I/O verification enabled
  running on top of a dm device while concurrently repeatedly simulating
  link layer failures (via ibportstate).
 
  Part of my questions here are because I don't know how the blk_mq
  handles certain conditions.  However, your testing above only handles
  one case: all channels get dropped.  As unlikely it may be, what if
  resource constraints caused just one channel to be dropped out of the
  bunch and the others stayed alive?  Then the blk_mq would see requests
  on just one queue come back errored, while the others finished
  successfully.  Does it drop that one queue out of rotation, or does it
  fail over the entire connection?
 
 Hello Doug,
 
 Sorry but I don't think that a SCSI LLD is the appropriate layer to 
 choose between requeuing or failing a request.

Be that as it may, that doesn't change what I said about posting a
command to a known disconnected QP.  You could just fail immediately.
Something like:

if (!ch-connected) {
scmnd-result = DID_NO_CONNECT;
goto err;
}

right after getting the channel in queuecommand would work.  That would
save a couple spinlocks, several DMA mappings, a call into the low level
driver, and a few other things.  (And I only left requeue on the table
because I wasn't sure how the blk_mq dealt with just a single channel
being down versus all of them being down)

  If multiple paths are 
 available between an initiator system and a SAN and if one path fails

Who says the path failed?  The path may be just fine.

 only the multipath layer knows whether there are other working paths 
 available. If a working path is still available then the request should 
 be resent as soon as possible over another path. The multipath layer can 
 only take such a decision after a SCSI LLD has failed a request.

Sure.  I totally get failing fast and unilaterally for multipath managed
devices.  That's all assuming multipath though.  There are uses without
that.

But my point in all of this is that if you have a single qp between
yourself and the target, then any error including a qp resource error ==
path error since you only have one path.  When you have a multi queue
device, that's no longer true.  A transient resource problem on one qp
does not mean a path event (at least not necessarily, although your
statement below converts a QP event into a path event by virtue
disconnecting and reconnecting all of the QPs).  My curiosity is now
moot given what you wrote about tearing everything down and reconnecting
(unless the error handling is modified to be more subtle in its
workings), but the original question in my mind was what happens at the
blk_mq level if you did have a single queue drop but not all of them and
you weren't using multipath.

 If only one channel fails all other channels are disconnected and the 
 transport layer error handling mechanism is started.

I missed that.  I assume it's done in srp_start_tl_fail_timers()?

-- 
Doug Ledford dledf...@redhat.com

Re: [GIT PULL] delete decade+ obsolete aic7xxx_old driver

2013-11-08 Thread Doug Ledford

On 11/08/2013 11:04 AM, Paul Gortmaker wrote:


Paul Gortmaker (1):
   scsi: delete decade+ obsolete aic7xxx_old driver

  Documentation/scsi/00-INDEX | 2 -
  Documentation/scsi/aic7xxx_old.txt  |   511 --
  MAINTAINERS | 1 -
  drivers/scsi/Kconfig|41 -
  drivers/scsi/Makefile   | 1 -
  drivers/scsi/aic7xxx_old.c  | 11149 --
  drivers/scsi/aic7xxx_old/aic7xxx.h  |28 -
  drivers/scsi/aic7xxx_old/aic7xxx.reg|  1401 
  drivers/scsi/aic7xxx_old/aic7xxx.seq|  1539 -
  drivers/scsi/aic7xxx_old/aic7xxx_proc.c |   270 -
  drivers/scsi/aic7xxx_old/aic7xxx_reg.h  |   629 --
  drivers/scsi/aic7xxx_old/aic7xxx_seq.c  |   817 ---
  drivers/scsi/aic7xxx_old/scsi_message.h |49 -
  drivers/scsi/aic7xxx_old/sequencer.h|   135 -
  14 files changed, 16573 deletions(-)
  delete mode 100644 Documentation/scsi/aic7xxx_old.txt
  delete mode 100644 drivers/scsi/aic7xxx_old.c
  delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx.h
  delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx.reg
  delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx.seq
  delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx_proc.c
  delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx_reg.h
  delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx_seq.c
  delete mode 100644 drivers/scsi/aic7xxx_old/scsi_message.h
  delete mode 100644 drivers/scsi/aic7xxx_old/sequencer.h



Farewell old friend...don't let the door hit you in the ass on the way 
out ;-)

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: delete decade+ obsolete aic7xxx_old driver

2013-09-17 Thread Doug Ledford
On 09/17/13 16:10, James Bottomley wrote:

 OK, so do we have any real evidence that no-one uses this driver?  Does
 any distro actually compile it, for instance?

Red Hat doesn't use it in any of their products (and it hasn't been the
preferred driver since about the old Red Hat Linux 7.0 days).


-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD
  http://people.redhat.com/dledford

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: delete decade+ obsolete aic7xxx_old driver

2013-09-16 Thread Doug Ledford
Yes, this driver is well past ready to be removed.

Acked-by: Doug Ledford dledf...@redhat.com

Sent from my ASUS Pad

Paul Gortmaker paul.gortma...@windriver.com wrote:

After getting warnings in an allyesconfig build[1] from this
driver, I decided to remind myself just how old it was, and
whether it warranted fixing.  In the Kconfig help text, I found:

  This driver will eventually be phased out entirely

Going back to the history archive, I see the line was added[2]
in Feb 2002, when we moved from v2.4.2.1 --- v2.4.2.2

So, with over a decade of notification, and multiple major releases
since then, I think we can justify removing this.  Currently we have
people wasting time building it during routine testing, and then
wasting more time re-researching the known reported warnings, only to
find that nobody really is willing to integrate the fixes[3] for it.

A quick search didn't seem to indicate any active user base for it.
If someone happens to have a quirky _old_ card that the eleven year
old new driver doesn't work with, then it is entirely reasonable
that they stick with a kernel version that predates this removal.

[1] drivers/scsi/aic7xxx_old.c: In function ‘aic7xxx_register’:
drivers/scsi/aic7xxx_old.c:7901:5: warning: case value ‘257’ not in 
 enumerated type ‘ahc_chip’ [-Wswitch]
drivers/scsi/aic7xxx_old.c:7898:5: warning: case value ‘513’ not in 
 enumerated type ‘ahc_chip’ [-Wswitch]
drivers/scsi/aic7xxx_old.c: In function ‘aic7xxx_load_seeprom’:
drivers/scsi/aic7xxx_old.c:8517:5: warning: case value ‘257’ not in 
 enumerated type ‘ahc_chip’ [-Wswitch]
drivers/scsi/aic7xxx_old.c:8510:5: warning: case value ‘513’ not in 
 enumerated type ‘ahc_chip’ [-Wswitch]

[2] http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git commit 
44e8778c

[3] https://lkml.org/lkml/2012/10/29/215

Cc: Hannes Reinecke h...@suse.de
Cc: Doug Ledford dledf...@redhat.com
Cc: James E.J. Bottomley jbottom...@parallels.com
Signed-off-by: Paul Gortmaker paul.gortma...@windriver.com
---

[This is an --irreversible-delete pseudo-patch which doesn't show all
the file content that was deleted wholesale.  The full commit is at:
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux.git aic7xxx-delete ]

 Documentation/scsi/00-INDEX | 2 -
 Documentation/scsi/aic7xxx_old.txt  |   511 --
 MAINTAINERS | 1 -
 drivers/scsi/Kconfig|41 -
 drivers/scsi/Makefile   | 1 -
 drivers/scsi/aic7xxx_old.c  | 11149 --
 drivers/scsi/aic7xxx_old/aic7xxx.h  |28 -
 drivers/scsi/aic7xxx_old/aic7xxx.reg|  1401 
 drivers/scsi/aic7xxx_old/aic7xxx.seq|  1539 -
 drivers/scsi/aic7xxx_old/aic7xxx_proc.c |   270 -
 drivers/scsi/aic7xxx_old/aic7xxx_reg.h  |   629 --
 drivers/scsi/aic7xxx_old/aic7xxx_seq.c  |   817 ---
 drivers/scsi/aic7xxx_old/scsi_message.h |49 -
 drivers/scsi/aic7xxx_old/sequencer.h|   135 -
 14 files changed, 16573 deletions(-)
 delete mode 100644 Documentation/scsi/aic7xxx_old.txt
 delete mode 100644 drivers/scsi/aic7xxx_old.c
 delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx.h
 delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx.reg
 delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx.seq
 delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx_proc.c
 delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx_reg.h
 delete mode 100644 drivers/scsi/aic7xxx_old/aic7xxx_seq.c
 delete mode 100644 drivers/scsi/aic7xxx_old/scsi_message.h
 delete mode 100644 drivers/scsi/aic7xxx_old/sequencer.h

diff --git a/Documentation/scsi/00-INDEX b/Documentation/scsi/00-INDEX
index 9b0787f..2044be5 100644
--- a/Documentation/scsi/00-INDEX
+++ b/Documentation/scsi/00-INDEX
@@ -42,8 +42,6 @@ aic79xx.txt
   - Adaptec Ultra320 SCSI host adapters
 aic7xxx.txt
   - info on driver for Adaptec controllers
-aic7xxx_old.txt
-  - info on driver for Adaptec controllers, old generation
 arcmsr_spec.txt
   - ARECA FIRMWARE SPEC (for IOP331 adapter)
 dc395x.txt
diff --git a/Documentation/scsi/aic7xxx_old.txt 
b/Documentation/scsi/aic7xxx_old.txt
deleted file mode 100644
index ecfc474..000
diff --git a/MAINTAINERS b/MAINTAINERS
index e61c2e8..c79be42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -470,7 +470,6 @@ M: Hannes Reinecke h...@suse.de
 L:linux-scsi@vger.kernel.org
 S:Maintained
 F:drivers/scsi/aic7xxx/
-F:drivers/scsi/aic7xxx_old/
 
 AIMSLAB FM RADIO RECEIVER DRIVER
 M:Hans Verkuil hverk...@xs4all.nl
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index fe25677..1f02003 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -499,47 +499,6 @@ config SCSI_AACRAID
 
 
 source drivers/scsi/aic7xxx/Kconfig.aic7xxx
-
-config SCSI_AIC7XXX_OLD
-  tristate Adaptec AIC7xxx support (old driver)
-  depends on (ISA || EISA || PCI )  SCSI
-  help
-WARNING This driver is an older aic7xxx driver and is no longer

Re: [PATCH] drivers/scsi/aic7xxx_old: Convert to generic boolean-values

2007-02-16 Thread Doug Ledford
On Fri, 2007-02-16 at 10:50 -0800, Andrew Morton wrote:

 Me no understand.
 
 If you take the specific example of
 
 void
 ahd_set_syncrate(struct ahd_softc *ahd, struct ahd_devinfo *devinfo,
u_int period, u_int offset, u_int ppr_options,
u_int type, int paused)
 
 then if is crufty, inappropriate and wrong that `paused' is a scalar type.

Although you picked a code segment out of the modern aic7xxx, there is a
matching similar one in aic7xxx_old.  Now, in all fairness, I was at one
point playing with a much more preemptable model for that driver that
allowed nested pauses, at which point the value of pause would have made
sense to be scalar, but that was a *long* time ago.

  
 It's just not true or sensible that the code is written so that `paused'
 can take a value of seventy eight.  It _is_ a boolean.  It is a truth
 value.  Declaring it as such in the source is all goodness.  Passing the
 value `true' into calls to this function improve readability over passing
 1.

Hence the reason for the original upper case TRUE/FALSE.  I have to
admit, I don't really like the lower case true/false, it looks like a
variable that can be assigned, thereby changing the implementation of
the function call when in fact each calling location is hard coding a
constant.  But, that's just me and my crufty old C that differentiates
between hard coded things and variables via case.

 So I don't agree with (or understand) your objections.  But I can certainly
 understand reluctance to merge a large-but-minor, do-nothing-much patch into
 a large and not-very-maintained driver.

Hehehe...and here I was thinking of factoring that thing into files and
actually bringing it into the current century.

-- 
Doug Ledford [EMAIL PROTECTED]
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part


Re: [Patch] QLogic qla2x00 driver fixes

2005-02-28 Thread Doug Ledford
On Fri, 2005-02-25 at 09:02 -0800, Patrick Mansfield wrote:
 On Fri, Feb 25, 2005 at 06:57:39AM -0500, Doug Ledford wrote:
  On Fri, 2005-02-25 at 03:38 -0500, Jeff Garzik wrote:
   Arjan van de Ven wrote:
On Thu, 2005-02-24 at 23:21 -0500, Doug Ledford wrote:

   Don't use cmd-request-nr_hw_segments as it may not be initialized
   (SG_IO in particular bypasses anything that initializes this and just
   uses scsi_do_req to insert a scsi_request directly on the head of the
   queue) 


should we fix that in the SG_IO layer ?
   
   Possibly/probably.
 
 Doug,
 
 What kernel did you hit this with? 

Our RHEL4 kernel (2.6.9 plus bunches-o'-patches)

 And same question as Doug G: is it via sg (not the block SG_IO)? sg uses
 scsi_do_req(), block SG_IO doesn't.

At the moment, I'm sorting that out.  It's a bit of a black box.  We
have two different loops of commands running on my test machine.  There
are 10 instances of:

while true; do scsiinfo -i /dev/sdf; done 

which sends an INQUIRY request to the target device.  I'm running into a
problem getting an strace from this command, so I'm tell you more when I
can.

The other 10 loops are:

while true; do scsi_id -g -s /block/sdf -p 0x80; done

 Jens sent changes last August or so that fixed SG_IO (not sg) to always
 set nr_hw_segments, change should be in 2.6.10. It is not obvious that his
 change fixed this, I can't find the changeset or log.

Finding out if this is fixed is only greatly complicated by the fact
that there exists a drivers/block/scsi_ioctl.c and
drivers/scsi/scsi_ioctl.c and they both are compiled into the kernel, so
knowing which one is actually being used, codepath wise, can be
confusing.  Then on top of that there's drivers/scsi/sg.c with it's
SG_IO path.  It's not necessarily an easy issue to sort out without that
strace and for some reason strace is locking on the test box I'm
running.

(insert delay as we get the strace issue resolved)

OK, the program that's causing the problem, from what I can tell, is
opening /dev/sdf and then issuing a SCSI_IOCTL_SEND_COMMAND request.
That means we go to scsi/sd.c:sd_ioctl to
block/scsi_ioctl.c:scsi_cmd_ioctl to block/scsi_ioctl.c:sg_scsi_ioctl
and it's on that path that we aren't always getting an initialized
nr_hw_segments.

/me votes for consolidating some of this ioctl mess if at all possible.

-- 
Doug Ledford [EMAIL PROTECTED]
http://people.redhat.com/dledford

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch] QLogic qla2x00 driver fixes

2005-02-25 Thread Doug Ledford
On Fri, 2005-02-25 at 03:38 -0500, Jeff Garzik wrote:
 Arjan van de Ven wrote:
  On Thu, 2005-02-24 at 23:21 -0500, Doug Ledford wrote:
  
 Don't use cmd-request-nr_hw_segments as it may not be initialized
 (SG_IO in particular bypasses anything that initializes this and just
 uses scsi_do_req to insert a scsi_request directly on the head of the
 queue) 
  
  
  should we fix that in the SG_IO layer ?
 
 Possibly/probably.

I'm not concerned with it personally.  The only reason that the scsi
layer copies the block layer request struct into the scsi
command/request is so that upon completion it has enough information to
mark blocks as either up to date or not while at the same time allowing
the scsi layer to free the original block request at queue time, not at
completion time.  It was never intended to be used by low level drivers.
And since the scsi layer has several ways of bypassing the block layer
to inject commands directly to devices, trying to catch all those
injection points and make them emulate block layer activity when we
already provide everything a low level driver needs in the scsi command
struct is silly.  Just use what's available, reliable, always
initialized, and intended for you to use (cmd-use_sg) instead of trying
to use a roughly duplicate item from a different abstraction layer that
isn't always involved in commands sent to a low level driver (request-
nr_hw_segments).

 For SCSI drivers specifically, using nr_hw_segments is pointless unless 
 cmd-use_sg goes away.  At which point tons of SCSI drivers want 
 changing anyway.
 
   n_sg = dma_map_sg(..., cmd-use_sg, ...)
 
 will always do the right thing, when the cmd contains a scatterlist.
 
 Deviate from the norm and pay the price, really...

Aye, yup.

-- 
Doug Ledford [EMAIL PROTECTED]
http://people.redhat.com/dledford


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch] QLogic qla2x00 driver fixes

2005-02-24 Thread Doug Ledford
Don't use cmd-request-nr_hw_segments as it may not be initialized
(SG_IO in particular bypasses anything that initializes this and just
uses scsi_do_req to insert a scsi_request directly on the head of the
queue) and a bogus value here can trip up the checks to make sure that
the number of segments will fit in the queue ring buffer, resulting in
commands that are never completed.

Fix up several issues with PCI DMA mapping and failure to check return
values on the mappings.

Make the check for space in the ring buffer happen after the DMA mapping
is done since any checks done before the mapping has taken place are
bogus.

-- 
Doug Ledford [EMAIL PROTECTED]
http://people.redhat.com/dledford
= qla_iocb.c 1.15 vs edited =
--- 1.15/drivers/scsi/qla2xxx/qla_iocb.c	2004-12-09 01:12:03 -05:00
+++ edited/qla_iocb.c	2005-02-24 16:25:55 -05:00
@@ -216,18 +216,7 @@ void qla2x00_build_scsi_iocbs_32(srb_t *
 			cur_seg++;
 		}
 	} else {
-		dma_addr_t	req_dma;
-		struct page	*page;
-		unsigned long	offset;
-
-		page = virt_to_page(cmd-request_buffer);
-		offset = ((unsigned long)cmd-request_buffer  ~PAGE_MASK);
-		req_dma = pci_map_page(ha-pdev, page, offset,
-		cmd-request_bufflen, cmd-sc_data_direction);
-
-		sp-dma_handle = req_dma;
-
-		*cur_dsd++ = cpu_to_le32(req_dma);
+		*cur_dsd++ = cpu_to_le32(sp-dma_handle);
 		*cur_dsd++ = cpu_to_le32(cmd-request_bufflen);
 	}
 }
@@ -299,19 +288,8 @@ void qla2x00_build_scsi_iocbs_64(srb_t *
 			cur_seg++;
 		}
 	} else {
-		dma_addr_t	req_dma;
-		struct page	*page;
-		unsigned long	offset;
-
-		page = virt_to_page(cmd-request_buffer);
-		offset = ((unsigned long)cmd-request_buffer  ~PAGE_MASK);
-		req_dma = pci_map_page(ha-pdev, page, offset,
-		cmd-request_bufflen, cmd-sc_data_direction);
-
-		sp-dma_handle = req_dma;
-
-		*cur_dsd++ = cpu_to_le32(LSD(req_dma));
-		*cur_dsd++ = cpu_to_le32(MSD(req_dma));
+		*cur_dsd++ = cpu_to_le32(LSD(sp-dma_handle));
+		*cur_dsd++ = cpu_to_le32(MSD(sp-dma_handle));
 		*cur_dsd++ = cpu_to_le32(cmd-request_bufflen);
 	}
 }
@@ -344,6 +322,8 @@ qla2x00_start_scsi(srb_t *sp)
 
 	/* Setup device pointers. */
 	ret = 0;
+	/* So we know we haven't pci_map'ed anything yet */
+	tot_dsds = 0;
 	fclun = sp-lun_queue-fclun;
 	ha = fclun-fcport-ha;
 	reg = ha-iobase;
@@ -372,8 +352,32 @@ qla2x00_start_scsi(srb_t *sp)
 	if (index == MAX_OUTSTANDING_COMMANDS)
 		goto queuing_error;
 
+	/* Map the sg table so we have an accurate count of sg entries needed */
+	if (cmd-use_sg) {
+		sg = (struct scatterlist *) cmd-request_buffer;
+		tot_dsds = pci_map_sg(ha-pdev, sg, cmd-use_sg,
+		cmd-sc_data_direction);
+		if (tot_dsds == 0)
+			goto queuing_error;
+	} else if (cmd-request_bufflen) {
+		dma_addr_t	req_dma;
+		struct page	*page;
+		unsigned long	offset;
+
+		page = virt_to_page(cmd-request_buffer);
+		offset = ((unsigned long)cmd-request_buffer  ~PAGE_MASK);
+		req_dma = pci_map_page(ha-pdev, page, offset,
+		cmd-request_bufflen, cmd-sc_data_direction);
+
+		if (dma_mapping_error(req_dma))
+			goto queuing_error;
+
+		sp-dma_handle = req_dma;
+		tot_dsds = 1;
+	}
+
 	/* Calculate the number of request entries needed. */
-	req_cnt = (ha-calc_request_entries)(cmd-request-nr_hw_segments);
+	req_cnt = (ha-calc_request_entries)(tot_dsds);
 	if (ha-req_q_cnt  (req_cnt + 2)) {
 		cnt = RD_REG_WORD_RELAXED(ISP_REQ_Q_OUT(ha, reg));
 		if (ha-req_ring_index  cnt)
@@ -385,19 +389,6 @@ qla2x00_start_scsi(srb_t *sp)
 	if (ha-req_q_cnt  (req_cnt + 2))
 		goto queuing_error;
 
-	/* Finally, we have enough space, now perform mappings. */
-	tot_dsds = 0;
-	if (cmd-use_sg) {
-		sg = (struct scatterlist *) cmd-request_buffer;
-		tot_dsds = pci_map_sg(ha-pdev, sg, cmd-use_sg,
-		cmd-sc_data_direction);
-		if (tot_dsds == 0)
-			goto queuing_error;
-	} else if (cmd-request_bufflen) {
-	tot_dsds++;
-	}
-	req_cnt = (ha-calc_request_entries)(tot_dsds);
-
 	/* Build command packet */
 	ha-current_outstanding_cmd = handle;
 	ha-outstanding_cmds[handle] = sp;
@@ -479,6 +470,14 @@ qla2x00_start_scsi(srb_t *sp)
 	return (QLA_SUCCESS);
 
 queuing_error:
+	if (cmd-use_sg  tot_dsds) {
+		sg = (struct scatterlist *) cmd-request_buffer;
+		pci_unmap_sg(ha-pdev, sg, cmd-use_sg,
+		cmd-sc_data_direction);
+	} else if (tot_dsds) {
+		pci_unmap_page(ha-pdev, sp-dma_handle, cmd-request_bufflen,
+			   cmd-sc_data_direction);
+	}
 	spin_unlock_irqrestore(ha-hardware_lock, flags);
 
 	return (QLA_FUNCTION_FAILED);


Re: question on io_request_lock

2001-07-19 Thread Doug Ledford

MEHTA,HIREN (A-SanJose,ex1) wrote:
 
 Well, I guess the problem is that : scsi layer does not know whether the
 hba driver will do its own locking and release io_request_lock or not.
 Before calling any entry point, scsi layer is anyway acquires
 io_request_lock.
 Also, if you are depending on the io_request_lock, then in the interrupt
 handler you are supposed to acquired the io_request_lock and
 release it before returning. So, when the hba driver calls the done routine,
 the io_request_lock is already locked and doing it again in the done() will
 freeze the system.

Exactly.  However, if you use the new error handling code, then you *can* get
away with not grabbing the io_request_lock for the done function (I believe,
Eric will correct me if I'm wrong) because it uses a different spin lock to
control the queue it puts the commands into for the bottom half handler.  As
such, it doesn't really do anything that needs the io_request_lock.  The old
error handling code is a whole different ball of wax.  Call that done function
without holding the io_request_lock and you are bound to blow the SCSI layer
to hell.

 -hiren
 
 -Original Message-
 From: Martin Peschke3 [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, July 19, 2001 12:23 PM
 To: Doug Ledford
 Cc: MEHTA,HIREN (A-SanJose,ex1); '[EMAIL PROTECTED]'
 Subject: Re: question on io_request_lock
 
  Yes, the done() function needs to be wrapped (this isn't so much because
 it
  actually needs it as it is that you are calling mid layer code and you
 need to
  adhere to what it tells you locking semantics are, which in this case is
  always hold the io_request_lock because that's how I keep myself sane).
 
 Are you sure about that?
 I think, there are low-level drivers which don't follow this rule.
 Why does not the scsi_done function(s) itself get the io_request_lock
 first before doing any work if it really needs this?
 
 Mit freundlichen Grüßen / with kind regards
 
 Martin Peschke
 
 IBM Deutschland Entwicklung GmbH
 GNU/Linux for S/390 and zSeries Development
 Phone: +49-(0)7031-16-2349
 
 Doug Ledford [EMAIL PROTECTED] on 07/19/2001 06:48:45 PM
 
 Please respond to Doug Ledford [EMAIL PROTECTED]
 
 To:   MEHTA,HIREN (A-SanJose,ex1) [EMAIL PROTECTED]
 cc:   '[EMAIL PROTECTED]' [EMAIL PROTECTED]
 Subject:  Re: question on io_request_lock
 
 MEHTA,HIREN (A-SanJose,ex1) wrote:
 
  Hi List,
 
  If I decide that I do not want to depend on io_request_lock
  to maintain the consistency of the data structures in my driver,
  then do I ever need to acquire this lock and release it in the driver ?
  e.g. Do I need to acquire io_request_lock before I call the done()
 routine
  and release it after done() returns ?
 
 Yes, the done() function needs to be wrapped (this isn't so much because it
 actually needs it as it is that you are calling mid layer code and you need
 to
 adhere to what it tells you locking semantics are, which in this case is
 always hold the io_request_lock because that's how I keep myself sane).
 You
 are free to use your own internal spin locks in your code if you wish.  For
 certain of your code entry points, you may wish to release the
 io_request_lock
 and then re-grab it before returning.  For example, at the beginning of
 your
 queue routine you should release the io_request_lock and then regrab it
 before
 returning if you want to use your own internal locking and not use the
 io_request_lock.  The queue, abort, and reset routines are called with the
 io_request_lock held when using the old error handling methods.  I don't
 know
 which entry points are called with the lock held under the new error
 handling
 code, but I would suspect the answer is all of them.
 
 --
 
  Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
   Please check my web site for aic7xxx updates/answers before
   e-mailing me about problems
 -
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to [EMAIL PROTECTED]

-- 

 Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]



Re: Stop rewind in MT command?

2001-05-24 Thread Doug Ledford

Alan Dayley wrote:
 
 I am using mt-st v. 0.5b.  At the end of EVERY MT command, a rewind is
 issued.  What good it a seek to end of data (seod) if it is immeadiately
 rewound before another command can be issued?
 
 Is there an option or setting somewhere to stop the rewind command at the
 end of MT execution?

man st

FILES
   /dev/st*  : the auto-rewind SCSI tape devices
   /dev/nst* : the non-rewind SCSI tape devices

This is true of all tape devices.  The ones that are preceeded with an 'n' are
the non-rewind version, and the ones that aren't automatically rewind when the
tape device is closed.  So, the problem isn't with mt, it's doing exactly what
it is suppossed to do.  The problem is with the device you are accessing the
tape through.

-- 

 Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]



Re: Linux Cluster using shared scsi

2001-05-02 Thread Doug Ledford

Eric Z. Ayers wrote:
 
 Doug Ledford writes:
 (James Bottomley commented about the need for SCSI reservation kernel patches)
  
   I agree.  It's something that needs fixed in general, your software needs it
   as well, and I've written (about 80% done at this point) some open source
   software geared towards getting/holding reservations that also requires the
   same kernel patches (plus one more to be fully functional, an ioctl to allow a
   SCSI reservation to do a forced reboot of a machine).  I'll be releasing that
   package in the short term (once I get back from my vacation anyway).
  
 
 Hello Doug,
 
 Does this package also tell the kernel to re-establish a
 reservation for all devices after a bus reset, or at least inform a
 user level program?  Finding out when there has been a bus reset has
 been a stumbling block for me.

It doesn't have to.  The kernel changes are minimal (basically James' SCSI
reset patch that he's been carrying around, the scsi reservation conflict
patch, and I need to write a third patch that makes the system optionally
reboot immediately on a reservation conflict and which is controlled by an
ioctl, but I haven't done that patch yet).  All of the rest is implemented in
user space via the /dev/sg entries.  As such, it doesn't have any more
information about bus resets than you do.  However, because of the policy
enacted in the code, it doesn't need to.  Furthermore, because there are so
many ways to loose a reservation silently, it's foolhardy to try and keep
reservation consistency any way other than something similar to what I outline
below.

The package is meant to be a sort of scsi reservation library.  The
application that uses the library is responsible for setting policy.  I wrote
a small, simple application that actually does a decent job of implementing
policy on the system.  The policy it does implement is simple:

If told to get a reservation, then attempt to get it.  If the attempt is
blocked by an existing reservation and we aren't suppossed to reset the drive,
then exit.  If it's blocked and we are suppossed to reset the drive, then send
a device reset, then wait 5 seconds, then try to get the reservation.  If we
again fail, then the other machine is still alive (as proven by the fact that
it re-established its reservation after the reset) and we exit, else we now
have the reservation.

If told to forcefully get a reservation, then attempt to get it.  If the
attempt fails, then reset the device and try again immediately (no 5 second
wait), if it fails again, then exit.

If told to hold a reservation, then resend your reservation request once every
2 seconds (this actually has very minimal CPU/BUS usage and isn't as big a
deal as requesting a reservation every 2 seconds might sound).  The first time
the reservation is refused, consider the reservation stolen by another machine
and exit (or optionally, reboot).

The package is meant to lock against itself (in other words, a malicious user
with write access to the /dev/sg entries could confuse this locking mechanism,
but it will work cooperatively with other copies of itself running on other
machines), the requirements for the locking to be safe are as follows:

1)  A machine is not allowed to mount or otherwise use a drive in any way
shape or form until it has successfully acquired a reservation.

2)  Once a machine has a reservation, it is not allowed to ever take any
action to break another machines reservation, so that if the reservation is
stolen, this machine is required to gracefully step away from the drive
(rebooting is the best way to accomplish this since even the act of unmounting
the drive will attempt to write to it).

3)  The timeouts in the program must be honored (resend your reservation, when
you hold it, every 2 seconds so that a passive attempt to steal the
reservation will see you are still alive within the 5 second timeout and leave
you be, which is a sort of heartbeat in and of itself).

Anyway, as I said in my previous email, it's about 80% complete.  It currently
is up and running on SCSI-2 LUN based reservations.  There is code to do
SCSI-2 and SCSI-3 extent based reservations but it hasn't been tested due to
lack of devices that support extent based reservations (my test bed is a
multipath FC setup, so I'm doing all my testing on FC drives over two FC
controllers in the same machine).  I've still got to add the SCSI-3 Persistent
Reservation code to the library (again, I'm lacking test drives for this
scenario).  The library itself requires that the program treat all
reservations as extent/persistent reservations and it silently falls back to
LUN reservations when neither of those two are available.  My simple program
that goes with the application just makes extent reservations of the whole
disc, so it acts like a LUN reservation regardless, but there is considerably
more flexibility in the library if a person wishes to program to it.


-- 

 Doug Ledford [EMAIL PROTECTED]  http

Re: QUEUE_FULL + use_new_eh_code = bad_karma?

2001-04-11 Thread Doug Ledford

Eric Youngdale wrote:
 
 Well hosts that don't use the new EH code don't handle QUEUE_FULL.  Not
 quite sure what would happen with it, actually.
 
 If everything goes according to plan, when the device returns
 QUEUE_FULL, the io should get stuck back in the queue to be handled later.
 In addition a flag should be set to prevent further I/O from being sent to
 that device - that flag should be cleared when one of the existing I/O
 requests actually completes.
 
 I am not completely sure how best to debug this.  I guess the first
 thing to do is to see if in fact the QUEUE_FULL is being handled as I
 described and the flag is getting set.   Then the next thing to look for is
 to see what happens when some other I/O finishes.

Please note that while another command completing is enough to trigger the
next command to be sent, there are cases when we can get a QUEUE_FULL with 0
commands active on the device, resulting in a hung drive if you don't
implement a timeout for the paused queue.

-- 

 Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: 160MB/s drive detected as 80MB/s

2001-03-30 Thread Doug Ledford

Simon Garner wrote:
 
 Hi,
 
 I have a RH7 system which I have just compiled kernel-2.4.2 for. The box has
 an Asus CUV4X-D motherboard, and an Adaptec 29160 PCI SCSI controller. I'm
 using the AIC7xxx driver.
 
 Everything works, except the hard drive is a Quantum Atlas 10K2, 160MB/sec,
 but is only detected as 80MB/sec, as follows:
 
 # dmesg
 ...
 SCSI subsystem driver Revision: 1.00
 (scsi0) Adaptec AIC-7892 Ultra 160/m SCSI host adapter found at PCI 0/11/0
 (scsi0) Wide Channel, SCSI ID=7, 32/255 SCBs
 (scsi0) Downloading sequencer code... 392 instructions downloaded
 scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.1/5.2.0
Adaptec AIC-7892 Ultra 160/m SCSI host adapter
 (scsi0:0:6:0) ***Synchronous at 80.0 Mbyte/sec***, offset 127.
   Vendor: QUANTUM   Model: ATLAS10K2-TY184L  Rev: DA40
   Type:   Direct-Access  ANSI SCSI revision: 03
 Detected scsi disk sda at scsi0, channel 0, id 6, lun 0
 SCSI device sda: 35860910 512-byte hdwr sectors (18361 MB)
 ...
 
 (My asterisks.)
 
 Is this wrong, and can it be fixed, or am I just misunderstanding?

The 5.2.4 aic7xxx driver fixes this (patches at my web site, also it is in the
ac series kernels) or you can use Justin Gibbs' new aic7xxx driver (which
never had this issue in the first place) which happens to be the new default
aic7xxx driver in the 2.4.3 kernel.

-- 

 Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: scsi_scan problem.

2001-03-16 Thread Doug Ledford

Ishikawa wrote:
 
 Hi,
 
 I have an "old" Nakamichi CD changer.
 ("old" might be important consideration here. )
 
 Should I test the patch submitted and report what I found ?
 (Or maybe I don't have to bother at this stage at all
 and  simply wait for the 2.5 development and debugging cycle?)

It would still be helpful because this problem has to be fixed before 2.5. 
The only question is whether to fix it with a simple patch such as I just
submitted, or a more complex patch that uses REPORT LUNs.  Part of that answer
is how my simple patch works on your device.

-- 

 Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: scsi_scan problem.

2001-03-14 Thread Doug Ledford

Pete Zaitcev wrote:
 
  Date: Wed, 14 Mar 2001 21:28:14 -0500
  From: Doug Ledford [EMAIL PROTECTED]
 
  A bug report I was charged with fixing (qla2x00 driver doesn't see all luns or
  sees multiple identical luns in different scenarios) was not a bug in the
  qla2x00 driver.  [...]
   The bug is that we were detecting offline devices and linking
  them into the device list.
 
 Why is this a bug? What would happen when I telnet into the
 the RAID box and enable my volumes on those LUNs?

Then they would be legitimate devices and you would do:

echo "scsi-add-single-device a b c d"  /proc/scsi/scsi

to add them to the device chain.  Before then, they aren't anything (at least
not in the case of the Clariion array).

   But, some devices (at least the Clariion raid
  chassis) report luns that don't currently have any device bound to them as
  present but offline.  This meant if we truly scanned all luns then we got
  something like 100+ devices on one ID from this chassis when only 1 might be
  valid:-(
 
 16384 LUNs for Fibre Channel. As you see, scanning is out of the
 question. You must issue REPORT LUNs and fall back on scanning
 if the device reports a check condition. I did that when I worked
 in Sun Storage with A5000/A3500/T3 arrays couple of years ago.

Patches welcomed.  The one I sent already works on a fiber channel setup (the
qla2x00 in question is fc and so is the Clariion array it's connected to, no
detrimental side effects from scanning the box) and so I'm not inclined to add
a REPORT LUNs section to the code unless absolutely necessary.

-- 

 Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: scsi_scan problem.

2001-03-14 Thread Doug Ledford

Doug Ledford wrote:

 Patches welcomed.  The one I sent already works on a fiber channel setup (the
 qla2x00 in question is fc and so is the Clariion array it's connected to, no
 detrimental side effects from scanning the box) and so I'm not inclined to add
 a REPORT LUNs section to the code unless absolutely necessary.

Clarification, I think REPORT LUNS support is a 2.5 thing, not a stick it in
now thing.

-- 

 Doug Ledford [EMAIL PROTECTED]  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]