On Fri, May 31, 2019 at 10:42:12AM +0100, John Garry wrote:
> > > > > 
> > > > > -static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba,
> > > > > -                                  struct scsi_cmnd *scsi_cmnd)
> > > > > +static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba)
> > > > >  {
> > > > >       int index;
> > > > >       void *bitmap = hisi_hba->slot_index_tags;
> > > > >       unsigned long flags;
> > > > > 
> > > > > -     if (scsi_cmnd)
> > > > > -             return scsi_cmnd->request->tag;
> > > > > -
> > > > >       spin_lock_irqsave(&hisi_hba->lock, flags);
> > > > >       index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count,
> > > > >                                  hisi_hba->last_slot_index + 1);
> > > > 
> > > > Then you switch to hisi_sas_slot_index_alloc() for allocating the unique
> > > > tag via spin_lock & find_next_zero_bit(). Do you think this way is more
> > > > efficient than blk-mq's sbitmap?
> 
> These are not fast path, as used only for TMF, internal IO, etc.

OK, looks I miss that.

> 
> Having said that, hopefully we can move to scsi_{get,put}_reserved_cmd()
> when available, so that the LLDD has to stop managing them.

Agree.

> 
> > > > 
> > > slot_index_alloc() is only used for commands which do _not_ have a tag
> > > (eg internal commands), or for v2 hardware which has weird allocation 
> > > rules.
> > 
> > But this patch has switched to this allocation unconditionally for all 
> > commands:
> > 
> 
> As Hannes said, v2 had a few bugs which meant that we had to make a specific
> version of this function for that hw revision, cf.
> slot_index_alloc_quirk_v2_hw(), and it cannot use request queue tag.
> 
> But, indeed, we could consider sbitmap for v2 hw. I'm not sure if it would
> help, considering the weird rules.
> 
> > > -       if (scsi_cmnd)
> > > -               return scsi_cmnd->request->tag;
> > > -
> > 
> > Otherwise duplicated slot can be used from different blk-mq hw queue.
> > 
> > > 
> > > > The worsen thing is that V3's actual max queue depth is (4096 - 96), but
> > > > this patch claims that the device can support (4096 - 96) * 32 command
> > > > slots
> 
> To be clear about the hw, the hw supports max 4096 command tags and has 16

Is 4096 the max allowed host-wide command tags? Or per-queue's max commands
tags?

If it is per-queue's max command tags, V3 should be real MQ hardware,
otherwise it is still host wide. Otherwise, Hannes's patch can't work
because tag from different hw queue can be same.

> hw queues. The hw queue depth is configurable by software, and we configure
> it at 4096 per queue - same as max command tags - this is to support
> possibility of all commands using the same queue simultaneously.

Suppose 4096 is the host-wide command tags:

As Hannes's patch changed to allow each blk-mq hw queue to accept 4096 commands,
there will be big contention in driver, given now the actual .can_queue becomes
4096 * 32, and how to avoid the same tag from different hw queue? 

Thanks,
Ming

Reply via email to