On Mon, 2009-05-18 at 16:11 -0700, Joe Eykholt wrote:
> Robert Love wrote:
> > This patch makes the EMs SMP aware. They will allocate
> > XIDs by pools determined by the number of possible CPUs
> > in the system.
> > 
> > The goal is to have I/O echanges processed on one CPU to
> 
> Typo: exchanges
> 
> > avoid cache misses. This means that whatever CPU is processing
> > the READ request should also process the READ response. To
> > do so, this patch carves up an XID space in an EM by the number
> > of CPUs in the system. It then checks the processor ID and
> > gives out XIDs to the active CPU from its pool. On Rx, there
> > is a queue/thread per CPU already, so this patch just uses
> > (XID % num_possible_cpus) to determine which CPU/queue/thread
> > the response should be processed by.
> 
> The hash should be determined by the largest CPU number plus 1.
> That is the same as num_possible_cpus() for x86, and probably all
> other architectures.
> 
Are you suggesting that I should do something other than-

cpu = oxid % num_possible_cpus();

because it isn't portable? I don't think you are because you would have
said more in the next comment. Just checking...

> > If a CPU is offlined there is no reallocation of resources
> > (i.e. XIDs), they just go unused. Also, if a CPU is offlined
> > our Rx handler will move the response frame to another CPU.
> > We will take a cache miss, but that isn't a concern as online/
> > offline CPU should be infrequent and the number of frames
> > that might go to the wrong CPU would likely show no impact.
> > 
> > Signed-off-by: Robert Love <[email protected]>
> > ---
> > 
> >  drivers/scsi/fcoe/fcoe.c     |    2 +
> >  drivers/scsi/libfc/fc_exch.c |   77 
> > +++++++++++++++++++++++++++++++++++++-----
> >  2 files changed, 69 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
> > index f589a47..0001de8 100644
> > --- a/drivers/scsi/fcoe/fcoe.c
> > +++ b/drivers/scsi/fcoe/fcoe.c
> > @@ -991,7 +991,7 @@ int fcoe_rcv(struct sk_buff *skb, struct net_device 
> > *dev,
> >      * cpu bits to get cpu and then this cpu is used for selecting
> >      * a per cpu kernel thread from fcoe_percpu.
> >      */
> > -   cpu = oxid & (num_online_cpus() - 1);
> > +   cpu = oxid % num_online_cpus();
> 
> This should be num_possible_cpus() which is what the allocation used.
> 
Yeah, you're right. I would have hit this when testing online/offine of
CPUs. Well, maybe not. :) If I had 4 CPUs and offline'd #2, then I would
not use #3 at all and I'd still be targeting #2, which of course would
move the work to a different CPU. 

> >  #endif
> >  
> >     fps = &per_cpu(fcoe_percpu, cpu);
> > diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c
> > index 0664366..1670bad 100644
> > --- a/drivers/scsi/libfc/fc_exch.c
> > +++ b/drivers/scsi/libfc/fc_exch.c
> > @@ -69,7 +69,7 @@ struct fc_exch_mgr {
> >     spinlock_t      em_lock;        /* exchange manager lock,
> >                                        must be taken before ex_lock */
> >     u16             em_user;        /* exchange manager user count */
> > -   u16             last_xid;       /* last allocated exchange ID */
> > +   u16             *last_xid;      /* last allocated exchange ID */
> >     u16             min_xid;        /* min exchange ID */
> >     u16             max_xid;        /* max exchange ID */
> >     u32             total_exches;   /* total allocated exchanges */
> > @@ -471,6 +471,49 @@ static struct fc_seq *fc_seq_alloc(struct fc_exch *ep, 
> > u8 seq_id)
> >     return sp;
> >  }
> >  
> > +static u16 fc_exch_xid_find(struct fc_exch_mgr *mp, unsigned int cpu_idx)
> > +{
> > +   int found = 0;
> > +   unsigned int nr_cpus = num_possible_cpus();
> > +   u16 xid, prev_xid;
> > +   prev_xid = xid = *per_cpu_ptr(mp->last_xid, cpu_idx);
> > +
> > +   while (!found) {
> 
> Found is unnecessary since we just break when we find the xid.
> We could just return.
> 
>       for (;;) {
> 
> > +           /* Wrap around or increment */
> > +           if (xid == FC_XID_UNKNOWN)
> > +                   xid = mp->min_xid + cpu_idx;
> 
> This assumes that min_xid is the right starting ID for cpu 0.
> Is that valid?   Need to check that when we allocate.
> Being off would hurt performance.
> 
Fixed this up, it's much nicer. Thanks.

> > +           else
> > +                   xid = xid + nr_cpus;
> > +
> > +           /*
> > +            * We've come back to the original last_xid,
> > +            * so we've exhausted the list.
> > +            */
> > +           if (xid == prev_xid) {
> > +                   xid = FC_XID_UNKNOWN;
> > +                   found = 1;
> 
> No need to set found.  Just do break or return FC_XID_UNKNOWN;
> 
> > +                   break;
> > +           }
> > +
> > +           /* Have we exceeded the max XIDs? */
> > +           if (xid > mp->max_xid) {
> > +                   xid = FC_XID_UNKNOWN;
> > +                   continue;
> > +           }
> > +
> > +           /*
> > +            * Always deal with real XIDs until now, where we
> > +            * need to look at the array index by subtracting one.
> > +            */
> > +           if (!mp->exches[xid - mp->min_xid]) {
> > +                   *per_cpu_ptr(mp->last_xid, cpu_idx) = xid;
> 
>                       return xid;
> 
> > +                   found = 1;
> > +                   break;
> > +           }
> > +   }
> > +   return xid;
> > +}
> > +
> >  /*
> >   * fc_exch_alloc - allocate an exchange and its exchange id.
> >   * @mp : ptr to the exchange manager
> > @@ -481,6 +524,7 @@ struct fc_exch *fc_exch_alloc(struct fc_exch_mgr *mp)
> >  {
> >     struct fc_exch *ep;
> >     u16 min, max, xid;
> > +   unsigned int cpu;
> >  
> >     /* allocate memory for exchange */
> >     ep = mempool_alloc(mp->ep_pool, GFP_ATOMIC);
> > @@ -494,14 +538,25 @@ struct fc_exch *fc_exch_alloc(struct fc_exch_mgr *mp)
> >     max = mp->max_xid;
> >  
> >     spin_lock_bh(&mp->em_lock);
> > -   xid = mp->last_xid;
> > -   /* alloc a new xid */
> > -   while (mp->exches[xid - min]) {
> > -           xid = (xid == max) ? min : xid + 1;
> > -           if (xid == mp->last_xid)
> > -                   goto err;
> > +
> > +   xid = fc_exch_xid_find(mp, smp_processor_id());
> > +   if (xid == FC_XID_UNKNOWN) {
> > +           /*
> > +            * It would be nice to have a more intelligent
> > +            * algorithm here instead of starting with CPU 0
> > +            * everytime.
> 
> typo:  every time.
> 
> > +            */
> 
> Use cpumask_next(cpu, cpu_online_mask).
> Stop when getting back to the original cpu.  Note that the current CPU
> could change, so doing smp_processor_id() in the loop isn't quite correct,
> although it'll work most of the time.
> 
> We could do a get_cpu()/put_cpu() to inhibit rescheduling.
> 
I need to think about this more.

> > +           for_each_online_cpu(cpu) {
> > +                   if (cpu == smp_processor_id())
> > +                           continue;
> > +                   xid = fc_exch_xid_find(mp, cpu);
> > +                   if (xid != FC_XID_UNKNOWN)
> > +                           break;
> > +           }
> >     }
> 
> Here's an alternative version, replacing above from first call to
> fc_exch_xid_find():
> 
>       cpu = smp_processor_id();
>       first_cpu = cpu;
>       do {
>               xid = fc_exch_xid_find(mp, cpu);
>               if (xid != FC_XID_UNKNOWN)
>                       goto found;
>       } while ((cpu = cpumask_next(cpu, cpu_online_mask)) != first_cpu);
>       return NULL;
> found:
> 
Done, thanks.


_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Reply via email to