On Tue, 2012-02-07 at 06:59 -0500, Neil Horman wrote: > On Mon, Feb 06, 2012 at 03:34:00PM -0800, Vasu Dev wrote: > > On Thu, 2012-02-02 at 16:03 -0500, Neil Horman wrote: > > > fc_port->ema_list is being accessed in multiple parallel contexts with no > > > locking around it leading to potential corruption. > > > > I don't see any possibility of ema_list corruption here and if there is > I do, and I've tried to explain it here. It seems I've not been clear enough. > > > any such possible case/s then we should fix such case/s at source > > instead ensuring list sanity by any un-necessary additional locking as > What do think I'm trying to do here?
The patch is adding additional ema_list locking while it is not required as I explained described scenario in patch description for additional locking is not possible and neither newly described flows from latest response seems issues, more details below along them. The point is to fix unintended access to the ema_list instead of adding additional locks to the list since list addition/deletion is already protected and fcoe create/destroy processing should be ensuring no access to ema_list: 1) prior its setup is fully done, all frames ignored until its setup done and that prevents any attempt to access empty or half initialized ema_list. 2) while list getting removed or after its removed. All pending exches, IOs and receive flows either already flushed or their frames is ignored at this point and that prevents no more access to ema_list at this point while ema_list getting empty or already empty. Given above two points we should never reach to the point of accessing ema_list while its empty or getting modified. If we do then we gota fix such cases instead adding more locking around ema_list. I don't preclude any such possibilities either specially in destroy processing which involves various worker context and flushing various things for all rports and lport/host/fcoe etc. If not fixed there then we'll have other bugs accessing stale already freed objects in any fcoe stack blocks, its not just about accessing empty or stale ema_list in that case. > > > > that may hide real issue/s at source.This patch is based on code > > observations and I've explanation why those cases are not issues here, > > see more details below. > > > You're analysis below is incomplete. Hopefully additional new details above completes that though I did provide explanation to described scenario in patch description before. > > > > Whats more the fc_find_ema > > > function was manually accessing ema_list.prev without checking for a > > > list_empty > > > condition. The ema_list could easily be emtpy if an transation came in > > > to the > > > foce stack with an UNKNOWN xid prior to our adding a ema anchor to the > > > list > > > (dev_add_pack gets called before said add occurs). > > > > Yes dev_add_pack is called before done with ema_list setup with lport > The fact that dev_add_pack is called early isn't overly relevent, it was just > meant to illustrate one way in which ema_list.perv could be accessed without > any > elements on the list, which is a corrupting action. There are others. > As I explained above we should not even get to the point of accessing ema_list before its setup is done or after its removed with no elements. My last response with para borken above and below explains the case why we won't get to accessing ema_list until its setup fully done to address your comment above "(dev_add_pack gets called before said add occurs)". Similarly all other case should be covered by above described two points. > > but added fcoe interface is not hooked up with lport until we are done > > with complete lport config and that prevents getting to even lport which > > holds ema_list head. So added packet handler fcoe_rcv() will drops all > > frames untill lport/ema_list set at:- > > > > fcoe = container_of(ptype, struct fcoe_interface, > > fcoe_packet_type); > > lport = fcoe->ctlr.lp; > > if (unlikely(!lport)) { > > FCOE_NETDEV_DBG(netdev, "Cannot find hba structure"); > > goto err2; > > } > > > > > Ok, I'll try make this as clear as I can. Here are the concurrent paths as I > see them: > > CPU0 > ==== > fcoe_rcv > fcoe_recv_frame > fc_exch_recv > fc_find_ema > list_for_each(lport->ema_list) > > CPU1 > ==== > fcoe_destroy_work > fcoe_if_destroy > fc_exch_mgr_free > fc_exch_mgr_del By this time all pending IOs, exches and any floating flow including above CPU0 flow should be flushed or their frames will be ignored. Also just like we drops in fcoe_rcv while ema_list getting setup, we start dropping all frames in fc_exch_recv once lport disabled at:- if (!lport || lport->state == LPORT_ST_DISABLED) { FC_LPORT_DBG(lport, "Receiving frames for an lport that " "has not been initialized correctly\n"); fc_frame_free(fp); return; } The lport gets disabled by fc_lport_destroy() much before calling fc_exch_mgr_free() to delete all elements of ema_list. > list_del(lport->ema_list) > > CPU2 > ==== > fcoe_if_create > fcoe_em_config > fc_exch_mgr_add > list_add(lport->ema_list) fc_exch_mgr_add and fc_exch_mgr_del are serialized by their caller create and destroy using fcoe_config_mutex as I explained before, so not possible to have CPU1 and CPU2 flow concurrently and if it happens then we got bigger mess not just about accessing ema_list, can we allow create/destroy go this far on on CPU0 and CPU2 concurrently ? If its possible then that should be fixed. > > Notes: > CPU0 derives the lport from the structure surrounding the packet_type struct, > which isn't removed until after fcoe_if_destroy is called (via > fcoe_interface_cleanup). > That is okay as long flushed frames from fcoe_interface_cleanup gets removed and that should be the case as lport is disabled, but just to be safe we should dettach lp from fcoe in this case just as we don't attach until lport attached, how about doing this ? I'll look more into this patch affects and testing. diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c index 90ebc15..b005fe0 100644 --- a/drivers/scsi/fcoe/fcoe.c +++ b/drivers/scsi/fcoe/fcoe.c @@ -951,6 +951,9 @@ static void fcoe_if_destroy(struct fc_lport *lport) /* Free memory used by statistical counters */ fc_lport_free_stats(lport); + /* Detach fcoe interface from lport to prevent any more ingress */ + fcoe->ctlr.lp = NULL; + /* Release the Scsi_Host */ scsi_host_put(lport->host); > CPU1 and CPU2 are serialized by the config_mutex. However, neither of them > are > serialized against CPU0 which runs in softirq context. And since fcoe_rcv > determines the fcoe_ctlr and lport based on the struct surrounding the passed > in > packet_type, which isn't removed until the end of fcoe_destroy_work (via > fcoe_interface_cleanup). Its entirely possible for the path in CPU0 to run > concurrently with a remove operation in CPU1, Yes its possible but frames will be ignored being lport already disabled, so we won't get to the point of accessing ema_list on CPU0 and after above suggested patch to detach lp, the fcoe_rcv will drop frames right in the beginning. > causing a list traversal at the > same time as a list_del (read, corruption). Don't allow any flow to reach to that point and fix them, if any going that far then a new request would allocate new exch after we are done with exch reset and flushing in destroy and that would later cause crash. Thanks Vasu > > > BTW FCoE switch is not expected to send FCoE frame until lport logged > > in but just in case safety check is already in place to drop fcoe frames > > in that case. > > > The above scenario is on delete, which presumes that the lport is already > logged > in. > > > Also no possibility of list corruption during sysfs create/destroy as > > all fcoe creation/destroys are serialized by global fcoe_config_mutex. > > > Correct, but I'm not talking about serialization of two list modifications, > I'm > talking about the serialization of a list traversal with a modification. > > > May be I'm missing something among lengthy side references after some > > good review comments discussion points from Bhanu on initial patch, in > > that case help me understand in what specific scenario this patch will > > prevent a possible bug. > > > Please see above > > Thanks > Neil > > > Thanks > > Vasu > > > > > > > > Fix it by adding an ema_lock to list accesses. Make it an rcu lock so we > > > don't > > > get Warnings when we try to tx a frame with a spinlock held > > > > > > Change notes: > > > v2) Removed pool->lock and ex_lock changes that had incorectly crept into > > > this commit > > > > > > Signed-off-by: Neil Horman <nhor...@tuxdriver.com> > > > CC: kiran.pa...@intel.com > > > CC: bprak...@broadcom.com > > > CC: robert.w.l...@intel.com > > > --- > > > drivers/scsi/libfc/fc_exch.c | 82 > > > ++++++++++++++++++++++++++++++++--------- > > > include/scsi/libfc.h | 2 + > > > 2 files changed, 66 insertions(+), 18 deletions(-) > > > > > > diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c > > > index 630291f..3c03c87 100644 > > > --- a/drivers/scsi/libfc/fc_exch.c > > > +++ b/drivers/scsi/libfc/fc_exch.c > > > @@ -130,6 +130,7 @@ struct fc_exch_mgr_anchor { > > > struct list_head ema_list; > > > struct fc_exch_mgr *mp; > > > bool (*match)(struct fc_frame *); > > > + struct rcu_head rcu; > > > }; > > > > > > static void fc_exch_rrq(struct fc_exch *); > > > @@ -801,9 +802,13 @@ static inline struct fc_exch *fc_exch_alloc(struct > > > fc_lport *lport, > > > { > > > struct fc_exch_mgr_anchor *ema; > > > > > > - list_for_each_entry(ema, &lport->ema_list, ema_list) > > > - if (!ema->match || ema->match(fp)) > > > + rcu_read_lock(); > > > + list_for_each_entry_rcu(ema, &lport->ema_list, ema_list) > > > + if (!ema->match || ema->match(fp)) { > > > + rcu_read_unlock(); > > > return fc_exch_em_alloc(lport, ema->mp); > > > + } > > > + rcu_read_unlock(); > > > return NULL; > > > } > > > > > > @@ -1329,10 +1334,13 @@ static struct fc_seq *fc_seq_assign(struct > > > fc_lport *lport, struct fc_frame *fp) > > > WARN_ON(fr_seq(fp)); > > > fr_seq(fp) = NULL; > > > > > > - list_for_each_entry(ema, &lport->ema_list, ema_list) > > > + rcu_read_lock(); > > > + list_for_each_entry_rcu(ema, &lport->ema_list, ema_list) > > > if ((!ema->match || ema->match(fp)) && > > > fc_seq_lookup_recip(lport, ema->mp, fp) == FC_RJT_NONE) > > > break; > > > + > > > + rcu_read_unlock(); > > > return fr_seq(fp); > > > } > > > > > > @@ -1817,12 +1825,14 @@ void fc_exch_mgr_reset(struct fc_lport *lport, > > > u32 sid, u32 did) > > > struct fc_exch_mgr_anchor *ema; > > > unsigned int cpu; > > > > > > - list_for_each_entry(ema, &lport->ema_list, ema_list) { > > > + rcu_read_lock(); > > > + list_for_each_entry_rcu(ema, &lport->ema_list, ema_list) { > > > for_each_possible_cpu(cpu) > > > fc_exch_pool_reset(lport, > > > per_cpu_ptr(ema->mp->pool, cpu), > > > sid, did); > > > } > > > + rcu_read_unlock(); > > > } > > > EXPORT_SYMBOL(fc_exch_mgr_reset); > > > > > > @@ -1836,11 +1846,17 @@ EXPORT_SYMBOL(fc_exch_mgr_reset); > > > static struct fc_exch *fc_exch_lookup(struct fc_lport *lport, u32 xid) > > > { > > > struct fc_exch_mgr_anchor *ema; > > > + struct fc_exch *exch = NULL; > > > > > > - list_for_each_entry(ema, &lport->ema_list, ema_list) > > > - if (ema->mp->min_xid <= xid && xid <= ema->mp->max_xid) > > > - return fc_exch_find(ema->mp, xid); > > > - return NULL; > > > + rcu_read_lock(); > > > + list_for_each_entry_rcu(ema, &lport->ema_list, ema_list) { > > > + if (ema->mp->min_xid <= xid && xid <= ema->mp->max_xid) { > > > + exch = fc_exch_find(ema->mp, xid); > > > + break; > > > + } > > > + } > > > + rcu_read_unlock(); > > > + return exch; > > > } > > > > > > /** > > > @@ -2166,6 +2182,7 @@ struct fc_exch_mgr_anchor *fc_exch_mgr_add(struct > > > fc_lport *lport, > > > bool (*match)(struct fc_frame *)) > > > { > > > struct fc_exch_mgr_anchor *ema; > > > + unsigned long flags; > > > > > > ema = kmalloc(sizeof(*ema), GFP_ATOMIC); > > > if (!ema) > > > @@ -2174,8 +2191,10 @@ struct fc_exch_mgr_anchor *fc_exch_mgr_add(struct > > > fc_lport *lport, > > > ema->mp = mp; > > > ema->match = match; > > > /* add EM anchor to EM anchors list */ > > > - list_add_tail(&ema->ema_list, &lport->ema_list); > > > + spin_lock_irqsave(&lport->ema_lock, flags); > > > + list_add_tail_rcu(&ema->ema_list, &lport->ema_list); > > > kref_get(&mp->kref); > > > + spin_unlock_irqrestore(&lport->ema_lock, flags); > > > return ema; > > > } > > > EXPORT_SYMBOL(fc_exch_mgr_add); > > > @@ -2193,16 +2212,26 @@ static void fc_exch_mgr_destroy(struct kref *kref) > > > kfree(mp); > > > } > > > > > > +static void del_ema_rcu_cb(struct rcu_head *head) > > > +{ > > > + struct fc_exch_mgr_anchor *ema; > > > + > > > + ema = container_of(head, struct fc_exch_mgr_anchor, rcu); > > > + > > > + kref_put(&ema->mp->kref, fc_exch_mgr_destroy); > > > + kfree(ema); > > > +} > > > + > > > /** > > > * fc_exch_mgr_del() - Delete an EM from a local port's list > > > * @ema: The exchange manager anchor identifying the EM to be deleted > > > + * NOTE: Assumes that the ema_lock for the list this anchor is on is held > > > */ > > > void fc_exch_mgr_del(struct fc_exch_mgr_anchor *ema) > > > { > > > /* remove EM anchor from EM anchors list */ > > > - list_del(&ema->ema_list); > > > - kref_put(&ema->mp->kref, fc_exch_mgr_destroy); > > > - kfree(ema); > > > + list_del_rcu(&ema->ema_list); > > > + call_rcu(&ema->rcu, del_ema_rcu_cb); > > > } > > > EXPORT_SYMBOL(fc_exch_mgr_del); > > > > > > @@ -2214,15 +2243,21 @@ EXPORT_SYMBOL(fc_exch_mgr_del); > > > int fc_exch_mgr_list_clone(struct fc_lport *src, struct fc_lport *dst) > > > { > > > struct fc_exch_mgr_anchor *ema, *tmp; > > > + unsigned long flags; > > > > > > + spin_lock_irqsave(&src->ema_lock, flags); > > > list_for_each_entry(ema, &src->ema_list, ema_list) { > > > if (!fc_exch_mgr_add(dst, ema->mp, ema->match)) > > > goto err; > > > } > > > + spin_unlock_irqrestore(&src->ema_lock, flags); > > > return 0; > > > err: > > > + spin_unlock_irqrestore(&src->ema_lock, flags); > > > + spin_lock_irqsave(&dst->ema_lock, flags); > > > list_for_each_entry_safe(ema, tmp, &dst->ema_list, ema_list) > > > fc_exch_mgr_del(ema); > > > + spin_unlock_irqrestore(&dst->ema_lock, flags); > > > return -ENOMEM; > > > } > > > EXPORT_SYMBOL(fc_exch_mgr_list_clone); > > > @@ -2322,10 +2357,13 @@ EXPORT_SYMBOL(fc_exch_mgr_alloc); > > > void fc_exch_mgr_free(struct fc_lport *lport) > > > { > > > struct fc_exch_mgr_anchor *ema, *next; > > > + unsigned long flags; > > > > > > flush_workqueue(fc_exch_workqueue); > > > + spin_lock_irqsave(&lport->ema_lock, flags); > > > list_for_each_entry_safe(ema, next, &lport->ema_list, ema_list) > > > fc_exch_mgr_del(ema); > > > + spin_unlock_irqrestore(&lport->ema_lock, flags); > > > } > > > EXPORT_SYMBOL(fc_exch_mgr_free); > > > > > > @@ -2340,24 +2378,32 @@ static struct fc_exch_mgr_anchor *fc_find_ema(u32 > > > f_ctl, > > > struct fc_lport *lport, > > > struct fc_frame_header *fh) > > > { > > > - struct fc_exch_mgr_anchor *ema; > > > u16 xid; > > > + struct fc_exch_mgr_anchor *ema = NULL; > > > > > > + rcu_read_lock(); > > > if (f_ctl & FC_FC_EX_CTX) > > > xid = ntohs(fh->fh_ox_id); > > > else { > > > xid = ntohs(fh->fh_rx_id); > > > - if (xid == FC_XID_UNKNOWN) > > > - return list_entry(lport->ema_list.prev, > > > + if (xid == FC_XID_UNKNOWN) { > > > + > > > + ema = list_empty(&lport->ema_list) ? > > > + NULL : > > > + list_entry(lport->ema_list.prev, > > > typeof(*ema), ema_list); > > > + goto out; > > > + } > > > } > > > > > > - list_for_each_entry(ema, &lport->ema_list, ema_list) { > > > + list_for_each_entry_rcu(ema, &lport->ema_list, ema_list) { > > > if ((xid >= ema->mp->min_xid) && > > > (xid <= ema->mp->max_xid)) > > > - return ema; > > > + goto out; > > > } > > > - return NULL; > > > +out: > > > + rcu_read_unlock(); > > > + return ema; > > > } > > > /** > > > * fc_exch_recv() - Handler for received frames > > > diff --git a/include/scsi/libfc.h b/include/scsi/libfc.h > > > index 8f9dfba..5627d99 100644 > > > --- a/include/scsi/libfc.h > > > +++ b/include/scsi/libfc.h > > > @@ -850,6 +850,7 @@ struct fc_lport { > > > /* Associations */ > > > struct Scsi_Host *host; > > > struct list_head ema_list; > > > + spinlock_t ema_lock; > > > struct fc_rport_priv *dns_rdata; > > > struct fc_rport_priv *ms_rdata; > > > struct fc_rport_priv *ptp_rdata; > > > @@ -1024,6 +1025,7 @@ libfc_host_alloc(struct scsi_host_template *sht, > > > int priv_size) > > > lport = shost_priv(shost); > > > lport->host = shost; > > > INIT_LIST_HEAD(&lport->ema_list); > > > + spin_lock_init(&lport->ema_lock); > > > INIT_LIST_HEAD(&lport->vports); > > > return lport; > > > } > > > > > > _______________________________________________ devel mailing list devel@open-fcoe.org https://lists.open-fcoe.org/mailman/listinfo/devel