On Thu, 2010-05-20 at 17:36 -0700, Joe Eykholt wrote:
> Vasu Dev wrote:
> > On Thu, 2010-05-20 at 12:22 -0700, Joe Eykholt wrote:
> >> Vasu Dev wrote:
> >>> On Wed, 2010-05-19 at 11:16 -0700, Joe Eykholt wrote:
> >>>>>> */
> >>>>>> -static void fcoe_ctlr_age_fcfs(struct fcoe_ctlr *fip)
> >>>>>> +static unsigned long fcoe_ctlr_age_fcfs(struct fcoe_ctlr *fip)
> >>>>>> {
> >>>>>> struct fcoe_fcf *fcf;
> >>>>>> struct fcoe_fcf *next;
> >>>>>> + unsigned long next_timer = jiffies +
> >>>> msecs_to_jiffies(FIP_DEF_FKA);
> >>>>
> >>>> Also, this should be further out than 8 seconds, in case there's
> >>>> nothing sooner needed. That's why I used the 90 second time before.
> >>>> The rest of the code in age_fcfs or the timeout should come up with
> >>>> a closer value unless the 90 second keep-alive is the next thing
> >>>> needed
> >>>> or there's nothing at all to do.
> >>>>
> >>> This default value mostly gets written when fd_flag is zero due to
> >>> sooner E node KA with typical 8 sec KA period and large value was always
> >>> effective in case fd_flag set that was an issue since in that case
> >>> periodic ka timeouts didn't kick after 90 second of fcf selection while
> >>> fd_flag set.
> >> I don't really understand. If there's any FCF at all, it should set
> >> next_timer to the 1.5 * FKA since the oldest advertisment received.
> >> With a selected FCF that has the flag on it should fire
> >> every 1.5 * FKA to check for a missing advertisement.
> >> With the change above, it'll fire after 1 FKA, find nothing todo
> >> and then reschedule for .5 FKA later, right?
> >>
> >
> > You are right, I must have tried something else in which I noticed
> > default next_timer used as 90 sec. I don't see with this patch.
> >
> >> I may have missed something.
> >>
> >>> I can roll back this change but then have to set KA period timeout for
> >>> this case when fcf is selected in fcoe_ctlr_timeout, so it would add few
> >>> more lines there after fcoe_ctlr_select calling as:-
> >>>
> >>> if (sel && sel->fd_flags &&
> >>> time_after(next_timer, sel->time + sel->fka_period))
> >>> next_timer = sel->time + fcf->fka_period;
> >> But the code you added in age_fcfs also does something like that:
> >>
> >> if (fcf->fd_flags &&
> >> time_after(next_timer, jiffies +
> >> fcf->fka_period))
> >> next_timer = jiffies + fcf->fka_period;
> >>
> >> This is after checks that the FCF is selected. It's setting the time
> >> to now + FKA, not the last advertisement + FKA, though.
> >> I'm not sure why that's needed, either. Since other code already accounts
> >> for the 1.5 * FKA check and the 2.5 * FKA timeout.
> >>
> >
> > That is to catch missed adv while d flag set as we talked in other
> > response.
> >
> >>> I'm fine with either and let me know if you want me to change this.
> >> I would put back the change to initializing next_timer, and take out
> >> the code quoted above. Is that OK?
> >
> > OK, but then it won't catch missed adv if none received just after fcf
> > was selected since TO would be set to 2.5 * FKA in that case. It would
> > be very unlikely case and would just miss missed adv count and prints.
> > However if we want that also not to occur then next timer should be
> > updated just after fcf selected for the D bit case.
>
> I added code similar to what you suggested after the first successful
> select, which is independent of the flag and will run the timeout
> when the first ctlr_ka is due. That's early to look for missing
> advertisements, but then age_fcfs will reschedule it.
>
> This stuff is surprisingly complicated.
Yes indeed. I'm glad you fixed this very unlikely case though I was
okay w/o that since age out was okay in either case but anyways you did
it and that too without additional flag checking. Thanks
> Maybe there's a better way.
> Anyway, below are the differences from your patch to the result of the
> two patches. It's cut & paste so may not apply.
> This is just for review. All these changes are incorporated in the
> patch that I'll send out in a little bit.
>
> --- libfcoe.c.vasu 2010-05-20 10:58:07.000000000 -0700
> +++ libfcoe.c 2010-05-20 17:18:18.000000000 -0700
> @@ -565,32 +565,28 @@ EXPORT_SYMBOL(fcoe_ctlr_els_send);
> *
> * Also, increment the MissDiscAdvCount when no advertisement is received
> * for the corresponding FCF for 1.5 * FKA_ADV_PERIOD (FC-BB-5 LESB).
> *
> * Returns the time in jiffies for the next call.
> */
> static unsigned long fcoe_ctlr_age_fcfs(struct fcoe_ctlr *fip)
> {
> struct fcoe_fcf *fcf;
> struct fcoe_fcf *next;
> - unsigned long next_timer = jiffies + msecs_to_jiffies(FIP_DEF_FKA);
> + unsigned long next_timer = jiffies + msecs_to_jiffies(FIP_VN_KA_PERIOD);
> unsigned long deadline;
> unsigned long sel_time = 0;
> struct fcoe_dev_stats *stats;
>
> list_for_each_entry_safe(fcf, next, &fip->fcfs, list) {
> deadline = fcf->time + fcf->fka_period + fcf->fka_period / 2;
> if (fip->sel_fcf == fcf) {
> - if (fcf->fd_flags &&
> - time_after(next_timer, jiffies + fcf->fka_period))
> - next_timer = jiffies + fcf->fka_period;
> -
> if (time_after(jiffies, deadline)) {
> stats = per_cpu_ptr(fip->lp->dev_stats,
> smp_processor_id());
> stats->MissDiscAdvCount++;
> printk(KERN_INFO "libfcoe: host%d: "
> "Missing Discovery Advertisement "
> "for fab %16.16llx count %lld\n",
> fip->lp->host->host_no, fcf->fabric_name,
> stats->MissDiscAdvCount);
> } else if (time_after(next_timer, deadline))
> @@ -609,21 +605,21 @@ static unsigned long fcoe_ctlr_age_fcfs(
> smp_processor_id());
> stats->VLinkFailureCount++;
> } else {
> if (time_after(next_timer, deadline))
> next_timer = deadline;
> if (fcoe_ctlr_mtu_valid(fcf) &&
> (!sel_time || time_before(sel_time, fcf->time)))
> sel_time = fcf->time;
> }
> }
> - if (!(fip->sel_fcf || fip->sel_time)) {
> + if (sel_time && !fip->sel_fcf && !fip->sel_time) {
> sel_time += msecs_to_jiffies(FCOE_CTLR_START_DELAY);
> fip->sel_time = sel_time;
> }
>
> return next_timer;
> }
>
> /**
> * fcoe_ctlr_parse_adv() - Decode a FIP advertisement into a new FCF entry
> * @fip: The FCoE controller receiving the advertisement
> @@ -1183,20 +1179,22 @@ static void fcoe_ctlr_timeout(unsigned l
> if (sel != fcf) {
> fcf = sel; /* the old FCF may have been freed */
> if (sel) {
> printk(KERN_INFO "libfcoe: host%d: FIP selected "
> "Fibre-Channel Forwarder MAC %pM\n",
> fip->lp->host->host_no, sel->fcf_mac);
> memcpy(fip->dest_addr, sel->fcf_mac, ETH_ALEN);
> fip->port_ka_time = jiffies +
> msecs_to_jiffies(FIP_VN_KA_PERIOD);
> fip->ctlr_ka_time = jiffies + sel->fka_period;
> + if (time_after(next_timer, fip->ctlr_ka_time))
> + next_timer = fip->ctlr_ka_time;
> } else {
> printk(KERN_NOTICE "libfcoe: host%d: "
> "FIP Fibre-Channel Forwarder timed out. "
> "Starting FCF discovery.\n",
> fip->lp->host->host_no);
> fip->reset_req = 1;
> schedule_work(&fip->timer_work);
> }
> }
>
looks good.
> I put the changes to the D-flag in a separate patch, for review at least.
> Maybe you can try that sometime? I don't have a switch that lets me
> set the D flag (as far as I know). I guess I could modify fcgw for that
> someday in my spare time ...
I'll test your patches w/ D-flag switch.
Vasu
_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel