Hello,

</snip>
> >     it seems to me we may leak `pd` we took at line 1944. The leak
> >     happens in case timeout_del() return zero.
> > 
> >     diff below ignores the return value from timeout_del() and makes
> >     sure we always call pfsync_undefefer()
> > 
> >     I have not seen such panic in my test environment. Would you be able
> >     to give my diff a try?
> 
> i hit a different fault in this code recently, so i've been looking
> at the code again. my fix for the second fault will conflict with this
> one. i'll send it out to tech@ separately though.
> 

    I've just sent my OK to your diff.

> your diff looks like it moves the handling of undefer until after pf
> drops its locks. i was just going to make pfsync schedule a task or
> softint immediately if the queue was getting long, and then stop doing
> deferrals if it was way too long.

    that's also alternative I was considering. I think I still have it
    in somewhere in history. Both approaches are fine I think. Scheduling
    a task looks less ugly. However as I was typing the code I've started
    to realize it might be better to let packet to dispatch the deferral.
    because this will share resources in more natural way.

    from my point of view the story goes like that:

        timeout does not dispatch deferrals fast enough, queue grows

        packet needs to insert new deferral to queue

        new deferral does not fit to queue, options are:

            remove one deferral from queue to make space for
            new one. packet then must dispatch the old deferral.
            this conserves resources because packet (network subsystem)
            pays cost (memory & cpu) for deferral processing right away.

            schedule yet another task to dispatch deferral. My concern
            here is that it allows network subsystem to allocate
            more and more resources. If deferral does not fit
            into regular timeout queue, we just let pfsync to create
            an 'express task' to dispatch old deferral.


    I see both those changes as 'interim fix'. We may be able to
    restore original code we currently have in pfsync_defer() as soon
    as we will shrink scope of PF_LOCK. Once PF_LOCK will be
    in better shape, we will be able to make pfsync_defer() nice
    and simple again.


thanks and
regards
sashan

Reply via email to