eugenelisstreams,

On Thu, 15 Dec 2005, [EMAIL PROTECTED] wrote:

> 
>    Hello,
> 
>    I have encountered the following message loss problem in LiS-2.18.0.
> 
>    My application uses putpmsg() function to send messages.
> 
>    LiS executes the following call sequence:
> 
>    lis_strputpmsg()     ->     lis_putnext    ==>    lis_safe_putnext()->
>    lis_safe_do_putmsg()
>    In the lis_safe_do_putmsg() there is code fragment:
> 
>        if ((err=lis_lockqf(q, f,l)) < 0)  <--- can fail due to EINTR
>        {
>            freemsg(mp) ;  /* busted */   <-- message dropped
>            return(1) ;
>        }

I'd suggest the following instead:

        while ((err = lis_lockqf(q, f, l)) < 0) ;

>    If application has periodical timer clicking, the
>    aforementioned lis_lockq() call can fail with the EINTR
>    if it was "lucky" to get hit by the timer while waiting on that lock.
> 
>    The message is dropped silently in that case.
>    Without   any  notification  sent  to  the  originating  putpmsg()  in
>    application.
>    [Since the lis_putenext() returns void to callers]
> 
>    This is not good. I.e. it is bad.  Message loss. :(
>    I'm   curious   if  other  people  experienced  similar  message  loss
>    scenarios.
>    Note, that lis_safe_do_putmsg(), via lis_putnext(), is called from
>    quite a number of other places.
> 
>    BTW, that was not a problem in LiS-2.16.18 due to a different
>    lis_lockq() implementation.

lis_lockqf() will only return an error if `q' has qlock syncrhonization
specified (the function merely returns zero if q->q_qsp is NULL).  This
feature was not even present in 2.16.18, so I'm a little confused by the
comparison.  Your driver or module needed to be MP-safe under 2.16.18,
so I don't understand why you are using the dubious q synchronization
under LiS 2.18.  (Dave warned us that q syncrhonization as not
thouroughly tested.)  Perhaps you are just allowing your driver or
module to default (which would be LIS_QLOCK_QUEUE), instead of
specifying LIS_QLOCK_NONE.

>    I'm looking for an advice on how we can fix this problem.
>    (Perhaps Brian or Dan Gora can advice on this).

I can see several places in the code that are as bad, or worse, if you
are using the experimental q syncrhonization.  For example, there are
two instances in lis_qdetach where the return value is ignored.

> 
>    What I'm thinking about is if in the lis_safe_do_putmsg() function,
>    instead of dropping message when lis_lockq() fails, we could
>    place it into deferred queue?
> 
>    In fact, in lis_safe_do_putmsg() function there are 2 calls
>    to the lis_defer_msg() function already.

The problem is that lis_lockqf acquires a semaphore and can sleep.
lis_defer_msg requires that a queue ISR lock be held across the
call (and condition tests) to avoid races.  The one cannot be acquired
within the other.

Another problem is that, unlike procsoff and freezestr, the function
does not check for or process a backlog when the syncq is unlocked.
That is lis_defer_msg never schedules anything if there is no service
procedure for the queue.  lis_qproson() and lis_unfreezestr() on the
other hand call lis_do_defferred_puts() that process the backlog.

Did you notice the race between deferring a message and locking the
syncq?  An MP-safe driver or module feeding putnext for a q sync'ed
module or driver directly from its put procedure can misorder messages.

It would require a lot of rework of the q syncrhonization to fix things.
Just set your driver or module LIS_QLOCK_NONE to get 2.16.18 behaviour,
or better still, use Linux Fast-STREAMS.

> 
>    Can we call lis_defer_msg() function when lis_lockq() fails there?

No, you'd be shoving the message into a black hole only to be freed on
close, that is no different than freeing it.

> 
>    I'll appreciate all advices on the matter.
> 
>    thanks,
>    --
>    Eugene

--brian

-- 
Brian F. G. Bidulock    ¦ The reasonable man adapts himself to the ¦
[EMAIL PROTECTED]    ¦ world; the unreasonable one persists in  ¦
http://www.openss7.org/ ¦ trying  to adapt the  world  to himself. ¦
                        ¦ Therefore  all  progress  depends on the ¦
                        ¦ unreasonable man. -- George Bernard Shaw ¦
_______________________________________________
Linux-streams mailing list
[email protected]
http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams

Reply via email to