On Thu, Jun 30, 2022 at 11:51:52AM +0200, Claudio Jeker wrote:
> After discussing this with mpi@ and jmatthew@ we came to the conclusion
> that we need to smr_barrier() before refcnt_finalize() to ensure that no
> other CPU is between the SMR_TAILQ_FOREACH, refcnt_take() and
> smr_read_leave().

[...]

> @@ -509,7 +487,8 @@ route_input(struct mbuf *m0, struct sock
>               return;
>       }
>  
> -     SRPL_FOREACH(rop, &sr, &rtptable.rtp_list, rop_list) {
> +     smr_read_enter();
> +     SMR_TAILQ_FOREACH(rop, &rtptable.rtp_list, rop_list) {
>               /*
>                * If route socket is bound to an address family only send
>                * messages that match the address family. Address family
> @@ -519,7 +498,8 @@ route_input(struct mbuf *m0, struct sock
>                   rop->rop_proto != sa_family)
>                       continue;
>  
> -
> +             refcnt_take(&rop->rop_refcnt);
> +             smr_read_leave();
>               so = rop->rop_socket;
>               solock(so);
>  
> @@ -579,8 +559,10 @@ route_input(struct mbuf *m0, struct sock
>               rtm_sendup(so, m);
>  next:
>               sounlock(so);
> +             smr_read_enter();
> +             refcnt_rele_wake(&rop->rop_refcnt);

This does not look correct.

smr_barrier() can proceed after smr_read_leave(), so refcnt_rele_wake()
might drop the final reference and this thread can no longer access
rop safely (SMR_TAILQ_NEXT() inside SMR_TAILQ_FOREACH()).

Also, SMR_TAILQ_NEXT() of rop becomes potentially dangling after
smr_read_leave(). After this thread leaves the read-side critical
section, another thread might free rop's successor.

Reply via email to