Re: [External] : ipsec acquire mutex refcount
Hello, > > > +ipsp_delete_acquire_locked(struct ipsec_acquire *ipa) > > > +{ > > > + if (timeout_del(>ipa_timeout) == 1) > > > + refcnt_rele(>ipa_refcnt); > >^^ > > can we also put ASSERT/check into this branch > > to verify we are no releasing the last > > reference to ipa. I suspect we might be doing > > an extra reference drop here. > > Later we call ipsp_unref_acquire_locked() and refcnt_rele() again. > This will KASSERT(refcnt != ~0) in the case you describe. > > > I believe this ASSERT > > would be hit if we will compile kernel without IPSEC. > > we grab the extra reference iff are adding a timer. > > We release the extra reference only iff we delete the timer > that was added before. I see your point. it makes sense. thanks for clarification. thanks and regards sashan
Re: [External] : ipsec acquire mutex refcount
On Mon, Mar 14, 2022 at 11:09:48AM +0100, Alexandr Nedvedicky wrote: > 667 ipsec_delete_policy(struct ipsec_policy *ipo) > > 668 { > 669 struct ipsec_acquire *ipa; > 670 struct radix_node_head *rnh; > 671 struct radix_node *rn = (struct radix_node *)ipo; > > 672 > 673 NET_ASSERT_LOCKED(); > > 674 > 675 if (refcnt_rele(>ipo_refcnt) == 0) > > 676 return 0; > > 677 > 678 /* Delete from SPD. */ > 679 if ((rnh = spd_table_get(ipo->ipo_rdomain)) == NULL || > 680 rn_delete(>ipo_addr, >ipo_mask, rnh, rn) == NULL) > > 681 return (ESRCH); > > the suggestion is not related to your changes, but I think it's worth > to consider as a different diff. I think we should either panic() > at line 681 or we should at least log error. If I understand code > right we delete policy (ipo), which has been fully created so this > error should not happen here. Maybe. If you send a diff, I will test it. > > +ipsp_delete_acquire_locked(struct ipsec_acquire *ipa) > > +{ > > + if (timeout_del(>ipa_timeout) == 1) > > + refcnt_rele(>ipa_refcnt); >^^ > can we also put ASSERT/check into this branch > to verify we are no releasing the last > reference to ipa. I suspect we might be doing > an extra reference drop here. Later we call ipsp_unref_acquire_locked() and refcnt_rele() again. This will KASSERT(refcnt != ~0) in the case you describe. > I believe this ASSERT > would be hit if we will compile kernel without IPSEC. > we grab the extra reference iff are adding a timer. We release the extra reference only iff we delete the timer that was added before. > I think this whole ipsp_delete_acquire() logic should > be slightly different. We should call ipsp_unref_acquire_locked(), > when timeout_del() returns 1, because function managed to unschedule > the pending timer. If it returns zero, we should leave things as > they are because timer is running already, waiting for > ipsec_acquire_mtx. As soon as timer will grab mutex, it will > drop the reference for ipa. That's my understanding. > if we go this way then we should not need to grab extra > reference for timer. I copied this logic from tdb. We refcount the pending timers. By checking the return value of timeout_add() and timeout_del() the refcount is correct. And the running timer is counted as reference. At least this works fine with tdb. > > struct ipsec_acquire *ipa; > ^^^ > this needs to be initialized to NULL As mentioned earlier, TAILQ_FOREACH(ipa,...) always initializes ipa. > > - timeout_add_sec(>ipa_timeout, ipsec_expire_acquire); > > + if (timeout_add_sec(>ipa_timeout, ipsec_expire_acquire) == 1) > > refcnt_take(>ipa_refcnt); > ^^^ > line is too long Fixed before commit.
Re: [External] : ipsec acquire mutex refcount
Hello, On Tue, Mar 15, 2022 at 12:58:48AM +0300, Vitaliy Makkoveev wrote: > > > >I think local var `ipa` needs to be initialized to NULL > >to avoid random value/pointer when no `ipa` for given `seq` > >is found. > > > >ipsp_pending_acquire() plays the same gamble. > > #define TAILQ_FOREACH(var, head, field) \ > for((var) = TAILQ_FIRST(head);\ > (var) != TAILQ_END(head); \ > (var) = TAILQ_NEXT(var, field)) > > TAILQ_END() defined as NULL. So it will be NULL when the > whole `ipsec_acquire_head’ was processed but `ipa’ was > not found. > > Also the initial `ipa’ value will be overwritten within > the TAILQ_FOREACH() loop processing. In all cases. I see now. you are absolutely right. I don't insist on those tweaks any more. thanks and regards sashan
Re: [External] : ipsec acquire mutex refcount
> On 15 Mar 2022, at 00:45, Alexandr Nedvedicky > wrote: > > Hello, > > On Tue, Mar 15, 2022 at 12:37:00AM +0300, Vitaliy Makkoveev wrote: >> Hi, >> >> Why do you want to initialize `ipa’ variable in >> ipsp_pending_acquire() and ipsec_get_acquire()? This doesn’t >> require. > >after looking at code with bluhm's diff applied I see this: > > 936 struct ipsec_acquire * > 937 ipsec_get_acquire(u_int32_t seq) > > 938 { > 939 struct ipsec_acquire *ipa; > > 940 > 941 NET_ASSERT_LOCKED(); > > 942 > 943 mtx_enter(_acquire_mtx); > 944 TAILQ_FOREACH(ipa, _acquire_head, ipa_next) { > 945 if (ipa->ipa_seq == seq) { > 946 refcnt_take(>ipa_refcnt); > 947 break; > 948 } > 949 } > 950 mtx_leave(_acquire_mtx); > > 951 > 952 return ipa; > > 953 } > >I think local var `ipa` needs to be initialized to NULL >to avoid random value/pointer when no `ipa` for given `seq` >is found. > >ipsp_pending_acquire() plays the same gamble. #define TAILQ_FOREACH(var, head, field) \ for((var) = TAILQ_FIRST(head);\ (var) != TAILQ_END(head); \ (var) = TAILQ_NEXT(var, field)) TAILQ_END() defined as NULL. So it will be NULL when the whole `ipsec_acquire_head’ was processed but `ipa’ was not found. Also the initial `ipa’ value will be overwritten within the TAILQ_FOREACH() loop processing. In all cases.
Re: [External] : ipsec acquire mutex refcount
Hello, On Tue, Mar 15, 2022 at 12:37:00AM +0300, Vitaliy Makkoveev wrote: > Hi, > > Why do you want to initialize `ipa’ variable in > ipsp_pending_acquire() and ipsec_get_acquire()? This doesn’t > require. after looking at code with bluhm's diff applied I see this: 936 struct ipsec_acquire * 937 ipsec_get_acquire(u_int32_t seq) 938 { 939 struct ipsec_acquire *ipa; 940 941 NET_ASSERT_LOCKED(); 942 943 mtx_enter(_acquire_mtx); 944 TAILQ_FOREACH(ipa, _acquire_head, ipa_next) { 945 if (ipa->ipa_seq == seq) { 946 refcnt_take(>ipa_refcnt); 947 break; 948 } 949 } 950 mtx_leave(_acquire_mtx); 951 952 return ipa; 953 } I think local var `ipa` needs to be initialized to NULL to avoid random value/pointer when no `ipa` for given `seq` is found. ipsp_pending_acquire() plays the same gamble. thanks and regards sashan
Re: [External] : ipsec acquire mutex refcount
Hi, Why do you want to initialize `ipa’ variable in ipsp_pending_acquire() and ipsec_get_acquire()? This doesn’t require. > On 14 Mar 2022, at 13:09, Alexandr Nedvedicky > wrote: > > Hello, > > > > changes looks good. just few nits. > >I took a closer look at ipsec_delete_policy(): > > 667 ipsec_delete_policy(struct ipsec_policy *ipo) > > 668 { > 669 struct ipsec_acquire *ipa; > 670 struct radix_node_head *rnh; > 671 struct radix_node *rn = (struct radix_node *)ipo; > > 672 > 673 NET_ASSERT_LOCKED(); > > 674 > 675 if (refcnt_rele(>ipo_refcnt) == 0) > > 676 return 0; > > 677 > 678 /* Delete from SPD. */ > 679 if ((rnh = spd_table_get(ipo->ipo_rdomain)) == NULL || > 680 rn_delete(>ipo_addr, >ipo_mask, rnh, rn) == NULL) > > 681 return (ESRCH); > >the suggestion is not related to your changes, but I think it's worth >to consider as a different diff. I think we should either panic() >at line 681 or we should at least log error. If I understand code >right we delete policy (ipo), which has been fully created so this >error should not happen here. > >more comments in-line > > thanks and > regards > sashan > >> Index: netinet/ip_spd.c >> === >> RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_spd.c,v >> retrieving revision 1.114 >> diff -u -p -r1.114 ip_spd.c > >> @@ -717,13 +720,38 @@ ipsp_delete_acquire_timo(void *v) >> void >> ipsp_delete_acquire(struct ipsec_acquire *ipa) >> { >> -NET_ASSERT_LOCKED(); >> +mtx_enter(_acquire_mtx); >> +ipsp_delete_acquire_locked(ipa); >> +mtx_leave(_acquire_mtx); >> +} >> + >> +void >> +ipsp_delete_acquire_locked(struct ipsec_acquire *ipa) >> +{ >> +if (timeout_del(>ipa_timeout) == 1) >> +refcnt_rele(>ipa_refcnt); > ^^ > can we also put ASSERT/check into this branch > to verify we are no releasing the last > reference to ipa. I suspect we might be doing > an extra reference drop here. I believe this ASSERT > would be hit if we will compile kernel without IPSEC. > we grab the extra reference iff are adding a timer. > > > I think this whole ipsp_delete_acquire() logic should > be slightly different. We should call ipsp_unref_acquire_locked(), > when timeout_del() returns 1, because function managed to unschedule > the pending timer. If it returns zero, we should leave things as > they are because timer is running already, waiting for > ipsec_acquire_mtx. As soon as timer will grab mutex, it will > drop the reference for ipa. That's my understanding. > if we go this way then we should not need to grab extra > reference for timer. > >> +ipsp_unref_acquire_locked(ipa); >> +} >> + > > >> @@ -731,19 +759,21 @@ ipsp_delete_acquire(struct ipsec_acquire >> * Find out if there's an ACQUIRE pending. >> * XXX Need a better structure. >> */ >> -struct ipsec_acquire * >> +int >> ipsp_pending_acquire(struct ipsec_policy *ipo, union sockaddr_union *gw) >> { >> struct ipsec_acquire *ipa; > ^^^ >this needs to be initialized to NULL >> >> NET_ASSERT_LOCKED(); >> >> -TAILQ_FOREACH (ipa, >ipo_acquires, ipa_ipo_next) { >> +mtx_enter(_acquire_mtx); >> +TAILQ_FOREACH(ipa, >ipo_acquires, ipa_ipo_next) { >> if (!memcmp(gw, >ipa_addr, gw->sa.sa_len)) >> -return ipa; >> +break; >> } >> +mtx_leave(_acquire_mtx); >> >> -return NULL; >> +return (ipa != NULL); >> } >> > >> @@ -850,13 +881,14 @@ ipsp_acquire_sa(struct ipsec_policy *ipo >> return 0; >> } >> >> +mtx_enter(_acquire_mtx); >> #ifdef IPSEC >> -timeout_add_sec(>ipa_timeout, ipsec_expire_acquire); >> +if (timeout_add_sec(>ipa_timeout, ipsec_expire_acquire) == 1) >> refcnt_take(>ipa_refcnt); > ^^^ > line is too long >> #endif >> - >> TAILQ_INSERT_TAIL(_acquire_head, ipa, ipa_next); >> TAILQ_INSERT_TAIL(>ipo_acquires, ipa, ipa_ipo_next); >> ipa->ipa_policy = ipo; >> +mtx_leave(_acquire_mtx); >> >> /* PF_KEYv2 notification message. */ >> return pfkeyv2_acquire(ipo, gw, laddr, >ipa_seq, ddst); >> @@ -908,9 +940,14 @@ ipsec_get_acquire(u_int32_t seq) >> > local variable ipa, needs to be initialized to NULL here as well. >> NET_ASSERT_LOCKED(); >> >> -TAILQ_FOREACH (ipa,
Re: [External] : ipsec acquire mutex refcount
Hello, changes looks good. just few nits. I took a closer look at ipsec_delete_policy(): 667 ipsec_delete_policy(struct ipsec_policy *ipo) 668 { 669 struct ipsec_acquire *ipa; 670 struct radix_node_head *rnh; 671 struct radix_node *rn = (struct radix_node *)ipo; 672 673 NET_ASSERT_LOCKED(); 674 675 if (refcnt_rele(>ipo_refcnt) == 0) 676 return 0; 677 678 /* Delete from SPD. */ 679 if ((rnh = spd_table_get(ipo->ipo_rdomain)) == NULL || 680 rn_delete(>ipo_addr, >ipo_mask, rnh, rn) == NULL) 681 return (ESRCH); the suggestion is not related to your changes, but I think it's worth to consider as a different diff. I think we should either panic() at line 681 or we should at least log error. If I understand code right we delete policy (ipo), which has been fully created so this error should not happen here. more comments in-line thanks and regards sashan > Index: netinet/ip_spd.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_spd.c,v > retrieving revision 1.114 > diff -u -p -r1.114 ip_spd.c > @@ -717,13 +720,38 @@ ipsp_delete_acquire_timo(void *v) > void > ipsp_delete_acquire(struct ipsec_acquire *ipa) > { > - NET_ASSERT_LOCKED(); > + mtx_enter(_acquire_mtx); > + ipsp_delete_acquire_locked(ipa); > + mtx_leave(_acquire_mtx); > +} > + > +void > +ipsp_delete_acquire_locked(struct ipsec_acquire *ipa) > +{ > + if (timeout_del(>ipa_timeout) == 1) > + refcnt_rele(>ipa_refcnt); ^^ can we also put ASSERT/check into this branch to verify we are no releasing the last reference to ipa. I suspect we might be doing an extra reference drop here. I believe this ASSERT would be hit if we will compile kernel without IPSEC. we grab the extra reference iff are adding a timer. I think this whole ipsp_delete_acquire() logic should be slightly different. We should call ipsp_unref_acquire_locked(), when timeout_del() returns 1, because function managed to unschedule the pending timer. If it returns zero, we should leave things as they are because timer is running already, waiting for ipsec_acquire_mtx. As soon as timer will grab mutex, it will drop the reference for ipa. That's my understanding. if we go this way then we should not need to grab extra reference for timer. > + ipsp_unref_acquire_locked(ipa); > +} > + > @@ -731,19 +759,21 @@ ipsp_delete_acquire(struct ipsec_acquire > * Find out if there's an ACQUIRE pending. > * XXX Need a better structure. > */ > -struct ipsec_acquire * > +int > ipsp_pending_acquire(struct ipsec_policy *ipo, union sockaddr_union *gw) > { > struct ipsec_acquire *ipa; ^^^ this needs to be initialized to NULL > > NET_ASSERT_LOCKED(); > > - TAILQ_FOREACH (ipa, >ipo_acquires, ipa_ipo_next) { > + mtx_enter(_acquire_mtx); > + TAILQ_FOREACH(ipa, >ipo_acquires, ipa_ipo_next) { > if (!memcmp(gw, >ipa_addr, gw->sa.sa_len)) > - return ipa; > + break; > } > + mtx_leave(_acquire_mtx); > > - return NULL; > + return (ipa != NULL); > } > > @@ -850,13 +881,14 @@ ipsp_acquire_sa(struct ipsec_policy *ipo > return 0; > } > > + mtx_enter(_acquire_mtx); > #ifdef IPSEC > - timeout_add_sec(>ipa_timeout, ipsec_expire_acquire); > + if (timeout_add_sec(>ipa_timeout, ipsec_expire_acquire) == 1) > refcnt_take(>ipa_refcnt); ^^^ line is too long > #endif > - > TAILQ_INSERT_TAIL(_acquire_head, ipa, ipa_next); > TAILQ_INSERT_TAIL(>ipo_acquires, ipa, ipa_ipo_next); > ipa->ipa_policy = ipo; > + mtx_leave(_acquire_mtx); > > /* PF_KEYv2 notification message. */ > return pfkeyv2_acquire(ipo, gw, laddr, >ipa_seq, ddst); > @@ -908,9 +940,14 @@ ipsec_get_acquire(u_int32_t seq) > local variable ipa, needs to be initialized to NULL here as well. > NET_ASSERT_LOCKED(); > > - TAILQ_FOREACH (ipa, _acquire_head, ipa_next) > - if (ipa->ipa_seq == seq) > - return ipa; > + mtx_enter(_acquire_mtx); > + TAILQ_FOREACH(ipa, _acquire_head, ipa_next) { > + if (ipa->ipa_seq == seq) { > + refcnt_take(>ipa_refcnt); > + break; >
Re: ipsec acquire mutex refcount
On 9.3.2022. 19:14, Alexander Bluhm wrote: > Hi, > > Hrvoje has hit a crash with IPsec acquire while testing the parallel > ip forwarding diff. Analysis with sashan@ concludes that this part > is not MP safe. Mutex and refcount should fix the memory management. > > Hrvoje: Could you test this diff? Hi, now that this diff is in, is it time to test parallel forwarding on wider audience?
Re: ipsec acquire mutex refcount
On 9.3.2022. 21:02, Hrvoje Popovski wrote: > On 9.3.2022. 19:14, Alexander Bluhm wrote: >> Hi, >> >> Hrvoje has hit a crash with IPsec acquire while testing the parallel >> ip forwarding diff. Analysis with sashan@ concludes that this part >> is not MP safe. Mutex and refcount should fix the memory management. >> >> Hrvoje: Could you test this diff? > > Hi, > > no problem. Give me 2 or 3 days to hit it properly... Hi, after 2 days of hitting sasyncd setup I can't trigger panic as before. Thank you ..
Re: ipsec acquire mutex refcount
On 9.3.2022. 19:14, Alexander Bluhm wrote: > Hi, > > Hrvoje has hit a crash with IPsec acquire while testing the parallel > ip forwarding diff. Analysis with sashan@ concludes that this part > is not MP safe. Mutex and refcount should fix the memory management. > > Hrvoje: Could you test this diff? Hi, no problem. Give me 2 or 3 days to hit it properly...
ipsec acquire mutex refcount
Hi, Hrvoje has hit a crash with IPsec acquire while testing the parallel ip forwarding diff. Analysis with sashan@ concludes that this part is not MP safe. Mutex and refcount should fix the memory management. Hrvoje: Could you test this diff? ok? bluhm Index: net/pfkeyv2.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/pfkeyv2.c,v retrieving revision 1.232 diff -u -p -r1.232 pfkeyv2.c --- net/pfkeyv2.c 8 Mar 2022 22:30:38 - 1.232 +++ net/pfkeyv2.c 9 Mar 2022 18:11:10 - @@ -1591,7 +1591,8 @@ pfkeyv2_send(struct socket *so, void *me case SADB_X_ASKPOLICY: /* Get the relevant policy */ NET_LOCK(); - ipa = ipsec_get_acquire(((struct sadb_x_policy *) headers[SADB_X_EXT_POLICY])->sadb_x_policy_seq); + ipa = ipsec_get_acquire(((struct sadb_x_policy *) + headers[SADB_X_EXT_POLICY])->sadb_x_policy_seq); if (ipa == NULL) { rval = ESRCH; NET_UNLOCK(); @@ -1600,6 +1601,7 @@ pfkeyv2_send(struct socket *so, void *me rval = pfkeyv2_policy(ipa, headers, , _sz); NET_UNLOCK(); + ipsec_unref_acquire(ipa); if (rval) mode = PFKEYV2_SENDMESSAGE_UNICAST; Index: netinet/ip_ipsp.h === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_ipsp.h,v retrieving revision 1.236 diff -u -p -r1.236 ip_ipsp.h --- netinet/ip_ipsp.h 8 Mar 2022 22:30:38 - 1.236 +++ netinet/ip_ipsp.h 9 Mar 2022 17:46:22 - @@ -45,6 +45,7 @@ * I immutable after creation * a atomic operations * N net lock + * A ipsec_acquire_mtx * F ipsec_flows_mtx * m tdb_mtx fields of struct tdb * p ipo_tdb_mtx link policy to TDB global mutex @@ -251,12 +252,15 @@ struct ipsec_acquire { u_int32_t ipa_seq; struct sockaddr_encap ipa_info; struct sockaddr_encap ipa_mask; + struct refcnt ipa_refcnt; struct timeout ipa_timeout; - struct ipsec_policy *ipa_policy; - TAILQ_ENTRY(ipsec_acquire) ipa_ipo_next; - TAILQ_ENTRY(ipsec_acquire) ipa_next; + struct ipsec_policy *ipa_policy;/* [A] back pointer */ + TAILQ_ENTRY(ipsec_acquire) ipa_ipo_next; /* [A] per policy */ + TAILQ_ENTRY(ipsec_acquire) ipa_next; /* [A] global list */ }; +TAILQ_HEAD(ipsec_acquire_head, ipsec_acquire); + struct ipsec_policy { struct radix_node ipo_nodes[2]; /* radix tree glue */ struct sockaddr_encap ipo_addr; @@ -287,9 +291,9 @@ struct ipsec_policy { struct ipsec_ids*ipo_ids; - TAILQ_HEAD(ipo_acquires_head, ipsec_acquire) ipo_acquires; /* List of acquires */ - TAILQ_ENTRY(ipsec_policy) ipo_tdb_next; /* List TDB policies */ - TAILQ_ENTRY(ipsec_policy) ipo_list; /* List of all policies */ + struct ipsec_acquire_head ipo_acquires; /* [A] List of acquires */ + TAILQ_ENTRY(ipsec_policy) ipo_tdb_next; /* [p] List TDB policies */ + TAILQ_ENTRY(ipsec_policy) ipo_list; /* List of all policies */ }; #defineIPSP_POLICY_NONE0x /* No flags set */ @@ -682,6 +686,7 @@ ssize_t ipsec_hdrsz(struct tdb *); void ipsec_adjust_mtu(struct mbuf *, u_int32_t); void ipsec_set_mtu(struct tdb *, u_int32_t); struct ipsec_acquire *ipsec_get_acquire(u_int32_t); +void ipsec_unref_acquire(struct ipsec_acquire *); intipsec_forward_check(struct mbuf *, int, int); intipsec_local_check(struct mbuf *, int, int, int); Index: netinet/ip_spd.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_spd.c,v retrieving revision 1.114 diff -u -p -r1.114 ip_spd.c --- netinet/ip_spd.c8 Mar 2022 22:30:38 - 1.114 +++ netinet/ip_spd.c9 Mar 2022 17:57:29 - @@ -43,10 +43,11 @@ int ipsp_spd_inp(struct mbuf *, struct i struct tdb **); intipsp_acquire_sa(struct ipsec_policy *, union sockaddr_union *, union sockaddr_union *, struct sockaddr_encap *, struct mbuf *); -struct ipsec_acquire *ipsp_pending_acquire(struct ipsec_policy *, - union sockaddr_union *); -void ipsp_delete_acquire_timo(void *); +intipsp_pending_acquire(struct ipsec_policy *, union sockaddr_union *); +void ipsp_delete_acquire_timer(void *); +void ipsp_delete_acquire_locked(struct ipsec_acquire *); void ipsp_delete_acquire(struct ipsec_acquire *); +void ipsp_unref_acquire_locked(struct ipsec_acquire *); struct pool ipsec_policy_pool; struct pool ipsec_acquire_pool; @@