Hi, On Mon, 20 Dec 2021 13:20:46 +0100 Alexander Bluhm <alexander.bl...@gmx.net> wrote: > On Tue, Dec 14, 2021 at 06:25:20PM +0900, YASUOKA Masahiko wrote: >> Yes, if there is another better idea, it will be welcome. >> For this moment, the diff is the best idea for me. > > Sorry, no better idea. I have no experiance with l2pt. Codewise > the diff looks fine, but I don't understand the consequences.
Thank you for your review and comments. >> + if (tdbflow != NULL) >> + rn = rn_lookup((caddr_t)&tdbflow->tdb_filter, >> + (caddr_t)&tdbflow->tdb_filtermask, rnh); > > Does rn_lookup() modify the radix tree? I looks like rn_lookup -> > rn_addmask -> rn_insert() does that. This will make it impossible > to make IPsec MP capable. The radix tree is not MP safe, art has > been implemented as an alternative. An ipsp_spd_lookup() should > not modify the flows. It is stange that a function named rn_lookup() > does modifications. Did I miss something? rn_lookup() doesn't make any modification. rn_lookup() calls rn_addmask() with second argument search=1. 183 /* return a perfect match if m_arg is set, else do a regular rn_match */ 184 struct radix_node * 185 rn_lookup(void *v_arg, void *m_arg, struct radix_node_head *head) 186 { 187 struct radix_node *x, *tm; 188 caddr_t netmask = 0; 189 190 if (m_arg) { 191 tm = rn_addmask(m_arg, 1, head->rnh_treetop->rn_off); and then rn_addmask() 416 struct radix_node * 417 rn_addmask(void *n_arg, int search, int skip) 418 { (snip) 449 if (tm || search) 450 return (tm); 451 tm = malloc(max_keylen + 2 * sizeof(*tm), M_RTABLE, M_NOWAIT | M_ZERO); 452 if (tm == NULL) 453 return (0); 454 saved_tm = tm; 455 netmask = cp = (caddr_t)(tm + 2); 456 memcpy(cp, addmask_key, mlen); 457 tm = rn_insert(cp, mask_rnhead, &maskduplicated, tm); returns at #449-450 before calling rn_insert(). It seems that rn_addmask() does read only operations when "search". > Why do you call rn_lookup() here? Since rn_match() doesn't take a mask and returns the best one. For an example, if there are multiple peers behind a NAT, flows like below can be configured at the same time. (a) Windows: REMOTE_IP:1701/udp <=> LOCAL_IP:1701/udp (b) Linux: REMOTE_IP:ANY/udp <=> LOCAL_IP:1701/udp If source port of a packet from the Linux is 1701, rn_match() will return (a) for it, then ipsp_spd_lookup() will fail to verify that the given tdb matches the policy. Policies can be created with wildcards (any port, any protocol), then it is compared with a packet whose port and protocol is concreted. Since rn_match() is to find a bestmatch, it can't find a wildcard policy properly if there is a non wildcard policy which is overlapped by the wildcard. So the diff uses rn_lookup() to find the correct policy. > Could we add the masks earlier when the flows are added? > >> + else if (tdbp != NULL) >> + rn = rn_lookup((caddr_t)&tdbp->tdb_filter, >> + (caddr_t)&tdbp->tdb_filtermask, rnh); > > What are the consequences of this chunk for regular IPsec? I have thought that again. Now I realized the problem is only for transport mode. For tunnel mode, since best match is always preferred, rn_lookup() should be used. I'll update the diff that uses rn_lookup() for transport mode only. >> /* Match source/dest IDs. */ >> - if (ipo->ipo_ids) >> - if (tdbp->tdb_ids == NULL || >> - !ipsp_ids_match(ipo->ipo_ids, >> tdbp->tdb_ids)) >> + if (ipo->ipo_ids != NULL) { >> + if ((tdbp->tdb_flags & TDBF_TUNNELING) == 0 && >> + (tdbp->tdb_flags & TDBF_UDPENCAP) != 0) { >> + /* >> + * Skip IDs check for transport mode >> + * with NAT-T. Multiple clients (IDs) >> + * can use a same policy. aima>> + */ >> + } else if (tdbp->tdb_ids == NULL && >> + !ipsp_ids_match(ipo->ipo_ids, >> + tdbp->tdb_ids)) >> goto nomatchin; >> + } > > This was added to make IPsec/l2tp work in rev 1.85. And now you > change it to make it work. I wish markus@ or mikeb@ could give a > clue. At the change of 1.85, "ipsec-id bundles" is introduced. This provides a way to select the same flow by a ipsecflowinfo after rekeying. The diff above is to skip a checking (TDB.ids == FLOW.ids) for transport mode and NAT-T cases. When NAT-T cases, a flow is shared by multiple peers (IDs) so the skip is needed. Also it is OK since it is already verified whether the tdb satisfies the policy(flow) by using rn_lookup(). Index: sys/netinet/ip_ipsp.c =================================================================== RCS file: /disk/cvs/openbsd/src/sys/netinet/ip_ipsp.c,v retrieving revision 1.267 diff -u -p -r1.267 ip_ipsp.c --- sys/netinet/ip_ipsp.c 20 Dec 2021 15:59:09 -0000 1.267 +++ sys/netinet/ip_ipsp.c 23 Dec 2021 11:42:30 -0000 @@ -90,6 +90,8 @@ void tdb_firstuse(void *); void tdb_soft_timeout(void *); void tdb_soft_firstuse(void *); int tdb_hash(u_int32_t, union sockaddr_union *, u_int8_t); +int sockaddr_encap_match(struct sockaddr_encap *, + struct sockaddr_encap *, struct sockaddr_encap *); int ipsec_in_use = 0; u_int64_t ipsec_last_added = 0; @@ -509,6 +511,78 @@ gettdbbysrc(u_int rdomain, union sockadd tdb_ref(tdbp); mtx_leave(&tdb_sadb_mtx); return tdbp; +} + +/* + * Get an SA given the flow, the direction, the security protocol type, and + * the desired IDs. + */ +struct tdb * +gettdbbyflow(u_int rdomain, int direction, struct sockaddr_encap *senflow, + u_int8_t sproto, struct ipsec_ids *ids) +{ + u_int32_t hashval; + struct tdb *tdbp; + union sockaddr_union srcdst; + + if (ids == NULL) /* ids is mandatory */ + return NULL; + + memset(&srcdst, 0, sizeof(srcdst)); + switch (senflow->sen_type) { + case SENT_IP4: + srcdst.sin.sin_len = sizeof(srcdst.sin); + srcdst.sin.sin_family = AF_INET; + if (direction == IPSP_DIRECTION_OUT) + srcdst.sin.sin_addr = senflow->Sen.Sip4.Dst; + else + srcdst.sin.sin_addr = senflow->Sen.Sip4.Src; + break; + case SENT_IP6: + srcdst.sin6.sin6_len = sizeof(srcdst.sin6); + srcdst.sin6.sin6_family = AF_INET6; + if (direction == IPSP_DIRECTION_OUT) + srcdst.sin6.sin6_addr = senflow->Sen.Sip6.Dst; + else + srcdst.sin6.sin6_addr = senflow->Sen.Sip6.Src; + break; + } + + mtx_enter(&tdb_sadb_mtx); + hashval = tdb_hash(0, &srcdst, sproto); + + for (tdbp = tdbdst[hashval]; tdbp != NULL; tdbp = tdbp->tdb_dnext) + if (tdbp->tdb_sproto == sproto && + tdbp->tdb_rdomain == rdomain && + (tdbp->tdb_flags & TDBF_INVALID) == 0 && + ipsp_ids_match(ids, tdbp->tdb_ids) && + ((direction == IPSP_DIRECTION_OUT && + !memcmp(&tdbp->tdb_dst, &srcdst, srcdst.sa.sa_len)) || + (direction == IPSP_DIRECTION_IN && + !memcmp(&tdbp->tdb_src, &srcdst, srcdst.sa.sa_len)))) { + if (sockaddr_encap_match(&tdbp->tdb_filter, + &tdbp->tdb_filtermask, senflow)) + break; + } + + tdb_ref(tdbp); + mtx_leave(&tdb_sadb_mtx); + return tdbp; +} + +int +sockaddr_encap_match(struct sockaddr_encap *addr, struct sockaddr_encap *mask, + struct sockaddr_encap *dest) +{ + size_t off; + + for (off = offsetof(struct sockaddr_encap, sen_type); + off < dest->sen_len; off++) { + if ((*((u_char *)addr + off) & *((u_char *)mask + off)) != + (*((u_char *)dest + off) & *((u_char *)mask + off))) + break; + } + return (off < dest->sen_len)? 0 : 1; } #ifdef DDB Index: sys/netinet/ip_ipsp.h =================================================================== RCS file: /disk/cvs/openbsd/src/sys/netinet/ip_ipsp.h,v retrieving revision 1.233 diff -u -p -r1.233 ip_ipsp.h --- sys/netinet/ip_ipsp.h 20 Dec 2021 15:59:10 -0000 1.233 +++ sys/netinet/ip_ipsp.h 23 Dec 2021 11:42:30 -0000 @@ -593,6 +593,8 @@ struct tdb *gettdbbysrcdst_dir(u_int, u_ union sockaddr_union *, u_int8_t, int); #define gettdbbysrcdst(a,b,c,d,e) gettdbbysrcdst_dir((a),(b),(c),(d),(e),0) #define gettdbbysrcdst_rev(a,b,c,d,e) gettdbbysrcdst_dir((a),(b),(c),(d),(e),1) +struct tdb *gettdbbyflow(u_int, int, struct sockaddr_encap *, u_int8_t, + struct ipsec_ids *); void puttdb(struct tdb *); void puttdb_locked(struct tdb *); void tdb_delete(struct tdb *); Index: sys/netinet/ip_spd.c =================================================================== RCS file: /disk/cvs/openbsd/src/sys/netinet/ip_spd.c,v retrieving revision 1.110 diff -u -p -r1.110 ip_spd.c --- sys/netinet/ip_spd.c 16 Dec 2021 15:38:03 -0000 1.110 +++ sys/netinet/ip_spd.c 23 Dec 2021 11:42:34 -0000 @@ -156,13 +156,14 @@ ipsp_spd_lookup(struct mbuf *m, int af, u_int32_t ipsecflowinfo) { struct radix_node_head *rnh; - struct radix_node *rn; + struct radix_node *rn = NULL; union sockaddr_union sdst, ssrc; struct sockaddr_encap *ddst, dst; struct ipsec_policy *ipo; struct ipsec_ids *ids = NULL; int error, signore = 0, dignore = 0; u_int rdomain = rtable_l2(m->m_pkthdr.ph_rtableid); + struct tdb *tdbflow = NULL; NET_ASSERT_LOCKED(); @@ -303,9 +304,39 @@ ipsp_spd_lookup(struct mbuf *m, int af, return EAFNOSUPPORT; } + /* + * Prepare tdb for searching the correct SPD by rn_lookup(). + * "tdb_filtemask" of the tdb is used to find the correct SPD when + * multiple policies are overlapped. + */ + if (ipsecflowinfo != 0) { + ids = ipsp_ids_lookup(ipsecflowinfo); + if (ids != NULL) + tdbflow = gettdbbyflow(rdomain, direction, &dst, + IPPROTO_ESP, ids); + } + /* Actual SPD lookup. */ - if ((rnh = spd_table_get(rdomain)) == NULL || - (rn = rn_match((caddr_t)&dst, rnh)) == NULL) { + rnh = spd_table_get(rdomain); + if (rnh != NULL) { + if (tdbflow != NULL) + rn = rn_lookup((caddr_t)&tdbflow->tdb_filter, + (caddr_t)&tdbflow->tdb_filtermask, rnh); + else if (tdbp != NULL && + (tdbp->tdb_flags & TDBF_TUNNELING) == 0) + /* + * Use rn_lookup() for transport mode, since protocol + * or port of the policy might be a wildcard, and may + * be overlapped with other peer's policy. + */ + rn = rn_lookup((caddr_t)&tdbp->tdb_filter, + (caddr_t)&tdbp->tdb_filtermask, rnh); + else + rn = rn_match((caddr_t)&dst, rnh); + } + tdb_unref(tdbflow); + + if (rn == NULL) { /* * Return whatever the socket requirements are, there are no * system-wide policies. @@ -397,9 +428,6 @@ ipsp_spd_lookup(struct mbuf *m, int af, } } - if (ipsecflowinfo) - ids = ipsp_ids_lookup(ipsecflowinfo); - /* Check that the cached TDB (if present), is appropriate. */ mtx_enter(&ipo_tdb_mtx); if (ipo->ipo_tdb != NULL) { @@ -540,10 +568,19 @@ ipsp_spd_lookup(struct mbuf *m, int af, goto nomatchin; /* Match source/dest IDs. */ - if (ipo->ipo_ids) - if (tdbp->tdb_ids == NULL || - !ipsp_ids_match(ipo->ipo_ids, tdbp->tdb_ids)) + if (ipo->ipo_ids != NULL) { + if ((tdbp->tdb_flags & TDBF_TUNNELING) == 0) { + /* + * we checked this already by finding + * the ipo with rn_lookup() and the + * ipo might be shared by multiple + * clients + */ + } else if (tdbp->tdb_ids == NULL && + !ipsp_ids_match(ipo->ipo_ids, + tdbp->tdb_ids)) goto nomatchin; + } /* Add it to the cache. */ mtx_enter(&ipo_tdb_mtx);