Dear Andrew I like to write this test as a testcase, I will work on that in my spare time. I like your solution about separate code path, but I think defragmentation could solve the problem and reassembly may have overhead. In defragmentation, information of fragmented packets is kept and all of packets are buffered until we decide to accept or deny those packets, but in this solution all of fragmented packets are not reassembled. Look at nf_defrag_ipv4.c in net/ipv4/netfilter from kernel source tree.
Another issue I found today and I forgot to mention in last mail: in multi_acl_match_get_applied_ace_index function (acl/public_inlines.h) we check portrange, if need_portrange_check is set; but it's not needed for fragmented packets. so I suggest following line if (PREDICT_FALSE(result_val->need_portrange_check)) { to changes to if (PREDICT_FALSE(result_val->need_portrange_check&& !match->pkt.is_nonfirst_fragment)) { Khers On Tue, May 8, 2018 at 2:50 PM, Andrew Yourtchenko <ayour...@gmail.com> wrote: > Yeah back in the day the fragment reassembly code was not there yet, so > there is a choice either to drop all the fragments on the floor, or rely on > the receiving TCP stack to drop the non-initial fragments, like IOS did. > There is a knob that allows you to choose the behavior between the two by > flipping the value of l4_match_nonfirst_fragment. > > Though we should not create a session for the non-initial fragments, since > there is no full 5-tuple, do you think you might put together a make test > testcase so we can ensure it is behaving properly ? > > I am now working on porting the TupleMerge approach that Valerio Bruschi > did. There the fragments will be going into a separate code path, and it > might be possible to add an option for reassembly then. > > --a > > On 8 May 2018, at 11:02, emma sdi <s3m2e1.6s...@gmail.com> wrote: > > Dear vpp folks > > I have a simple topology and a permit+reflect rule for udp on > destination port 1000 as pasted in this link. > <https://paste.ubuntu.com/p/6mRcTD8zsV/> > I send a big file from 172.20.1.2 to 172.20.1.1 port 1001 with > nc and I receive some packets (non first fragment) in second > client (172.20.1.1). > > Following are commands I used in this sce nario. > Clinet 172.20.1.2 > cat /dev/sda | nc -u 172.20.1.1 1001 > > Client 172.20.1.1> tcpdump -nn -i eth1 > 01:13:38.164466 IP 172.20.1.2 > 172.20.1.1: ip-proto-17 > 01:13:38.164467 IP 172.20.1.2 > 172.20.1.1: ip-proto-17 > 01:13:38.164468 IP 172.20.1.2 > 172.20.1.1: ip-proto-17 > 01:13:38.164469 IP 172.20.1.2 > 172.20.1.1: ip-proto-17 > > Output of 'show trace' is stored in this link > <https://paste.ubuntu.com/p/MVjrMxtnVz/> , First packet matched > with acl 1 and dropped but second fragment of that packet is matched > with acl 0 and a session created for that. So I dig more in > source code, and I found this block in hash_acl_add function: > > if (am->l4_match_nonfirst_fragment) { > /* add the second rule which matches the noninitial fragments with > the respective mask */ > make_mask_and_match_from_rule(&mask, &a->rules[i], &ace_info, 1); > ace_info.mask_type_index = assign_mask_type_index(am, &mask); > ace_info.match.pkt.mask_type_index_lsb = ace_info.mask_type_index; > DBG("ACE: %d (non-initial frags) mask_type_index: %d", i, > ace_info.mask_type_index); > /* Ensure a given index is set in the mask type index bitmap for > this ACL */ > ha->mask_type_index_bitmap = clib_bitmap_set(ha->mask_type_index_bitmap, > ace_info.mask_type_index, 1); > vec_add1(ha->rules, ace_info); > } > > We make 3-tuple rule for non first fragment packets, this code solved > the IP fragment problem in a simple and inaccurate way. I think we > need a buffer for fragments like netfilter-conntract. > > Regards, > Khers > > > > > > >