Thanks very much for the thorough review sorry; I made the changes last year and then ran into vacation and internal priorities.
On Tue, Dec 11, 2018 at 8:15 AM Ben Pfaff <b...@ovn.org> wrote: > On Mon, Nov 19, 2018 at 11:09:25AM -0800, Darrell Ball wrote: > > Fragmentation handling is added for supporting conntrack. > > Both v4 and v6 are supported. > > > > After discussion with several people, I decided to not store > > configuration state in the database to be more consistent with > > the kernel in future, similarity with other conntrack configuration > > which will not be in the database as well and overall simplicity. > > Accordingly, fragmentation handling is enabled by default. > > > > This patch enables fragmentation tests for the userspace datapath. > > > > Signed-off-by: Darrell Ball <dlu...@gmail.com> > > Thanks for implementing this. > > This could use more comments, especially on the data structures and > file-level variables and at the level of an individual function. > Added more comments. > > Please don't invent yet another mutex. > yep > > ipf_print_reass_packet() seems like it could use ds_put_hex_dump() and > thereby be more readable (I don't really want to read 182 hex digits in > a row without any white space). > ds_put_hex_dump() is better and I will use it – thanks. > > Why 91 bytes in ipf_print_reass_packet()? > Just a number with more than enough context > > All the counters seem to be write-only. > Subsequent patches provide that support, which as folded in now. > > struct ipf_addr seems odd to me. It has both aligned and unaligned > versions of addresses, which means that the overall struct needs to be > aligned, and it's a union nested in a struct instead of just a union. > The struct was never extended; it is trivially converted to a plain union. > ipf_addr_hash_add() implies that all the bytes in the struct need to be > initialized even if only some of them are used. > The whole key that gets hashed is initialized with a memset. The performance aspect is a rounding error. > > ipf_list_key_hash() seems to be at risk of hashing trailing padding at > the end of struct ipf_list_key. > ipf_list_key_hash() always operates on a memset zeroed key with fields later set. The performance aspect is a rounding error. > > Does ipf_list_complete() run one past the end of the array? Naively, it > seems like it might. > No, it uses array indices with adds 1 to the penultimate one for last iter. I made it more obvious though. > > I found ipf_sort() a little hard to read mostly due to the long variable > names. Here's an alternate form that you can accept or reject as you > like (I have not tested it): > > static void > ipf_sort(struct ipf_frag *frags, size_t last_idx) > OVS_REQUIRES(ipf_lock) > { > for (int i = 1; i <= last_idx; i++) { > struct ipf_frag ipf_frag = frags[i]; > int j = i - 1; > while (j >= 0 && frags[j].start_data_byte > > ipf_frag.start_data_byte) { > frags[j + 1] = frags[j]; > j--; > } > frags[j + 1] = ipf_frag; > } > } > One day I decided to write only while loops to see if you would notice and you did. It is trivially equivalent and a for loop where applicable is always more compact and easier to understand. Also, the longer names were not for my benefit, so I shortened them. > > ipf_reassemble_v4_frags() and ipf_reassemble_v6_frags() know the final > length of the packet they're assembling, but it doesn't look to me like > they preallocate the space for it. > Sure, even though the performance aspect is a rounding error and it is just a couple extra lines of code and looks nicer. > > ipf_reassemble_v6_frags() and ipf_reassemble_v6_frags() calculate the > length of the L3 header manually and skip past it. Couldn't they just > use the L4 header pointer? > I am using the L4 pointer elsewhere; it is weird that I didn’t just use it here as well. Converted now. > > The ipf_list_key_cmp() interface is a little confusing because the > return value convention and the name is a little like a strcmp()ish > function, but it doesn't use a -1/+1 return value. I'd rename it to > ipf_list_key_equal()s, change the return type to "bool", and make true > mean equal, false mean unequal. > yep; ‘eq’ is simpler semantics for the API, so I used it. > > Processing a fragment is O(n) in the number of existing fragments due to > the dup check. I don't know whether this is important. > This processing is fast and a fraction of the total per fragment. It is also practically a constant because of bounding. > > It looks like packet duplication can cause a fraglist to be marked > invalid. I wonder whether this is good behavior. > Only the packet is marked invalid and it is just an optimization because conntrack will do the same otherwise with unnecessary work. > > It seems like we could estimate the number of fragments needed by > dividing the total size by the size of the first fragment received. > We don’t know the total size; furthermore, DOS is more likely than real usage for fragmentation. I could estimate but there is not much advantage and more complexity. > > Do we need xzalloc() for frag_list? It seems like we're going to > initialize each element as needed anyway. > Maintainability was more important than the insignificant performance gain here. However, I spliced out a separate ‘init’ function to handle the initializations and used xmalloc instead. > > I think there's a race if either v4 or v6 is ever disabled, because the > code checks whether fragment reassembly is enabled twice, once at a high > level and once again in ipf_handle_frag(). The latter function > assert-fails if reassembly is disabled, which seems like it could be a > problem. > Yep, it is a bug – thanks When I converted to NOT_REACHED, I tried to remember the reason I did not do it originally and now I remember J > > At the same time, I don't see any way to ever disable fragment > reassembly. The code appears to enable it by default and not provide > any way to disable it. > A subsequent patch in the series allows disabling; I folded that patch into this one now. > > Is it common practice to enforce a minimum size for fragments? > A subsequent patch provided comments about this. Folded that patch into this one. > > Maybe it would be a little easier to make the counters a two-dimensional > array: atomic_uint64_t ipf_counters[2][N_COUNTERS], where one dimension > is IPv4/IPv6 and the other is the particular counter. Then ipf_count() > would become trivial. Just a thought though. > These should have been converted to arrays before. The intention was to use two arrays since V6 counters can/will diverge from V4, which I did now; ipf_count() still becomes a 1 line function. > > Thanks, and I'll look forward to the next version. > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev