On Mon, Nov 19, 2018 at 11:09:25AM -0800, Darrell Ball wrote:
> Fragmentation handling is added for supporting conntrack.
> Both v4 and v6 are supported.
> 
> After discussion with several people, I decided to not store
> configuration state in the database to be more consistent with
> the kernel in future, similarity with other conntrack configuration
> which will not be in the database as well and overall simplicity.
> Accordingly, fragmentation handling is enabled by default.
> 
> This patch enables fragmentation tests for the userspace datapath.
> 
> Signed-off-by: Darrell Ball <dlu...@gmail.com>

Thanks for implementing this.

This could use more comments, especially on the data structures and
file-level variables and at the level of an individual function.

Please don't invent yet another mutex.

ipf_print_reass_packet() seems like it could use ds_put_hex_dump() and
thereby be more readable (I don't really want to read 182 hex digits in
a row without any white space).

Why 91 bytes in ipf_print_reass_packet()?

All the counters seem to be write-only.

struct ipf_addr seems odd to me.  It has both aligned and unaligned
versions of addresses, which means that the overall struct needs to be
aligned, and it's a union nested in a struct instead of just a union.
ipf_addr_hash_add() implies that all the bytes in the struct need to be
initialized even if only some of them are used.

ipf_list_key_hash() seems to be at risk of hashing trailing padding at
the end of struct ipf_list_key.

Does ipf_list_complete() run one past the end of the array?  Naively, it
seems like it might.

I found ipf_sort() a little hard to read mostly due to the long variable
names.  Here's an alternate form that you can accept or reject as you
like (I have not tested it):

static void
ipf_sort(struct ipf_frag *frags, size_t last_idx)
    OVS_REQUIRES(ipf_lock)
{
    for (int i = 1; i <= last_idx; i++) {
        struct ipf_frag ipf_frag = frags[i];
        int j = i - 1;
        while (j >= 0 && frags[j].start_data_byte > ipf_frag.start_data_byte) {
            frags[j + 1] = frags[j];
            j--;
        }
        frags[j + 1] = ipf_frag;
    }
}

ipf_reassemble_v4_frags() and ipf_reassemble_v6_frags() know the final
length of the packet they're assembling, but it doesn't look to me like
they preallocate the space for it.

ipf_reassemble_v6_frags() and ipf_reassemble_v6_frags() calculate the
length of the L3 header manually and skip past it.  Couldn't they just
use the L4 header pointer?

The ipf_list_key_cmp() interface is a little confusing because the
return value convention and the name is a little like a strcmp()ish
function, but it doesn't use a -1/+1 return value.  I'd rename it to
ipf_list_key_equal()s, change the return type to "bool", and make true
mean equal, false mean unequal.

Processing a fragment is O(n) in the number of existing fragments due to
the dup check.  I don't know whether this is important.

It looks like packet duplication can cause a fraglist to be marked
invalid.  I wonder whether this is good behavior.

It seems like we could estimate the number of fragments needed by
dividing the total size by the size of the first fragment received.

Do we need xzalloc() for frag_list?  It seems like we're going to
initialize each element as needed anyway.

I think there's a race if either v4 or v6 is ever disabled, because the
code checks whether fragment reassembly is enabled twice, once at a high
level and once again in ipf_handle_frag().  The latter function
assert-fails if reassembly is disabled, which seems like it could be a
problem.

At the same time, I don't see any way to ever disable fragment
reassembly.  The code appears to enable it by default and not provide
any way to disable it.

Is it common practice to enforce a minimum size for fragments?

Maybe it would be a little easier to make the counters a two-dimensional
array: atomic_uint64_t ipf_counters[2][N_COUNTERS], where one dimension
is IPv4/IPv6 and the other is the particular counter.  Then ipf_count()
would become trivial.  Just a thought though.

Thanks, and I'll look forward to the next version.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to