[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. lstewart: We can generate a paper documenting the benefits of enlarging the IP-packet input payload, so that we can fully understand what is going on. Going the multipacket approach seems a bit more tricky, hence it involves changing the TCP and posibly also if_output() calls needs to handle this aswell, though not impossible. lstewart: What should an mbuf accessor that computes the total pktheader payload for multiple mbufs be called? m_pkthdr_chain_length(m) ?? REPOSITORY rS FreeBSD src repository REVISION DETAIL https://reviews.freebsd.org/D1761 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: hselasky, rrs, glebius, gnn, emaste, rwatson, bz, imp, np, jfv, adrian, lstewart Cc: imp, freebsd-net-list ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. lawrence: It is someone well known to be using FreeBSD. This patch makes such a big difference when applied to +10Gbit/s connections that we can run 2 TCP streams totalling 37.5 GBit/s on a single 2.x GHz CPU core instead of only one. REPOSITORY rS FreeBSD src repository REVISION DETAIL https://reviews.freebsd.org/D1761 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: hselasky, rrs, glebius, gnn, emaste, rwatson, bz, imp, np, jfv, adrian, lstewart Cc: imp, freebsd-net-list ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
lstewart added a comment. Ok, but that's anecdotal and gives us reviewers nothing to go on - without any methodology or raw data who knows whether the LRO change is solely responsible for the improvement and if it introduced any undesired side effects. It's also possible that with tuning, the same results could have been obtained without the jumbo LRO change. As there seems to be some sensitivity around sharing specific details from field deployments which is fine, the path forward is therefore for you and/or Mellanox test engineers to run experiments, capture + analyse data and present it for discussion. You should provide your methodology so anyone wanting to replicate your experiments and results can do so. That being said, I personally feel the energy would be better spent on batching, which would allow a tunable number of 64k correctly formed packets to be passed up the stack which should give 99% of the benefits of this work without the hackiness, plus gives us a win in many other workloads when LRO is unavailable or not used. REPOSITORY rS FreeBSD src repository REVISION DETAIL https://reviews.freebsd.org/D1761 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: hselasky, rrs, glebius, gnn, emaste, rwatson, bz, imp, np, jfv, adrian, lstewart Cc: imp, freebsd-net-list ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
lstewart added a comment. I hope I didn't delete it... from what I could see online, the Abandon Phabricator action is the means by which a reviewer indicates they have permanently rejected the patch (as opposed to suggesting changes). As to people using the patch, can you say who and why? REPOSITORY rS FreeBSD src repository REVISION DETAIL https://reviews.freebsd.org/D1761 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: hselasky, rrs, glebius, gnn, emaste, rwatson, bz, imp, np, jfv, adrian, lstewart Cc: imp, freebsd-net-list ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. lstewart: OK, just don't delete this patch, because some people are using it. REPOSITORY rS FreeBSD src repository REVISION DETAIL https://reviews.freebsd.org/D1761 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: hselasky, rrs, glebius, gnn, emaste, rwatson, bz, imp, np, jfv, adrian, lstewart Cc: imp, freebsd-net-list ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
adrian added a comment. I don't think we should be using the hash bits like this. There's even more hash types to add (all the variations on symmetric and non-RSS hashing; hashing XOR versus RSS, different fields, etc). I'd rather we don't make things more complicated by having the flowid/hashtype fields get more overloaded than they already are. Let's see if we can do something more sensible by adding more flags; we can always look at growing the mbuf to include these new features. As I said before, there will be more features than this one coming in the pipeline so we will be better off if we list stuff that's in the pipeline so we can all figure out what new bits and what shuffling we need to do. There's also robert's mbuf shuffling that needs to finish; once that's finished we may be in a much better situation to add fields. :-) REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rrs, glebius, gnn, emaste, lstewart, rwatson, bz, imp, np, adrian, jfvogel Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. Hi, Adrian: Would a set of macros hiding how we set and clear the LRO_TCP flag be acceptable, then we can later resolve that minor detail, exactly how the bit is encoded? #define M_SET_LRO_TCP(m) ... #define M_CLR_LRO_TCP(m) ... #define M_GET_LRO_TCP(m) ... --HPS REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rrs, glebius, gnn, emaste, lstewart, rwatson, bz, imp, np, adrian Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
adrian added a comment. I think we should talk to rwatson about his mbuf hijinx and see about extending flags. This isn't the only option to add. REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rrs, glebius, gnn, emaste, lstewart, rwatson, bz, imp, np, adrian Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. Adrian: We do have one bit left after M_FLOWID was removed. Should I use it for this? And what would be an appropriate name. M_LENGTH_OK ? REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rrs, glebius, gnn, emaste, lstewart, rwatson, bz, imp, np, adrian Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
rwatson added a comment. np@'s comments are good ones, both with respect to ACK clocking and BPF -- but this will also affect local firewalls that do state tracking and/or transformation, and likewise will run into problems. Given this discussion, I'm increasingly convinced that this is Not A Good Idea In Its Current Form -- far more thought is required about how we might want to handle this. I would recommend we set aside time at a BSDCan devsummit session to talk about the implications of VLRO for the stack as a whole. Given our default direct-dispatch input configuration, presumably much of the benefit from VLRO comes from reducing the number of tcpcb lookups, and a bit more comes from avoiding reassembly costs. I do wonder if we might choose to present VLRO in a different way: rather than as a single, very large datagram, but instead as a series of datagrams that the lower layer promises are for the same TCP connection, and sequential, allowing a single lookup and reassembly avoidance, but avoiding diverging from the current assumption that mbuf chains correspond to a single IP datagram that meets its invariant properties. We could easily assert that the IP addresses match and that the segments are sequential, allowing us to catch bugs while making strong assumptions. REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rmacklem, rrs, glebius, gnn, emaste, imp, adrian, bz, rwatson, lstewart Cc: np, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. See comments added. INLINE COMMENTS sys/netinet/ip_output.c:129 Because the variable is later on compared to the mtu which is also an int. Else you will get a compile warning about signed comparison mismatch. Same goes for other location where int is used. sys/netinet/tcp_lro.c:233 This is a dummy value which should not be used in the new LRO implementation. 65535 is chosen to make the packet drop as invalid in case some code that is not updated, if any, decides to check the p_len with m_pkthdr.len. REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rmacklem, rrs, glebius, gnn, emaste, imp, adrian, bz, rwatson Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. rwatson: The LRO code ensures that all trailing and padding is stripped. This happens both before and after my change. REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rmacklem, rrs, glebius, gnn, emaste, imp, adrian, bz, rwatson Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
hselasky added a comment. adrian: I'm not in a position to change the IP header structure to support 32-bit payload length. That's why m_pkthdr.len was chosen. REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rmacklem, rrs, glebius, gnn, emaste, imp, adrian, bz, rwatson Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
adrian added a comment. *laugh* I know you're not; I'm just worried that we'll have lots of subtle bugs here and there with the semantics of sometimes the length is in ip_hdr; sometimes the length is in m_pkthdr; and there's no clear definition of when we should check both. If we had a clearer definition of that and we had accessor macros that abstracted that away, it'll make things clearer and it'll give us a place to put assertions about our assumptions. REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rmacklem, rrs, glebius, gnn, emaste, imp, adrian, bz, rwatson Cc: freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Differential] [Commented On] D1761: Extend LRO support to accumulate more than 65535 bytes
adrian added a comment. .. and yes, I don't think you should be using the hashtype field to flag LRO information. They're totally separate things. (It's totally feasible that a NIC doing 64KiB LRO will reassemble frames with correct RSS information and feed it to an RSS kernel, which can then ensure it's running on the correct cpuset with the TCP state and locks.) REVISION DETAIL https://reviews.freebsd.org/D1761 To: hselasky, rmacklem, rrs, glebius, gnn, emaste, imp, adrian, bz, rwatson, lstewart Cc: np, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org