On 05/03/2024 17:31, Ilya Maximets wrote: > On 3/5/24 18:13, Kevin Traynor wrote: >> On 05/03/2024 15:22, Ilya Maximets wrote: >>> It's hard to debug situations where driver rejects packets for some >>> reason. Dumping out the mbuf should help with debugging. >>> >>> Sample output looks like this: >>> >>> |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: Invalid packet: >>> dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176 >>> pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0 >>> segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1 >>> Dump data at [0x1180bce580], len=64 >>> 00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | >>> 33.....'..M...`. >>> 00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | >>> ...$............ >>> 00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | >>> ................ >>> 00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | >>> ......:......... >>> >> >> Hi Ilya, >> >> Thanks for the patch, that could be useful. One comment below. >> >>> Signed-off-by: Ilya Maximets <[email protected]> >>> --- >>> lib/netdev-dpdk.c | 33 +++++++++++++++++++++++++++++++++ >>> 1 file changed, 33 insertions(+) >>> >>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >>> index 45f61930d..662b41d5c 100644 >>> --- a/lib/netdev-dpdk.c >>> +++ b/lib/netdev-dpdk.c >>> @@ -2664,6 +2664,35 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk *dev, >>> struct rte_mbuf **pkts, >>> return cnt; >>> } >>> >>> +static void >>> +netdev_dpdk_mbuf_dump(const char *prefix, const char *message, >>> + const struct rte_mbuf *mbuf) >>> +{ >>> + static struct vlog_rate_limit dump_rl = VLOG_RATE_LIMIT_INIT(5, 5); >>> + char *response = NULL; >>> + FILE *stream; >>> + size_t size; >>> + >>> + if (VLOG_DROP_DBG(&dump_rl)) { >>> + return; >>> + } >>> + >>> + stream = open_memstream(&response, &size); >>> + if (!stream) { >>> + VLOG_ERR("Unable to open memstream for mbuf dump: %s.", >>> + ovs_strerror(errno)); >>> + return; >>> + } >>> + >>> + rte_pktmbuf_dump(stream, mbuf, rte_pktmbuf_pkt_len(mbuf)); >>> + >>> + fclose(stream); >>> + >>> + VLOG_DBG(prefix ? "%s: %s:\n%s" : "%s%s:\n%s", >>> + prefix ? prefix : "", message, response); >>> + free(response); >>> +} >>> + >>> /* Tries to transmit 'pkts' to txq 'qid' of device 'dev'. Takes ownership >>> of >>> * 'pkts', even in case of failure. >>> * >>> @@ -2680,6 +2709,10 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, >>> int qid, >>> VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. " >>> "Only %u/%u are valid: %s", netdev_get_name(&dev->up), >>> nb_tx_prep, cnt, rte_strerror(rte_errno)); >>> + for (size_t i = nb_tx_prep; i < cnt; i++) { >>> + netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up), >>> + "Invalid packet", pkts[i]); >> >> rte_eth_tx_prepare() stops when it finds an invalid packet, so we don't >> know the status of subsequent packets except that they will not be sent. >> It might be confusing if we dump them and label as invalid. > > Ah, interesting. I just assumed that since netdev_dpdk_eth_tx_burst() > is calling it once, it will re-order the packets moving the bad ones > to the end like some other functions do. But you're right, that's not > the case. And the existing log message is actually not right either. >
Yeah, maybe 's/are valid/are sent' would be more appropriate for that log message >> >> How about only dump the known invalid packet or else replace "Invalid >> packet" with, i == nb_tx_prep ? "Invalid packet" : "Not sent" >> >> (maybe a better term than "Not sent") >> >> What do you think ? > > I don't think we need to dump the 'not sent' packets at all. We don't > know if they are bad or not, so dumping them doesn't give any meaningful > information. We can just dump the "First invalid packet", I suppose. > It is the one that failed the check. E.g.: > > netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up), > "First invalid packet", pkts[nb_tx_prep]); > LGTM > In general, we should call the rte_eth_tx_prepare() multiple times > re-ordering the packets between calls, so we don't drop packets that are > actually good. But, the probability of only a few packets in a batch to > be bad is very low, so it's probably fine to keep the code as it is. > > What do you think? > Not sure really. My initial thought was that repeated calls to rte_eth_tx_prepare() and shuffling of packets around (or feeding sub-parts of the array to multiple calls of rte_eth_tx_burst) might be too much work and better to just drop and move on to processing the next batch of packets. But OTOH, it does seem more correct to at least attempt to send any subsequent packets that are valid. I'd suggest to take this patch with the changes discussed above as an independent patch anyway. I can submit a follow-on RFC to process the subsequent packets (or you can if you prefer), if people don't think that's a terrible idea. thanks, Kevin. > Best regards, Ilya Maximets. > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
