On 3/7/24 14:22, Kevin Traynor wrote: > On 05/03/2024 17:31, Ilya Maximets wrote: >> On 3/5/24 18:13, Kevin Traynor wrote: >>> On 05/03/2024 15:22, Ilya Maximets wrote: >>>> It's hard to debug situations where driver rejects packets for some >>>> reason. Dumping out the mbuf should help with debugging. >>>> >>>> Sample output looks like this: >>>> >>>> |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: Invalid packet: >>>> dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176 >>>> pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0 >>>> segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1 >>>> Dump data at [0x1180bce580], len=64 >>>> 00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | >>>> 33.....'..M...`. >>>> 00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | >>>> ...$............ >>>> 00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | >>>> ................ >>>> 00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | >>>> ......:......... >>>> >>> >>> Hi Ilya, >>> >>> Thanks for the patch, that could be useful. One comment below. >>> >>>> Signed-off-by: Ilya Maximets <[email protected]> >>>> --- >>>> lib/netdev-dpdk.c | 33 +++++++++++++++++++++++++++++++++ >>>> 1 file changed, 33 insertions(+) >>>> >>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >>>> index 45f61930d..662b41d5c 100644 >>>> --- a/lib/netdev-dpdk.c >>>> +++ b/lib/netdev-dpdk.c >>>> @@ -2664,6 +2664,35 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk >>>> *dev, struct rte_mbuf **pkts, >>>> return cnt; >>>> } >>>> >>>> +static void >>>> +netdev_dpdk_mbuf_dump(const char *prefix, const char *message, >>>> + const struct rte_mbuf *mbuf) >>>> +{ >>>> + static struct vlog_rate_limit dump_rl = VLOG_RATE_LIMIT_INIT(5, 5); >>>> + char *response = NULL; >>>> + FILE *stream; >>>> + size_t size; >>>> + >>>> + if (VLOG_DROP_DBG(&dump_rl)) { >>>> + return; >>>> + } >>>> + >>>> + stream = open_memstream(&response, &size); >>>> + if (!stream) { >>>> + VLOG_ERR("Unable to open memstream for mbuf dump: %s.", >>>> + ovs_strerror(errno)); >>>> + return; >>>> + } >>>> + >>>> + rte_pktmbuf_dump(stream, mbuf, rte_pktmbuf_pkt_len(mbuf)); >>>> + >>>> + fclose(stream); >>>> + >>>> + VLOG_DBG(prefix ? "%s: %s:\n%s" : "%s%s:\n%s", >>>> + prefix ? prefix : "", message, response); >>>> + free(response); >>>> +} >>>> + >>>> /* Tries to transmit 'pkts' to txq 'qid' of device 'dev'. Takes >>>> ownership of >>>> * 'pkts', even in case of failure. >>>> * >>>> @@ -2680,6 +2709,10 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, >>>> int qid, >>>> VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. " >>>> "Only %u/%u are valid: %s", >>>> netdev_get_name(&dev->up), >>>> nb_tx_prep, cnt, rte_strerror(rte_errno)); >>>> + for (size_t i = nb_tx_prep; i < cnt; i++) { >>>> + netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up), >>>> + "Invalid packet", pkts[i]); >>> >>> rte_eth_tx_prepare() stops when it finds an invalid packet, so we don't >>> know the status of subsequent packets except that they will not be sent. >>> It might be confusing if we dump them and label as invalid. >> >> Ah, interesting. I just assumed that since netdev_dpdk_eth_tx_burst() >> is calling it once, it will re-order the packets moving the bad ones >> to the end like some other functions do. But you're right, that's not >> the case. And the existing log message is actually not right either. >> > > > Yeah, maybe 's/are valid/are sent' would be more appropriate for that > log message
They are not actually sent yet and may be dropped later. So, maybe 'will be sent' ? Or even "N out of M packets will not be sent." > >>> >>> How about only dump the known invalid packet or else replace "Invalid >>> packet" with, i == nb_tx_prep ? "Invalid packet" : "Not sent" >>> >>> (maybe a better term than "Not sent") >>> >>> What do you think ? >> >> I don't think we need to dump the 'not sent' packets at all. We don't >> know if they are bad or not, so dumping them doesn't give any meaningful >> information. We can just dump the "First invalid packet", I suppose. >> It is the one that failed the check. E.g.: >> >> netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up), >> "First invalid packet", pkts[nb_tx_prep]); >> > > LGTM > >> In general, we should call the rte_eth_tx_prepare() multiple times >> re-ordering the packets between calls, so we don't drop packets that are >> actually good. But, the probability of only a few packets in a batch to >> be bad is very low, so it's probably fine to keep the code as it is. >> >> What do you think? >> > > Not sure really. My initial thought was that repeated calls to > rte_eth_tx_prepare() and shuffling of packets around (or feeding > sub-parts of the array to multiple calls of rte_eth_tx_burst) might be > too much work and better to just drop and move on to processing the next > batch of packets. > > But OTOH, it does seem more correct to at least attempt to send any > subsequent packets that are valid. > > I'd suggest to take this patch with the changes discussed above as an > independent patch anyway. OK. I'll submit a new version. I will not touch any existing code, only add dumping of the first actually invalid packet. > > I can submit a follow-on RFC to process the subsequent packets (or you > can if you prefer), if people don't think that's a terrible idea. The only concern I have is it probably not worth it if i will affect performance in a general case. If it doesn't, then it should be fine. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
