Re: [dpdk-dev] [PATCH v4 2/2] doc: add guide for debug and troubleshoot
Thanks Marko, I will spin v5 with the changes asap. Note: Just wondering why 'devtools/checkpatches.sh' did not report any error. Thanks Vipin Varghese > -Original Message- > From: Kovacevic, Marko > Sent: Friday, January 18, 2019 8:59 PM > To: Varghese, Vipin ; dev@dpdk.org; > shreyansh.j...@nxp.com; tho...@monjalon.net > Cc: Mcnamara, John ; Patel, Amol > ; Padubidri, Sanjay A > Subject: RE: [PATCH v4 2/2] doc: add guide for debug and troubleshoot > > After checking the patch again I found a few spelling mistakes > > > Add user guide on debug and troubleshoot for common issues and > > bottleneck found in sample application model. > > > > Signed-off-by: Vipin Varghese > > Acked-by: Marko Kovacevic > > --- > > doc/guides/howto/debug_troubleshoot_guide.rst | 375 > > ++ > > doc/guides/howto/index.rst| 1 + > > 2 files changed, 376 insertions(+) > > create mode 100644 doc/guides/howto/debug_troubleshoot_guide.rst > > > > <...> > > receieve / receive > > > +- If stats for RX and drops updated on same queue? check > > + receieve > > thread > > +- If packet does not reach PMD? check if offload for port and queue > > + matches to traffic pattern send. > > + > > <...> > > Offlaod/ offload > > > +- Is the packet multi segmented? Check if port and queue offlaod is > > set. > > + > > +Are there object drops in producer point for ring? > > +~~ > > <...> > > sufficent / sufficient > > > +- Are drops on specific socket? If yes check if there are sufficent > > + objects by rte_mempool_get_count() or rte_mempool_avail_count() > > +- Is 'rte_mempool_get_count() or rte_mempool_avail_count()' zero? > > + application requires more objects hence reconfigure number of > > + elements in rte_mempool_create(). > > +- Is there single RX thread for multiple NIC? try having multiple > > + lcore to read from fixed interface or we might be hitting cache > > + limit, so increase cache_size for pool_create(). > > + > > Sceanrios/ scenarios > > > +#. Is performance low for some sceanrios? > > +- Check if sufficient objects in mempool by rte_mempool_avail_count() > > +- Is failure seen in some packets? we might be getting packets with > > + 'size > mbuf data size'. > > +- Is NIC offload or application handling multi segment mbuf? check the > > + special packets are continuous with rte_pktmbuf_is_contiguous(). > > +- If there separate user threads used to access mempool objects, use > > + rte_mempool_cache_create() for non DPDK threads. > > debuging / debugging > > > +- Is the error reproducible with 1GB hugepage? If no, then try > > debuging > > + the issue with lookup table or objects with rte_mem_lock_page(). > > + > > +.. note:: > > + Stall in release of MBUF can be because > > <...> > > softwre / software > > > +- If softwre crypto is in use, check if the CRYPTO Library is build > > with > > + right (SIMD) flags or check if the queue pair using CPU ISA for > > + feature_flags AVX|SSE|NEON using rte_cryptodev_info_get() > > Assited/ assisted > > > +- If its hardware assited crypto showing performance variance? Check > > if > > + hardware is on same NUMA socket as queue pair and session pool. > > + > > <...> > > exceeeding / exceeding > > > + core? registered functions may be exceeeding the desired time slots > > + while running on same service core. > > +- Is function is running on RTE core? check if there are conflicting > > + functions running on same CPU core by rte_thread_get_affinity(). > > + > > <...> > > > +#. Where to capture packets? > > +- Enable pdump in primary to allow secondary to access queue-pair for > > + ports. Thus packets are copied over in RX|TX callback by secondary > > + process using ring buffers. > > +- To capture packet in middle of pipeline stage, user specific hooks > > + or callback are to be used to copy the packets. These packets > > +can > > secodnary / secondary > > > + be shared to secodnary process via user defined custom rings. > > + > > +Issue still persists? > > +~ > > + > > +#. Are there custom or vendor specific offload meta data? > > +- From PMD, then check for META data error and drops. > > +- From application, then check for META data error and drops. > > +#. Is multiprocess is used configuration and data processing? > > +- Check enabling or disabling features from secondary is > > +supported or > > not? > > Obejcts/ objects > > > +#. Is there drops for certain scenario for packets or obejcts? > > +- Check user private data in objects by dumping the details for debug. > > + > <...> > > Thanks, > Marko K
Re: [dpdk-dev] [PATCH v4 2/2] doc: add guide for debug and troubleshoot
After checking the patch again I found a few spelling mistakes > Add user guide on debug and troubleshoot for common issues and > bottleneck found in sample application model. > > Signed-off-by: Vipin Varghese > Acked-by: Marko Kovacevic > --- > doc/guides/howto/debug_troubleshoot_guide.rst | 375 > ++ > doc/guides/howto/index.rst| 1 + > 2 files changed, 376 insertions(+) > create mode 100644 doc/guides/howto/debug_troubleshoot_guide.rst > <...> receieve / receive > +- If stats for RX and drops updated on same queue? check receieve > thread > +- If packet does not reach PMD? check if offload for port and queue > + matches to traffic pattern send. > + <...> Offlaod/ offload > +- Is the packet multi segmented? Check if port and queue offlaod is set. > + > +Are there object drops in producer point for ring? > +~~ <...> sufficent / sufficient > +- Are drops on specific socket? If yes check if there are sufficent > + objects by rte_mempool_get_count() or rte_mempool_avail_count() > +- Is 'rte_mempool_get_count() or rte_mempool_avail_count()' zero? > + application requires more objects hence reconfigure number of > + elements in rte_mempool_create(). > +- Is there single RX thread for multiple NIC? try having multiple > + lcore to read from fixed interface or we might be hitting cache > + limit, so increase cache_size for pool_create(). > + Sceanrios/ scenarios > +#. Is performance low for some sceanrios? > +- Check if sufficient objects in mempool by rte_mempool_avail_count() > +- Is failure seen in some packets? we might be getting packets with > + 'size > mbuf data size'. > +- Is NIC offload or application handling multi segment mbuf? check the > + special packets are continuous with rte_pktmbuf_is_contiguous(). > +- If there separate user threads used to access mempool objects, use > + rte_mempool_cache_create() for non DPDK threads. debuging / debugging > +- Is the error reproducible with 1GB hugepage? If no, then try debuging > + the issue with lookup table or objects with rte_mem_lock_page(). > + > +.. note:: > + Stall in release of MBUF can be because <...> softwre / software > +- If softwre crypto is in use, check if the CRYPTO Library is build with > + right (SIMD) flags or check if the queue pair using CPU ISA for > + feature_flags AVX|SSE|NEON using rte_cryptodev_info_get() Assited/ assisted > +- If its hardware assited crypto showing performance variance? Check if > + hardware is on same NUMA socket as queue pair and session pool. > + <...> exceeeding / exceeding > + core? registered functions may be exceeeding the desired time slots > + while running on same service core. > +- Is function is running on RTE core? check if there are conflicting > + functions running on same CPU core by rte_thread_get_affinity(). > + <...> > +#. Where to capture packets? > +- Enable pdump in primary to allow secondary to access queue-pair for > + ports. Thus packets are copied over in RX|TX callback by secondary > + process using ring buffers. > +- To capture packet in middle of pipeline stage, user specific hooks > + or callback are to be used to copy the packets. These packets can secodnary / secondary > + be shared to secodnary process via user defined custom rings. > + > +Issue still persists? > +~ > + > +#. Are there custom or vendor specific offload meta data? > +- From PMD, then check for META data error and drops. > +- From application, then check for META data error and drops. > +#. Is multiprocess is used configuration and data processing? > +- Check enabling or disabling features from secondary is supported or > not? Obejcts/ objects > +#. Is there drops for certain scenario for packets or obejcts? > +- Check user private data in objects by dumping the details for debug. > + <...> Thanks, Marko K
[dpdk-dev] [PATCH v4 2/2] doc: add guide for debug and troubleshoot
Add user guide on debug and troubleshoot for common issues and bottleneck found in sample application model. Signed-off-by: Vipin Varghese Acked-by: Marko Kovacevic --- doc/guides/howto/debug_troubleshoot_guide.rst | 375 ++ doc/guides/howto/index.rst| 1 + 2 files changed, 376 insertions(+) create mode 100644 doc/guides/howto/debug_troubleshoot_guide.rst diff --git a/doc/guides/howto/debug_troubleshoot_guide.rst b/doc/guides/howto/debug_troubleshoot_guide.rst new file mode 100644 index 0..f2e337bb1 --- /dev/null +++ b/doc/guides/howto/debug_troubleshoot_guide.rst @@ -0,0 +1,375 @@ +.. SPDX-License-Identifier: BSD-3-Clause +Copyright(c) 2018 Intel Corporation. + +.. _debug_troubleshoot_via_pmd: + +Debug & Troubleshoot guide via PMD +== + +DPDK applications can be designed to run as single thread simple stage to +multiple threads with complex pipeline stages. These application can use poll +mode devices which helps in offloading CPU cycles. A few models are + + * single primary + * multiple primary + * single primary single secondary + * single primary multiple secondary + +In all the above cases, it is a tedious task to isolate, debug and understand +odd behaviour which occurs randomly or periodically. The goal of guide is to +share and explore a few commonly seen patterns and behaviour. Then, isolate +and identify the root cause via step by step debug at various processing +stages. + +Application Overview + + +Let us take up an example application as reference for explaining issues and +patterns commonly seen. The sample application in discussion makes use of +single primary model with various pipeline stages. The application uses PMD +and libraries such as service cores, mempool, pkt mbuf, event, crypto, QoS +and eth. + +The overview of an application modeled using PMD is shown in +:numref:`dtg_sample_app_model`. + +.. _dtg_sample_app_model: + +.. figure:: img/dtg_sample_app_model.* + + Overview of pipeline stage of an application + +Bottleneck Analysis +--- + +To debug the bottleneck and performance issues the desired application +is made to run in an environment matching as below + +#. Linux 64-bit|32-bit +#. DPDK PMD and libraries are used +#. Libraries and PMD are either static or shared. But not both +#. Machine flag optimizations of gcc or compiler are made constant + +Is there mismatch in packet rate (received < send)? +~~~ + +RX Port and associated core :numref:`dtg_rx_rate`. + +.. _dtg_rx_rate: + +.. figure:: img/dtg_rx_rate.* + + RX send rate compared against Received rate + +#. Are generic configuration correct? +- What is port Speed, Duplex? rte_eth_link_get() +- Are packets of higher sizes are dropped? rte_eth_get_mtu() +- Are only specific MAC received? rte_eth_promiscuous_get() + +#. Are there NIC specific drops? +- Check rte_eth_rx_queue_info_get() for nb_desc and scattered_rx +- Is RSS enabled? rte_eth_dev_rss_hash_conf_get() +- Are packets spread on all queues? rte_eth_dev_stats() +- If stats for RX and drops updated on same queue? check receieve thread +- If packet does not reach PMD? check if offload for port and queue + matches to traffic pattern send. + +#. If problem still persists, this might be at RX lcore thread +- Check if RX thread, distributor or event rx adapter? these may be + processing less than required +- Is the application is build using processing pipeline with RX stage? If + there are multiple port-pair tied to a single RX core, try to debug by + using rte_prefetch_non_temporal(). This will intimate the mbuf in cache + is temporary. + +Are there packet drops (receive|transmit)? +~~ + +RX-TX Port and associated cores :numref:`dtg_rx_tx_drop`. + +.. _dtg_rx_tx_drop: + +.. figure:: img/dtg_rx_tx_drop.* + + RX-TX drops + +#. At RX +- Get RX queue count? nb_rx_queues using rte_eth_dev_info_get() +- Are there miss, errors, qerros? rte_eth_dev_stats() for imissed, + ierrors, q_erros, rx_nombuf, rte_mbuf_ref_count + +#. At TX +- Are you doing in bulk TX? check application for TX descriptor overhead. +- Are there TX errors? rte_eth_dev_stats() for oerrors and qerros +- Is specific scenarios not releasing mbuf? check rte_mbuf_ref_count of + those packets. +- Is the packet multi segmented? Check if port and queue offlaod is set. + +Are there object drops in producer point for ring? +~~ + +Producer point for ring :numref:`dtg_producer_ring`. + +.. _dtg_producer_ring: + +.. figure:: img/dtg_producer_ring.* + + Producer point for Rings + +#. Performance for Producer +- Fetch the type of RING 'rte_ring_dump()' for flags (RING_F_SP_ENQ) +- If '(burst enqueue - actual enqueue) > 0'