Re: [vpp-dev] Static ARP Flag Question
On Thu, May 10, 2018 at 7:28 PM, John Lo (loj)wrote: > Hi Jon, > Hi John, > This is not the right behavior. > I had that suspicion... :-) I think it is caused by reuse of a static ARP entry in the IP4 neighbor > pool with static bit still set. The code merely set the dynamic bit in the > flags but left the static bit untouched (similarly for the static path) in > arp.c function vnet_arp_set_ip4_over_ethernet_internal (): > > > > e->time_last_updated = vlib_time_now (vm); > > if (is_static) > > e->flags |= ETHERNET_ARP_IP4_ENTRY_FLAG_STATIC; > > else > > e->flags |= ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC; > Ah, right. So it should always be one or the other, and never both. Right? I spotted another error in the function vnet_arp_flush_ip4_over_ > ethernet_internal() > > > > if (e->flags & ETHERNET_ARP_IP4_ENTRY_FLAG_STATIC) > > { > > e->flags &= ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC; > > } > > else if (e->flags & ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC) > > { > > arp_entry_free (eai, e); > > } > > > > I believe the “if static” path should be: > > e->flags &= ~ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC; > > > > Would you like to submit a patch to fix them? > Sure! I will make a first-effort and submit a patch! jdl
Re: [vpp-dev] Packet tx functions via DPDK
Hi Prashant, Hope you are doing fine. Regarding your question, I am not able to see macswap plugin in current master branch but I will try to explain wrt dpdk_plugin: With respect to low level device each VPP device driver registers for 1) INPUT_NODE (For Rx) VLIB_REGISTER_NODE (This you already figured out) 2) And Tx function via VNET_DEVICE_CLASS (), Device class like "dpdk" There are couple of more function pointers registered but let stick to Rx/Tx part. As part of startup low level plugin/driver calls ethernet_register_interface() which in turn calls vnet_register_interface(). vnet_register_interface: For a particular interface like Intel 40G, init time interface node is created and the tx function of that node is copied from VNET_DEVICE_CLASS{.tx_function="}. node->tx and node->output functions are properly initialized and node is registered. VPP stack sends packet to this low level Tx node via sw_if_index. I am guessing sw_if_index is determined by IPv4 routing or L2 switching. I think vnet_set_interface_output_node() is called for those interface (Tx path) whose DEVICE_CLASS do not provide tx_functions but I am not sure. "show vlib graph" will tell you how nodes are arranged in vpp graph. To be specific for your question next0 = hi0->output_node_next_index; output_node_next_index is the index of next node at which the current vector has to be copied. (Transition from one node to another along the graph) Note: All this I got through browsing code and if this information is not correct, I request VPP experts to correct it. Thanks, Nitin On Thursday 10 May 2018 02:19 PM, Prashant Upadhyaya wrote: Hi, I am trying to walk throught the code to see how the packet arrives into the system at dpdk rx side and finally leaves it at the dpdk tx side. I am using the context of the macswap sample plugin for this. It is clear to me that dpdk-input is a graph node and it is an 'input' type graph node so it polls for the packets using dpdk functions. The frame is then eventually passed to the sample plugin because the sample plugin inserts itself at the right place. The sample plugin queues the packets to the interface-output graph node. So now I check the interface-output graph node function. I locate that in vpp/src/vnet/interface_output.c So the dispatch function for the graph node is vnet_per_buffer_interface_output Here the interface-output node is queueing the packets to a next node based on the following code -- hi0 = vnet_get_sup_hw_interface (vnm, vnet_buffer (b0)->sw_if_index [VLIB_TX]); next0 = hi0->output_node_next_index; Now I am a little lost, what is this output_node_next_index ? Which graph node function is really called for really emitting the packet ? Where exactly is this setup ? I do see that the actual dpdk tx burst function is called from tx_burst_vector_internal, which itself is called from dpdk_interface_tx (vpp/src/plugins/dpdk/device/device.c). But how the code reaches the dpdk_interface_tx after the packets are queued from interface-output graph node is not clear to me. If somebody could help me connect the dots, that would be great. Regards -Prashant -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#9260): https://lists.fd.io/g/vpp-dev/message/9260 View All Messages In Topic (2): https://lists.fd.io/g/vpp-dev/topic/19023164 Mute This Topic: https://lists.fd.io/mt/19023164/21656 New Topic: https://lists.fd.io/g/vpp-dev/post Change Your Subscription: https://lists.fd.io/g/vpp-dev/editsub/21656 Group Home: https://lists.fd.io/g/vpp-dev Contact Group Owner: vpp-dev+ow...@lists.fd.io Terms of Service: https://lists.fd.io/static/tos Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] vpp_api_test via socket file
The underlying [c-code] vpp client API library supports one client connection. It’s not conceptually difficult to support multiple connections, but it would take a lot of typing and testing. You can raise it as a feature request, but I wouldn’t plan on seeing it any time soon. D. From: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) Sent: Friday, May 11, 2018 6:44 AM To: Justin IurmanCc: Dave Barach (dbarach) ; Damjan Marion ; vpp-dev Subject: RE: [vpp-dev] vpp_api_test via socket file Hello, Thank you for the pointers. Seems to be working although with few notes: 1. It is not possible to keep both connection open: vpp1 = VPP(jsonfiles) r1 = vpp1.connect('vpp1', chroot_prefix='vpp1') print('VPP1 version', vpp1.api.show_version().version.decode().rstrip('\0x00')) vpp2 = VPP(jsonfiles) r2 = vpp2.connect('vpp2', chroot_prefix='vpp2') print('VPP2 version', vpp2.api.show_version().version.decode().rstrip('\0x00')) vpp1.disconnect() vpp2.disconnect() >> DEBUG:vpp_papi:No such message type or failed CRC checksum: >> cdp_enable_disable_4eec6097 >> DEBUG:vpp_papi:No such message type or failed CRC checksum: >> cdp_enable_disable_reply_e8d4e804 >> ('VPP1 version', u'18.04-release') >> WARNING:vpp_papi:VPP API client:: vac_connect:285: vl_client_api map rv -1 >> >> WARNING:vpp_papi:VPP API client:: vac_connect:285: vl_client_api map rv -1 >> >> Traceback (most recent call last): >> File "pythonAPI.py", line 18, in >>r2 = vpp2.connect('vpp2', chroot_prefix='vpp') >> File "/usr/lib/python2.7/dist-packages/vpp_papi.py", line 690, in connect >> async) >> File "/usr/lib/python2.7/dist-packages/vpp_papi.py", line 661, in >> connect_internal >> raise IOError(2, 'Connect failed') >> IOError: [Errno 2] Connect failed >> DEBUG:vpp_papi:Cleaning up VPP on exit On the other side this is possible: vpp1 = VPP(jsonfiles) r1 = vpp1.connect('vpp1', chroot_prefix='vpp1') print('VPP1 version', vpp1.api.show_version().version.decode().rstrip('\0x00')) vpp1.disconnect() vpp2 = VPP(jsonfiles) r2 = vpp2.connect('vpp2', chroot_prefix='vpp2') print('VPP2 version', vpp2.api.show_version().version.decode().rstrip('\0x00')) vpp2.disconnect() >> DEBUG:vpp_papi:No such message type or failed CRC checksum: >> cdp_enable_disable_4eec6097 >> DEBUG:vpp_papi:No such message type or failed CRC checksum: >> cdp_enable_disable_reply_e8d4e804 >> ('VPP1 version', u'18.04-release') >> DEBUG:vpp_papi:No such message type or failed CRC checksum: >> cdp_enable_disable_4eec6097 >> DEBUG:vpp_papi:No such message type or failed CRC checksum: >> cdp_enable_disable_reply_e8d4e804 >> ('VPP2 version', u'18.04-release') 1. print('VPP1', intf.interface_name.decode().rstrip('\0x00')) broke the output of interface 7/0/0 >> ('VPP1', u'TenGigabitEthernet7/0/') 1. DEBUG:vpp_papi:No such message type or failed CRC checksum: cdp_enable_disable_4eec6097 DEBUG:vpp_papi:No such message type or failed CRC checksum: cdp_enable_disable_reply_e8d4e804 Looks like some API’s need update 1. Connect() requires sudo, that means script needs to be running as root. Peter Mikus Engineer – Software Cisco Systems Limited From: Justin Iurman [mailto:justin.iur...@uliege.be] Sent: Friday, May 11, 2018 9:59 AM To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) > Cc: Dave Barach (dbarach) >; Damjan Marion >; vpp-dev > Subject: Re: [vpp-dev] vpp_api_test via socket file Peter, …however, are there any other options to full control 2+ instances of VPP via API (not vppctl)? PythonAPI for example [1]. Ole’s answer to the same question: r = vpp.connect('vpp1', chroot_prefix='name of shared address segment') Cheers, Justin
Re: [vpp-dev] TCP performance - TSO - HW offloading in general.
Florin, A few more comments about latency. Some number in ms in the table below: This is ping and iperf3 concurrent. In case of VPP it is vppctl ping. Kernel w/ load Kernel w/o load VPP w/ load VPP w/o load Min. :0.1920 Min. :0.0610 Min. :0.0573 Min. :0.03480 1st Qu.:0.2330 1st Qu.:0.1050 1st Qu.:0.2058 1st Qu.:0.04640 Median :0.2450 Median :0.1090 Median :0.2289 Median :0.04880 Mean :0.2458 Mean :0.1153 Mean :0.2568 Mean :0.05096 3rd Qu.:0.2720 3rd Qu.:0.1290 3rd Qu.:0.2601 3rd Qu.:0.05270 Max. :0.2800 Max. :0.1740 Max. :0.6926 Max. :0.09420 In short: ICMP packets have a lower latency under load. I could interpret this as du to vectorization maybe. Also the Linux kernel is slower to reply to ping by x2 factor (system call latency?) 115us vs 50us in VPP. w/ load no difference. In this test Linux TCP is using TSO. While trying to use hping to have a latency sample w/ TCP instead of ICMP we noticed that VPP TCP stack does not reply with a RST. So we don’t get any sample. Is that expected behavior? Thanks Luca From: Luca MuscarielloDate: Thursday 10 May 2018 at 13:52 To: Florin Coras Cc: Luca Muscariello , "vpp-dev@lists.fd.io" Subject: Re: [vpp-dev] TCP performance - TSO - HW offloading in general. MTU had no effect, just statistical fluctuations in the test reports. Sorry for misreporting the info. We are exploiting vectorization as we have a single memif channel per transport socket so we can control the size of the batches dynamically. In theory the size of outstanding data from the transport should be controlled in bytes for batching to be useful and not harmful as frame sizes can vary a lot. But I’m not aware of a queue abstraction from DPDK to control that from VPP. From: Florin Coras Date: Wednesday 9 May 2018 at 18:23 To: Luca Muscariello Cc: Luca Muscariello , "vpp-dev@lists.fd.io" Subject: Re: [vpp-dev] TCP performance - TSO - HW offloading in general. Hi Luca, We don’t yet support pmtu in the stack so tcp uses a fixed 1460 mtu, unless you changed that, we shouldn’t generate jumbo packets. If we do, I’ll have to take a look at it :) If you already had your transport protocol, using memif is the natural way to go. Using the session layer makes sense only if you can implement your transport within vpp in a way that leverages vectorization or if it can leverage the existing transports (see for instance the TLS implementation). Until today [1] the stack did allow for excessive batching (generation of multiple frames in one dispatch loop) but we’re now restricting that to one. This is still far from proper pacing which is on our todo list. Florin [1] https://gerrit.fd.io/r/#/c/12439/ On May 9, 2018, at 4:21 AM, Luca Muscariello (lumuscar) wrote: Florin, Thanks for the slide deck, I’ll check it soon. BTW, VPP/DPDK test was using jumbo frames by default so the TCP stack had a little advantage wrt the Linux TCP stack which was using 1500B by default. By manually setting DPDK MTU to 1500B the goodput goes down to 8.5Gbps which compares to 4.5Gbps for Linux w/o TSO. Also congestion window adaptation is not the same. BTW, for what we’re doing it is difficult to reuse the VPP session layer as it is. Our transport stack uses a different kind of namespace and mux/demux is also different. We are using memif as underlying driver which does not seem to be a bottleneck as we can also control batching there. Also, we have our own shared memory downstream memif inside VPP through a plugin. What we observed is that delay-based congestion control does not like much VPP batching (batching in general) and we are using DBCG. Linux TSO has the same problem but has TCP pacing to limit bad effects of bursts on RTT/losses and flow control laws. I guess you’re aware of these issues already. Luca From: Florin Coras Date: Monday 7 May 2018 at 22:23 To: Luca Muscariello Cc: Luca Muscariello , "vpp-dev@lists.fd.io" Subject: Re: [vpp-dev] TCP performance - TSO - HW offloading in general. Yes, the whole host stack uses shared memory segments and fifos that the session layer manages. For a brief description of the session layer see [1, 2]. Apart from that, unfortunately, we don’t have any other dev documentation. src/vnet/session/segment_manager.[ch] has some good examples of how to allocate segments and fifos. Under application_interface.h check app_[send|recv]_[stream|dgram]_raw for examples on how to read/write to the fifos. Now, regarding the the writing to the fifos:
Re: [vpp-dev] vpp_api_test via socket file
Peter, > …however, are there any other options to full control 2+ instances of VPP via > API (not vppctl)? PythonAPI for example [1]. Ole’s answer to the same question: > r = vpp.connect('vpp1', chroot_prefix='name of shared address segment') Cheers, Justin
Re: [vpp-dev] How to add plugin's statistics into stats thread
Hi folks, Are there any suggestions for this? Basically we want have effective method to calculate statistics like pps for cps inside plugin. Thanks, Mori From: "Ni, Hongjun"Date: Monday, May 7, 2018 16:21 To: "Jerome Tollet (jtollet)" , vpp-dev Cc: "Mori, Naoyuki" , Yusuke Tatsumi Subject: RE: [vpp-dev] How to add plugin's statistics into stats thread Hi Jerome, We would like to add LB plugin statistics, including per- VIP connections and per-AS connections for each VIP. Frequency is configurable, 1 second is better. Data of volume depends on the number of VIPs and Ass. Please refer to below patch for details: https://gerrit.fd.io/r/#/c/12146/2/src/plugins/lb/lb.api Thank you, Hongjun From: Jerome Tollet (jtollet) [mailto:jtol...@cisco.com] Sent: Monday, May 7, 2018 3:15 PM To: Ni, Hongjun ; vpp-dev Cc: Mori, Naoyuki ; Yusuke Tatsumi Subject: Re: [vpp-dev] How to add plugin's statistics into stats thread Hi Hongjun, Could you elaborate a bit on the kind of statistics you’d like to create? Frequency and volume of data may be interesting inputs. Jerome De : > au nom de "Ni, Hongjun" > Date : lundi 7 mai 2018 à 07:43 À : vpp-dev > Cc : "Mori, Naoyuki" >, Yusuke Tatsumi > Objet : [vpp-dev] How to add plugin's statistics into stats thread Hi all, We want to add plugin’s statistics into VPP stats thread. But it seems that current stats thread only support codebase (i.e. vnet) statistics. Is there some mechanism to support adding plugin’s statistics into stats thread? Thanks, Hongjun