Re: [vpp-dev] Static ARP Flag Question

2018-05-11 Thread Jon Loeliger
On Thu, May 10, 2018 at 7:28 PM, John Lo (loj)  wrote:

> Hi Jon,
>

Hi John,


> This is not the right behavior.
>

I had that suspicion... :-)

  I think it is caused by reuse of a static ARP entry in  the IP4 neighbor
> pool with static bit still set.  The code merely set the dynamic bit in the
> flags but left the static bit untouched (similarly for the static path) in
> arp.c function vnet_arp_set_ip4_over_ethernet_internal ():
>
>
>
>   e->time_last_updated = vlib_time_now (vm);
>
>   if (is_static)
>
> e->flags |= ETHERNET_ARP_IP4_ENTRY_FLAG_STATIC;
>
>   else
>
> e->flags |= ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC;
>

Ah, right.  So it should always be one or the other, and never both.  Right?

I spotted another error in the function vnet_arp_flush_ip4_over_
> ethernet_internal()
>
>
>
>  if (e->flags & ETHERNET_ARP_IP4_ENTRY_FLAG_STATIC)
>
> {
>
>   e->flags &= ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC;
>
> }
>
>   else if (e->flags & ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC)
>
> {
>
>   arp_entry_free (eai, e);
>
> }
>
>
>
> I believe the “if static” path should be:
>
>   e->flags &= ~ETHERNET_ARP_IP4_ENTRY_FLAG_DYNAMIC;
>
>
>
> Would you like to submit a patch to fix them?
>

Sure!  I will make a first-effort and submit a patch!

jdl


Re: [vpp-dev] Packet tx functions via DPDK

2018-05-11 Thread Nitin Saxena

Hi Prashant,

Hope you are doing fine.

Regarding your question, I am not able to see macswap plugin in current 
master branch but I will try to explain wrt dpdk_plugin:


With respect to low level device each VPP device driver registers for

1) INPUT_NODE (For Rx) VLIB_REGISTER_NODE (This you already figured out)
2) And Tx function via VNET_DEVICE_CLASS (), Device class like "dpdk"
There are couple of more function pointers registered but let stick to 
Rx/Tx part.


As part of startup low level plugin/driver calls 
ethernet_register_interface() which in turn calls 
vnet_register_interface().


vnet_register_interface:
For a particular interface like Intel 40G, init time interface node is 
created and the tx function of that node is copied from 
VNET_DEVICE_CLASS{.tx_function="}. node->tx and node->output 
functions are properly initialized and node is registered.


VPP stack sends packet to this low level Tx node via sw_if_index. I am 
guessing sw_if_index is determined by IPv4 routing or L2 switching.


I think vnet_set_interface_output_node() is called for those interface 
(Tx path) whose DEVICE_CLASS do not provide tx_functions but I am not sure.


"show vlib graph" will tell you how nodes are arranged in vpp graph.

To be specific for your question

  next0 = hi0->output_node_next_index;

output_node_next_index is the index of next node at which the current 
vector has to be copied. (Transition from one node to another along the 
graph)


Note: All this I got through browsing code and if this information is 
not correct, I request VPP experts to correct it.


Thanks,
Nitin

On Thursday 10 May 2018 02:19 PM, Prashant Upadhyaya wrote:

Hi,

I am trying to walk throught the code to see how the packet arrives
into the system at dpdk rx side and finally leaves it at the dpdk tx
side. I am using the context of the macswap sample plugin for this.

It is clear to me that dpdk-input is a graph node and it is an 'input'
type graph node so it polls for the packets using dpdk functions. The
frame is then eventually passed to the sample plugin because the
sample plugin inserts itself at the right place. The sample plugin
queues the packets to the interface-output graph node.

So now I check the interface-output graph node function.
I locate that in vpp/src/vnet/interface_output.c
So the dispatch function for the graph node is vnet_per_buffer_interface_output
Here the interface-output node is queueing the packets to a next node
based on the following code --

  hi0 =
 vnet_get_sup_hw_interface (vnm,
vnet_buffer (b0)->sw_if_index
[VLIB_TX]);

   next0 = hi0->output_node_next_index;

Now I am a little lost, what is this output_node_next_index ? Which
graph node function is really called for really emitting the packet ?
Where exactly is this setup ?

I do see that the actual dpdk tx burst function is called from
tx_burst_vector_internal, which itself is called from
dpdk_interface_tx (vpp/src/plugins/dpdk/device/device.c). But how the
code reaches the dpdk_interface_tx after the packets are queued from
interface-output graph node is not clear to me. If somebody could help
me connect the dots, that would be great.

Regards
-Prashant





-=-=-=-=-=-=-=-=-=-=-=-
Links:

You receive all messages sent to this group.

View/Reply Online (#9260): https://lists.fd.io/g/vpp-dev/message/9260
View All Messages In Topic (2): https://lists.fd.io/g/vpp-dev/topic/19023164
Mute This Topic: https://lists.fd.io/mt/19023164/21656
New Topic: https://lists.fd.io/g/vpp-dev/post

Change Your Subscription: https://lists.fd.io/g/vpp-dev/editsub/21656
Group Home: https://lists.fd.io/g/vpp-dev
Contact Group Owner: vpp-dev+ow...@lists.fd.io
Terms of Service: https://lists.fd.io/static/tos
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] vpp_api_test via socket file

2018-05-11 Thread Dave Barach
The underlying [c-code] vpp client API library supports one client connection. 
It’s not conceptually difficult to support multiple connections, but it would 
take a lot of typing and testing.

You can raise it as a feature request, but I wouldn’t plan on seeing it any 
time soon.

D.

From: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco)
Sent: Friday, May 11, 2018 6:44 AM
To: Justin Iurman 
Cc: Dave Barach (dbarach) ; Damjan Marion 
; vpp-dev 
Subject: RE: [vpp-dev] vpp_api_test via socket file

Hello,

Thank you for the pointers. Seems to be working although with few notes:


  1.  It is not possible to keep both connection open:

vpp1 = VPP(jsonfiles)
r1 = vpp1.connect('vpp1', chroot_prefix='vpp1')
print('VPP1 version', vpp1.api.show_version().version.decode().rstrip('\0x00'))
vpp2 = VPP(jsonfiles)
r2 = vpp2.connect('vpp2', chroot_prefix='vpp2')
print('VPP2 version', vpp2.api.show_version().version.decode().rstrip('\0x00'))
vpp1.disconnect()
vpp2.disconnect()

>> DEBUG:vpp_papi:No such message type or failed CRC checksum: 
>> cdp_enable_disable_4eec6097
>> DEBUG:vpp_papi:No such message type or failed CRC checksum: 
>> cdp_enable_disable_reply_e8d4e804
>> ('VPP1 version', u'18.04-release')
>> WARNING:vpp_papi:VPP API client:: vac_connect:285: vl_client_api map rv -1
>>
>> WARNING:vpp_papi:VPP API client:: vac_connect:285: vl_client_api map rv -1
>>
>> Traceback (most recent call last):
>>   File "pythonAPI.py", line 18, in 
>>r2 = vpp2.connect('vpp2', chroot_prefix='vpp')
>>   File "/usr/lib/python2.7/dist-packages/vpp_papi.py", line 690, in connect
>> async)
>>   File "/usr/lib/python2.7/dist-packages/vpp_papi.py", line 661, in 
>> connect_internal
>> raise IOError(2, 'Connect failed')
>> IOError: [Errno 2] Connect failed
>> DEBUG:vpp_papi:Cleaning up VPP on exit


On the other side this is possible:

vpp1 = VPP(jsonfiles)
r1 = vpp1.connect('vpp1', chroot_prefix='vpp1')
print('VPP1 version', vpp1.api.show_version().version.decode().rstrip('\0x00'))
vpp1.disconnect()
vpp2 = VPP(jsonfiles)
r2 = vpp2.connect('vpp2', chroot_prefix='vpp2')
print('VPP2 version', vpp2.api.show_version().version.decode().rstrip('\0x00'))
vpp2.disconnect()


>> DEBUG:vpp_papi:No such message type or failed CRC checksum: 
>> cdp_enable_disable_4eec6097
>> DEBUG:vpp_papi:No such message type or failed CRC checksum: 
>> cdp_enable_disable_reply_e8d4e804
>> ('VPP1 version', u'18.04-release')
>> DEBUG:vpp_papi:No such message type or failed CRC checksum: 
>> cdp_enable_disable_4eec6097
>> DEBUG:vpp_papi:No such message type or failed CRC checksum: 
>> cdp_enable_disable_reply_e8d4e804
>> ('VPP2 version', u'18.04-release')



  1.  print('VPP1', intf.interface_name.decode().rstrip('\0x00')) broke the 
output of interface 7/0/0

>> ('VPP1', u'TenGigabitEthernet7/0/')


  1.  DEBUG:vpp_papi:No such message type or failed CRC checksum: 
cdp_enable_disable_4eec6097

DEBUG:vpp_papi:No such message type or failed CRC checksum: 
cdp_enable_disable_reply_e8d4e804



Looks like some API’s need update



  1.  Connect() requires sudo, that means script needs to be running as root.



Peter Mikus
Engineer – Software
Cisco Systems Limited

From: Justin Iurman [mailto:justin.iur...@uliege.be]
Sent: Friday, May 11, 2018 9:59 AM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
>
Cc: Dave Barach (dbarach) >; Damjan 
Marion >; vpp-dev 
>
Subject: Re: [vpp-dev] vpp_api_test via socket file

Peter,

…however, are there any other options to full control 2+ instances of VPP via 
API (not vppctl)? PythonAPI for example [1].

Ole’s answer to the same question:

r = vpp.connect('vpp1', chroot_prefix='name of shared address segment')

Cheers,

Justin


Re: [vpp-dev] TCP performance - TSO - HW offloading in general.

2018-05-11 Thread Luca Muscariello
Florin,

 

A few more comments about latency.

Some number in ms in the table below:

 

This is ping and iperf3 concurrent. In case of VPP it is vppctl ping.

 

Kernel w/ load   Kernel w/o load  VPP w/ load  VPP w/o load

Min.   :0.1920   Min.   :0.0610   Min.   :0.0573   Min.   :0.03480

1st Qu.:0.2330   1st Qu.:0.1050   1st Qu.:0.2058   1st Qu.:0.04640

Median :0.2450   Median :0.1090   Median :0.2289   Median :0.04880

Mean   :0.2458   Mean   :0.1153   Mean   :0.2568   Mean   :0.05096

3rd Qu.:0.2720   3rd Qu.:0.1290   3rd Qu.:0.2601   3rd Qu.:0.05270

Max.   :0.2800   Max.   :0.1740   Max.   :0.6926   Max.   :0.09420

 

In short: ICMP packets have a lower latency under load.

I could interpret this as du to vectorization maybe. Also the Linux kernel

is slower to reply to ping by x2 factor (system call latency?) 115us vs

50us in VPP. w/ load no difference. In this test Linux TCP is using TSO.

 

While trying to use hping  to have a latency sample w/ TCP instead of ICMP 

we noticed that VPP TCP stack does not reply with a RST. So we don’t get

any sample. Is that expected behavior?

 

Thanks

 

 

Luca

 

 

 

 

 

From: Luca Muscariello 
Date: Thursday 10 May 2018 at 13:52
To: Florin Coras 
Cc: Luca Muscariello , "vpp-dev@lists.fd.io" 

Subject: Re: [vpp-dev] TCP performance - TSO - HW offloading in general.

 

MTU had no effect, just statistical fluctuations in the test reports. Sorry for 
misreporting the info.

 

We are exploiting vectorization as we have a single memif channel 

per transport socket so we can control the size of the batches dynamically. 

 

In theory the size of outstanding data from the transport should be controlled 
in bytes for 

batching to be useful and not harmful as frame sizes can vary a lot. But I’m 
not aware of a queue abstraction from DPDK 

to control that from VPP.

 

From: Florin Coras 
Date: Wednesday 9 May 2018 at 18:23
To: Luca Muscariello 
Cc: Luca Muscariello , "vpp-dev@lists.fd.io" 

Subject: Re: [vpp-dev] TCP performance - TSO - HW offloading in general.

 

Hi Luca,

 

We don’t yet support pmtu in the stack so tcp uses a fixed 1460 mtu, unless you 
changed that, we shouldn’t generate jumbo packets. If we do, I’ll have to take 
a look at it :)

 

If you already had your transport protocol, using memif is the natural way to 
go. Using the session layer makes sense only if you can implement your 
transport within vpp in a way that leverages vectorization or if it can 
leverage the existing transports (see for instance the TLS implementation).

 

Until today [1] the stack did allow for excessive batching (generation of 
multiple frames in one dispatch loop) but we’re now restricting that to one. 
This is still far from proper pacing which is on our todo list. 

 

Florin

 

[1] https://gerrit.fd.io/r/#/c/12439/

 




On May 9, 2018, at 4:21 AM, Luca Muscariello (lumuscar)  
wrote:

 

Florin,

 

Thanks for the slide deck, I’ll check it soon.

 

BTW, VPP/DPDK test was using jumbo frames by default so the TCP stack had a 
little

advantage wrt the Linux TCP stack which was using 1500B by default.

 

By manually setting DPDK MTU to 1500B the goodput goes down to 8.5Gbps which 
compares

to 4.5Gbps for Linux w/o TSO. Also congestion window adaptation is not the same.

 

BTW, for what we’re doing it is difficult to reuse the VPP session layer as it 
is.

Our transport stack uses a different kind of namespace and mux/demux is also 
different.

 

We are using memif as underlying driver which does not seem to be a

bottleneck as we can also control batching there. Also, we have our own

shared memory downstream memif inside VPP through a plugin.

 

What we observed is that delay-based congestion control does not like

much VPP batching (batching in general) and we are using DBCG.

 

Linux TSO has the same problem but has TCP pacing to limit bad effects of bursts

on RTT/losses and flow control laws.

 

I guess you’re aware of these issues already.

 

Luca

 

 

From: Florin Coras 
Date: Monday 7 May 2018 at 22:23
To: Luca Muscariello 
Cc: Luca Muscariello , "vpp-dev@lists.fd.io" 

Subject: Re: [vpp-dev] TCP performance - TSO - HW offloading in general.

 

Yes, the whole host stack uses shared memory segments and fifos that the 
session layer manages. For a brief description of the session layer see [1, 2]. 
Apart from that, unfortunately, we don’t have any other dev documentation. 
src/vnet/session/segment_manager.[ch] has some good examples of how to allocate 
segments and fifos. Under application_interface.h check 
app_[send|recv]_[stream|dgram]_raw for examples on how to read/write to the 
fifos. 

 

Now, regarding the the writing to the fifos: 

Re: [vpp-dev] vpp_api_test via socket file

2018-05-11 Thread Justin Iurman
Peter,

> …however, are there any other options to full control 2+ instances of VPP via 
> API (not vppctl)? PythonAPI for example [1].

Ole’s answer to the same question:

> r = vpp.connect('vpp1', chroot_prefix='name of shared address segment')

Cheers,

Justin

Re: [vpp-dev] How to add plugin's statistics into stats thread

2018-05-11 Thread Naoyuki Mori
Hi folks,

Are there any suggestions for this?
Basically we want have effective method to calculate statistics like pps for 
cps inside plugin.

Thanks,
Mori
From: "Ni, Hongjun" 
Date: Monday, May 7, 2018 16:21
To: "Jerome Tollet (jtollet)" , vpp-dev 
Cc: "Mori, Naoyuki" , Yusuke Tatsumi 

Subject: RE: [vpp-dev] How to add plugin's statistics into stats thread

Hi Jerome,

We would like to add LB plugin statistics, including per- VIP connections and 
per-AS connections for each VIP.
Frequency is configurable, 1 second is better.
Data of volume depends on the number of VIPs and Ass.
Please refer to below patch for details:
https://gerrit.fd.io/r/#/c/12146/2/src/plugins/lb/lb.api

Thank you,
Hongjun

From: Jerome Tollet (jtollet) [mailto:jtol...@cisco.com]
Sent: Monday, May 7, 2018 3:15 PM
To: Ni, Hongjun ; vpp-dev 
Cc: Mori, Naoyuki ; Yusuke Tatsumi 

Subject: Re: [vpp-dev] How to add plugin's statistics into stats thread

Hi Hongjun,
Could you elaborate a bit on the kind of statistics you’d like to create?
Frequency and volume of data may be interesting inputs.
Jerome

De : > au nom de "Ni, Hongjun" 
>
Date : lundi 7 mai 2018 à 07:43
À : vpp-dev >
Cc : "Mori, Naoyuki" >, 
Yusuke Tatsumi >
Objet : [vpp-dev] How to add plugin's statistics into stats thread

Hi all,

We want to add plugin’s statistics into VPP stats thread.
But it seems that current stats thread only support codebase (i.e. vnet) 
statistics.
Is there some mechanism to support adding plugin’s statistics into stats thread?

Thanks,
Hongjun