Re: [vpp-dev] anomaly in deleting tcp idle session in vpp

2018-06-02 Thread Andrew Yourtchenko
Thanks for more detail! Could you also include the VPP config ? This way I can 
give it a shot in the lab when I am back or the office 11th June.

By the way, https://gerrit.fd.io/r/#/c/12770/ could have changed the behavior 
since I made an addition of the transient state before the connection is 
deleted, you might wanna give it a quick check if you like, before the 11th 
June.

--a

> On 2 Jun 2018, at 12:37, emma sdi  wrote:
> 
> 
> -- Forwarded message --
> From: khers 
> Date: Sat, Jun 2, 2018 at 1:09 PM
> Subject: Re: [vpp-dev] anomaly in deleting tcp idle session in vpp
> To: Andrew Yourtchenko 
> 
> 
> Dear Andrew,
> 
> I have observed a contradiction. In my test case, after being session table 
> full, vpp start to delete idle sessions.
> trex command : ./t-rex-64 -c 3 --active-flows 1  -f 
> /cap2/concurrent_connection_test.yaml  --nc -m 1000 --no-key
> yaml file and pcap file and patch file is attached to this email.
> 
> I changed vpp 18.04 version to log session timeout type before removing them. 
> Accordingly, I observed sessions with timeout type 1 ( 1 is used for idle 
> session timeout) in my log.
> 
> set acl-plugin session timeout udp idle 60
> set acl-plugin session timeout tcp idle 3600
> set acl-plugin session timeout tcp transient 30
> 
> 
> 
> 
>> On Wed, May 30, 2018 at 7:59 PM, Andrew Yourtchenko  
>> wrote:
>> If the table is full it should fifo-reuse the tcp transient sessions, not 
>> the established ones.
>> 
>> --a
>> 
>>> On 30 May 2018, at 14:00, emma sdi  wrote:
>>> 
>>> Dear Folks,
>>> I have a problem with vpp stateful mode. I observed that vpp start to 
>>> delete tcp idle sessions when session table is full. my question is this 
>>> behavior is implemented and indeed it is normal routine? or is this an 
>>> anomaly? because this behavior is not normal generally (for example in 
>>> conntrack) and an established session have to exist till its timeout will 
>>> be zero. I expect vpp holds all old tcp idle sessions instead of creating 
>>> new sessions when session table doesn't have any empty entry.
>>> Best Regards,
> 
> 
> 
> 
> 
> 


[vpp-dev] anomaly in deleting tcp idle session in vpp

2018-06-02 Thread emma sdi
-- Forwarded message --
From: khers 
Date: Sat, Jun 2, 2018 at 1:09 PM
Subject: Re: [vpp-dev] anomaly in deleting tcp idle session in vpp
To: Andrew Yourtchenko 


Dear Andrew,

I have observed a contradiction. In my test case, after being session table
full, vpp start to delete idle sessions.
trex command : ./t-rex-64 -c 3 --active-flows 1  -f
/cap2/concurrent_connection_test.yaml  --nc -m 1000 --no-key
yaml file and pcap file and patch file is attached to this email.

I changed vpp 18.04 version to log session timeout type before removing
them. Accordingly, I observed sessions with timeout type 1 ( 1 is used for
idle session timeout) in my log.

set acl-plugin session timeout udp idle 60
set acl-plugin session timeout tcp idle 3600
set acl-plugin session timeout tcp transient 30




On Wed, May 30, 2018 at 7:59 PM, Andrew Yourtchenko 
wrote:

> If the table is full it should fifo-reuse the tcp transient sessions, not
> the established ones.
>
> --a
>
> On 30 May 2018, at 14:00, emma sdi  wrote:
>
> Dear Folks,
> I have a problem with vpp stateful mode. I observed that vpp start to
> delete tcp idle sessions when session table is full. my question is this
> behavior is implemented and indeed it is normal routine? or is this an
> anomaly? because this behavior is not normal generally (for example in
> conntrack) and an established session have to exist till its timeout will
> be zero. I expect vpp holds all old tcp idle sessions instead of creating
> new sessions when session table doesn't have any empty entry.
> Best Regards,
> 
>
>


concurrent_connection_test.yaml
Description: application/yaml


concurrent_connection_test_pcap.pcap
Description: application/vnd.tcpdump.pcap
diff --git a/src/plugins/acl/fa_node.c b/src/plugins/acl/fa_node.c
index a36a5815..bb965bb2 100644
--- a/src/plugins/acl/fa_node.c
+++ b/src/plugins/acl/fa_node.c
@@ -388,6 +388,9 @@ acl_fa_conn_list_delete_session (acl_main_t *am, fa_full_session_id_t sess_id)
 return 0;
   }
   fa_session_t *sess = get_session_ptr(am, sess_id.thread_index, sess_id.session_index);
+  u32 res = -1;  
+  if(sess)
+ res = fa_session_get_timeout_type (am, sess);  
   /* we should never try to delete the session with another thread index */
   ASSERT(sess->thread_index == thread_index);
   if (~0 != sess->link_prev_idx) {
@@ -408,6 +411,8 @@ acl_fa_conn_list_delete_session (acl_main_t *am, fa_full_session_id_t sess_id)
   if (pw->fa_conn_list_tail[sess->link_list_id] == sess_id.session_index) {
 pw->fa_conn_list_tail[sess->link_list_id] = sess->link_prev_idx;
   }
+
+  printf("acl_fa_conn_list_delete_session: the session timeout type is %d \n  ", res);
   return 1;
 }
 
@@ -449,6 +454,9 @@ acl_fa_delete_session (acl_main_t * am, u32 sw_if_index, fa_full_session_id_t se
   void *oldheap = clib_mem_set_heap(am->acl_mheap);
   fa_session_t *sess = get_session_ptr(am, sess_id.thread_index, sess_id.session_index);
   ASSERT(sess->thread_index == os_get_thread_index ());
+  u32 res = -1;
+  if(sess)   
+ res = fa_session_get_timeout_type (am, sess);
   BV (clib_bihash_add_del) (>fa_sessions_hash,
 			>info.kv, 0);
   acl_fa_per_worker_data_t *pw = >per_worker_data[sess_id.thread_index];
@@ -459,6 +467,8 @@ acl_fa_delete_session (acl_main_t * am, u32 sw_if_index, fa_full_session_id_t se
   clib_mem_set_heap (oldheap);
   pw->fa_session_dels_by_sw_if_index[sw_if_index]++;
   clib_smp_atomic_add(>fa_session_total_dels, 1);
+  printf("acl_delete_session_function:the session timeout type is %d \n", res);
+
 }
 
 static int


Re: [vpp-dev] Support for TCP flag

2018-06-02 Thread Rubina Bianchi
As Regards to your question, the case that I'm testing is connection tracking 
in stateful firewall which its functionality is the same as Linux conntrack. Do 
you have any plan to provide VPP as an appropriate infrastructure for firewall 
applications?

From: Andrew  Yourtchenko 
Sent: Tuesday, May 29, 2018 1:10 PM
To: Rubina Bianchi
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Support for TCP flag

Hi Rubina,

I designed the stateful mode to be just a bit more than the ACL, with
a "diode" state, rather than going for the fully fledged firewall
model - as a balance between the simplicity and the functionality.

The full tracking of the TCP state machine was not in scope - getting
into that territory properly requires also TCP sequence number
tracking, etc. - and there the complexity would far outweigh the
usefulness for most practical cases.

So I needed to primarily differentiate the session state from the
timeout perspective - when to remove it.

For that purpose, there are  two types of TCP sessions, decided by
taking by the combination of SYN,FIN,RST,ACK TCP flag bits seen from
each side:

1) Those that has seen SYN+ACK on both sides are fully open (this is
where the "tcp idle" timeout applies, which is usually rather long.

2)  Those that had seen any other combination of the flags (this is
where the "tcp transient" timeout applies, which is default to 2
minutes)

As we receive the packets, we update the seen flags, and we may change
the current idle timeout based on the accumulated seen flags.

Additionally, if we run out of sessions when creating the new ones,
then the sessions in the transient state will be cleaned up and reused
in the FIFO manner - so as to simulate a simple mechanism against the
resource starvation for the higher session rate.

This is a deliberate design choice, and unless there is some
operational issues with it (i.e. where the resource clean-up does not
happen where it should, etc...), I did not have any plans to change
it.

So, could you expand a bit more on what kind of use case you are
looking for, to discuss further ?

--a

On 5/29/18, Rubina Bianchi  wrote:
> Hi
> I have a question about vpp stateful mode. It seems that vpp stateful mode
> hasn't implemented completely. I mean there aren't any feature same as
> contrack in linux kernel. So, vpp doesn't have any mechanism to handle TCP
> sessions based on different flags. For example I sent TCP three way
> handshaking packets in different order (ack -> syn -> syn-ack), in this case
> an idle session is added to session table. Do you have any plan to develop
> it?
>


Re: [vpp-dev] Multiarch/target select for dpdk_device_input

2018-06-02 Thread Damjan Marion

Dear Nitin,

Anybody is free to submit patch to gerrit.fd.io  or to 
provide comments to any patch there.

Regards,

Damjan

> On 2 Jun 2018, at 03:11, Nitin Saxena  wrote:
> 
> Hi Damjan,
> 
> If VPP is an open-source project that supports multiple architectures, then 
> there should be a review of every commit which provides others using the open 
> source project an opportunity to raise their concerns. So my request is to 
> post changes for review before they are committed to ensure VPP stays true to 
> open-source philosophy. Please let me know if this is possible. If not, i'd 
> like to understand the reasons for it.
> 
> Regards,
> Nitin
> 
> On 02-Jun-2018, at 00:17, Damjan Marion  > wrote:
> 
>> 
>> Dear Nitin,
>> 
>> That doesn't work that way. 
>> 
>> Regards,
>> 
>> Damjan
>> 
>>> On 1 Jun 2018, at 19:41, Saxena, Nitin >> > wrote:
>>> 
>>> Hi Damjan,
>>> 
>>>  Now that you are aware that Cavium is working on optimisations for ARM, 
>>> can I request that you check with us on implications for ARM(at least 
>>> Cavium), before bringing changes in dpdk-input? 
>>> 
>>> Regards,
>>> Nitin
>>> 
>>> On 01-Jun-2018, at 21:39, Damjan Marion >> > wrote:
>>> 
 
 Dear Nitin,
 
 I really don't have anything else to add. It your call how do you want to 
 proceed
 
 Regards,
 
 Damjan
 
> On 1 Jun 2018, at 18:02, Nitin Saxena  > wrote:
> 
> Hi Damjan,
> 
> Answers Inline.
> 
> Thanks,
> Nitin
> 
> On Friday 01 June 2018 08:49 PM, Damjan Marion wrote:
>> Hi Nitin,
>> inline...
>>> On 1 Jun 2018, at 15:23, Nitin Saxena >> > wrote:
>>> 
>>> Hi Damjan,
>>> 
 It was hard to know that you have subset of patches hidden somewhere.
>>> I wouldn't say patches are hidden. We are trying to fine tune 
>>> dpdk-input initially from our end first and later we will seek your 
>>> expertise while upstreaming.
>> for me they were hidden.
 Typically it makes sense to discuss such kind of changes with person 
 >who "maintains" the code before starting writing the code.
>>> Agreed. However we prefer to do internal analysis/POC first before 
>>> reaching out to MAINTAINERS. That way we can better understand code 
>>> review comments.
>> Perfectly fine, but then don't put blame on us for not knowing that you 
>> are doing something internally...
> The intention was not to blame anybody but to understand modular approach 
> in vpp to accommodate multi-arch(s).
>>> 
 Maybe, but sounds to me like we are still in guessing phase.
>>> I wouldn't do any guess work with MAINTAINERS.
>>> 
 Maybe we even need different function for each ARM CPU core as they
 maybe have different memory subsystem and pipeline
>>> This is what I am looking for. Is it ok to detect our hardware natively 
>>> from autoconf and append target specific macro to CFLAGS? And then 
>>> separate function for our target in dpdk/device/node.c? Sorry my 
>>> multi-arch select example was incorrect and that's not what I am 
>>> looking at.
>> Here I will be able to help when I get reasonable understanding what is 
>> the "big" plan.
> The "Big" plan is to optimize each vpp node for Aarch64. For now focus is 
> dpdk-input.
>> I don't want that we end up in 6 months with cavium patches, nxp 
>> patches, marvell patches, and so on.
> Is it a problem? If yes than I am not able to visualize it as the same 
> problem would exist for any architecture and not just for Aarch64.
>>> 
 Is there an agreement between ARM vendors what is the targeted core
 you want to have code tuned for or you are simply tuning to whatever
 core Cavium uses?
>>> I am trying to optimize Cavium's SOC. This question is in this regard 
>>> only. However efforts are going on optimizing Cortex cores as well by 
>>> ARM community.
>> What about agreeing on plan for optimising on all ARM cores, and then 
>> starting doing optimisation?
> This is cross-company question so hard to answer but Cavium has the "big" 
> plan described above.
>>> 
>>> Thanks,
>>> Nitin
>>> 
>>> On Friday 01 June 2018 01:55 AM, Damjan Marion wrote:
 inline...
 -- 
 Damjan
> On 31 May 2018, at 21:10, Saxena, Nitin    >> wrote:
> 
> Hi Damjan,
> 
> Answers inline.
> 
> Thanks,
> Nitin
> 
>> On 01-Jun-2018, at 12:15 AM, Damjan Marion >  

[vpp-dev] something wrong about memory

2018-06-02 Thread xulang
Hi all,
I have  stopped my vpp process and its threads in gdb,  and I even did not bind 
any phy interfaces, but it consumes memory continuously.
What should  I do?
I hope hearing from you, thanks.
Below is some information.








Thread 1 "vpp_main" hit Breakpoint 2, dispatch_node (vm=0x779aa2a0 
, node=0x7fff6ec89440, type=VLIB_NODE_TYPE_INPUT, 
dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, 
last_time_stamp=393305677230170) at 
/home/wangzy/oldcode/VBRASV100R001_new_trunk/vpp1704/build-data/../src/vlib/main.c:926
926{
(gdb) info threads
  Id   Target Id Frame 
* 1Thread 0x77fd6740 (LWP 9928) "vpp_main" dispatch_node 
(vm=0x779aa2a0 , node=0x7fff6ec89440, 
type=VLIB_NODE_TYPE_INPUT, dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, 
last_time_stamp=393305677230170) at 
/home/wangzy/oldcode/VBRASV100R001_new_trunk/vpp1704/build-data/../src/vlib/main.c:926
  2Thread 0x7fff6e9bd700 (LWP 9929) "vpp" 0x7fffefd8794d in recvmsg () 
at ../sysdeps/unix/syscall-template.S:84
  3Thread 0x7affda084700 (LWP 9930) "eal-intr-thread" 0x7fffef8b0e23 in 
epoll_wait () at ../sysdeps/unix/syscall-template.S:84
  4Thread 0x7affd9883700 (LWP 9931) "vpp_stats" 0x7fffefd87c1d in 
nanosleep () at ../sysdeps/unix/syscall-template.S:84
(gdb) b dispatch_node
Breakpoint 2 at 0x777570e0: file 
/home/wangzy/oldcode/VBRASV100R001_new_trunk/vpp1704/build-data/../src/vlib/main.c,
 line 926.
(gdb) c
Continuing.
(gdb) b recvmsg
Breakpoint 3 at 0x7fffef8b1770: recvmsg. (2 locations)
(gdb) b epoll_wait
Breakpoint 4 at 0x7fffef8b0df0: file ../sysdeps/unix/syscall-template.S, line 
84.
(gdb) thread 4
[Switching to thread 4 (Thread 0x7affd9883700 (LWP 9931))]
#0  0x7fffefd87c1d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
84../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) info threads
  Id   Target Id Frame 
  1Thread 0x77fd6740 (LWP 9928) "vpp_main" dispatch_node 
(vm=0x779aa2a0 , node=0x7fff6ec89440, 
type=VLIB_NODE_TYPE_INPUT, dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, 
last_time_stamp=393305677230170) at 
/home/wangzy/oldcode/VBRASV100R001_new_trunk/vpp1704/build-data/../src/vlib/main.c:926
  2Thread 0x7fff6e9bd700 (LWP 9929) "vpp" 0x7fffefd8794d in recvmsg () 
at ../sysdeps/unix/syscall-template.S:84
  3Thread 0x7affda084700 (LWP 9930) "eal-intr-thread" 0x7fffef8b0e23 in 
epoll_wait () at ../sysdeps/unix/syscall-template.S:84
* 4Thread 0x7affd9883700 (LWP 9931) "vpp_stats" 0x7fffefd87c1d in 
nanosleep () at ../sysdeps/unix/syscall-template.S:84
(gdb) 




root@ubuntu:/home/wangzy# ps aux|grep vpp
root   5405  0.0  3.5 197540 143976 pts/22  S+   Jun01   0:06 gdb vpp
root   9928  0.4 26.7 5369610712 1076944 pts/22 tl 00:28   0:04 
/usr/bin/vpp -c /etc/vpp/startup.conf
root  10166  0.0  0.0  14228  1092 pts/18   S+   00:43   0:00 grep 
--color=auto vpp
root@ubuntu:/home/wangzy# ps aux|grep vpp
root   5405  0.0  3.5 197540 143976 pts/22  S+   Jun01   0:06 gdb vpp
root   9928  0.4 26.7 5369610712 1078976 pts/22 tl 00:28   0:04 
/usr/bin/vpp -c /etc/vpp/startup.conf
root  10168  0.0  0.0  14228   944 pts/18   S+   00:43   0:00 grep 
--color=auto vpp
root@ubuntu:/home/wangzy# ps aux|grep vpp
root   5405  0.0  3.5 197540 143976 pts/22  S+   Jun01   0:06 gdb vpp
root   9928  0.4 26.8 5369610712 1083048 pts/22 tl 00:28   0:04 
/usr/bin/vpp -c /etc/vpp/startup.conf
root  10177  0.0  0.0  14228   904 pts/18   S+   00:46   0:00 grep 
--color=auto vpp
root@ubuntu:/home/wangzy# 


Regards
xlangyun

Re: [vpp-dev] Rx stuck to 0 after a while

2018-06-02 Thread Andrew Yourtchenko
Dear Rubina,

Excellent, thank you very much! The change is in the master now.

Note that to keep the default memory footprint the same I have temporarily 
halved the default upper limit on sessions (since we create two bihash entries 
now instead of one).

FYI, I plan to do some more work on session management/reuse before 1807 
release.

--a

> On 2 Jun 2018, at 07:48, Rubina Bianchi  wrote:
> 
> Dear Andrew
> 
> Sorry for delayed response. I checked your second patch and here is my test 
> result:
> 
> Best case is still the best and vpp throughput is Maximum (18.5 Gbps) in my 
> scenario.
> Worst case is getting better than past. I never see deadlock again and 
> throughput increases from 50 Mbps to 5.5 Gbps. I also added my T-Rex result.
> 
> -Per port stats table 
>   ports |   0 |   1 
>  
> -
>opackets |  1119818503 |  1065627562 
>  obytes |490687253990 |471065675962 
>ipackets |   274437415 |   391504529 
>  ibytes |120020261974 |170214837563 
> ierrors |   0 |   0 
> oerrors |   0 |   0 
>   Tx Bw |   9.48 Gbps |   9.08 Gbps 
> 
> -Global stats enabled 
>  Cpu Utilization : 88.4  %  7.0 Gb/core 
>  Platform_factor : 1.0  
>  Total-Tx:  18.56 Gbps  
>  Total-Rx:   5.78 Gbps  
>  Total-PPS   :   5.27 Mpps  
>  Total-CPS   :  79.51 Kcps  
> 
>  Expected-PPS:   9.02 Mpps  
>  Expected-CPS: 135.31 Kcps  
>  Expected-BPS:  31.77 Gbps  
> 
>  Active-flows:88840  Clients :  252   Socket-util : 0.5598 %
>  Open-flows  : 33973880  Servers :65532   Socket :88840 
> Socket/Clients :  352.5 
>  drop-rate   :  12.79 Gbps   
>  current time: 423.4 sec  
>  test duration   : 99576.6 sec
> 
> One point that I missed and would be helpful is that I run T-Rex with '-p' 
> parameter:
> ./t-rex-64 -c 6 -d 10 -f cap2/sfr.yaml --cfg cfg/trex_cfg.yaml -m 30 -p
> 
> Thanks,
> Sincerely
> 
> From: Andrew  Yourtchenko 
> Sent: Wednesday, May 30, 2018 12:08 PM
> To: Rubina Bianchi
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Rx stuck to 0 after a while
>  
> Dear Rubina,
> 
> Thanks for checking it!
> 
> yeah actually that patch was leaking the sessions in the session reuse
> path. I have got the setup in the lab locally yesterday and am working
> on a better way to do it...
> 
> Will get back to you when I am happy with the way the code works..
> 
> --a
> 
> 
> 
> On 5/29/18, Rubina Bianchi  wrote:
> > Dear Andrew
> >
> > I cleaned everything and created a new deb packages by your patch once
> > again. With your patch I never see deadlock again, but still I have
> > throughput problem in my scenario.
> >
> > -Per port stats table
> >   ports |   0 |   1
> > -
> >opackets |   474826597 |   452028770
> >  obytes |207843848531 |199591809555
> >ipackets |71010677 |72028456
> >  ibytes | 31441646551 | 31687562468
> > ierrors |   0 |   0
> > oerrors |   0 |   0
> >   Tx Bw |   9.56 Gbps |   9.16 Gbps
> >
> > -Global stats enabled
> >  Cpu Utilization : 88.4  %  7.1 Gb/core
> >  Platform_factor : 1.0
> >  Total-Tx:  18.72 Gbps
> >  Total-Rx:  59.30 Mbps
> >  Total-PPS   :   5.31 Mpps
> >  Total-CPS   :  79.79 Kcps
> >
> >  Expected-PPS:   9.02 Mpps
> >  Expected-CPS: 135.31 Kcps
> >  Expected-BPS:  31.77 Gbps
> >
> >  Active-flows:88837  Clients :  252   Socket-util : 0.5598 %
> >  Open-flows  : 14708455  Servers :65532   Socket :88837
> > Socket/Clients :  352.5
> >  Total_queue_full : 328355248
> >  drop-rate   :  18.66 Gbps
> >  current time: 180.9 sec
> >  test duration   : 99819.1 sec
> >
> > In best case (4 interface in one numa that only 2 of them has acl) my device
> > (HP DL380 G9) throughput is maximum (18.72Gbps) but in worst case (4
> > interface in one numa that all of them has acl) my device throughput will
> > decrease from maximum to around 60Mbps. Actually patch just prevent deadlock
> > in my case but throughput is same as before.
> >
> > 
> > From: Andrew  Yourtchenko 
> > Sent: Tuesday, May 29, 2018 10:11 AM
> > To: Rubina Bianchi
> > Cc: vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] Rx stuck to 0 after a while
> >
> > Dear Rubina,
> >
> > thank you for quickly checking it!
> >
> > Judging by the logs the VPP quits, so I would say there should be a
> > core file, could you check ?
> >
> > If you find it (doublecheck by the timestamps that it is indeed the
> > fresh one), you can load it in gdb (using