[vpp-dev] Want to switch to dpdk17.11.4 ,using vpp 18.0.1

2018-09-25 Thread chetan bhasin
Hi everyone,

We are using Vpp18.0.1 that internally using dpdk 17.11 version. We want to
switch to dpdk 17.11.4 as it has Mellanox fixes.

Can anybody suggest the steps to do so ? Does it have any impact ?

Thanks,
Chetan Bhasin
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10664): https://lists.fd.io/g/vpp-dev/message/10664
Mute This Topic: https://lists.fd.io/mt/26227852/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] IPsec+ikev2 #vpp

2018-09-25 Thread yang . kai
Hi all,

  I'm trying to setup ipsec tunnel in VPP and my gateway, but it always fails. 
After check I found my gateway is used ikev1 but VPP used ikev2, so if ikev1 
and ikev2 can interactive support ? How to configure ? If can not interactive 
support, I want to explant the openswan ikev1 to VPP, is it be come true ?

Thanks,
Best regards.

Ken
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10663): https://lists.fd.io/g/vpp-dev/message/10663
Mute This Topic: https://lists.fd.io/mt/26226994/21656
Mute #vpp: https://lists.fd.io/mk?hashtag=vpp=1480452
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] L2TP #vpp

2018-09-25 Thread yang . kai
Hi all, 
  I'm Ken and I want to setup L2TP tunnel in VPP and my gateway, but I found 
that my gateway is using L2TP + IPV4 but VPP only support L2TPv3 + IPV6, so I 
want to certainly to know if the L2TPV3 is backward support L2TP ? How to 
configure it ? And how to configure L2TPv3 + IPV4 ? If I want to support IPV4, 
what should I do ?

Thank you !
Best regards!

Ken
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10662): https://lists.fd.io/g/vpp-dev/message/10662
Mute This Topic: https://lists.fd.io/mt/26226885/21656
Mute #vpp: https://lists.fd.io/mk?hashtag=vpp=1480452
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Odd problem in adding new session on vpp when its session table is full

2018-09-25 Thread Andrew Yourtchenko
Dear Rubina,

On 9/25/18, Rubina Bianchi  wrote:
> Dear Andrew,
>
> Actually it's not a RFC standard , or known attack, we extracted the
> restriction using experiment on linux netfiltter and tried to implement its
> behavior.

hmm okay. so my position about implementing it still stands then :-)

Also, another point - if you send two echo requests at once with the
same code (which is fine, because they are differentiated by ICMP ID
and sequence number), then the second request will overwrite bihash
values on activation (which is fine) and then the first reply that
comes will erase those upon deactivation, so the second request will
be dropped. This can make the troubleshooting harder in some cases
since this is confusing.  One way to deal with that is to keep some
sort of reference counter, or extend the bihash key, but again in the
absence of real-world threat model this looks like shooting pigeons
from ICBMs... (the only attack I know for this scenario is smurf and
the existing implementation covers it fine)

>
> I checked your patch, throughput is fine. After about 2000 seconds my
> throughput was still maximum and there was no drop-rate on trex. Also,
> rx-miss was very little in this patch.

Excellent, thanks a lot!

> However, one thing is attracting for
> me is that, in the same scenario in previous discussion, vpp served this
> traffic with 3.6 million sessions, but in current patch it serves with 3.9
> million sessions. Is this related to your last changes?

Yeah, it could be...  the new change  make the expiry somewhat longer:
the session in any given timeout list is checked twice per timeout,
rather than every X seconds (X being the shortest list timeout). So if
e.g. a UDP session timeout is 100 seconds, and there is a packet on it
10 seconds before the session is checked, the next check will be 50
seconds later, at which point the idle time on the session (60
seconds) is still smaller than 100, so the session will be expunged
only 50 seconds later, with the idle time of 110 seconds.

Contrast this with the previous behavior where all the lists might be
checked as often as 2 seconds - this would have resulted in a more
precise timing out, but at an expense of a lot of CPU usage.

But then all of the TCP sessions will transition into the tcp
transient state upon closure, so the only impact in real world will be
with the UDP connections, which should be relatively small.

All that said: I am considering  to move the timeout infra to the
tw_timers (which weren't available back when I was writing the code) -
then in theory we can get both the precise expiry and the efficiency.
But that is a larger change, and I am not sure of it yet, so I wanted
to get this simpler solution in first.

--a

>
> Thanks,
> Sincerely
> 
> From: Andrew  Yourtchenko 
> Sent: Monday, September 24, 2018 6:49 PM
> To: Rubina Bianchi
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
> session table is full
>
> Dear Rubina,
>
> On 9/24/18, Rubina Bianchi  wrote:
>> Dear Andrew,
>>
>> It's hardcoded as it was simple and fast solution for our default
>> scenario
>> implementation.
>> As you correctly mentioned in previous email one of the bug fixes was the
>> restriction. Also another one is preventing reply packets pass through
>> even
>
> ok. that's more a "featurette" - I deliberately did not attempt to
> implement the "strict checking" because I had difficult time finding
> the attack vector. (rather than maybe some kind of "compliance" checks
> for the reasons of compliance ?)
>
>> if those packets are matched with an acl rule. In another word these
>> reply
>> packets are not belong to any echo request in reverse direction.
>
> hmm so you are sort-of making a "protocol inspection engine" there ? :-)
>
> Anyway, so far I haven't managed to recreate this condition - though
> if you were running the 18.07 rather than 18.07.1 code, then the bug
> related to hash acl manipulation on ACL changes might have caused that
> effect... I will experiment a bit more, though.
>
> Also, remember the other thread we discussed a while ago about the
> throughput getting lower over time.. I have made
> https://gerrit.fd.io/r/#/c/14821/ which should significantly reduce
> the amount of session list shuffling work in normal case scenarios.
> Before I commit it, could you give it a shot to see if it indeed
> behaves as I would expect it to behave ?  Thanks a lot!
>
> --a
>
>> Thanks,
>> Sincerely
>> 
>> From: Andrew  Yourtchenko 
>> Sent: Tuesday, September 18, 2018 4:06 PM
>> To: Rubina Bianchi
>> Cc: vpp-dev@lists.fd.io
>> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
>> session table is full
>>
>> Dear Rubina,
>>
>> On 9/18/18, Rubina Bianchi  wrote:
>>> Dear Andrew,
>>>
>>> Our changes is provided to you by creating a patch which is attached to
>>> this
>>> email.
>>> I didn't commit it to gerrit 

Re: [vpp-dev] how to pull a 18.07 code??

2018-09-25 Thread Jit Mehta
Thank you Ed. :-)

- J

On Tue, Sep 25, 2018 at 11:43 AM Ed Warnicke  wrote:

> git checkout v18.07.1
>
> will get you to the latest release code
>
> If you want the stable/18.07 branch:
>
> git checkout stable/1807
>
> Ed
>
>
> On September 24, 2018 at 1:43:06 PM, Jit Mehta (
> jitendra.harshad...@gmail.com) wrote:
>
> Could someone tell me how do I pull the 18.07 code?
>
> I have never pulled tree with a label before so
>
> not sure how to do it.
>
> BTW, I pulled latest VPP (*git* clone http://gerrit.fd.io/r/vpp)and have
> run into the following cmake error:
>
> CMake Error at /usr/share/cmake-3.5/Modules/CMakeTestCCompiler.cmake:61
> (message):
>
>   The C compiler "/usr/lib/ccache/cc" is not able to compile a simple test
>
>   program.
>
>
>   It fails with the following output:
>
>
>Change Dir:
> /home/lab/VPP/vpp/build-root/build-vpp-native/vpp/CMakeFiles/CMakeTmp
>
>
>   Run Build Command:"/usr/bin/ninja" "cmTC_4d84d"
>
> The 18.07 code I had built fine earlier.
>
>
> Thanks,
>
> J
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#10630): https://lists.fd.io/g/vpp-dev/message/10630
> Mute This Topic: https://lists.fd.io/mt/26202696/464962
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [hagb...@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10660): https://lists.fd.io/g/vpp-dev/message/10660
Mute This Topic: https://lists.fd.io/mt/26202696/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] how to pull a 18.07 code??

2018-09-25 Thread Edward Warnicke
git checkout v18.07.1

will get you to the latest release code

If you want the stable/18.07 branch:

git checkout stable/1807

Ed


On September 24, 2018 at 1:43:06 PM, Jit Mehta (
jitendra.harshad...@gmail.com) wrote:

Could someone tell me how do I pull the 18.07 code?

I have never pulled tree with a label before so

not sure how to do it.

BTW, I pulled latest VPP (*git* clone http://gerrit.fd.io/r/vpp)and have
run into the following cmake error:

CMake Error at /usr/share/cmake-3.5/Modules/CMakeTestCCompiler.cmake:61
(message):

  The C compiler "/usr/lib/ccache/cc" is not able to compile a simple test

  program.


  It fails with the following output:


   Change Dir:
/home/lab/VPP/vpp/build-root/build-vpp-native/vpp/CMakeFiles/CMakeTmp


  Run Build Command:"/usr/bin/ninja" "cmTC_4d84d"

The 18.07 code I had built fine earlier.


Thanks,

J

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10630): https://lists.fd.io/g/vpp-dev/message/10630
Mute This Topic: https://lists.fd.io/mt/26202696/464962
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [hagb...@gmail.com]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10659): https://lists.fd.io/g/vpp-dev/message/10659
Mute This Topic: https://lists.fd.io/mt/26202696/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [**EXTERNAL**] Fwd: [vpp-dev] Failing to create untagged sub-interface

2018-09-25 Thread Bly, Mike
Neale,

We are in need of providing a full blend of L2 and L3 services on each NIC. The 
permutations dictate a need for untagged to be separated from what VPP calls 
“default” (all non-explicit tagged traffic) on disparate sub-interfaces. I have 
the changes done locally that appear to make this work and all of the 
permutations appear to function correctly from an L2 perspective (refer to my 
original simple config).

I wanted to run “make test” before submitting my patch, but even with the stock 
stable/1801 branch I am running into test failures on a CentOS container setup. 
All preceding sub-tests (except MAC learning) are passing. Digging through 
online search results to see what I can figure out.

==
ARP Test Case
==
ARP  OK
Fatal Python error: Bus error

Thread 0x7ff607bff700  (most recent call first):
  File "/usr/lib64/python2.7/threading.py", line 339 in wait
  File "/usr/lib64/python2.7/Queue.py", line 168 in get
  File "build/bdist.linux-x86_64/egg/vpp_papi.py", line 850 in 
thread_msg_handler
  File "/usr/lib64/python2.7/threading.py", line 765 in run
  File "/usr/lib64/python2.7/threading.py", line 812 in __bootstrap_inner
  File "/usr/lib64/python2.7/threading.py", line 785 in __bootstrap
...
Thread 0x7ff799844700  (most recent call first):
  File "/usr/lib64/python2.7/threading.py", line 339 in wait
  File "/usr/lib64/python2.7/Queue.py", line 168 in get
  File "build/bdist.linux-x86_64/egg/vpp_papi.py", line 850 in 
thread_msg_handler
  File "/usr/lib64/python2.7/threading.py", line 765 in run
  File "/usr/lib64/python2.7/threading.py", line 812 in __bootstrap_inner
  File "/usr/lib64/python2.7/threading.py", line 785 in __bootstrap

Current thread 0x7ff7a4950740  (most recent call first):
  File "build/bdist.linux-x86_64/egg/vpp_papi.py", line 616 in _write_new_cffi
  File "build/bdist.linux-x86_64/egg/vpp_papi.py", line 773 in _call_vpp
  File "build/bdist.linux-x86_64/egg/vpp_papi.py", line 571 in 
  File "build/bdist.linux-x86_64/egg/vpp_papi.py", line 99 in __call__
  File "/workspace/vpp/test/vpp_papi_provider.py", line 184 in cli
  File "/workspace/vpp/test/framework.py", line 496 in setUp
  File "/workspace/vpp/test/test_neighbor.py", line 24 in setUp
  File "/usr/lib64/python2.7/unittest/case.py", line 360 in run
  File "/usr/lib64/python2.7/unittest/case.py", line 433 in __call__
  File "/usr/lib64/python2.7/unittest/suite.py", line 108 in run
  File "/usr/lib64/python2.7/unittest/suite.py", line 70 in __call__
  File "/usr/lib64/python2.7/unittest/runner.py", line 151 in run
  File "/workspace/vpp/test/framework.py", line 1108 in run
  File "run_tests.py", line 21 in test_runner_wrapper
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114 in run
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258 in _bootstrap
  File "/usr/lib64/python2.7/multiprocessing/forking.py", line 126 in __init__
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 130 in start
  File "run_tests.py", line 57 in run_forked
  File "run_tests.py", line 159 in 
22:36:41,079 Timeout while waiting for child test runner process (last test 
running was `ARP Duplicates' in `/tmp/vpp-unittest-ARPTestCase-HV5bsk')!
22:36:41,080 Creating a link to the failed test: 
/tmp/vpp-failed-unittests/vpp-unittest-ARPTestCase-HV5bsk-FAILED -> 
vpp-unittest-ARPTestCase-HV5bsk
1 test(s) failed, 0 attempt(s) left
Killing possible remaining process IDs:  1108 1115 1117
Not compressing files in temporary directories from failed test runs.
make[1]: *** [test] Error 1
make[1]: Leaving directory `/workspace/vpp/test'
make: *** [test] Error 2


-Mike

From: Neale Ranns (nranns) 
Sent: Tuesday, September 25, 2018 12:39 AM
To: Bly, Mike ; John Lo (loj) ; Edward Warnicke 
; vpp-dev@lists.fd.io
Subject: Re: [**EXTERNAL**] Fwd: [vpp-dev] Failing to create untagged 
sub-interface

Hi Mike,

Perhaps you could tell us why you want to create an untagged sub-interface.

Regards,
Neale


De : mailto:vpp-dev@lists.fd.io>> au nom de "Bly, Mike" 
mailto:m...@ciena.com>>
Date : vendredi 21 septembre 2018 à 17:06
À : "John Lo (loj)" mailto:l...@cisco.com>>, Edward Warnicke 
mailto:hagb...@gmail.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Objet : Re: [**EXTERNAL**] Fwd: [vpp-dev] Failing to create untagged 
sub-interface

John,

Any advise on this is appreciated. We can certainly dig into this, but we first 
wanted to sanity check with the community in case there was something obvious 
as to why it is working the way it is currently. I am hopeful that between you 
efforts and ours we can run this to ground in short order.

-Mike

From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of John Lo (loj) 

Re: [vpp-dev] PCI domain should be 32 bit

2018-09-25 Thread Stephen Hemminger
On Tue, 25 Sep 2018 15:16:04 +0200
Damjan Marion  wrote:

> > On 25 Sep 2018, at 15:03, Stephen Hemminger  
> > wrote:
> > 
> > I noticed that the PCI domain in VPP is limited to 16 bits.
> > This is a problem on Azure/Hyper-V and other virtual environments.
> > In these environments, the host will generate a 32 bit synthetic value
> > when passing a PCI device through. The Linux kernel has 32 bit PCI 
> > domains, and libpciaccess used by X now has 32 bit PCI domains.
> > The PCI specification says domain is supposed to be 32 bit so this is
> > not just Linux/Hyper-V issue.
> > 
> > Unfortunately, VPP has 16 bit PCI domain hard coded. This is not a show
> > stopper now because the main use of PCI is for SR-IOV passthrough. In the
> > passthrough case, the VF is taken over by other devices and the actual PCI
> > value is not something VPP has to worry about directly. With DPDK today,
> > this is done via TAP/FAILSAFE/VDEV_NETVSC devices. The issue would only be
> > visible with direct device assignment.
> > 
> > Fixing it in not too hard (only in pci.h) but it would cause the device
> > union to get larger and all the places as_u32 is used would have to become
> > as_u64. Mostly mechanical, but always a possibility of errors.  
> 
> Feel free to submit patch which moves to u64, i don't expect major issues.
> This is purely historic
> 

Ok, the biggest issue is places that take 32 bit value into and out of long.
As in ntohl( pci_addr.as_u32 )
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10657): https://lists.fd.io/g/vpp-dev/message/10657
Mute This Topic: https://lists.fd.io/mt/26219976/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Odd problem in adding new session on vpp when its session table is full

2018-09-25 Thread Rubina Bianchi
Dear Andrew,

Actually it's not a RFC standard or known attack, we extracted the restriction 
using experiment on linux netfiltter and tried to implement its behavior.

I checked your patch, throughput is fine. After about 2000 seconds my 
throughput was still maximum and there was no drop-rate on trex. Also, rx-miss 
was very little in this patch. However, one thing is attracting for me is that, 
in the same scenario in previous discussion, vpp served this traffic with 3.6 
million sessions, but in current patch it serves with 3.9 million sessions. Is 
this related to your last changes?

Thanks,
Sincerely

From: Andrew  Yourtchenko 
Sent: Monday, September 24, 2018 6:49 PM
To: Rubina Bianchi
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its 
session table is full

Dear Rubina,

On 9/24/18, Rubina Bianchi  wrote:
> Dear Andrew,
>
> It's hardcoded as it was simple and fast solution for our default scenario
> implementation.
> As you correctly mentioned in previous email one of the bug fixes was the
> restriction. Also another one is preventing reply packets pass through even

ok. that's more a "featurette" - I deliberately did not attempt to
implement the "strict checking" because I had difficult time finding
the attack vector. (rather than maybe some kind of "compliance" checks
for the reasons of compliance ?)

> if those packets are matched with an acl rule. In another word these reply
> packets are not belong to any echo request in reverse direction.

hmm so you are sort-of making a "protocol inspection engine" there ? :-)

Anyway, so far I haven't managed to recreate this condition - though
if you were running the 18.07 rather than 18.07.1 code, then the bug
related to hash acl manipulation on ACL changes might have caused that
effect... I will experiment a bit more, though.

Also, remember the other thread we discussed a while ago about the
throughput getting lower over time.. I have made
https://gerrit.fd.io/r/#/c/14821/ which should significantly reduce
the amount of session list shuffling work in normal case scenarios.
Before I commit it, could you give it a shot to see if it indeed
behaves as I would expect it to behave ?  Thanks a lot!

--a

> Thanks,
> Sincerely
> 
> From: Andrew  Yourtchenko 
> Sent: Tuesday, September 18, 2018 4:06 PM
> To: Rubina Bianchi
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
> session table is full
>
> Dear Rubina,
>
> On 9/18/18, Rubina Bianchi  wrote:
>> Dear Andrew,
>>
>> Our changes is provided to you by creating a patch which is attached to
>> this
>> email.
>> I didn't commit it to gerrit due to our specific scenario (permit+reflect
>> on
>> all inputs, permit+reflect or deny on all outputs).
>
> Why do you hardcode it as opposed to making it part of configuration ?
> permit+reflect in one direction and deny except established sessions
> is a fairly standard config.
>
>> In addition to ICMP timeout handling, our code fixes some ICMP bugs.
>
> Do you mean the "strict"  enforcement of the one-request-one-response
> policy for ICMP that this code does ?
>
> --a
>
>> Although, I think code is clear for you, I can explain it in details if
>> you
>> ask.
>>
>> Thanks,
>> Sincerely
>> 
>> From: Andrew  Yourtchenko 
>> Sent: Tuesday, September 18, 2018 11:27 AM
>> To: Rubina Bianchi
>> Cc: vpp-dev@lists.fd.io
>> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
>> session table is full
>>
>>
>>
>>
>> Hi Rubina,
>>
>> On 18 Sep 2018, at 11:14, Rubina Bianchi
>> mailto:r_bian...@outlook.com>> wrote:
>>
>> Hi Dear Andrew
>>
>> 1) I just attached my init.conf to this email. As you guessed session
>> table
>> size is 100. This problem is occurred on vpp stable/1807.
>>
>> Ah, cool, that helps, thanks!
>>
>>
>> 2) Yes, there is 6 timeout list. We added a list for handling icmp
>> timeouts.
>>
>> That is not the stable/1807, then ☺️ would you mind submitting the change
>> to
>> gerrit so we could take a look at it and ideally incorporate into the
>> master
>> ?
>>
>> —a
>>
>>
>> 
>> From: Andrew  Yourtchenko
>> mailto:ayour...@gmail.com>>
>> Sent: Monday, September 17, 2018 8:03 PM
>> To: Rubina Bianchi
>> Cc: vpp-dev@lists.fd.io
>> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
>> session table is full
>>
>> Dear Rubina,
>>
>> looking at the outputs, there are a few anomalies that hopefully you
>> can clarify:
>>
>> 1) the max session count is 100. The latest master has the default
>> limit of 50, and I do not see any startup config parameters
>> changing that. Which version are you testing with/building off ?
>>
>> 2) there are 6 fa_conn_list_head elements in each worker for your
>> outputs. That number was initially 3, and in the early spring when I
>> 

Re: [vpp-dev] PCI domain should be 32 bit

2018-09-25 Thread Damjan Marion via Lists.Fd.Io

> On 25 Sep 2018, at 15:03, Stephen Hemminger  
> wrote:
> 
> I noticed that the PCI domain in VPP is limited to 16 bits.
> This is a problem on Azure/Hyper-V and other virtual environments.
> In these environments, the host will generate a 32 bit synthetic value
> when passing a PCI device through. The Linux kernel has 32 bit PCI 
> domains, and libpciaccess used by X now has 32 bit PCI domains.
> The PCI specification says domain is supposed to be 32 bit so this is
> not just Linux/Hyper-V issue.
> 
> Unfortunately, VPP has 16 bit PCI domain hard coded. This is not a show
> stopper now because the main use of PCI is for SR-IOV passthrough. In the
> passthrough case, the VF is taken over by other devices and the actual PCI
> value is not something VPP has to worry about directly. With DPDK today,
> this is done via TAP/FAILSAFE/VDEV_NETVSC devices. The issue would only be
> visible with direct device assignment.
> 
> Fixing it in not too hard (only in pci.h) but it would cause the device
> union to get larger and all the places as_u32 is used would have to become
> as_u64. Mostly mechanical, but always a possibility of errors.

Feel free to submit patch which moves to u64, i don't expect major issues.
This is purely historic

-- 
Damjan-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10655): https://lists.fd.io/g/vpp-dev/message/10655
Mute This Topic: https://lists.fd.io/mt/26219976/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] PCI domain should be 32 bit

2018-09-25 Thread Stephen Hemminger
I noticed that the PCI domain in VPP is limited to 16 bits.
This is a problem on Azure/Hyper-V and other virtual environments.
In these environments, the host will generate a 32 bit synthetic value
when passing a PCI device through. The Linux kernel has 32 bit PCI 
domains, and libpciaccess used by X now has 32 bit PCI domains.
The PCI specification says domain is supposed to be 32 bit so this is
not just Linux/Hyper-V issue.

Unfortunately, VPP has 16 bit PCI domain hard coded. This is not a show
stopper now because the main use of PCI is for SR-IOV passthrough. In the
passthrough case, the VF is taken over by other devices and the actual PCI
value is not something VPP has to worry about directly. With DPDK today,
this is done via TAP/FAILSAFE/VDEV_NETVSC devices. The issue would only be
visible with direct device assignment.

Fixing it in not too hard (only in pci.h) but it would cause the device
union to get larger and all the places as_u32 is used would have to become
as_u64. Mostly mechanical, but always a possibility of errors.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10654): https://lists.fd.io/g/vpp-dev/message/10654
Mute This Topic: https://lists.fd.io/mt/26219976/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread Andrew Yourtchenko
Hmm this output does look exactly like what I had before commit ed435504 - the 
IPv6 table bucket area gets corrupted by the overruns from IPv4 table arena.

And then I spotted the leak which was happening during the rehash which I fixed 
in 11521387.

Your (very few) freelists look like you don’t have that fix either - normally 
you would see something 
like in http://paste.ubuntu.com/p/y43MsrdHSr/ - notice how many short elements 
are in the freelists.

And given your tests confirm the session cleaner optimization work as I 
intended, I have it now in master.

So you can just pull your fresh tree from master and recheck :)

--a

> On 25 Sep 2018, at 13:10, khers  wrote:
> 
> I checked out from gerrrit, I think  it's using latest master. ( but i make 
> another version to make sure )
> let me explain how I produce this situation, 
> 
> while true  :)
> 1- run trex 
> ./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 7 -c 2 
> 2-stop trex after ~120 seconds
> 3- wait until all session deleted from 'sh acl-plugin session'
> 
> As you see I waited until all session is deleted so all bucket must be 
> completely free, 
> I send you part of  'show acl-plugin sessions verbose 1'  in this link. 
> 
> 
> 
>> On Tue, Sep 25, 2018 at 1:52 PM Andrew  Yourtchenko  
>> wrote:
>> Are you using latest master ? I fixed a couple of issues in bihash last week 
>> related to memory usage... if it’s the latest master, the output of used vs 
>> available looks weird... - so please let me know...
>> 
>> As for the “general” growth - basically what happens is bihash doubles each 
>> bucket size whenever there is a collision on insert, and then converts the 
>> bucket into linear lookup whenever there is still a collision after that 
>> growth.
>> 
>> Then the only time the shrinkage/reset is happening is when the bucket is 
>> completely free - which with long living sessions with overlapping lifetimes 
>> might mean never.
>> 
>> So one approach to this is to increase the number of buckets. Then they will 
>> be smaller and have higher probability of being freed.
>> 
>> This is assuming there is nothing else “funny” going on. You can do “show 
>> acl-plugin sessions verbose 1” via vppctl (It will take forever to complete 
>> and needs pager disabled since it dumps the entire bihash) to inspect the 
>> way the buckets are filled...
>> 
>> --a
>> 
>>> On 25 Sep 2018, at 12:12, khers  wrote:
>>> 
>>> It's amazing!!!
>>> 
>>> IPv4 Session lookup hash table:
>>> Hash table ACL plugin FA IPv4 session bihash
>>> 968086 active elements 65536 active buckets
>>> 13 free lists
>>>[len 16] 1 free elts
>>>[len 32] 1 free elts
>>>[len 256] 10669 free elts
>>>[len 512] 36768 free elts
>>>[len 1024] 4110 free elts
>>>[len 2048] 156 free elts
>>>[len 4096] 4 free elts
>>> 844 linear search buckets
>>> arena: base 7fe91232, next 2680ca780
>>>used 10335594368 b (9856 Mbytes) of 100 b (9536 Mbytes)
>>> 
>>> 
 On Tue, Sep 25, 2018 at 1:39 PM khers  wrote:
 Yes, that's right. I think is completely another issue from the patch you 
 sent
 
> On Tue, Sep 25, 2018 at 1:35 PM Andrew  Yourtchenko  
> wrote:
> Excellent, thanks!
> 
> Memory usage - you mean in bihash arena ?
> 
> --a
> 
>> On 25 Sep 2018, at 11:38, khers  wrote:
>> 
>> Throughput and session add/del is stable as rock. The only danger i see 
>> is growing memory usage.
>> look at this 
>> 
>> 
>>> On Tue, Sep 25, 2018 at 11:31 AM khers  wrote:
>>> Of course, I test your patch, there is no slowdown with my scenario. I 
>>> need more time to test other
>>> scenarios and make sure.
>>> 
>>> 
 On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko 
  wrote:
 Cool. Then it is probably indeed the session requeues that are not yet 
 efficient... I have been looking at optimizing that.
 
 I have a draft in the works which should have less session requeues - 
 I have just added you to it, could you give it a shot and see if it 
 makes things better ? 
 
 --a
 
> On 24 Sep 2018, at 12:55, khers  wrote:
> 
> yes, I confirm
> 
>> On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko 
>>  wrote:
>> Okay, so what I think I am hearing - the gradual slowdown is/was 
>> always there, and is somewhat more pronounced in master, right ?
>> 
>> --a
>> 
>>> On 24 Sep 2018, at 11:49, khers  wrote:
>>> 
>>> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 
>>> or more worker thread and 1 main,
>>> but when vpp using one cpu I hadn't any problem. In the 1807 multi 
>>> core is stable i didn't see any of those
>>> problem but throughput is declining slowly.
>>> 

Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread emma sdi
I checked out from gerrrit, I think  it's using latest master. ( but i make
another version to make sure )
let me explain how I produce this situation,

while true  :)
1- run trex
./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 7 -c 2
2-stop trex after ~120 seconds
3- wait until all session deleted from 'sh acl-plugin session'

As you see I waited until all session is deleted so all bucket must be
completely free,
I send you part of  'show acl-plugin sessions verbose 1'  in this link
.



On Tue, Sep 25, 2018 at 1:52 PM Andrew  Yourtchenko 
wrote:

> Are you using latest master ? I fixed a couple of issues in bihash last
> week related to memory usage... if it’s the latest master, the output of
> used vs available looks weird... - so please let me know...
>
> As for the “general” growth - basically what happens is bihash doubles
> each bucket size whenever there is a collision on insert, and then converts
> the bucket into linear lookup whenever there is still a collision after
> that growth.
>
> Then the only time the shrinkage/reset is happening is when the bucket is
> completely free - which with long living sessions with overlapping
> lifetimes might mean never.
>
> So one approach to this is to increase the number of buckets. Then they
> will be smaller and have higher probability of being freed.
>
> This is assuming there is nothing else “funny” going on. You can do “show
> acl-plugin sessions verbose 1” via vppctl (It will take forever to complete
> and needs pager disabled since it dumps the entire bihash) to inspect the
> way the buckets are filled...
>
> --a
>
> On 25 Sep 2018, at 12:12, khers  wrote:
>
> It's amazing!!!
>
> IPv4 Session lookup hash table:
> Hash table ACL plugin FA IPv4 session bihash
> 968086 active elements 65536 active buckets
> 13 free lists
>[len 16] 1 free elts
>[len 32] 1 free elts
>[len 256] 10669 free elts
>[len 512] 36768 free elts
>[len 1024] 4110 free elts
>[len 2048] 156 free elts
>[len 4096] 4 free elts
> 844 linear search buckets
> arena: base 7fe91232, next 2680ca780
> *   used 10335594368 b (9856 Mbytes) of 100 b (9536
> Mbytes)*
>
>
> On Tue, Sep 25, 2018 at 1:39 PM khers  wrote:
>
>> Yes, that's right. I think is completely another issue from the patch you
>> sent
>>
>> On Tue, Sep 25, 2018 at 1:35 PM Andrew  Yourtchenko 
>> wrote:
>>
>>> Excellent, thanks!
>>>
>>> Memory usage - you mean in bihash arena ?
>>>
>>> --a
>>>
>>> On 25 Sep 2018, at 11:38, khers  wrote:
>>>
>>> Throughput and session add/del is stable as rock. The only danger i see
>>> is growing memory usage.
>>> look at this 
>>>
>>>
>>> On Tue, Sep 25, 2018 at 11:31 AM khers  wrote:
>>>
 Of course, I test your patch, there is no slowdown with my scenario. I
 need more time to test other
 scenarios and make sure.


 On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko <
 ayour...@gmail.com> wrote:

> Cool. Then it is probably indeed the session requeues that are not yet
> efficient... I have been looking at optimizing that.
>
> I have a draft in the works which should have less session requeues -
> I have just added you to it, could you give it a shot and see if it makes
> things better ?
>
> --a
>
> On 24 Sep 2018, at 12:55, khers  wrote:
>
> yes, I confirm
>
> On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko <
> ayour...@gmail.com> wrote:
>
>> Okay, so what I think I am hearing - the gradual slowdown is/was
>> always there, and is somewhat more pronounced in master, right ?
>>
>> --a
>>
>> On 24 Sep 2018, at 11:49, khers  wrote:
>>
>> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 or
>> more worker thread and 1 main,
>> but when vpp using one cpu I hadn't any problem. In the 1807 multi
>> core is stable i didn't see any of those
>> problem but throughput is declining slowly.
>> I ran another test with same version of last email, which vpp is
>> configured with one core and throughput is declining slower than
>> master
>> second 200 
>> second 5900 
>>
>>
>> On Sun, Sep 23, 2018 at 6:57 PM Andrew  Yourtchenko <
>> ayour...@gmail.com> wrote:
>>
>>> Interesting - but you are saying in 1804 this effect is not observed
>>> ? There was no other notable changes with regards to session management 
>>> -
>>> but maybe worth it to just do hit bisect and see. Should be 4-5 
>>> iterations.
>>> Could you verify that - if indeed this is not seen in 1804.
>>>
>>> --a
>>>
>>> On 23 Sep 2018, at 16:42, khers  wrote:
>>>
>>> I checked out the version before the gerrit 12770 is merged to
>>> 

Re: [vpp-dev] question about bfd protocol message

2018-09-25 Thread Klement Sekera via Lists.Fd.Io
Hi Xue,

RFC 5884 is not implemented.

Implemented RFCs are: 5880, 5881.

Thanks,
Klement

Quoting 薛欣颖 (2018-09-25 12:56:56)
>Hi Klement,
>I'm interested in BFD for LSP (RFC 5884 ).  Thank you very much for your
>reply.
>Thanks,
>Xue
> 
>--
> 
>   
>  From: [1]Klement Sekera via Lists.Fd.Io
>  Date: 2018-09-25 16:57
>  To: [2]薛欣颖; [3]vpp-dev
>  CC: [4]vpp-dev
>  Subject: Re: [vpp-dev] question about bfd protocol message
>  Hi Xue,
>   
>  I'm not sure what protocol message you mean. Can you please elaborate or
>  point to RFC number & section which describes the message you're
>  interested in?
>   
>  Thanks,
>  Klement
>   
>  Quoting xyxue (2018-09-25 09:48:58)
>  >    Hi guys��
>  >    I��m testing the bfd . Is the bfd support protocol message? 
>  >    Thanks,
>  >    Xue
>  >
>  >   
>  
> --
>   
>   
>  -=-=-=-=-=-=-=-=-=-=-=-
>  Links: You receive all messages sent to this group.
>   
>  View/Reply Online (#10637): https://lists.fd.io/g/vpp-dev/message/10637
>  Mute This Topic: https://lists.fd.io/mt/26218372/675372
>  Group Owner: vpp-dev+ow...@lists.fd.io
>  Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [xy...@fiberhome.com]
>  -=-=-=-=-=-=-=-=-=-=-=-
> 
> References
> 
>Visible links
>1. mailto:ksekera=cisco@lists.fd.io
>2. mailto:xy...@fiberhome.com
>3. mailto:vpp-dev@lists.fd.io
>4. mailto:vpp-dev@lists.fd.io
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10650): https://lists.fd.io/g/vpp-dev/message/10650
Mute This Topic: https://lists.fd.io/mt/26218372/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] question about bfd protocol message

2018-09-25 Thread xyxue
Hi Klement,

I'm interested in BFD for LSP (RFC 5884 ).  Thank you very much for your reply.

Thanks,
Xue


 
From: Klement Sekera via Lists.Fd.Io
Date: 2018-09-25 16:57
To: 薛欣颖; vpp-dev
CC: vpp-dev
Subject: Re: [vpp-dev] question about bfd protocol message
Hi Xue,
 
I'm not sure what protocol message you mean. Can you please elaborate or
point to RFC number & section which describes the message you're
interested in?
 
Thanks,
Klement
 
Quoting xyxue (2018-09-25 09:48:58)
>Hi guys��
>I��m testing the bfd . Is the bfd support protocol message? 
>Thanks,
>Xue
> 
>--
 
 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
 
View/Reply Online (#10637): https://lists.fd.io/g/vpp-dev/message/10637
Mute This Topic: https://lists.fd.io/mt/26218372/675372
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [xy...@fiberhome.com]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10649): https://lists.fd.io/g/vpp-dev/message/10649
Mute This Topic: https://lists.fd.io/mt/26218372/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] A bug in IP reassembly?

2018-09-25 Thread Kingwel Xie
Ok. I'll find some time tomorrow to push a patch fixing both v4 and v6.

-Original Message-
From: Klement Sekera  
Sent: Tuesday, September 25, 2018 6:02 PM
To: Kingwel Xie ; vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] A bug in IP reassembly?

Hi Kingwel,

thanks for finding this bug. Your patch looks fine - would you mind making a 
similar fix in ip4_reassembly.c? The logic suffers from the same flaw there.

Thanks,
Klement

Quoting Kingwel Xie (2018-09-25 11:06:49)
>Hi,
> 
> 
> 
>I worked on testing IP reassembly recently, the hit a crash when testing
>IP reassembly with IPSec. It took me some time to figure out why.
> 
> 
> 
>The crash only happens when there are >1 feature node enabled under
>ip-unicast and ip reassembly is working, like below.
> 
> 
> 
>ip4-unicast:
> 
>  ip4-reassembly-feature
> 
>  ipsec-input-ip4
> 
> 
> 
>It looks like there is a bug in the reassembly code as below:
>vnet_feature_next will do to buffer b0 to update the next0 and the
>current_config_index of b0, but b0 is pointing to some fragment buffer
>which in most cases is not the first buffer in chain indicated by bi0.
>Actually bi0 pointing to the first buffer is returned by ip6_reass_update
>when reassembly is finalized. As I can see this is a mismatch that bi0 and
>b0 are not the same buffer. In the end the quick fix is like what I added
>: b0 = vlib_get_buffer (vm, bi0); to make it right.
> 
> 
> 
>      if (~0 != bi0)
> 
>        {
> 
>        skip_reass:
> 
>      to_next[0] = bi0;
> 
>      to_next += 1;
> 
>      n_left_to_next -= 1;
> 
>      if (is_feature && IP6_ERROR_NONE == error0)
> 
>    {
> 
>      b0 = vlib_get_buffer (vm, bi0);  à added
>by Kingwel
> 
>      vnet_feature_next (, b0);
> 
>    }
> 
>      vlib_validate_buffer_enqueue_x1 (vm, node,
>next_index, to_next,
> 
>   
> 
>   n_left_to_next, bi0, next0);
> 
>        }
> 
> 
> 
>Probably this is not the perfect fix, but it works at least. Wonder if
>committers have better thinking about it? I can of course push a patch if
>you think it is ok.
> 
> 
> 
>Regards,
> 
>Kingwel
> 
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10648): https://lists.fd.io/g/vpp-dev/message/10648
Mute This Topic: https://lists.fd.io/mt/26218556/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread Andrew Yourtchenko
Are you using latest master ? I fixed a couple of issues in bihash last week 
related to memory usage... if it’s the latest master, the output of used vs 
available looks weird... - so please let me know...

As for the “general” growth - basically what happens is bihash doubles each 
bucket size whenever there is a collision on insert, and then converts the 
bucket into linear lookup whenever there is still a collision after that growth.

Then the only time the shrinkage/reset is happening is when the bucket is 
completely free - which with long living sessions with overlapping lifetimes 
might mean never.

So one approach to this is to increase the number of buckets. Then they will be 
smaller and have higher probability of being freed.

This is assuming there is nothing else “funny” going on. You can do “show 
acl-plugin sessions verbose 1” via vppctl (It will take forever to complete and 
needs pager disabled since it dumps the entire bihash) to inspect the way the 
buckets are filled...

--a

> On 25 Sep 2018, at 12:12, khers  wrote:
> 
> It's amazing!!!
> 
> IPv4 Session lookup hash table:
> Hash table ACL plugin FA IPv4 session bihash
> 968086 active elements 65536 active buckets
> 13 free lists
>[len 16] 1 free elts
>[len 32] 1 free elts
>[len 256] 10669 free elts
>[len 512] 36768 free elts
>[len 1024] 4110 free elts
>[len 2048] 156 free elts
>[len 4096] 4 free elts
> 844 linear search buckets
> arena: base 7fe91232, next 2680ca780
>used 10335594368 b (9856 Mbytes) of 100 b (9536 Mbytes)
> 
> 
>> On Tue, Sep 25, 2018 at 1:39 PM khers  wrote:
>> Yes, that's right. I think is completely another issue from the patch you 
>> sent
>> 
>>> On Tue, Sep 25, 2018 at 1:35 PM Andrew  Yourtchenko  
>>> wrote:
>>> Excellent, thanks!
>>> 
>>> Memory usage - you mean in bihash arena ?
>>> 
>>> --a
>>> 
 On 25 Sep 2018, at 11:38, khers  wrote:
 
 Throughput and session add/del is stable as rock. The only danger i see is 
 growing memory usage.
 look at this 
 
 
> On Tue, Sep 25, 2018 at 11:31 AM khers  wrote:
> Of course, I test your patch, there is no slowdown with my scenario. I 
> need more time to test other
> scenarios and make sure.
> 
> 
>> On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko 
>>  wrote:
>> Cool. Then it is probably indeed the session requeues that are not yet 
>> efficient... I have been looking at optimizing that.
>> 
>> I have a draft in the works which should have less session requeues - I 
>> have just added you to it, could you give it a shot and see if it makes 
>> things better ? 
>> 
>> --a
>> 
>>> On 24 Sep 2018, at 12:55, khers  wrote:
>>> 
>>> yes, I confirm
>>> 
 On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko 
  wrote:
 Okay, so what I think I am hearing - the gradual slowdown is/was 
 always there, and is somewhat more pronounced in master, right ?
 
 --a
 
> On 24 Sep 2018, at 11:49, khers  wrote:
> 
> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 or 
> more worker thread and 1 main,
> but when vpp using one cpu I hadn't any problem. In the 1807 multi 
> core is stable i didn't see any of those
> problem but throughput is declining slowly.
> I ran another test with same version of last email, which vpp is 
> configured with one core and throughput is declining slower than 
> master
> second 200
> second 5900
> 
> 
>> On Sun, Sep 23, 2018 at 6:57 PM Andrew  Yourtchenko 
>>  wrote:
>> Interesting - but you are saying in 1804 this effect is not observed 
>> ? There was no other notable changes with regards to session 
>> management - but maybe worth it to just do hit bisect and see. 
>> Should be 4-5 iterations. Could you verify that - if indeed this is 
>> not seen in 1804.
>> 
>> --a
>> 
>>> On 23 Sep 2018, at 16:42, khers  wrote:
>>> 
>>> I checked out the version before the gerrit 12770 is merged to 
>>> master.
>>> 2371c25fed6b2e751163df590bb9d9a93a75a0f
>>> 
>>> I got SIGSEGV with 2 workers, so I repeat the test with one worker.
>>> Throughput is going down like the latest version.
>>> 
 On Sun, Sep 23, 2018 at 4:55 PM Andrew  Yourtchenko 
  wrote:
 Would you be able to confirm that it changes at a point of 
 https://gerrit.fd.io/r/#/c/12770/ ?
 
 --a
 
> On 23 Sep 2018, at 13:31, emma sdi  wrote:
> 
> Dear Community
> 
> I have simple configuration as following:
> 

Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread emma sdi
It's amazing!!!

IPv4 Session lookup hash table:
Hash table ACL plugin FA IPv4 session bihash
968086 active elements 65536 active buckets
13 free lists
   [len 16] 1 free elts
   [len 32] 1 free elts
   [len 256] 10669 free elts
   [len 512] 36768 free elts
   [len 1024] 4110 free elts
   [len 2048] 156 free elts
   [len 4096] 4 free elts
844 linear search buckets
arena: base 7fe91232, next 2680ca780
*   used 10335594368 b (9856 Mbytes) of 100 b (9536 Mbytes)*


On Tue, Sep 25, 2018 at 1:39 PM khers  wrote:

> Yes, that's right. I think is completely another issue from the patch you
> sent
>
> On Tue, Sep 25, 2018 at 1:35 PM Andrew  Yourtchenko 
> wrote:
>
>> Excellent, thanks!
>>
>> Memory usage - you mean in bihash arena ?
>>
>> --a
>>
>> On 25 Sep 2018, at 11:38, khers  wrote:
>>
>> Throughput and session add/del is stable as rock. The only danger i see
>> is growing memory usage.
>> look at this 
>>
>>
>> On Tue, Sep 25, 2018 at 11:31 AM khers  wrote:
>>
>>> Of course, I test your patch, there is no slowdown with my scenario. I
>>> need more time to test other
>>> scenarios and make sure.
>>>
>>>
>>> On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko <
>>> ayour...@gmail.com> wrote:
>>>
 Cool. Then it is probably indeed the session requeues that are not yet
 efficient... I have been looking at optimizing that.

 I have a draft in the works which should have less session requeues - I
 have just added you to it, could you give it a shot and see if it makes
 things better ?

 --a

 On 24 Sep 2018, at 12:55, khers  wrote:

 yes, I confirm

 On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko <
 ayour...@gmail.com> wrote:

> Okay, so what I think I am hearing - the gradual slowdown is/was
> always there, and is somewhat more pronounced in master, right ?
>
> --a
>
> On 24 Sep 2018, at 11:49, khers  wrote:
>
> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 or
> more worker thread and 1 main,
> but when vpp using one cpu I hadn't any problem. In the 1807 multi
> core is stable i didn't see any of those
> problem but throughput is declining slowly.
> I ran another test with same version of last email, which vpp is
> configured with one core and throughput is declining slower than
> master
> second 200 
> second 5900 
>
>
> On Sun, Sep 23, 2018 at 6:57 PM Andrew  Yourtchenko <
> ayour...@gmail.com> wrote:
>
>> Interesting - but you are saying in 1804 this effect is not observed
>> ? There was no other notable changes with regards to session management -
>> but maybe worth it to just do hit bisect and see. Should be 4-5 
>> iterations.
>> Could you verify that - if indeed this is not seen in 1804.
>>
>> --a
>>
>> On 23 Sep 2018, at 16:42, khers  wrote:
>>
>> I checked out the version before the gerrit 12770 is merged to master.
>> 2371c25fed6b2e751163df590bb9d9a93a75a0f
>>
>> I got SIGSEGV with 2 workers, so I repeat the test with one worker.
>> Throughput is going down like the latest version.
>>
>> On Sun, Sep 23, 2018 at 4:55 PM Andrew  Yourtchenko <
>> ayour...@gmail.com> wrote:
>>
>>> Would you be able to confirm that it changes at a point of
>>> https://gerrit.fd.io/r/#/c/12770/ ?
>>>
>>> --a
>>>
>>> On 23 Sep 2018, at 13:31, emma sdi  wrote:
>>>
>>> Dear Community
>>>
>>> I have simple configuration as following:
>>>
>>> startup.conf 
>>> simple_acl 
>>>
>>> I used Trex packet generator with following command:
>>> ./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 5 -c 2 -d
>>> 6000
>>> The Total-RX gradually decrease, here is output of Trex in second
>>> 200 , and 5900.
>>> 
>>>
>>> I did not saw this problem in 18.04. I think session_cleaner thread
>>> make so many
>>> interrupt, do you have any idea?
>>>
>>> Regards
>>>
>>> -=-=-=-=-=-=-=-=-=-=-=-
>>> Links: You receive all messages sent to this group.
>>>
>>> View/Reply Online (#10615):
>>> https://lists.fd.io/g/vpp-dev/message/10615
>>> Mute This Topic: https://lists.fd.io/mt/26145401/675608
>>> Group Owner: vpp-dev+ow...@lists.fd.io
>>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [
>>> ayour...@gmail.com]
>>> -=-=-=-=-=-=-=-=-=-=-=-
>>>
>>>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10646): https://lists.fd.io/g/vpp-dev/message/10646
Mute This 

Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread emma sdi
Yes, that's right. I think is completely another issue from the patch you
sent

On Tue, Sep 25, 2018 at 1:35 PM Andrew  Yourtchenko 
wrote:

> Excellent, thanks!
>
> Memory usage - you mean in bihash arena ?
>
> --a
>
> On 25 Sep 2018, at 11:38, khers  wrote:
>
> Throughput and session add/del is stable as rock. The only danger i see is
> growing memory usage.
> look at this 
>
>
> On Tue, Sep 25, 2018 at 11:31 AM khers  wrote:
>
>> Of course, I test your patch, there is no slowdown with my scenario. I
>> need more time to test other
>> scenarios and make sure.
>>
>>
>> On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko 
>> wrote:
>>
>>> Cool. Then it is probably indeed the session requeues that are not yet
>>> efficient... I have been looking at optimizing that.
>>>
>>> I have a draft in the works which should have less session requeues - I
>>> have just added you to it, could you give it a shot and see if it makes
>>> things better ?
>>>
>>> --a
>>>
>>> On 24 Sep 2018, at 12:55, khers  wrote:
>>>
>>> yes, I confirm
>>>
>>> On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko <
>>> ayour...@gmail.com> wrote:
>>>
 Okay, so what I think I am hearing - the gradual slowdown is/was always
 there, and is somewhat more pronounced in master, right ?

 --a

 On 24 Sep 2018, at 11:49, khers  wrote:

 I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 or
 more worker thread and 1 main,
 but when vpp using one cpu I hadn't any problem. In the 1807 multi core
 is stable i didn't see any of those
 problem but throughput is declining slowly.
 I ran another test with same version of last email, which vpp is
 configured with one core and throughput is declining slower than
 master
 second 200 
 second 5900 


 On Sun, Sep 23, 2018 at 6:57 PM Andrew  Yourtchenko <
 ayour...@gmail.com> wrote:

> Interesting - but you are saying in 1804 this effect is not observed ?
> There was no other notable changes with regards to session management - 
> but
> maybe worth it to just do hit bisect and see. Should be 4-5 iterations.
> Could you verify that - if indeed this is not seen in 1804.
>
> --a
>
> On 23 Sep 2018, at 16:42, khers  wrote:
>
> I checked out the version before the gerrit 12770 is merged to master.
> 2371c25fed6b2e751163df590bb9d9a93a75a0f
>
> I got SIGSEGV with 2 workers, so I repeat the test with one worker.
> Throughput is going down like the latest version.
>
> On Sun, Sep 23, 2018 at 4:55 PM Andrew  Yourtchenko <
> ayour...@gmail.com> wrote:
>
>> Would you be able to confirm that it changes at a point of
>> https://gerrit.fd.io/r/#/c/12770/ ?
>>
>> --a
>>
>> On 23 Sep 2018, at 13:31, emma sdi  wrote:
>>
>> Dear Community
>>
>> I have simple configuration as following:
>>
>> startup.conf 
>> simple_acl 
>>
>> I used Trex packet generator with following command:
>> ./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 5 -c 2 -d
>> 6000
>> The Total-RX gradually decrease, here is output of Trex in second 200
>> , and 5900.
>> 
>>
>> I did not saw this problem in 18.04. I think session_cleaner thread
>> make so many
>> interrupt, do you have any idea?
>>
>> Regards
>>
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Links: You receive all messages sent to this group.
>>
>> View/Reply Online (#10615):
>> https://lists.fd.io/g/vpp-dev/message/10615
>> Mute This Topic: https://lists.fd.io/mt/26145401/675608
>> Group Owner: vpp-dev+ow...@lists.fd.io
>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [ayour...@gmail.com
>> ]
>> -=-=-=-=-=-=-=-=-=-=-=-
>>
>>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10645): https://lists.fd.io/g/vpp-dev/message/10645
Mute This Topic: https://lists.fd.io/mt/26145401/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread Andrew Yourtchenko
Excellent, thanks!

Memory usage - you mean in bihash arena ?

--a

> On 25 Sep 2018, at 11:38, khers  wrote:
> 
> Throughput and session add/del is stable as rock. The only danger i see is 
> growing memory usage.
> look at this 
> 
> 
>> On Tue, Sep 25, 2018 at 11:31 AM khers  wrote:
>> Of course, I test your patch, there is no slowdown with my scenario. I need 
>> more time to test other
>> scenarios and make sure.
>> 
>> 
>>> On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko  
>>> wrote:
>>> Cool. Then it is probably indeed the session requeues that are not yet 
>>> efficient... I have been looking at optimizing that.
>>> 
>>> I have a draft in the works which should have less session requeues - I 
>>> have just added you to it, could you give it a shot and see if it makes 
>>> things better ? 
>>> 
>>> --a
>>> 
 On 24 Sep 2018, at 12:55, khers  wrote:
 
 yes, I confirm
 
> On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko  
> wrote:
> Okay, so what I think I am hearing - the gradual slowdown is/was always 
> there, and is somewhat more pronounced in master, right ?
> 
> --a
> 
>> On 24 Sep 2018, at 11:49, khers  wrote:
>> 
>> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 or 
>> more worker thread and 1 main,
>> but when vpp using one cpu I hadn't any problem. In the 1807 multi core 
>> is stable i didn't see any of those
>> problem but throughput is declining slowly.
>> I ran another test with same version of last email, which vpp is 
>> configured with one core and throughput is declining slower than 
>> master
>> second 200
>> second 5900
>> 
>> 
>>> On Sun, Sep 23, 2018 at 6:57 PM Andrew  Yourtchenko 
>>>  wrote:
>>> Interesting - but you are saying in 1804 this effect is not observed ? 
>>> There was no other notable changes with regards to session management - 
>>> but maybe worth it to just do hit bisect and see. Should be 4-5 
>>> iterations. Could you verify that - if indeed this is not seen in 1804.
>>> 
>>> --a
>>> 
 On 23 Sep 2018, at 16:42, khers  wrote:
 
 I checked out the version before the gerrit 12770 is merged to master.
 2371c25fed6b2e751163df590bb9d9a93a75a0f
 
 I got SIGSEGV with 2 workers, so I repeat the test with one worker.
 Throughput is going down like the latest version.
 
> On Sun, Sep 23, 2018 at 4:55 PM Andrew  Yourtchenko 
>  wrote:
> Would you be able to confirm that it changes at a point of 
> https://gerrit.fd.io/r/#/c/12770/ ?
> 
> --a
> 
>> On 23 Sep 2018, at 13:31, emma sdi  wrote:
>> 
>> Dear Community
>> 
>> I have simple configuration as following:
>> 
>> startup.conf
>> simple_acl
>> 
>> I used Trex packet generator with following command:
>> ./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 5 -c 2 -d 
>> 6000
>> The Total-RX gradually decrease, here is output of Trex in second 
>> 200, and 5900.
>> 
>> I did not saw this problem in 18.04. I think session_cleaner thread 
>> make so many 
>> interrupt, do you have any idea?
>> 
>> Regards
>> 
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Links: You receive all messages sent to this group.
>> 
>> View/Reply Online (#10615): 
>> https://lists.fd.io/g/vpp-dev/message/10615
>> Mute This Topic: https://lists.fd.io/mt/26145401/675608
>> Group Owner: vpp-dev+ow...@lists.fd.io
>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
>> [ayour...@gmail.com]
>> -=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10644): https://lists.fd.io/g/vpp-dev/message/10644
Mute This Topic: https://lists.fd.io/mt/26145401/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT IPsec AES-GCM 128 tests failing

2018-09-25 Thread Peter Mikus via Lists.Fd.Io
Hello devs,

We are observing IPSec failing tests in CSIT-perf, specially and only 
combination with aes-gcm (both interface AND tunnel mode, Integ Alg AES GCM 
128).
Note: Integ Alg SHA1 96, is working.

Can you please help decrypt the error message below?

Thank you.


VAT error message (on both DUTs)
-
sw_interface_set_flags error: Unspecified Error


-Note: This error is happening when trying to get interface UP.

DUT1 config
---

sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_dump
hw_interface_set_mtu sw_if_index 2 mtu 9200
hw_interface_set_mtu sw_if_index 1 mtu 9200
sw_interface_dump
sw_interface_add_del_address sw_if_index 2 192.168.10.1/24
sw_interface_add_del_address sw_if_index 1 172.168.1.1/24
ip_neighbor_add_del sw_if_index 2 dst 192.168.10.2 mac 68:05:ca:35:79:1c
ip_neighbor_add_del sw_if_index 1 dst 172.168.1.2 mac 68:05:ca:35:76:b1
ip_add_del_route 10.0.0.0/8 via 192.168.10.2  sw_if_index 2 resolve-attempts 10 
count 1
ipsec_tunnel_if_add_del local_spi 1 remote_spi 2 crypto_alg aes-gcm-128 
local_crypto_key 4e4d444178336b4958744a374365356235643545 remote_crypto_key 
4e4d444178336b4958744a374365356235643545  local_ip 172.168.1.1 remote_ip 
172.168.1.2
ip_add_del_route 20.0.0.0/32 via 172.168.1.2 ipsec0
exec set interface unnumbered ipsec0 use FortyGigabitEthernet88/0/0
sw_interface_set_flags ipsec0 admin-up


DUT2 config
---
sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_dump
hw_interface_set_mtu sw_if_index 2 mtu 9200
hw_interface_set_mtu sw_if_index 1 mtu 9200
sw_interface_dump
sw_interface_add_del_address sw_if_index 2 172.168.1.2/24
sw_interface_add_del_address sw_if_index 1 192.168.20.1/24
ip_neighbor_add_del sw_if_index 1 dst 192.168.20.2 mac 68:05:ca:35:79:19
ip_neighbor_add_del sw_if_index 2 dst 172.168.1.1 mac 68:05:ca:37:25:18
ip_add_del_route 20.0.0.0/8 via 192.168.20.2  sw_if_index 1 resolve-attempts 10 
count 1
ipsec_tunnel_if_add_del local_spi 2 remote_spi 1 crypto_alg aes-gcm-128 
local_crypto_key 4e4d444178336b4958744a374365356235643545 remote_crypto_key 
4e4d444178336b4958744a374365356235643545  local_ip 172.168.1.2 remote_ip 
172.168.1.1
ip_add_del_route 10.0.0.0/32 via 172.168.1.1 ipsec0
exec set interface unnumbered ipsec0 use FortyGigabitEthernet88/0/1
sw_interface_set_flags ipsec0 admin-up


1969/12/31 16:00:00:086 warn   dpdk   EAL init args: -c 18 -n 4 
--huge-dir /run/vpp/hugepages --file-prefi
x vpp -w :88:00.1 -w :88:00.0 -w :86:01.0 --master-lcore 19 
--socket-mem 1024,1024
1969/12/31 16:00:00:967 notice dpdk   EAL: Detected 36 lcore(s)
1969/12/31 16:00:00:967 notice dpdk   EAL: Detected 2 NUMA nodes
1969/12/31 16:00:00:967 notice dpdk   EAL: Multi-process socket 
/var/run/dpdk/vpp/mp_socket
1969/12/31 16:00:00:967 notice dpdk   EAL: No free hugepages reported 
in hugepages-1048576kB
1969/12/31 16:00:00:967 notice dpdk   EAL: Probing VFIO support...
1969/12/31 16:00:00:967 notice dpdk   EAL: VFIO support initialized
1969/12/31 16:00:00:967 notice dpdk   EAL: PCI device :86:01.0 on 
NUMA socket 1
1969/12/31 16:00:00:967 notice dpdk   EAL:   probe driver: 8086:443 qat
1969/12/31 16:00:00:967 notice dpdk   qat_comp_dev_create(): 
Compression PMD not supported on QAT dh895xcc
1969/12/31 16:00:00:967 notice dpdk   EAL: PCI device :88:00.0 on 
NUMA socket 1
1969/12/31 16:00:00:967 notice dpdk   EAL:   probe driver: 8086:1583 
net_i40e
1969/12/31 16:00:00:967 notice dpdk   EAL: PCI device :88:00.1 on 
NUMA socket 1
1969/12/31 16:00:00:967 notice dpdk   EAL:   probe driver: 8086:1583 
net_i40e
1969/12/31 16:00:00:967 notice dpdk   Invalid port_id=2

Peter Mikus
Engineer - Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10643): https://lists.fd.io/g/vpp-dev/message/10643
Mute This Topic: https://lists.fd.io/mt/26218774/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] A bug in IP reassembly?

2018-09-25 Thread Klement Sekera via Lists.Fd.Io
Hi Kingwel,

thanks for finding this bug. Your patch looks fine - would you mind
making a similar fix in ip4_reassembly.c? The logic suffers from the
same flaw there.

Thanks,
Klement

Quoting Kingwel Xie (2018-09-25 11:06:49)
>Hi,
> 
> 
> 
>I worked on testing IP reassembly recently, the hit a crash when testing
>IP reassembly with IPSec. It took me some time to figure out why.
> 
> 
> 
>The crash only happens when there are >1 feature node enabled under
>ip-unicast and ip reassembly is working, like below.
> 
> 
> 
>ip4-unicast:
> 
>  ip4-reassembly-feature
> 
>  ipsec-input-ip4
> 
> 
> 
>It looks like there is a bug in the reassembly code as below:
>vnet_feature_next will do to buffer b0 to update the next0 and the
>current_config_index of b0, but b0 is pointing to some fragment buffer
>which in most cases is not the first buffer in chain indicated by bi0.
>Actually bi0 pointing to the first buffer is returned by ip6_reass_update
>when reassembly is finalized. As I can see this is a mismatch that bi0 and
>b0 are not the same buffer. In the end the quick fix is like what I added
>: b0 = vlib_get_buffer (vm, bi0); to make it right.
> 
> 
> 
>      if (~0 != bi0)
> 
>        {
> 
>        skip_reass:
> 
>      to_next[0] = bi0;
> 
>      to_next += 1;
> 
>      n_left_to_next -= 1;
> 
>      if (is_feature && IP6_ERROR_NONE == error0)
> 
>    {
> 
>      b0 = vlib_get_buffer (vm, bi0);  à added
>by Kingwel
> 
>      vnet_feature_next (, b0);
> 
>    }
> 
>      vlib_validate_buffer_enqueue_x1 (vm, node,
>next_index, to_next,
> 
>   
> 
>   n_left_to_next, bi0, next0);
> 
>        }
> 
> 
> 
>Probably this is not the perfect fix, but it works at least. Wonder if
>committers have better thinking about it? I can of course push a patch if
>you think it is ok.
> 
> 
> 
>Regards,
> 
>Kingwel
> 
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10642): https://lists.fd.io/g/vpp-dev/message/10642
Mute This Topic: https://lists.fd.io/mt/26218556/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VMBus netvsc integration

2018-09-25 Thread Damjan Marion via Lists.Fd.Io


> On 25 Sep 2018, at 11:08, Stephen Hemminger  
> wrote:
> 
> I am looking into the best way to integrate setup of the Hyper-V/Azure 
> Network Poll Mode Driver
> (netvsc PMD) from DPDK. The device shows up on vmbus (not PCI) therefore 
> setting it up transparently
> in VPP requires some additional setup logic; it can be setup now via other 
> methods such as driverctl.
> 
> Would it make more sense to extend DPDK plugin logic to handle VMBUS as well 
> as PCI, or do a different
> plugin? I am leaning towards modifying DPDK plugin since it is a DPDK device 
> and it makes sense to configure
> it there.

Yes, extend DPDK plugin. It shouldn't be too hard. Unfortunately I don't have 
Azure account to give it a try...

-- 
Damjan-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10641): https://lists.fd.io/g/vpp-dev/message/10641
Mute This Topic: https://lists.fd.io/mt/26218561/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread emma sdi
Throughput and session add/del is stable as rock. The only danger i see is
growing memory usage.
look at this 


On Tue, Sep 25, 2018 at 11:31 AM khers  wrote:

> Of course, I test your patch, there is no slowdown with my scenario. I
> need more time to test other
> scenarios and make sure.
>
>
> On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko 
> wrote:
>
>> Cool. Then it is probably indeed the session requeues that are not yet
>> efficient... I have been looking at optimizing that.
>>
>> I have a draft in the works which should have less session requeues - I
>> have just added you to it, could you give it a shot and see if it makes
>> things better ?
>>
>> --a
>>
>> On 24 Sep 2018, at 12:55, khers  wrote:
>>
>> yes, I confirm
>>
>> On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko 
>> wrote:
>>
>>> Okay, so what I think I am hearing - the gradual slowdown is/was always
>>> there, and is somewhat more pronounced in master, right ?
>>>
>>> --a
>>>
>>> On 24 Sep 2018, at 11:49, khers  wrote:
>>>
>>> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 or
>>> more worker thread and 1 main,
>>> but when vpp using one cpu I hadn't any problem. In the 1807 multi core
>>> is stable i didn't see any of those
>>> problem but throughput is declining slowly.
>>> I ran another test with same version of last email, which vpp is
>>> configured with one core and throughput is declining slower than
>>> master
>>> second 200 
>>> second 5900 
>>>
>>>
>>> On Sun, Sep 23, 2018 at 6:57 PM Andrew  Yourtchenko <
>>> ayour...@gmail.com> wrote:
>>>
 Interesting - but you are saying in 1804 this effect is not observed ?
 There was no other notable changes with regards to session management - but
 maybe worth it to just do hit bisect and see. Should be 4-5 iterations.
 Could you verify that - if indeed this is not seen in 1804.

 --a

 On 23 Sep 2018, at 16:42, khers  wrote:

 I checked out the version before the gerrit 12770 is merged to master.
 2371c25fed6b2e751163df590bb9d9a93a75a0f

 I got SIGSEGV with 2 workers, so I repeat the test with one worker.
 Throughput is going down like the latest version.

 On Sun, Sep 23, 2018 at 4:55 PM Andrew  Yourtchenko <
 ayour...@gmail.com> wrote:

> Would you be able to confirm that it changes at a point of
> https://gerrit.fd.io/r/#/c/12770/ ?
>
> --a
>
> On 23 Sep 2018, at 13:31, emma sdi  wrote:
>
> Dear Community
>
> I have simple configuration as following:
>
> startup.conf 
> simple_acl 
>
> I used Trex packet generator with following command:
> ./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 5 -c 2 -d
> 6000
> The Total-RX gradually decrease, here is output of Trex in second 200
> , and 5900.
> 
>
> I did not saw this problem in 18.04. I think session_cleaner thread
> make so many
> interrupt, do you have any idea?
>
> Regards
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#10615):
> https://lists.fd.io/g/vpp-dev/message/10615
> Mute This Topic: https://lists.fd.io/mt/26145401/675608
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [ayour...@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10640): https://lists.fd.io/g/vpp-dev/message/10640
Mute This Topic: https://lists.fd.io/mt/26145401/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] VMBus netvsc integration

2018-09-25 Thread Stephen Hemminger
I am looking into the best way to integrate setup of the Hyper-V/Azure Network 
Poll Mode Driver
(netvsc PMD) from DPDK. The device shows up on vmbus (not PCI) therefore 
setting it up transparently
in VPP requires some additional setup logic; it can be setup now via other 
methods such as driverctl.

Would it make more sense to extend DPDK plugin logic to handle VMBUS as well as 
PCI, or do a different
plugin? I am leaning towards modifying DPDK plugin since it is a DPDK device 
and it makes sense to configure
it there.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10639): https://lists.fd.io/g/vpp-dev/message/10639
Mute This Topic: https://lists.fd.io/mt/26218561/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] A bug in IP reassembly?

2018-09-25 Thread Kingwel Xie
Hi,

I worked on testing IP reassembly recently, the hit a crash when testing IP 
reassembly with IPSec. It took me some time to figure out why.

The crash only happens when there are >1 feature node enabled under ip-unicast 
and ip reassembly is working, like below.

ip4-unicast:
  ip4-reassembly-feature
  ipsec-input-ip4

It looks like there is a bug in the reassembly code as below: vnet_feature_next 
will do to buffer b0 to update the next0 and the current_config_index of b0, 
but b0 is pointing to some fragment buffer which in most cases is not the first 
buffer in chain indicated by bi0. Actually bi0 pointing to the first buffer is 
returned by ip6_reass_update when reassembly is finalized. As I can see this is 
a mismatch that bi0 and b0 are not the same buffer. In the end the quick fix is 
like what I added : b0 = vlib_get_buffer (vm, bi0); to make it right.

  if (~0 != bi0)
{
skip_reass:
  to_next[0] = bi0;
  to_next += 1;
  n_left_to_next -= 1;
  if (is_feature && IP6_ERROR_NONE == error0)
{
  b0 = vlib_get_buffer (vm, bi0);  --> added by 
Kingwel
  vnet_feature_next (, b0);
}
  vlib_validate_buffer_enqueue_x1 (vm, node, next_index, 
to_next,

   n_left_to_next, bi0, next0);
}

Probably this is not the perfect fix, but it works at least. Wonder if 
committers have better thinking about it? I can of course push a patch if you 
think it is ok.

Regards,
Kingwel

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10638): https://lists.fd.io/g/vpp-dev/message/10638
Mute This Topic: https://lists.fd.io/mt/26218556/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] question about bfd protocol message

2018-09-25 Thread Klement Sekera via Lists.Fd.Io
Hi Xue,

I'm not sure what protocol message you mean. Can you please elaborate or
point to RFC number & section which describes the message you're
interested in?

Thanks,
Klement

Quoting xyxue (2018-09-25 09:48:58)
>Hi guys��
>I��m testing the bfd . Is the bfd support protocol message? 
>Thanks,
>Xue
> 
>--
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10637): https://lists.fd.io/g/vpp-dev/message/10637
Mute This Topic: https://lists.fd.io/mt/26218372/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Make test failures on ARM - IP4, L2, ECMP, Multicast, GRE, SCTP, SPAN, ACL

2018-09-25 Thread Juraj Linkeš
I created the new tickets under CSIT, which is an oversight, but I fixed it and 
now the tickets are under VPP:

*GRE crash

*SCTP failure/crash

o   Me and Marco resolved a similar issue in the past, but this could be 
something different

*SPAN crash

*IP4 failures

o   These are multiple failures and I'm not sure that grouping them together is 
correct

*L2 failures/crash

o   As in IP4, these are multiple failures and I'm not sure that grouping them 
together is correct

*ECMP failure

*Multicast failure

*ACL failure

o   I'm already working with Andrew on fixing this

There seem to be a lot of people who touched the code. I would like to ask the 
authors to tell me who to turn to (at least for IP and L2).

Regards,
Juraj

From: Juraj Linkeš [mailto:juraj.lin...@pantheon.tech]
Sent: Monday, September 24, 2018 6:26 PM
To: vpp-dev 
Cc: csit-dev 
Subject: [vpp-dev] Make test failures on ARM

Hi vpp-devs,

Especially ARM vpp devs :)

We're experiencing a number of failures on Cavium ThunderX and we'd like to fix 
the issues. I've created a number of Jira tickets:

*GRE crash

*SCTP failure/crash

o   Me and Marco resolved a similar issue in the past, but this could be 
something different

*SPAN crash

*IP4 failures

o   These are multiple failures and I'm not sure that grouping them together is 
correct

*L2 failures/crash

o   As in IP4, these are multiple failures and I'm not sure that grouping them 
together is correct

*ECMP failure

*Multicast failure

*ACL failure

o   I'm already working with Andrew on fixing this

The reason I didn't reach out to all authors individually is that I wanted 
someone to look at the issues and assess whether there's an overlap (or I 
grouped the failures improperly), since some of the failures look similar.

Then there's the issue of hardware availability - if anyone willing to help has 
access to fd.io lab, I can setup access to a Cavium ThunderX, otherwise we 
could set up a call if further debugging is needed.

Thanks,
Juraj
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10636): https://lists.fd.io/g/vpp-dev/message/10636
Mute This Topic: https://lists.fd.io/mt/26218436/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] continuous decline in throughput with acl

2018-09-25 Thread emma sdi
Of course, I test your patch, there is no slowdown with my scenario. I need
more time to test other
scenarios and make sure.


On Mon, Sep 24, 2018 at 3:11 PM Andrew  Yourtchenko 
wrote:

> Cool. Then it is probably indeed the session requeues that are not yet
> efficient... I have been looking at optimizing that.
>
> I have a draft in the works which should have less session requeues - I
> have just added you to it, could you give it a shot and see if it makes
> things better ?
>
> --a
>
> On 24 Sep 2018, at 12:55, khers  wrote:
>
> yes, I confirm
>
> On Mon, Sep 24, 2018 at 2:08 PM Andrew  Yourtchenko 
> wrote:
>
>> Okay, so what I think I am hearing - the gradual slowdown is/was always
>> there, and is somewhat more pronounced in master, right ?
>>
>> --a
>>
>> On 24 Sep 2018, at 11:49, khers  wrote:
>>
>> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 or more
>> worker thread and 1 main,
>> but when vpp using one cpu I hadn't any problem. In the 1807 multi core
>> is stable i didn't see any of those
>> problem but throughput is declining slowly.
>> I ran another test with same version of last email, which vpp is
>> configured with one core and throughput is declining slower than
>> master
>> second 200 
>> second 5900 
>>
>>
>> On Sun, Sep 23, 2018 at 6:57 PM Andrew  Yourtchenko 
>> wrote:
>>
>>> Interesting - but you are saying in 1804 this effect is not observed ?
>>> There was no other notable changes with regards to session management - but
>>> maybe worth it to just do hit bisect and see. Should be 4-5 iterations.
>>> Could you verify that - if indeed this is not seen in 1804.
>>>
>>> --a
>>>
>>> On 23 Sep 2018, at 16:42, khers  wrote:
>>>
>>> I checked out the version before the gerrit 12770 is merged to master.
>>> 2371c25fed6b2e751163df590bb9d9a93a75a0f
>>>
>>> I got SIGSEGV with 2 workers, so I repeat the test with one worker.
>>> Throughput is going down like the latest version.
>>>
>>> On Sun, Sep 23, 2018 at 4:55 PM Andrew  Yourtchenko <
>>> ayour...@gmail.com> wrote:
>>>
 Would you be able to confirm that it changes at a point of
 https://gerrit.fd.io/r/#/c/12770/ ?

 --a

 On 23 Sep 2018, at 13:31, emma sdi  wrote:

 Dear Community

 I have simple configuration as following:

 startup.conf 
 simple_acl 

 I used Trex packet generator with following command:
 ./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 5 -c 2 -d
 6000
 The Total-RX gradually decrease, here is output of Trex in second 200
 , and 5900.
 

 I did not saw this problem in 18.04. I think session_cleaner thread
 make so many
 interrupt, do you have any idea?

 Regards

 -=-=-=-=-=-=-=-=-=-=-=-
 Links: You receive all messages sent to this group.

 View/Reply Online (#10615): https://lists.fd.io/g/vpp-dev/message/10615
 Mute This Topic: https://lists.fd.io/mt/26145401/675608
 Group Owner: vpp-dev+ow...@lists.fd.io
 Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [ayour...@gmail.com]
 -=-=-=-=-=-=-=-=-=-=-=-


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10635): https://lists.fd.io/g/vpp-dev/message/10635
Mute This Topic: https://lists.fd.io/mt/26145401/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] QoS architecture issues #VPP_QoS #VPP_Hard_Code #VPP_Stability

2018-09-25 Thread Edward Russell
I have a question about the architecture of QoS Section in VPP:

- Why the implementation code of the features in this section is static 
generally?
- Is the limitation of resource management system has lead to this decision 
(static code)?
- If features are not hard-coded, does it cause instability in VPP at run-time?

Thanks Community.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10634): https://lists.fd.io/g/vpp-dev/message/10634
Mute This Topic: https://lists.fd.io/mt/26218378/21656
Mute #vpp_stability: https://lists.fd.io/mk?hashtag=vpp_stability=1480452
Mute #vpp_qos: https://lists.fd.io/mk?hashtag=vpp_qos=1480452
Mute #vpp_hard_code: https://lists.fd.io/mk?hashtag=vpp_hard_code=1480452
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] question about bfd protocol message

2018-09-25 Thread xyxue

Hi guys,

I’m testing the bfd . Is the bfd support protocol message? 

Thanks,
Xue


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10633): https://lists.fd.io/g/vpp-dev/message/10633
Mute This Topic: https://lists.fd.io/mt/26218372/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [**EXTERNAL**] Fwd: [vpp-dev] Failing to create untagged sub-interface

2018-09-25 Thread Neale Ranns via Lists.Fd.Io
Hi Mike,

Perhaps you could tell us why you want to create an untagged sub-interface.

Regards,
Neale


De :  au nom de "Bly, Mike" 
Date : vendredi 21 septembre 2018 à 17:06
À : "John Lo (loj)" , Edward Warnicke , 
"vpp-dev@lists.fd.io" 
Objet : Re: [**EXTERNAL**] Fwd: [vpp-dev] Failing to create untagged 
sub-interface

John,

Any advise on this is appreciated. We can certainly dig into this, but we first 
wanted to sanity check with the community in case there was something obvious 
as to why it is working the way it is currently. I am hopeful that between you 
efforts and ours we can run this to ground in short order.

-Mike

From: vpp-dev@lists.fd.io  On Behalf Of John Lo (loj) via 
Lists.Fd.Io
Sent: Thursday, September 20, 2018 4:02 PM
To: Edward Warnicke ; vpp-dev@lists.fd.io; Bly, Mike 

Cc: vpp-dev@lists.fd.io
Subject: Re: [**EXTERNAL**] Fwd: [vpp-dev] Failing to create untagged 
sub-interface

When a sub-interface is created, matching of tags on the packet to the 
sub-interface can be specified as “exact-match”.  With exact-match, packet must 
have the same number of tags with values matching that specified for the 
sub-interface.  Otherwise, packets will belong to the best matched 
sub-interface.  A sub-interface to be used for L3 must be created with 
exact-match.  Otherwise, IP forwarding cannot get a proper L2 header rewrite 
for output on the sub-interface.

As for a main interface,  I suppose when it is in L2 mode, packets received 
with no tags or with tags without any specific sub-interface match is 
considered as on the main interface.  When the main interface is in L3 mode, it 
will only get untagged packets because of the exact match requirement.  I think 
this is why the default sub-interface starts to get non-matching tagged packets 
when main interface is in L3 mode, as observed.  Packets received on the main 
interface in L3 mode can be IP forwarded or be dropped.

It is a good question – what is the expected sub-interface classification 
behavior with untagged or default sub-interface?  I think this is the area of 
VPP that has not been used much and thus we have little knowledge of how it 
behaves without studying the code (hence lack of response to this thread of 
questions so far).  When I get a chance, I can take look into this issue – how 
VLAN match should work for default/untagged sub-interface and why untagged 
sub-interface creation fails.  I don’t know how soon I will get to it.  So, if 
anyone is willing to contribute and submit a patch to fix the issue, I will be 
happy to review and/or merge the patch as appropriate.

Regards,
John

From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of Edward Warnicke
Sent: Thursday, September 20, 2018 1:25 PM
To: vpp-dev@lists.fd.io; Bly, Mike 
mailto:m...@ciena.com>>
Subject: Re: [**EXTERNAL**] Fwd: [vpp-dev] Failing to create untagged 
sub-interface

Guys,
  Anyone have any thoughts on this?

Ed



On September 20, 2018 at 12:01:05 PM, Bly, Mike 
(m...@ciena.com) wrote:
Ed/Keith, et al,

What Vijay is digging into is trying to understand how to provide the following 
sub-interface setup on a common/single physical NIC. I am hoping you can shed 
some light on the feasibility of this, given the current code to date.

Our goal is to provide proper separation of untagged vs. explicit-vlan (EVPL) 
vs. default (all remaining vlans) vs. EPL as needed on a given NIC, independent 
of any choice of forwarding mode (L2 vs L3).

GigabitEthernet5/0/0 --> “not used to forward traffic” (see next three 
sub-if’s), calling it sub_if_0 for reference below (seen as possible EPL path, 
but not covered here, since already “working”)
GigabitEthernet5/0/0.untagged --> all untagged traffic on this port goes to 
sub_if_1
GigabitEthernet5/0/0.vid1 --> all traffic arriving with outer tag == 1 goes to 
sub_if_2
GigabitEthernet5/0/0.default --> all other tagged traffic goes to sub_if_3

The only way we seem to be able to get sub_if_3 to process traffic is to 
disable sub_if_0 (set mode to l3).

Additionally, the current configuration checking in src/vnet/ethernet/node.c 
does not seem amenable to allowing the actual configuration and support of 
untagged vs default as two distinct sub-if’s processing traffic at the same 
time (my sub_if_1 and sub_if_3 above). Are we missing something here in how 
this is supposed to work? We would be fine with letting “sub_if_0” carry the 
untagged traffic (in place of sub_if_1), but we have yet to figure out how to 
do that while still having sub_if_3 processing “all other tagged frames”. We 
can say in all of our testing that we in fact do correctly see sub_if_2 working 
as expected.

Here is a simple configuration showing our current efforts in this area:

create bridge-domain 1
create bridge-domain 2
create bridge-domain 3

set interface l2 bridge GigabitEthernet5/0/0 1
set interface l2 bridge GigabitEthernet5/0/1 1