Re: [vpp-dev] AVF interface creation fails on VFs with configured VLAN with newer i40e drivers

2021-10-07 Thread Peter Mikus via lists.fd.io
vpp_device, a.k.a functional per patch testing in volume.

Peter Mikus
Engineer – Software
Cisco Systems Limited


From: Damjan Marion (damarion) 
Sent: Thursday, October 7, 2021 14:02
To: Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco)
Cc: Juraj Linkeš; vpp-dev; Lijian Zhang
Subject: Re: [vpp-dev] AVF interface creation fails on VFs with configured VLAN 
with newer i40e drivers


I don't think using vlans in performance testbeds is good idea.
I would simply avoid using it…

—
Damjan



On 07.10.2021., at 13:58, Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco) 
mailto:pmi...@cisco.com>> wrote:

This effectively means to me decommission of vpp_device testing on Fortville, 
due to absent of support (API, switch, whatever...).
Vlan is fundamental feature of isolating the traffic into VFs in multitenant 
environments. It is clear from [0]

//quote
If you have applications that require Virtual Functions (VFs) to receive
packets with VLAN tags, you can disable VLAN tag stripping for the VF. The
Physical Function (PF) processes requests issued from the VF to enable or
disable VLAN tag stripping. Note that if the PF has assigned a VLAN to a VF,
then requests from that VF to set VLAN tag stripping will be ignored.

So unless this is a bug, then it means that application should not enforce the 
behavior.

Juraj. Does this detection works on DPDK testpmd? 
https://doc.dpdk.org/dts/test_plans/vlan_test_plan.html
If this is indeed unsupported on new driver level and firmware, then removing 
700 series is the only option to me.

To me this is as simple as to:
 =  disable_vlan_stripping()
if error:
   vlan_cannot_stripped_flag == 1 # ignore the error.

Thoughts?

[0] https://downloadmirror.intel.com/24693/eng/readme_4.2.7.txt

Peter Mikus
Engineer – Software
Cisco Systems Limited


From: Damjan Marion (damarion) mailto:damar...@cisco.com>>
Sent: Thursday, October 7, 2021 13:36
To: Juraj Linkeš
Cc: vpp-dev; Lijian Zhang; Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco)
Subject: Re: [vpp-dev] AVF interface creation fails on VFs with configured VLAN 
with newer i40e drivers



On 07.10.2021., at 13:22, Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>> wrote:



-Original Message-
From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of Juraj Linkeš
Sent: Tuesday, September 28, 2021 11:43 AM
To: damar...@cisco.com
Cc: vpp-dev mailto:vpp-dev@lists.fd.io>>; Lijian Zhang 
mailto:lijian.zh...@arm.com>>
Subject: Re: [vpp-dev] AVF interface creation fails on VFs with configured VLAN
with newer i40e drivers



-Original Message-
From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of Damjan
Marion via lists.fd.io
Sent: Wednesday, September 15, 2021 5:54 PM
To: Juraj Linkeš mailto:juraj.lin...@pantheon.tech>>
Cc: vpp-dev mailto:vpp-dev@lists.fd.io>>; Lijian Zhang 
mailto:lijian.zh...@arm.com>>
Subject: Re: [vpp-dev] AVF interface creation fails on VFs with
configured VLAN with newer i40e drivers



On 10.09.2021., at 08:53, Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>> wrote:



From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of Damjan
Marion via lists.fd.io
Sent: Thursday, September 9, 2021 12:01 PM
To: Juraj Linkeš mailto:juraj.lin...@pantheon.tech>>
Cc: vpp-dev mailto:vpp-dev@lists.fd.io>>; Lijian Zhang
mailto:lijian.zh...@arm.com>>
Subject: Re: [vpp-dev] AVF interface creation fails on VFs with
configured VLAN with newer i40e drivers


On 09.09.2021., at 09:14, Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>> wrote:

Hi Damjan, vpp devs,

Upgrading to 2.15.9 i40e driver in CI (from Ubuntu's 2.8.20-k) makes
AVF
interface creation on VFs with configured VLANs fail:
2021/08/30 09:15:27:343 debug avf :91:04.1: request_queues:
num_queue_pairs 1
2021/08/30 09:15:27:434 debug avf :91:04.1: version: major 1
minor
1
2021/08/30 09:15:27:444 debug avf :91:04.1: get_vf_resources:
bitmap 0x180b80a1 (l2 wb-on-itr adv-link-speed vlan-v2 vlan
rx-polling rss-pf offload-adv-rss-pf offload-fdir-pf)
2021/08/30 09:15:27:445 debug avf :91:04.1: get_vf_resources:
num_vsis 1 num_queue_pairs 1 max_vectors 5 max_mtu 0 vf_cap_flags
0xb0081 (l2 adv-link-speed vlan rx-polling rss-pf) rss_key_size 52
rss_lut_size 64
2021/08/30 09:15:27:445 debug avf :91:04.1:
get_vf_resources_vsi[0]: vsi_id 27 num_queue_pairs 1 vsi_type 6
qset_handle 21 default_mac_addr ba:dc:0f:fe:02:11
2021/08/30 09:15:27:445 debug avf :91:04.1:
disable_vlan_stripping
2021/08/30 09:15:27:559 error avf :00:00.0: error: avf_send_to_pf:
error [v_opcode = 28, v_retval -5] from avf_create_if: pci-addr
:91:04.1

Syslog reveals a bit more:
Aug 30 09:15:27 s55-t13-sut1 kernel: [352169.781206] vfio-pci
:91:04.1: enabling device ( -> 0002) Aug 30 09:15:27

Re: [vpp-dev] AVF interface creation fails on VFs with configured VLAN with newer i40e drivers

2021-10-07 Thread Peter Mikus via lists.fd.io
This effectively means to me decommission of vpp_device testing on Fortville, 
due to absent of support (API, switch, whatever...).
Vlan is fundamental feature of isolating the traffic into VFs in multitenant 
environments. It is clear from [0]

//quote
If you have applications that require Virtual Functions (VFs) to receive
packets with VLAN tags, you can disable VLAN tag stripping for the VF. The
Physical Function (PF) processes requests issued from the VF to enable or
disable VLAN tag stripping. Note that if the PF has assigned a VLAN to a VF,
then requests from that VF to set VLAN tag stripping will be ignored.

So unless this is a bug, then it means that application should not enforce the 
behavior.

Juraj. Does this detection works on DPDK testpmd? 
https://doc.dpdk.org/dts/test_plans/vlan_test_plan.html
If this is indeed unsupported on new driver level and firmware, then removing 
700 series is the only option to me.

To me this is as simple as to:
 =  disable_vlan_stripping()
if error:
vlan_cannot_stripped_flag == 1 # ignore the error.

Thoughts?

[0] https://downloadmirror.intel.com/24693/eng/readme_4.2.7.txt

Peter Mikus
Engineer – Software
Cisco Systems Limited


From: Damjan Marion (damarion) 
Sent: Thursday, October 7, 2021 13:36
To: Juraj Linkeš
Cc: vpp-dev; Lijian Zhang; Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco)
Subject: Re: [vpp-dev] AVF interface creation fails on VFs with configured VLAN 
with newer i40e drivers



> On 07.10.2021., at 13:22, Juraj Linkeš  wrote:
>
>
>
>> -Original Message-
>> From: vpp-dev@lists.fd.io  On Behalf Of Juraj Linkeš
>> Sent: Tuesday, September 28, 2021 11:43 AM
>> To: damar...@cisco.com
>> Cc: vpp-dev ; Lijian Zhang 
>> Subject: Re: [vpp-dev] AVF interface creation fails on VFs with configured 
>> VLAN
>> with newer i40e drivers
>>
>>
>>
>>> -Original Message-
>>> From: vpp-dev@lists.fd.io  On Behalf Of Damjan
>>> Marion via lists.fd.io
>>> Sent: Wednesday, September 15, 2021 5:54 PM
>>> To: Juraj Linkeš 
>>> Cc: vpp-dev ; Lijian Zhang 
>>> Subject: Re: [vpp-dev] AVF interface creation fails on VFs with
>>> configured VLAN with newer i40e drivers
>>>
>>>
>>>
 On 10.09.2021., at 08:53, Juraj Linkeš  wrote:



 From: vpp-dev@lists.fd.io  On Behalf Of Damjan
 Marion via lists.fd.io
 Sent: Thursday, September 9, 2021 12:01 PM
 To: Juraj Linkeš 
 Cc: vpp-dev ; Lijian Zhang
 
 Subject: Re: [vpp-dev] AVF interface creation fails on VFs with
 configured VLAN with newer i40e drivers


 On 09.09.2021., at 09:14, Juraj Linkeš  wrote:

 Hi Damjan, vpp devs,

 Upgrading to 2.15.9 i40e driver in CI (from Ubuntu's 2.8.20-k) makes
 AVF
>>> interface creation on VFs with configured VLANs fail:
 2021/08/30 09:15:27:343 debug avf :91:04.1: request_queues:
 num_queue_pairs 1
 2021/08/30 09:15:27:434 debug avf :91:04.1: version: major 1
 minor
 1
 2021/08/30 09:15:27:444 debug avf :91:04.1: get_vf_resources:
 bitmap 0x180b80a1 (l2 wb-on-itr adv-link-speed vlan-v2 vlan
 rx-polling rss-pf offload-adv-rss-pf offload-fdir-pf)
 2021/08/30 09:15:27:445 debug avf :91:04.1: get_vf_resources:
 num_vsis 1 num_queue_pairs 1 max_vectors 5 max_mtu 0 vf_cap_flags
 0xb0081 (l2 adv-link-speed vlan rx-polling rss-pf) rss_key_size 52
 rss_lut_size 64
 2021/08/30 09:15:27:445 debug avf :91:04.1:
 get_vf_resources_vsi[0]: vsi_id 27 num_queue_pairs 1 vsi_type 6
 qset_handle 21 default_mac_addr ba:dc:0f:fe:02:11
 2021/08/30 09:15:27:445 debug avf :91:04.1:
 disable_vlan_stripping
 2021/08/30 09:15:27:559 error avf :00:00.0: error: avf_send_to_pf:
 error [v_opcode = 28, v_retval -5] from avf_create_if: pci-addr
 :91:04.1

 Syslog reveals a bit more:
 Aug 30 09:15:27 s55-t13-sut1 kernel: [352169.781206] vfio-pci
 :91:04.1: enabling device ( -> 0002) Aug 30 09:15:27
 s55-t13-sut1 kernel: [352170.140729] i40e :91:00.0: Cannot
 disable vlan stripping when port VLAN is set Aug 30 09:15:27
 s55-t13-sut1
 kernel: [352170.140737] i40e :91:00.0: VF 17 failed opcode 28,
 retval: -5

 It looks like this feature (vlan stripping on VFs with VLANs) was
 removed in
>>> later versions of the driver. I don't know what the proper solution
>>> here is, but adding a configuration option to not disable vlan
>>> stripping when creating AVF interface sound good to me.

 I've documented this in https://jira.fd.io/browse/VPP-1995.

 Can you try with 2.16.11 and report back same outputs?

 I've updated https://jira.fd.io/browse/VPP-1995 with 2.16.11 outputs
 and
>>> they're pretty much the same, except the last syslog line is missing.
>>>
>>> OK, I was hoping new version of driver supports VLAN v2 offload APIs
>>> which allows us to know if stripping is supported or not on the

Re: [vpp-dev] RFC: Enabling Gerrit Auto-Abandon job on VPP master

2021-02-01 Thread Peter Mikus via lists.fd.io
@Ole,

> What do you intend to happen with those 600+ abandonded changes in the future?

Clicking in gerrit on "restore" button, if you find a gold in there ;)


Peter Mikus
Engineer – Software
Cisco Systems Limited


From: vpp-dev@lists.fd.io  on behalf of Ole Troan 

Sent: Monday, February 1, 2021 15:38
To: Dave Wallace
Cc: Paul Vinciguerra; Andrew Yourtchenko; vpp-dev
Subject: Re: [vpp-dev] RFC: Enabling Gerrit Auto-Abandon job on VPP master

Dave,

> To be perfectly honest, other than Andrew's proposal to tweak the 
> auto-abandon parameters, I have not heard another solution that solves the 
> problem of cleaning up the current queue and limiting the size of the queue 
> in the future.  Is anyone going to volunteer to manually review/abandon 600+ 
> gerrit changes?  Auto-assigning maintainers to gerrit changes is a separate 
> issue. Please make a proposal to fix that in its own thread and I will help 
> to get that implemented.

What do you intend to happen with those 600+ abandonded changes in the future?
Assuming there is gold in quite a few of them.

Ole

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18639): https://lists.fd.io/g/vpp-dev/message/18639
Mute This Topic: https://lists.fd.io/mt/80169540/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] [csit-dev] csit verify failures

2021-01-21 Thread Peter Mikus via lists.fd.io
Hello,

Status update:

For better stability of verify vpp-csit-device job, I decided to silence one of 
the test that was randomly failing ipv6-vm until full root cause is identified 
and cleared.

Peter Mikus
Engineer – Software
Cisco Systems Limited


From: vpp-dev@lists.fd.io  on behalf of Peter Mikus via 
lists.fd.io 
Sent: Thursday, January 21, 2021 18:08
To: csit-...@lists.fd.io; vpp-dev
Subject: Re: [vpp-dev] [csit-dev] csit verify failures

Hello,

Status update:

Today I have found one of the resource bottleneck being csit-shim container 
causing majority of OOM failures on vpp-device job.
I fixed/increase the capacity of memory allocated to this container handling 
reservation API.
I restarted machines to ensure everything is started from scratch.

I will keep monitor execution of jobs and in case of any major failures or 
culprits I will contact Dave Wallace to coordinate disabling of voting unless 
the issue is fixed permanently.

Thank you for understanding

Peter Mikus
Engineer – Software
Cisco Systems Limited


From: csit-...@lists.fd.io  on behalf of Peter Mikus via 
lists.fd.io 
Sent: Wednesday, January 20, 2021 17:27
To: csit-...@lists.fd.io; Benoit Ganne (bganne)
Cc: vpp-dev
Subject: Re: [csit-dev] csit verify failures

Hello,

I am now fully allocated to monitor the issue and find most appropriate 
solution as long term fix.

There were few issues identified.

1) Race condition on Intel x700 series card being tracked in separate mail 
([csit-dev] [vpp-dev] VPP Device jobs randomly failing) and being solved with 
cooperation with vendor (Intel).

Error symptoms: One specific test of CSIT robot execution is failing. But tests 
are finished properly and framework quits.

2)  OOM kill. Due to increased demands on resources (mainly memory indeed) from 
device under test and accompanied test stack there was a refusal to start more 
containers by Docker deamon itself.

This issue has been fixed by adjusting pre-allocated memory layout on both 
devices and is now in the place. I am monitoring jobs. We also put in place 
optimization in gerrit-jenkins trigger mechanics for vpp-device to prevent the 
start of unwanted verify jobs (to decrease the load and run verify jobs in more 
effective pipeline).

Error symptoms (message): Docker container failed to start

3) Last but most complicated issue involves garbage collection of virtual 
functions (sriov vfs) used (+ containers).
This issue is complex and while I yesterday reset vpp_device, I am still 
looking for an permanent fix to be applied. It is indeed related to state where 
previous simulation did not properly cleaned resources.

Error symptoms (message):
Cannot find device "enpXX"
+ die 'Moving interface to YY namespace failed!


Peter Mikus
Engineer – Software
Cisco Systems Limited


From: csit-...@lists.fd.io  on behalf of Benoit Ganne 
(bganne) via lists.fd.io 
Sent: Tuesday, January 19, 2021 11:55
To: csit-...@lists.fd.io
Cc: vpp-dev
Subject: [csit-dev] csit verify failures

Hi all,

I noticed 100% failures with the verify job 
vpp-csit-verify-device-master-1n-skx recently, eg. 
https://logs.fd.io/production/vex-yul-rot-jenkins-1/vpp-csit-verify-device-master-1n-skx/10902/console.log.gz
It seems to always fail with 'Failed to start TG docker container!' and 
'Topology reservation via shim-dcr failed!'.

Here is an excerpt:

+ DCR_UUIDS+=([tg]=$(docker run "${params[@]}"))
++ docker run --detach=true --privileged --publish-all --rm --shm-size 2G 
--mount type=tmpfs,destination=/sys/bus/pci/devices --volume 
/dev/vfio:/dev/vfio --volume /var/run/docker.sock:/var/run/docker.sock --volume 
/opt/boot/:/opt/boot/ --volume /dev/hugepages/:/dev/hugepages/ --sysctl 
net.ipv6.conf.all.disable_ipv6=1 --sysctl net.ipv6.conf.default.disable_ipv6=1 
--name csit-tg-fc4f2532-ea5a-47f2-b0bf-7ccdff00dc32 csit_sut-ubuntu1804:local
+ die 'Failed to start TG docker container!'
+ set -x
+ set +eu
+ warn 'Failed to start TG docker container!'
+ set -exuo pipefail
+ echo 'Failed to start TG docker container!'
Failed to start TG docker container!
+ exit 1
+++ set +eu
+++ warn 'Topology reservation via shim-dcr failed!'
+++ set -exuo pipefail
+++ echo 'Topology reservation via shim-dcr failed!'
Topology reservation via shim-dcr failed!
+++ exit 1
Build step 'Execute shell' marked build as failure

Best
ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18578): https://lists.fd.io/g/vpp-dev/message/18578
Mute This Topic: https://lists.fd.io/mt/80008977/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] [csit-dev] csit verify failures

2021-01-21 Thread Peter Mikus via lists.fd.io
Hello,

Status update:

Today I have found one of the resource bottleneck being csit-shim container 
causing majority of OOM failures on vpp-device job.
I fixed/increase the capacity of memory allocated to this container handling 
reservation API.
I restarted machines to ensure everything is started from scratch.

I will keep monitor execution of jobs and in case of any major failures or 
culprits I will contact Dave Wallace to coordinate disabling of voting unless 
the issue is fixed permanently.

Thank you for understanding

Peter Mikus
Engineer – Software
Cisco Systems Limited


From: csit-...@lists.fd.io  on behalf of Peter Mikus via 
lists.fd.io 
Sent: Wednesday, January 20, 2021 17:27
To: csit-...@lists.fd.io; Benoit Ganne (bganne)
Cc: vpp-dev
Subject: Re: [csit-dev] csit verify failures

Hello,

I am now fully allocated to monitor the issue and find most appropriate 
solution as long term fix.

There were few issues identified.

1) Race condition on Intel x700 series card being tracked in separate mail 
([csit-dev] [vpp-dev] VPP Device jobs randomly failing) and being solved with 
cooperation with vendor (Intel).

Error symptoms: One specific test of CSIT robot execution is failing. But tests 
are finished properly and framework quits.

2)  OOM kill. Due to increased demands on resources (mainly memory indeed) from 
device under test and accompanied test stack there was a refusal to start more 
containers by Docker deamon itself.

This issue has been fixed by adjusting pre-allocated memory layout on both 
devices and is now in the place. I am monitoring jobs. We also put in place 
optimization in gerrit-jenkins trigger mechanics for vpp-device to prevent the 
start of unwanted verify jobs (to decrease the load and run verify jobs in more 
effective pipeline).

Error symptoms (message): Docker container failed to start

3) Last but most complicated issue involves garbage collection of virtual 
functions (sriov vfs) used (+ containers).
This issue is complex and while I yesterday reset vpp_device, I am still 
looking for an permanent fix to be applied. It is indeed related to state where 
previous simulation did not properly cleaned resources.

Error symptoms (message):
Cannot find device "enpXX"
+ die 'Moving interface to YY namespace failed!


Peter Mikus
Engineer – Software
Cisco Systems Limited


From: csit-...@lists.fd.io  on behalf of Benoit Ganne 
(bganne) via lists.fd.io 
Sent: Tuesday, January 19, 2021 11:55
To: csit-...@lists.fd.io
Cc: vpp-dev
Subject: [csit-dev] csit verify failures

Hi all,

I noticed 100% failures with the verify job 
vpp-csit-verify-device-master-1n-skx recently, eg. 
https://logs.fd.io/production/vex-yul-rot-jenkins-1/vpp-csit-verify-device-master-1n-skx/10902/console.log.gz
It seems to always fail with 'Failed to start TG docker container!' and 
'Topology reservation via shim-dcr failed!'.

Here is an excerpt:

+ DCR_UUIDS+=([tg]=$(docker run "${params[@]}"))
++ docker run --detach=true --privileged --publish-all --rm --shm-size 2G 
--mount type=tmpfs,destination=/sys/bus/pci/devices --volume 
/dev/vfio:/dev/vfio --volume /var/run/docker.sock:/var/run/docker.sock --volume 
/opt/boot/:/opt/boot/ --volume /dev/hugepages/:/dev/hugepages/ --sysctl 
net.ipv6.conf.all.disable_ipv6=1 --sysctl net.ipv6.conf.default.disable_ipv6=1 
--name csit-tg-fc4f2532-ea5a-47f2-b0bf-7ccdff00dc32 csit_sut-ubuntu1804:local
+ die 'Failed to start TG docker container!'
+ set -x
+ set +eu
+ warn 'Failed to start TG docker container!'
+ set -exuo pipefail
+ echo 'Failed to start TG docker container!'
Failed to start TG docker container!
+ exit 1
+++ set +eu
+++ warn 'Topology reservation via shim-dcr failed!'
+++ set -exuo pipefail
+++ echo 'Topology reservation via shim-dcr failed!'
Topology reservation via shim-dcr failed!
+++ exit 1
Build step 'Execute shell' marked build as failure

Best
ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18574): https://lists.fd.io/g/vpp-dev/message/18574
Mute This Topic: https://lists.fd.io/mt/80008977/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] csit verify failures

2021-01-20 Thread Peter Mikus via lists.fd.io
Hello,

I am now fully allocated to monitor the issue and find most appropriate 
solution as long term fix.

There were few issues identified.

1) Race condition on Intel x700 series card being tracked in separate mail 
([csit-dev] [vpp-dev] VPP Device jobs randomly failing) and being solved with 
cooperation with vendor (Intel).

Error symptoms: One specific test of CSIT robot execution is failing. But tests 
are finished properly and framework quits.

2)  OOM kill. Due to increased demands on resources (mainly memory indeed) from 
device under test and accompanied test stack there was a refusal to start more 
containers by Docker deamon itself.

This issue has been fixed by adjusting pre-allocated memory layout on both 
devices and is now in the place. I am monitoring jobs. We also put in place 
optimization in gerrit-jenkins trigger mechanics for vpp-device to prevent the 
start of unwanted verify jobs (to decrease the load and run verify jobs in more 
effective pipeline).

Error symptoms (message): Docker container failed to start

3) Last but most complicated issue involves garbage collection of virtual 
functions (sriov vfs) used (+ containers).
This issue is complex and while I yesterday reset vpp_device, I am still 
looking for an permanent fix to be applied. It is indeed related to state where 
previous simulation did not properly cleaned resources.

Error symptoms (message):
Cannot find device "enpXX"
+ die 'Moving interface to YY namespace failed!


Peter Mikus
Engineer – Software
Cisco Systems Limited


From: csit-...@lists.fd.io  on behalf of Benoit Ganne 
(bganne) via lists.fd.io 
Sent: Tuesday, January 19, 2021 11:55
To: csit-...@lists.fd.io
Cc: vpp-dev
Subject: [csit-dev] csit verify failures

Hi all,

I noticed 100% failures with the verify job 
vpp-csit-verify-device-master-1n-skx recently, eg. 
https://logs.fd.io/production/vex-yul-rot-jenkins-1/vpp-csit-verify-device-master-1n-skx/10902/console.log.gz
It seems to always fail with 'Failed to start TG docker container!' and 
'Topology reservation via shim-dcr failed!'.

Here is an excerpt:

+ DCR_UUIDS+=([tg]=$(docker run "${params[@]}"))
++ docker run --detach=true --privileged --publish-all --rm --shm-size 2G 
--mount type=tmpfs,destination=/sys/bus/pci/devices --volume 
/dev/vfio:/dev/vfio --volume /var/run/docker.sock:/var/run/docker.sock --volume 
/opt/boot/:/opt/boot/ --volume /dev/hugepages/:/dev/hugepages/ --sysctl 
net.ipv6.conf.all.disable_ipv6=1 --sysctl net.ipv6.conf.default.disable_ipv6=1 
--name csit-tg-fc4f2532-ea5a-47f2-b0bf-7ccdff00dc32 csit_sut-ubuntu1804:local
+ die 'Failed to start TG docker container!'
+ set -x
+ set +eu
+ warn 'Failed to start TG docker container!'
+ set -exuo pipefail
+ echo 'Failed to start TG docker container!'
Failed to start TG docker container!
+ exit 1
+++ set +eu
+++ warn 'Topology reservation via shim-dcr failed!'
+++ set -exuo pipefail
+++ echo 'Topology reservation via shim-dcr failed!'
Topology reservation via shim-dcr failed!
+++ exit 1
Build step 'Execute shell' marked build as failure

Best
ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18573): https://lists.fd.io/g/vpp-dev/message/18573
Mute This Topic: https://lists.fd.io/mt/79948604/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] CSIT - AVF interface create crash [VPP-1858]

2020-04-06 Thread Peter Mikus via lists.fd.io
Hello vpp-dev,

We found issues running CSIT AVF tests. I opened VPP-1858 for tracking.

In case of any questions please contact @csit-dev.

Thank you.

Peter Mikus
Engineer – Software
Cisco Systems Limited
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15999): https://lists.fd.io/g/vpp-dev/message/15999
Mute This Topic: https://lists.fd.io/mt/72809079/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT - VM VHOST tests with rxq>1 failing [VPP-1853]

2020-04-01 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

We found issues running CSIT VM VHOST tests. I opened VPP-1853 for tracking.

In case of any questions please contact @csit-dev.

Thank you.

Peter Mikus
Engineer – Software
Cisco Systems Limited
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15964): https://lists.fd.io/g/vpp-dev/message/15964
Mute This Topic: https://lists.fd.io/mt/72695885/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT - VM VHOST tests with qemu mrg_rxbuf=off [VPP-1854]

2020-04-01 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

We found issues running CSIT VM VHOST tests. I opened VPP-1854 for tracking.

In case of any questions please contact @csit-dev.

Thank you.

Peter Mikus
Engineer – Software
Cisco Systems Limited
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15963): https://lists.fd.io/g/vpp-dev/message/15963
Mute This Topic: https://lists.fd.io/mt/72695880/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] [csit-dev] crypto_ia32 -> crypto_native

2020-01-27 Thread Peter Mikus via Lists.Fd.Io
Let us know the commit id and when it will be merged.

Peter Mikus
Engineer – Software
Cisco Systems Limited

From: csit-...@lists.fd.io  On Behalf Of Damjan Marion 
via Lists.Fd.Io
Sent: Monday, January 27, 2020 9:54 PM
To: vpp-dev ; csit-...@lists.fd.io
Cc: csit-...@lists.fd.io
Subject: [csit-dev] crypto_ia32 -> crypto_native


Folks,

To avoid code duplication i would like to rename crypto_ia32 plugin into 
crypto_native. Reason is adding ARMv8 support which seems to be very similar to 
IA32 in terms of implementing CBC and GCM.

Any objections or caveats?

--
Damjan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15265): https://lists.fd.io/g/vpp-dev/message/15265
Mute This Topic: https://lists.fd.io/mt/70213013/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] CSIT - performance tests failing on Taishan

2019-12-03 Thread Peter Mikus via Lists.Fd.Io
Latest update is that Benoit has no access over VPN so he did try to replicate 
in local lab (assuming x86).
I will do quick fix in CSIT. I will disable MLX driver on Taishan.

Peter Mikus
Engineer - Software
Cisco Systems Limited

> -Original Message-
> From: Juraj Linkeš 
> Sent: Tuesday, December 3, 2019 3:09 PM
> To: Benoit Ganne (bganne) ; Peter Mikus -X (pmikus -
> PANTHEON TECH SRO at Cisco) ; Maciek Konstantynowicz
> (mkonstan) ; vpp-dev ; csit-
> d...@lists.fd.io
> Cc: Vratko Polak -X (vrpolak - PANTHEON TECH SRO at Cisco)
> ; lijian.zh...@arm.com; Honnappa Nagarahalli
> 
> Subject: RE: CSIT - performance tests failing on Taishan
> 
> Hi Benoit,
> 
> Do you have access to FD.io lab? The Taishan servers are in it.
> 
> Juraj
> 
> -Original Message-
> From: Benoit Ganne (bganne) 
> Sent: Friday, November 29, 2019 4:03 PM
> To: Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco)
> ; Juraj Linkeš ; Maciek
> Konstantynowicz (mkonstan) ; vpp-dev  d...@lists.fd.io>; csit-...@lists.fd.io
> Cc: Vratko Polak -X (vrpolak - PANTHEON TECH SRO at Cisco)
> ; lijian.zh...@arm.com; Honnappa Nagarahalli
> 
> Subject: RE: CSIT - performance tests failing on Taishan
> 
> Hi Peter, can I get access to the setup to investigate?
> 
> Best
> ben
> 
> > -Original Message-
> > From: Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco)
> > 
> > Sent: vendredi 29 novembre 2019 11:08
> > To: Benoit Ganne (bganne) ; Juraj Linkeš
> > ; Maciek Konstantynowicz (mkonstan)
> > ; vpp-dev ;
> > csit-...@lists.fd.io
> > Cc: Vratko Polak -X (vrpolak - PANTHEON TECH SRO at Cisco)
> > ; Benoit Ganne (bganne) ;
> > lijian.zh...@arm.com; Honnappa Nagarahalli
> > 
> > Subject: RE: CSIT - performance tests failing on Taishan
> >
> > +dev lists
> >
> > Peter Mikus
> > Engineer - Software
> > Cisco Systems Limited
> >
> > > -Original Message-
> > > From: Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco)
> > > Sent: Friday, November 29, 2019 11:06 AM
> > > To: Benoit Ganne (bganne) ; Juraj Linkeš
> > > ; Maciek Konstantynowicz (mkonstan)
> > > 
> > > Cc: Vratko Polak -X (vrpolak - PANTHEON TECH SRO at Cisco)
> > > ; Benoit Ganne (bganne) ;
> > > lijian.zh...@arm.com; Honnappa Nagarahalli
> > 
> > > Subject: CSIT - performance tests failing on Taishan
> > >
> > > Hello all,
> > >
> > > In CSIT we are observing the issue with Taishan boxes where
> > > performance tests are failing.
> > > There has been long misleading discussion about the potential issue,
> > root
> > > cause and what workaround to apply.
> > >
> > > Issue
> > > =
> > > VPP is being restarted after an attempt to read "show pci" over the
> > > socket on '/run/vpp/cli.sock'
> > > in a loop. This loop test is executed in CSIT towards VPP with
> > > default startup configuration via command below to check if VPP is
> > > really UP and responding.
> > >
> > > How to reproduce
> > > 
> > > for i in $(seq 1 120); do echo "show pci" | sudo socat - UNIX-
> > > CONNECT:/run/vpp/cli.sock; sudo netstat -ap | grep vpp; done
> > >
> > > The same can be reproduced using vppctl:
> > >
> > > for i in $(seq 1 120); do echo "show pci" | sudo vppctl; sudo
> > > netstat -
> > ap
> > > | grep vpp; done
> > >
> > > To eliminate the issue with test itself I used "show version"
> > > for i in $(seq 1 120); do echo "show version" | sudo socat - UNIX-
> > > CONNECT:/run/vpp/cli.sock; sudo netstat -ap | grep vpp; done
> > >
> > > This test is passing with "show version" and VPP is not restarted.
> > >
> > >
> > > Root cause
> > > ==
> > > The root cause seems to be:
> > >
> > > Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault.
> > > 0xbeb4f3d0 in format_vlib_pci_vpd (
> > > s=0x7fabe830 "0002:f9:00.0   0  15b3:1015   8.0 GT/s x8
> > > mlx5_core   CX4121A - ConnectX-4 LX SFP28", args
> > > =)
> > > at /w/workspace/vpp-arm-merge-master-
> > > ubuntu1804/src/vlib/pci/pci.c:230
> > > 230 /w/workspace/vpp-arm-merge-master-
> ubuntu1804/src/vlib/pci/pci.c:
> > > No such file or directory.
> > > (gdb)
> > > Continuing.
> > >
> > > Thread 1 "vpp_main" received signal SIGABRT, Aborted.
> > > __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> > > 51  ../sysdeps/unix/sysv/linux/raise.c: No such file or
> directory.
> > > (gdb)
> > >
> > >
> > > Issue started after MLX was installed into Taishan.
> > >
> > >
> > > @Benoit Ganne (bganne) can you please help fixing the root cause?
> > >
> > > Thank you.
> > >
> > > Peter Mikus
> > > Engineer - Software
> > > Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14765): https://lists.fd.io/g/vpp-dev/message/14765
Mute This Topic: https://lists.fd.io/mt/64332740/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] CSIT - performance tests failing on Taishan

2019-11-29 Thread Peter Mikus via Lists.Fd.Io
+dev lists

Peter Mikus
Engineer - Software
Cisco Systems Limited

> -Original Message-
> From: Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco)
> Sent: Friday, November 29, 2019 11:06 AM
> To: Benoit Ganne (bganne) ; Juraj Linkeš
> ; Maciek Konstantynowicz (mkonstan)
> 
> Cc: Vratko Polak -X (vrpolak - PANTHEON TECH SRO at Cisco)
> ; Benoit Ganne (bganne) ;
> lijian.zh...@arm.com; Honnappa Nagarahalli 
> Subject: CSIT - performance tests failing on Taishan
> 
> Hello all,
> 
> In CSIT we are observing the issue with Taishan boxes where performance
> tests are failing.
> There has been long misleading discussion about the potential issue, root
> cause and what workaround to apply.
> 
> Issue
> =
> VPP is being restarted after an attempt to read "show pci" over the
> socket on '/run/vpp/cli.sock'
> in a loop. This loop test is executed in CSIT towards VPP with default
> startup configuration via command below to check if VPP is really UP and
> responding.
> 
> How to reproduce
> 
> for i in $(seq 1 120); do echo "show pci" | sudo socat - UNIX-
> CONNECT:/run/vpp/cli.sock; sudo netstat -ap | grep vpp; done
> 
> The same can be reproduced using vppctl:
> 
> for i in $(seq 1 120); do echo "show pci" | sudo vppctl; sudo netstat -ap
> | grep vpp; done
> 
> To eliminate the issue with test itself I used "show version"
> for i in $(seq 1 120); do echo "show version" | sudo socat - UNIX-
> CONNECT:/run/vpp/cli.sock; sudo netstat -ap | grep vpp; done
> 
> This test is passing with "show version" and VPP is not restarted.
> 
> 
> Root cause
> ==
> The root cause seems to be:
> 
> Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault.
> 0xbeb4f3d0 in format_vlib_pci_vpd (
> s=0x7fabe830 "0002:f9:00.0   0  15b3:1015   8.0 GT/s x8
> mlx5_core   CX4121A - ConnectX-4 LX SFP28", args
> =)
> at /w/workspace/vpp-arm-merge-master-
> ubuntu1804/src/vlib/pci/pci.c:230
> 230 /w/workspace/vpp-arm-merge-master-ubuntu1804/src/vlib/pci/pci.c:
> No such file or directory.
> (gdb)
> Continuing.
> 
> Thread 1 "vpp_main" received signal SIGABRT, Aborted.
> __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> 51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb)
> 
> 
> Issue started after MLX was installed into Taishan.
> 
> 
> @Benoit Ganne (bganne) can you please help fixing the root cause?
> 
> Thank you.
> 
> Peter Mikus
> Engineer - Software
> Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14734): https://lists.fd.io/g/vpp-dev/message/14734
Mute This Topic: https://lists.fd.io/mt/64332740/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Requirements for DPDK pmd/feature testing in CSIT vpp-device-test jobs

2019-11-19 Thread Peter Mikus via Lists.Fd.Io
Hello,

Q:

> Where does the vpp-device-test live?
(For VPP) It lives in [0] and runs per-patch non-voting. Plan is to move it to 
voting as coverage and stability issues are now limit close to zero (except 
major Jenkins outages which we are not in control) and all the results were 
analyzed for false negatives. Right now [1] there are no false negatives.

> What does it do?
  Currently it is testing hardware level integration, focused on driver 
testing (per mail below it is another layer beyond make-test). This means we 
are running VPP in docker container on top of Intel X710 cards in SRIOV mode 
(can be extended to Mellanox if needed, or e.g Intel QAT?). Tested are AVF, 
vfio-pci (i40evf) via DPDK, memif, vhost, tap and can be extended to more.

> Can we expand this discussion to discuss the VPP CI workflow?  I would like 
> to see a decoupling of development and integration.
  Sure, any feedback welcomed.

> It would be great if we could rebuild the containers whenever a commit 
> updated the Makefile or the requirements.txt files.
  This was original idea and goal to have the container CI/CD pipeline in 
place to be able to eliminate the manual intervention when change is required. 
So I am happy to collaborate on design of such. I would start with defining 
requirements from community first and then build such infra.

> What’s next?
  We are seeking for inputs how to extend vpp-device driver coverage and 
what other drivers you would like to test from either DPDK stack or native.


[0] https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-skx/
[1] 
https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-skx/buildTimeTrend
[2] https://docs.fd.io/csit/master/report/vpp_device_tests/overview.html

Peter Mikus
Engineer – Software
Cisco Systems Limited

From: vpp-dev@lists.fd.io  On Behalf Of Paul Vinciguerra
Sent: Tuesday, November 19, 2019 12:10 PM
To: Paul Vinciguerra 
Cc: Dave Wallace ; vpp-dev 
Subject: Re: [vpp-dev] Requirements for DPDK pmd/feature testing in CSIT 
vpp-device-test jobs

Can we expand this discussion to discuss the VPP CI workflow?  I would like to 
see a decoupling of development and integration.
As I mentioned the other day,  It would be great if we could rebuild the 
containers whenever a commit updated the Makefile or the requirements.txt files.

I'd also like to throw out the idea of breaking up the verify job.  I think 
that if we were to remove VOM and the dist builds from verify and change the 
workflow so that a +2 triggers a pre-commit gate where VOM and the dist builds 
and the extended tests are run.  If everything passes, the change is merged, if 
not, the +2 is removed.  The existing csit job could be non-voting (so the csit 
folks could have a heads up) in the first phase, and voting in pre-commit-phase.

Paul

On Mon, Nov 18, 2019 at 12:18 PM Paul Vinciguerra via 
Lists.Fd.Io 
mailto:vinciconsulting@lists.fd.io>>
 wrote:
Hi Dave.

Where does the vpp-device-test live?

On Mon, Nov 18, 2019 at 11:13 AM Dave Wallace 
mailto:dwallac...@gmail.com>> wrote:
Folks,

Per the topic in last week's monthly VPP community meeting, the topic of DPDK 
pmd/feature testing in the CSIT devicetest job was discussed in the most recent 
CSIT community meeting (Wed 11/13).

In the beginning of the VPP project, DPDK pmd/feature testing was performed in 
the VIRL based CSIT test suites. DPDK was moved from the VPP core feature set 
into a plugin in VPP 17.04 and in later releases native device drivers were 
implemented.  Subsequently, DPDK testing was removed from the CSIT VIRL tests.  
Also the CSIT team put a plan put in place for all of the VIRL tests to be 
moved and the VIRL servers re-purposed.  In addition, the CSIT vpp-device-test 
job was created to provide test coverage of device level VPP features that 
cannot be tested in VPP's 'make test' framework. The plan for re-purposing the 
VIRL servers is complete and the vpp-device-test job is slated to become voting 
once it is stable enough for continuous-integration testing.

The CSIT team would like input from the VPP community on exactly what DPDK 
PMD's and/or features are required to be added to the CSIT vpp-device-test jobs.

Thanks,
-daw-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14617): https://lists.fd.io/g/vpp-dev/message/14617
Mute This Topic: https://lists.fd.io/mt/60208819/1594641
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
[pvi...@vinciconsulting.com]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14618): https://lists.fd.io/g/vpp-dev/message/14618
Mute This Topic: https://lists.fd.io/mt/60208819/1594641
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: 

Re: [csit-dev] [vpp-dev] current verify issues and jenkins build queue

2019-10-09 Thread Peter Mikus via Lists.Fd.Io
1device restored.

Peter Mikus
Engineer – Software
Cisco Systems Limited

> -Original Message-
> From: csit-...@lists.fd.io  On Behalf Of Ed Kern
> via Lists.Fd.Io
> Sent: Tuesday, October 8, 2019 6:58 PM
> To: vpp-dev ; csit-...@lists.fd.io
> Cc: csit-...@lists.fd.io
> Subject: Re: [csit-dev] [vpp-dev] current verify issues and jenkins build
> queue
> 
> Update:
> 
> Still ongoing:   1device cluster still dead….per patch job has been
> removed so verify’s can happen
> 
> Jenkins is back up with an empty queue as of @30 minutes ago.   I have
> three rechecks running and when those pass
> ill be going through all the open gerrits I see (without any verification
> vote at all) for the past day and recheck them.
> 
> 
> Wont send another update until/if something else goes sideways or 1device
> cluster is back in service and is part of the verify process again.
> 
> Ill be looking into the health checker style so ’the right thing’ will
> happen when the port is open and receiving but without a fully functional
> brain behind it.
> 
> Ed
> 
> 
> 
> > On Oct 8, 2019, at 9:59 AM, Ed Kern via Lists.Fd.Io
>  wrote:
> >
> >
> > Problems currently still ongoing:
> > 1device cluster worker nodes are currently down.. I’ve notified
> csit in slack and am cc’ing them here..  In the meantime I have a gerrit
> to remove 1device per patch
> > so it doesn’t delay voting on verify jobs.
> > Jenkins just crashed so that will take awhile to sort.
> > vanessa and I are trying to just empty the build queue at
> this point to get back to zero so jenkins won’t just crash again when it
> gets opened.
> >
> >
> > History:
> >
> > root cause:
> > a. will have to wait on csit folks for answers on the two 1device
> node failure
> > b. during the night the internal docker registry stopped responding
> > (but still passed socket health check so didnt fail over)
> >
> > Workflow:
> > 1. I saw there was an issue reading email around 6am pacific this
> morning.
> > 2. saw that the registry wasn’t responding and attempted restart.
> > 3. due to the jenkins server queue hammering on the nomad cluster
> it took a long while to get that restart to go through (roughly 40 min)
> > 4. once the bottle was uncorked the sixty jobs pending (including a
> large number of checkstyle jobs) turned into 160.
> > 5. jenkins ‘chokes’ and crashes
> > 6. ‘we’ start scrubbing the queue which will cause a huge number of
> rechecks but at least jenkins wont crash again..
> >
> > **current time
> >
> >  future:
> > 7.  will force the ci-man patch removing per patch verify
> > 8. jenkins queue will re-open and ill send another email.
> > 9. Im adding myself to the queue high threshold alarm LF system so
> I get paged/called when the queue gets above 90 (their current severe
> water mark)
> > 10. Ill see if i can find a way to troll gerrit to manually recheck
> > what i can find
> >
> >
> > more as it rolls along-=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> >
> > View/Reply Online (#14147):
> > https://lists.fd.io/g/vpp-dev/message/14147
> > Mute This Topic: https://lists.fd.io/mt/34443895/675649
> > Group Owner: vpp-dev+ow...@lists.fd.io
> > Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [e...@cisco.com]
> > -=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14151): https://lists.fd.io/g/vpp-dev/message/14151
Mute This Topic: https://lists.fd.io/mt/34460887/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] SIGABRT during dpdk_config

2019-05-03 Thread Peter Mikus via Lists.Fd.Io
Hello,

Would be helpful if you can describe your environment closer. K8S version, 
yaml, etc.
This seems to be problem of mapping hugepages.

Peter Mikus
Engineer – Software
Cisco Systems Limited

From: vpp-dev@lists.fd.io  On Behalf Of Damjan Marion via 
Lists.Fd.Io
Sent: Friday, May 3, 2019 11:48 AM
To: msher...@yahoo.com
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] SIGABRT during dpdk_config


On 1 May 2019, at 18:43, Mohamed Mohamed via Lists.Fd.Io 
mailto:msherif4=yahoo@lists.fd.io>> wrote:

Hi Damjan

I am running container in privileged mode is there a way to narrow this down ?

No idea, I never tried to run VPP in docker container…

--
Damjan


Thanks
Mohamed

Sent from my iPhone

On May 1, 2019, at 12:37 PM, Damjan Marion (damarion) 
mailto:damar...@cisco.com>> wrote:



On 1 May 2019, at 16:11, Mohamed Mohamed via Lists.Fd.Io 
mailto:msherif4=yahoo@lists.fd.io>> wrote:

Hi Folks:

 I am getting vpp crash during init with the following traces

Program terminated with signal SIGABRT, Aborted.
#0  0x7f0c7e3ea428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f0c7ffef780 (LWP 21592))]
(gdb) bt
#0  0x7f0c7e3ea428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x7f0c7e3ec02a in __GI_abort () at abort.c:89
#2  0x0040730e in os_exit (code=code@entry=1) at 
/build/vpp/src/vpp/vnet/main.c:357
#3  0x7f0c7ecaec09 in unix_signal_handler (signum=, 
si=,
uc=) at /build/vpp/src/vlib/unix/main.c:156
#4  
#5  0x7f0c3bcfa977 in alloc_seg () from /usr/lib/vpp_plugins/dpdk_plugin.so
#6  0x7f0c3bcfb06d in alloc_seg_walk () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#7  0x7f0c3bd022fb in rte_memseg_list_walk_thread_unsafe () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#8  0x7f0c3bcfbbf1 in eal_memalloc_alloc_seg_bulk () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#9  0x7f0c3bd12034 in alloc_pages_on_heap () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#10 0x7f0c3bd12362 in try_expand_heap () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#11 0x7f0c3bd128b8 in alloc_more_mem_on_socket () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#12 0x7f0c3bd12d25 in malloc_heap_alloc () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#13 0x7f0c3bd0d90e in rte_malloc_socket () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#14 0x7f0c3bd166b1 in rte_service_init () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#15 0x7f0c3bcf0487 in rte_eal_init () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#16 0x7f0c3c38d0b3 in dpdk_config (vm=, input=)
at /build/vpp/src/plugins/dpdk/device/init.c:1446
#17 0x7f0c7ec710d7 in vlib_call_all_config_functions (vm=,
input=input@entry=0x7f0c3e3fffa0, is_early=is_early@entry=0) at 
/build/vpp/src/vlib/init.c:146
#18 0x7f0c7ec83c16 in vlib_main (vm=, 
vm@entry=0x7f0c7eec6340 ,
input=input@entry=0x7f0c3e3fffa0) at /build/vpp/src/vlib/main.c:2028
#19 0x7f0c7ecadc26 in thread0 (arg=139691645756224) at 
/build/vpp/src/vlib/unix/main.c:606
#20 0x7f0c7e7b0594 in clib_calljmp () from 
/usr/lib/x86_64-linux-gnu/libvppinfra.so.19.01
#21 0x7ffe62c32740 in ?? ()
#22 0x7f0c7ecaf172 in vlib_unix_main (argc=, argv=)
at /build/vpp/src/vlib/unix/main.c:675
#23 0x in ?? ()

I am running VPP in docker container in vagrant VM, it comes up fine with 
docker-compose but when I did the same from k8s it failed to come up because of 
the above

any suggestions ?

DPDK is not able to allocate memory. Might be missing permissions

--
Damjan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12893): https://lists.fd.io/g/vpp-dev/message/12893
Mute This Topic: https://lists.fd.io/mt/31432757/675642
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
[dmar...@me.com]
-=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12919): https://lists.fd.io/g/vpp-dev/message/12919
Mute This Topic: https://lists.fd.io/mt/31432757/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Jenkins.fd.io network issues

2019-05-02 Thread Peter Mikus via Lists.Fd.Io
Hello,

This issue is starting to be very serious and impacting deliverables for CSIT 
project.

Can you please investigate with priority.

Subset of failed jobs (just few examples from past days):
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/38/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/37/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/36/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/35/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/34/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/20/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/19/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/18/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/17/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/15/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/13/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/16/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/17/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/18/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/19/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/10/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/13/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/14/
https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/15/
https://jenkins.fd.io/job/csit-dpdk-perf-verify-1904-3n-skx/8/
https://jenkins.fd.io/job/csit-dpdk-perf-verify-1904-3n-skx/10/
https://jenkins.fd.io/job/csit-vpp-perf-mrr-daily-master/666/
https://jenkins.fd.io/job/csit-vpp-perf-mrr-daily-master/665/

Thank you.

Note: From logs, it can be quite easily extracted dates and times of outages 
and correlation in time.


jobs=('https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/38/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/37/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/36/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/35/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-2n-skx/34/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/20/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/19/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/18/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/17/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/15/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-skx/13/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/16/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/17/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/18/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/19/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/10/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/13/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/14/'
  'https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-hsw/15/'
  'https://jenkins.fd.io/job/csit-dpdk-perf-verify-1904-3n-skx/8/'
  'https://jenkins.fd.io/job/csit-dpdk-perf-verify-1904-3n-skx/10/'
 'https://jenkins.fd.io/job/csit-vpp-perf-mrr-daily-master/666/'
  'https://jenkins.fd.io/job/csit-vpp-perf-mrr-daily-master/665/'
  ' https://jenkins.fd.io/job/csit-vpp-perf-mrr-daily-master-3n-skx/416/')
for job in "${jobs[@]}"; do
curl -s 
"${job}/timestamps/?time=.MM.dd-HH:mm:ss=en_US" | grep -m 
1 "java.nio.channels.ClosedChannelException"
done

which results in:

2019.05.01-13:48:27  java.nio.channels.ClosedChannelException
2019.05.01-13:48:26  java.nio.channels.ClosedChannelException
2019.05.01-13:48:26  java.nio.channels.ClosedChannelException
2019.05.01-13:48:18  java.nio.channels.ClosedChannelException
2019.04.30-16:16:29  java.nio.channels.ClosedChannelException
2019.05.01-13:48:18  java.nio.channels.ClosedChannelException
2019.05.01-13:48:26  java.nio.channels.ClosedChannelException
2019.05.01-13:48:26  java.nio.channels.ClosedChannelException
2019.04.30-16:16:29  java.nio.channels.ClosedChannelException
2019.04.30-16:16:29  java.nio.channels.ClosedChannelException
2019.04.27-10:55:06  java.nio.channels.ClosedChannelException
2019.04.30-16:16:29  java.nio.channels.ClosedChannelException
2019.05.01-13:48:26  java.nio.channels.ClosedChannelException
2019.05.01-17:59:20  java.nio.channels.ClosedChannelException
2019.05.01-13:48:26  java.nio.channels.ClosedChannelException
2019.04.27-10:54:40  java.nio.channels.ClosedChannelException
2019.04.28-09:32:20  java.nio.channels.ClosedChannelException
2019.04.30-16:16:29  java.nio.channels.ClosedChannelException
2019.04.30-16:16:29  java.nio.channels.ClosedChannelException
2019.05.01-13:48:18  java.nio.channels.ClosedChannelException
2019.05.01-13:48:26  java.nio.channels.ClosedChannelException
2019.04.30-16:16:29  

Re: [vpp-dev] [tsc] [csit-dev] VPP 18.10 is out!!!

2018-10-25 Thread Peter Mikus via Lists.Fd.Io
Hello,

Please address all the question to vpp-dev. It is not tested by CSIT anymore.

Thanks.

Peter Mikus
Engineer – Software
Cisco Systems Limited

From: csit-...@lists.fd.io  On Behalf Of Florin Coras
Sent: Thursday, October 25, 2018 3:43 AM
To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) 
Cc: Damjan Marion (damarion) ; csit-...@lists.fd.io; Ed 
Kern (ejk) 
Subject: Re: [tsc] [csit-dev] VPP 18.10 is out!!!

While we’re at it. Any idea if this dependency can be dropped for centos/suse? 
We keep randomly getting:

package vpp-ext-deps-19.01-4.x86_64 (which is newer than 
vpp-ext-deps-19.01-3.x86_64) is already installed

Cheers,
Florin

On Oct 24, 2018, at 10:27 AM, Jan Gelety via Lists.Fd.Io 
mailto:jgelety=cisco@lists.fd.io>> wrote:

Hello Damjan,

Thanks for the info.

In the past there was used igb_uio driver for CSIT tests and former 
vpp-dpdk-dkms deb package, replaced by vpp-ext-deps package now, was used as 
source of this driver for tests on ubuntu.

But CSIT tests were modified to use vfio-pci driver so we were able to remove 
CSIT dependency on vpp-ext-deps package in master [0] as well as rls1810 [1] 
branches.

Thus we don’t need it anymore.

Thanks,
Jan

[0] https://gerrit.fd.io/r/#/c/15505/
[1] https://gerrit.fd.io/r/#/c/15512/

From: Damjan Marion (damarion)
Sent: Wednesday, October 24, 2018 7:00 PM
To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) 
mailto:jgel...@cisco.com>>
Cc: Marco Varlese mailto:mvarl...@suse.de>>; 
vpp-dev@lists.fd.io; 
csit-...@lists.fd.io; 
t...@lists.fd.io
Subject: Re: [tsc] [csit-dev] VPP 18.10 is out!!!


Jan,

vpp-ext-deps is not official vpp package, it is just helper package for 
developers.
It is not needed or intended to be used by end users

Why do you need it?

--
Damjan



On 24 Oct 2018, at 10:59, via Lists.Fd.Io 
mailto:jgelety=cisco@lists.fd.io>> wrote:

Hello Macro,

Great and thanks for the info!

Unfortunately, vpp-ext-deps packages are missing there [0] - is it possible to 
fix it, please?

Thanks,
Jan

[0] https://packagecloud.io/app/fdio/release/search?q=vpp-ext-deps

-Original Message-
From: csit-...@lists.fd.io 
mailto:csit-...@lists.fd.io>> On Behalf Of Marco Varlese
Sent: Wednesday, October 24, 2018 9:47 AM
To: vpp-dev@lists.fd.io; 
csit-...@lists.fd.io
Cc: t...@lists.fd.io
Subject: [csit-dev] VPP 18.10 is out!!!

Dear all,

I am very happy to announce that release 18.10 is available.

Release artificats can be found on both Nexus and Packagecloud.

Thanks to all contributors for yet another great release!


Cheers,
--
Marco V

SUSE LINUX GmbH | GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 
(AG Nürnberg) Maxfeldstr. 5, D-90409, Nürnberg

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#863): https://lists.fd.io/g/tsc/message/863
Mute This Topic: https://lists.fd.io/mt/27619232/675241
Group Owner: tsc+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/tsc/unsub  
[damar...@cisco.com]
-=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#3147): https://lists.fd.io/g/csit-dev/message/3147
Mute This Topic: https://lists.fd.io/mt/27620195/675152
Group Owner: csit-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/csit-dev/unsub  
[fcoras.li...@gmail.com]
-=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10979): https://lists.fd.io/g/vpp-dev/message/10979
Mute This Topic: https://lists.fd.io/mt/27619893/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT - VirtualEthernet interface UP issue

2018-10-08 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

Recently we are facing issue that VirtualEthernet interfaces (vhost-user) are 
not coming UP once we put them to admin UP state.


Brief bisecting reveals that its happening after one of these:

git log --pretty=oneline 0a4e006..5507192
5507192339aed14634929b3e8d7c5d3e5ea8f997 Fix JVPP enum _host_to_net_ 
translation (VPP-1438)
2d3c7b9c4555ea4467253b0590c9aa1a6c644b4d BFD: add get echo source API (VPP-1367)
a9a0b2ce2daabc5479aa7722e3ec7023f8c6c0d5 IPsec: add API for SPDs dump (VPP-1363)
b192feba004e7a52b57ff9f68246b1c94e8b667b vhost-user: Interface state updates
83c46a2c5c97320e029b4dd154a45212530f221d vhost_user: Fix setting MTU using 
uninitialized variable
8f39d55a298e08ac808da6988032f14d542627c6 Update code to compute checksum for 
buffer chains
ef91534e665cf343af2389df11d46559a1f83d78 tls: fix disconnects for sessions with 
pending data
5f5d50ee9b342964ca10612cd002497fb40c tcp: fix close wait timeout with no fin
ca09d0730974effd53b436871f3e69c8bb8b1114 tcp: accept fins if in recovery

Could you please take a look and let us know if there was any change in 
mechanics or API?


Thank you.

[0] 
https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master/333/consoleFull
 | grep VirtualEthernet
[1] VAT history:
sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_l2_bridge sw_if_index 2 bd_id 1 shg 0 enable
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_set_l2_bridge sw_if_index 1 bd_id 2 shg 0 enable
create_vhost_user_if socket /tmp/sock-1-1
sw_interface_dump
sw_interface_dump
create_vhost_user_if socket /tmp/sock-1-2
sw_interface_dump
sw_interface_dump
sw_interface_set_flags sw_if_index 3 admin-up link-up
sw_interface_set_flags sw_if_index 4 admin-up link-up
sw_interface_set_flags sw_if_index 3 admin-up link-up
sw_interface_set_l2_bridge sw_if_index 3 bd_id 1 shg 0 enable
sw_interface_set_flags sw_if_index 4 admin-up link-up
sw_interface_set_l2_bridge sw_if_index 4 bd_id 2 shg 0 enable
sw_interface_dump



Peter Mikus
Engineer - Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10767): https://lists.fd.io/g/vpp-dev/message/10767
Mute This Topic: https://lists.fd.io/mt/26872454/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VAT api changes 5507192..8807674

2018-10-04 Thread Peter Mikus via Lists.Fd.Io
With latest version 18.10-rc0~572-g5958769~b5355 the issue is not there 
anymore. So I am considering this one as closed.

Peter Mikus
Engineer – Software
Cisco Systems Limited

From: Maciek Konstantynowicz (mkonstan)
Sent: Thursday, October 4, 2018 10:28 AM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; vpp-dev@lists.fd.io
Cc: csit-dev 
Subject: Re: [vpp-dev] VAT api changes 5507192..8807674

Is it still broken?

-Maciek


On 3 Oct 2018, at 09:11, Peter Mikus via Lists.Fd.Io 
mailto:pmikus=cisco@lists.fd.io>> wrote:

Hello devs,

Recently in CSIT we are facing the issue with vpp_api_test console with JSON 
enabled, when calling dump_interface_table:

dump_interface_table:6017: JSON output supported only for VPE API calls and 
dump_stats_table
main:427: BUG: message reply spin-wait timeout

our API call:

$ sudo -S vpp_api_test json in dump_interfaces.vat script

Where:
$ cat dump_interfaces.vat
sw_interface_dump
dump_interface_table
quit

Brief bisecting reveals that its happening after one of these:

$ git log --pretty=oneline 5507192..8807674
88076749e663e35925c2212eb79e2ec4ce023772 (HEAD -> master, origin/master, 
origin/HEAD) Enabled untagged vs default functionality Removed 0-tags attribute 
for default-sub-if config Moved default-sub-if check before untagged
819d5fdb39526386ee8fe4a8729f960e84443cbd VPP-1440: clean up coverity warnings
bf49590c07162be44b21d0e0440e7fb96b2746d5 Stats: vpp_prometheus_export fixes.
94495f2a6a68ac2202b7715ce09620f1ba6fe673 PAPI: Use UNIX domain sockets instead 
of shared memory
84db4087fa38b8d4c62cbb0787d600950638034c vcl: fix coverity warning
c67cfd22bedf5b45e82ff6820d1d4e6f61ba5187 ip4-local: classify protos that skip 
csum and src check
2f54c27f7fd0d2c24e7d6b1d48809e8b58ec1abf vhost-user: add support for vlib_log 
API

Could you please take a look and let us know if this is bug or desired?

Thank you.

Peter Mikus
Engineer – Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10737): https://lists.fd.io/g/vpp-dev/message/10737
Mute This Topic: https://lists.fd.io/mt/26716121/675185
Group Owner: vpp-dev+ow...@lists.fd.io<mailto:vpp-dev+ow...@lists.fd.io>
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
[mkons...@cisco.com<mailto:mkons...@cisco.com>]
-=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10753): https://lists.fd.io/g/vpp-dev/message/10753
Mute This Topic: https://lists.fd.io/mt/26716121/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] VAT api changes 5507192..8807674

2018-10-03 Thread Peter Mikus via Lists.Fd.Io
Hello devs,

Recently in CSIT we are facing the issue with vpp_api_test console with JSON 
enabled, when calling dump_interface_table:

dump_interface_table:6017: JSON output supported only for VPE API calls and 
dump_stats_table
main:427: BUG: message reply spin-wait timeout

our API call:

$ sudo -S vpp_api_test json in dump_interfaces.vat script

Where:
$ cat dump_interfaces.vat
sw_interface_dump
dump_interface_table
quit

Brief bisecting reveals that its happening after one of these:

$ git log --pretty=oneline 5507192..8807674
88076749e663e35925c2212eb79e2ec4ce023772 (HEAD -> master, origin/master, 
origin/HEAD) Enabled untagged vs default functionality Removed 0-tags attribute 
for default-sub-if config Moved default-sub-if check before untagged
819d5fdb39526386ee8fe4a8729f960e84443cbd VPP-1440: clean up coverity warnings
bf49590c07162be44b21d0e0440e7fb96b2746d5 Stats: vpp_prometheus_export fixes.
94495f2a6a68ac2202b7715ce09620f1ba6fe673 PAPI: Use UNIX domain sockets instead 
of shared memory
84db4087fa38b8d4c62cbb0787d600950638034c vcl: fix coverity warning
c67cfd22bedf5b45e82ff6820d1d4e6f61ba5187 ip4-local: classify protos that skip 
csum and src check
2f54c27f7fd0d2c24e7d6b1d48809e8b58ec1abf vhost-user: add support for vlib_log 
API

Could you please take a look and let us know if this is bug or desired?

Thank you.

Peter Mikus
Engineer - Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10737): https://lists.fd.io/g/vpp-dev/message/10737
Mute This Topic: https://lists.fd.io/mt/26716121/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] CSIT IPsec AES-GCM 128 tests failing

2018-10-02 Thread Peter Mikus via Lists.Fd.Io
Many thanks!
Seems to be working [0]

[0] 
https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master/332/console
  Ctrl+F ipsec

Peter Mikus
Engineer - Software
Cisco Systems Limited

From: vpp-dev@lists.fd.io  On Behalf Of Radu Nicolau
Sent: Monday, October 1, 2018 5:27 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] CSIT IPsec AES-GCM 128 tests failing

This should fix it: https://gerrit.fd.io/r/#/c/15078/


From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
[mailto:vpp-dev@lists.fd.io] On Behalf Of Peter Mikus via Lists.Fd.Io
Sent: Wednesday, September 26, 2018 11:27 AM
To: Nicolau, Radu mailto:radu.nico...@intel.com>>
Cc: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>
Subject: Re: [vpp-dev] CSIT IPsec AES-GCM 128 tests failing



Peter Mikus
Engineer - Software
Cisco Systems Limited

From: Nicolau, Radu mailto:radu.nico...@intel.com>>
Sent: Tuesday, September 25, 2018 8:38 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
mailto:pmi...@cisco.com>>
Cc: Kinsella, Ray mailto:ray.kinse...@intel.com>>
Subject: RE: CSIT IPsec AES-GCM 128 tests failing

Hi Peter,

I'm traveling this week so I won't be able to have a look, but, can you help 
with some more info?


-  Is it happening consistently?

Yes


-  What is the output of show crypto device mapping [verbose]

$ sudo vppctl show int
  Name   IdxState  MTU (L3/IP4/IP6/MPLS) 
Counter  Count
FortyGigabitEthernet88/0/01  up  9200/0/0/0
FortyGigabitEthernet88/0/12  up  9200/0/0/0 rx packets  
   173678340
rx bytes
 10420700400
drops   
   173678340
ip4 
   173678340
rx-miss 
 737
ipsec03  up   0/0/0/0   tx-error
   173678340
local00 down  0/0/0/0

$ sudo vppctl show err
   CountNode  Reason
 167158808  ipsec0-output interface is down

It is not clear to me as why ipsec0 is up but error says interface down.

$ sudo vppctl show crypto
show: unknown input `crypto'

$ sudo vppctl show ipsec
tunnel interfaces
  ipsec0 seq
   seq 0 seq-hi 0 esn 0 anti-replay 0 udp-encap 0
   local-spi 1 local-ip 172.168.1.1
   local-crypto aes-gcm-128 70394d61596b657635793046705443744c45786c
   local-integrity none
   last-seq 0 last-seq-hi 0 esn 0 anti-replay 0 window 

   remote-spi 2 remote-ip 172.168.1.2
   remote-crypto aes-gcm-128 70394d61596b657635793046705443744c45786c
   remote-integrity none



-  What is the vpp version/commit id causing the issue?

Currently using vpp v18.10-rc0~465-gb7020d6~b5194 but I did not bisect, For 
sure it is broken for several weeks now.


Regards,
Radu

From: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
[mailto:pmi...@cisco.com]
Sent: Tuesday, September 25, 2018 3:32 PM
To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>; Nicolau, Radu 
mailto:radu.nico...@intel.com>>
Cc: csit-...@lists.fd.io<mailto:csit-...@lists.fd.io>
Subject: CSIT IPsec AES-GCM 128 tests failing

Hello devs,

We are observing IPSec failing tests in CSIT-perf, specially and only 
combination with aes-gcm (both interface AND tunnel mode, Integ Alg AES GCM 
128).
Note: Integ Alg SHA1 96, is working.

Can you please help decrypt the error message below?

Thank you.


VAT error message (on both DUTs)
-
sw_interface_set_flags error: Unspecified Error


-Note: This error is happening when trying to get interface UP.

DUT1 config
---

sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_dump
hw_interface_set_mtu sw_if_index 2 mtu 9200
hw_interface_set_mtu sw_if_index 1 mtu 9200
sw_interface_dump
sw_interface_add_del_address sw_if_index 2 192.168.10.1/24
sw_interface_add_del_address sw_if_index 1 172.168.1.1/24
ip_neighbor_add_del sw_if_index 2 dst 192.168.10.2 mac 68:05:ca:35:79:1c
ip_neighbor_add_del sw_if_index 1 dst 172.168.1.2 mac 68:05:ca:35:76:b1
ip_add_del_route 10.0.0.0/8 via 192.168.10.2  sw_if_index 2 resolve-attempts 10 
count 1
ipsec_tunnel_if_add_del local_spi 1 remote_spi 2 crypto_alg aes-gcm-128 
local_crypto_key 4e4d444178336b4958744a374365356235643545 remote_crypto_key 
4e4d444178336b4958744a374365356235643545  local_ip 172.168.1.1 remote_ip 
172.168.1.2
ip_add_del_route 20.0.0.0/32 via 172.168.1.2 ipsec0
exec set inte

Re: [vpp-dev] CSIT IPsec AES-GCM 128 tests failing

2018-09-26 Thread Peter Mikus via Lists.Fd.Io


Peter Mikus
Engineer - Software
Cisco Systems Limited

From: Nicolau, Radu 
Sent: Tuesday, September 25, 2018 8:38 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
Cc: Kinsella, Ray 
Subject: RE: CSIT IPsec AES-GCM 128 tests failing

Hi Peter,

I'm traveling this week so I won't be able to have a look, but, can you help 
with some more info?


-  Is it happening consistently?

Yes


-  What is the output of show crypto device mapping [verbose]

$ sudo vppctl show int
  Name   IdxState  MTU (L3/IP4/IP6/MPLS) 
Counter  Count
FortyGigabitEthernet88/0/01  up  9200/0/0/0
FortyGigabitEthernet88/0/12  up  9200/0/0/0 rx packets  
   173678340
rx bytes
 10420700400
drops   
   173678340
ip4 
   173678340
rx-miss 
 737
ipsec03  up   0/0/0/0   tx-error
   173678340
local00 down  0/0/0/0

$ sudo vppctl show err
   CountNode  Reason
 167158808  ipsec0-output interface is down

It is not clear to me as why ipsec0 is up but error says interface down.

$ sudo vppctl show crypto
show: unknown input `crypto'

$ sudo vppctl show ipsec
tunnel interfaces
  ipsec0 seq
   seq 0 seq-hi 0 esn 0 anti-replay 0 udp-encap 0
   local-spi 1 local-ip 172.168.1.1
   local-crypto aes-gcm-128 70394d61596b657635793046705443744c45786c
   local-integrity none
   last-seq 0 last-seq-hi 0 esn 0 anti-replay 0 window 

   remote-spi 2 remote-ip 172.168.1.2
   remote-crypto aes-gcm-128 70394d61596b657635793046705443744c45786c
   remote-integrity none



-  What is the vpp version/commit id causing the issue?

Currently using vpp v18.10-rc0~465-gb7020d6~b5194 but I did not bisect, For 
sure it is broken for several weeks now.


Regards,
Radu

From: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
[mailto:pmi...@cisco.com]
Sent: Tuesday, September 25, 2018 3:32 PM
To: vpp-dev@lists.fd.io; Nicolau, Radu 
mailto:radu.nico...@intel.com>>
Cc: csit-...@lists.fd.io
Subject: CSIT IPsec AES-GCM 128 tests failing

Hello devs,

We are observing IPSec failing tests in CSIT-perf, specially and only 
combination with aes-gcm (both interface AND tunnel mode, Integ Alg AES GCM 
128).
Note: Integ Alg SHA1 96, is working.

Can you please help decrypt the error message below?

Thank you.


VAT error message (on both DUTs)
-
sw_interface_set_flags error: Unspecified Error


-Note: This error is happening when trying to get interface UP.

DUT1 config
---

sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_dump
hw_interface_set_mtu sw_if_index 2 mtu 9200
hw_interface_set_mtu sw_if_index 1 mtu 9200
sw_interface_dump
sw_interface_add_del_address sw_if_index 2 192.168.10.1/24
sw_interface_add_del_address sw_if_index 1 172.168.1.1/24
ip_neighbor_add_del sw_if_index 2 dst 192.168.10.2 mac 68:05:ca:35:79:1c
ip_neighbor_add_del sw_if_index 1 dst 172.168.1.2 mac 68:05:ca:35:76:b1
ip_add_del_route 10.0.0.0/8 via 192.168.10.2  sw_if_index 2 resolve-attempts 10 
count 1
ipsec_tunnel_if_add_del local_spi 1 remote_spi 2 crypto_alg aes-gcm-128 
local_crypto_key 4e4d444178336b4958744a374365356235643545 remote_crypto_key 
4e4d444178336b4958744a374365356235643545  local_ip 172.168.1.1 remote_ip 
172.168.1.2
ip_add_del_route 20.0.0.0/32 via 172.168.1.2 ipsec0
exec set interface unnumbered ipsec0 use FortyGigabitEthernet88/0/0
sw_interface_set_flags ipsec0 admin-up


DUT2 config
---
sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_dump
hw_interface_set_mtu sw_if_index 2 mtu 9200
hw_interface_set_mtu sw_if_index 1 mtu 9200
sw_interface_dump
sw_interface_add_del_address sw_if_index 2 172.168.1.2/24
sw_interface_add_del_address sw_if_index 1 192.168.20.1/24
ip_neighbor_add_del sw_if_index 1 dst 192.168.20.2 mac 68:05:ca:35:79:19
ip_neighbor_add_del sw_if_index 2 dst 172.168.1.1 mac 68:05:ca:37:25:18
ip_add_del_route 20.0.0.0/8 via 192.168.20.2  sw_if_index 1 resolve-attempts 10 
count 1
ipsec_tunnel_if_add_del local_spi 2 remote_spi 1 crypto_alg aes-gcm-128 
local_crypto_key 4e4d444178336b4958744a374365356235643545 remote_crypto_key 
4e4d444178336b4958744a374365356235643545  local_ip 172.168.1.2 remote_ip 
172.168.1.1
ip_add_del_route 10.0.0.0/32 via 

[vpp-dev] CSIT IPsec AES-GCM 128 tests failing

2018-09-25 Thread Peter Mikus via Lists.Fd.Io
Hello devs,

We are observing IPSec failing tests in CSIT-perf, specially and only 
combination with aes-gcm (both interface AND tunnel mode, Integ Alg AES GCM 
128).
Note: Integ Alg SHA1 96, is working.

Can you please help decrypt the error message below?

Thank you.


VAT error message (on both DUTs)
-
sw_interface_set_flags error: Unspecified Error


-Note: This error is happening when trying to get interface UP.

DUT1 config
---

sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_dump
hw_interface_set_mtu sw_if_index 2 mtu 9200
hw_interface_set_mtu sw_if_index 1 mtu 9200
sw_interface_dump
sw_interface_add_del_address sw_if_index 2 192.168.10.1/24
sw_interface_add_del_address sw_if_index 1 172.168.1.1/24
ip_neighbor_add_del sw_if_index 2 dst 192.168.10.2 mac 68:05:ca:35:79:1c
ip_neighbor_add_del sw_if_index 1 dst 172.168.1.2 mac 68:05:ca:35:76:b1
ip_add_del_route 10.0.0.0/8 via 192.168.10.2  sw_if_index 2 resolve-attempts 10 
count 1
ipsec_tunnel_if_add_del local_spi 1 remote_spi 2 crypto_alg aes-gcm-128 
local_crypto_key 4e4d444178336b4958744a374365356235643545 remote_crypto_key 
4e4d444178336b4958744a374365356235643545  local_ip 172.168.1.1 remote_ip 
172.168.1.2
ip_add_del_route 20.0.0.0/32 via 172.168.1.2 ipsec0
exec set interface unnumbered ipsec0 use FortyGigabitEthernet88/0/0
sw_interface_set_flags ipsec0 admin-up


DUT2 config
---
sw_interface_set_flags sw_if_index 2 admin-up link-up
sw_interface_set_flags sw_if_index 1 admin-up link-up
sw_interface_dump
hw_interface_set_mtu sw_if_index 2 mtu 9200
hw_interface_set_mtu sw_if_index 1 mtu 9200
sw_interface_dump
sw_interface_add_del_address sw_if_index 2 172.168.1.2/24
sw_interface_add_del_address sw_if_index 1 192.168.20.1/24
ip_neighbor_add_del sw_if_index 1 dst 192.168.20.2 mac 68:05:ca:35:79:19
ip_neighbor_add_del sw_if_index 2 dst 172.168.1.1 mac 68:05:ca:37:25:18
ip_add_del_route 20.0.0.0/8 via 192.168.20.2  sw_if_index 1 resolve-attempts 10 
count 1
ipsec_tunnel_if_add_del local_spi 2 remote_spi 1 crypto_alg aes-gcm-128 
local_crypto_key 4e4d444178336b4958744a374365356235643545 remote_crypto_key 
4e4d444178336b4958744a374365356235643545  local_ip 172.168.1.2 remote_ip 
172.168.1.1
ip_add_del_route 10.0.0.0/32 via 172.168.1.1 ipsec0
exec set interface unnumbered ipsec0 use FortyGigabitEthernet88/0/1
sw_interface_set_flags ipsec0 admin-up


1969/12/31 16:00:00:086 warn   dpdk   EAL init args: -c 18 -n 4 
--huge-dir /run/vpp/hugepages --file-prefi
x vpp -w :88:00.1 -w :88:00.0 -w :86:01.0 --master-lcore 19 
--socket-mem 1024,1024
1969/12/31 16:00:00:967 notice dpdk   EAL: Detected 36 lcore(s)
1969/12/31 16:00:00:967 notice dpdk   EAL: Detected 2 NUMA nodes
1969/12/31 16:00:00:967 notice dpdk   EAL: Multi-process socket 
/var/run/dpdk/vpp/mp_socket
1969/12/31 16:00:00:967 notice dpdk   EAL: No free hugepages reported 
in hugepages-1048576kB
1969/12/31 16:00:00:967 notice dpdk   EAL: Probing VFIO support...
1969/12/31 16:00:00:967 notice dpdk   EAL: VFIO support initialized
1969/12/31 16:00:00:967 notice dpdk   EAL: PCI device :86:01.0 on 
NUMA socket 1
1969/12/31 16:00:00:967 notice dpdk   EAL:   probe driver: 8086:443 qat
1969/12/31 16:00:00:967 notice dpdk   qat_comp_dev_create(): 
Compression PMD not supported on QAT dh895xcc
1969/12/31 16:00:00:967 notice dpdk   EAL: PCI device :88:00.0 on 
NUMA socket 1
1969/12/31 16:00:00:967 notice dpdk   EAL:   probe driver: 8086:1583 
net_i40e
1969/12/31 16:00:00:967 notice dpdk   EAL: PCI device :88:00.1 on 
NUMA socket 1
1969/12/31 16:00:00:967 notice dpdk   EAL:   probe driver: 8086:1583 
net_i40e
1969/12/31 16:00:00:967 notice dpdk   Invalid port_id=2

Peter Mikus
Engineer - Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10643): https://lists.fd.io/g/vpp-dev/message/10643
Mute This Topic: https://lists.fd.io/mt/26218774/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT-1282 Migrate from Nexus.fd.io to packagecloud.io

2018-09-14 Thread Peter Mikus via Lists.Fd.Io
Hello community,

As of this week and CSIT merged patch [0], we are downloading artifacts 
directly from packagecloud.io [1] instead of nexus.fd.io. This move was 
discussed on CSIT public call on Wed-13.

As there is some dependency on other commits (specially new bootstrap design) 
this change will not be backported to previous CSIT branches (unless someone 
wants to spent time on it). Full migration I except to be smooth with new oper* 
or rls* branches as they will be created.

In case on any questions please discuss on csit-dev.
In case of any issues spotted on our side, directly related to [1] hosting, we 
will report.

Thanks.

[0] https://gerrit.fd.io/r/#/c/14786/
[1] https://packagecloud.io/fdio/

Peter Mikus
Engineer - Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10498): https://lists.fd.io/g/vpp-dev/message/10498
Mute This Topic: https://lists.fd.io/mt/25672104/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] vpp_api_test missing after vpp installation

2018-09-10 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

In CSIT we are facing the issue with missing "vpp_api_test" binary located in 
"/usr/bin" after *deb packages are installed in system.

testuser@s14-t32-sut1:/ $ sudo find / -name vpp_api_test
testuser@s14-t32-sut1:/ $

testuser@s14-t32-sut1:/ $ ll /usr/bin/vpp*
-rwxr-xr-x 1 root root 881800 Sep  7 15:19 /usr/bin/vpp*
-rwxr-xr-x 1 root root  24246 Sep  7 15:19 /usr/bin/vppapigen*
-rwxr-xr-x 1 root root  14712 Sep  7 15:19 /usr/bin/vppctl*
-rwxr-xr-x 1 root root  10544 Sep  7 15:19 /usr/bin/vpp_get_metrics*
-rwxr-xr-x 1 root root  14768 Sep  7 15:19 /usr/bin/vpp_prometheus_export*
-rwxr-xr-x 1 root root  10528 Sep  7 15:19 /usr/bin/vpp_restart*


>From CSIT perspective this is crucial as "vpp_api_test" is used as primary 
>tool for configuring VPP.

Short bisect investigation finds commit-id range:
Last good: 4e588aa
First bad: 74cac88

git log --pretty="short" 4e588aa..74cac88

commit 74cac8839efae6a69baea031fb01602ef8907e8a
Author: Florin Coras 

session: fix reentrant listens

commit d790c7e1fa5f1accb621aa75089212be586c137f
Author: Matthew Smith 

update regex used by rpm build to find lib files

commit 36feebb42f1fb9734c1b99b4afae87d3c8233548
Author: Dave Barach 

Improve NTP / kernel time change event handling

commit 833de8cab672c806176d580a1ebc001f394b2eaf
Author: Damjan Marion 

cmake: set packaging component for different files

commit 0745036cb9ead1a3aaf9686c8c8046cb7285ea52
Author: Marco Varlese 

Cavium OcteonTX: cache line fix

Short elimination of those changes got its primary suspect.

Can you please take a look and advise?

Thank you.

Peter Mikus
Engineer - Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10451): https://lists.fd.io/g/vpp-dev/message/10451
Mute This Topic: https://lists.fd.io/mt/25499282/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT rls1807 report - draft out

2018-08-10 Thread Peter Mikus via Lists.Fd.Io
CSIT rls1807 draft report is out:



  https://docs.fd.io/csit/rls1807/report/index.html



Pls give it a scan and send feedback to 
csit-...@lists.fd.io.



We're still waiting for few tests to finish: vpp-ligato, more MLR/MRR 
performance tests. Expecting these to be done in the next couple of days.. Will 
send announcement email once final version posted.

Peter Mikus
Engineer - Software
Cisco Systems Limited
[http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
Think before you print.
This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10099): https://lists.fd.io/g/vpp-dev/message/10099
Mute This Topic: https://lists.fd.io/mt/24247636/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP artifacts

2018-08-08 Thread Peter Mikus via Lists.Fd.Io
Hello,

We are creating branches on weekly basis and CSIT being verified in weekly job. 
So the question is if there is option to set "date" or "number" when limiting 
repo.

Counting the cadence of up to 10 merges per day (artifacts posted on Nexus) 
then means that safe value is around 100-120 artifacts.
This I am talking about master branch of VPP. Stable branches we should be ok 
with just 10-15 artifacts.

Peter Mikus
Engineer – Software
Cisco Systems Limited

-Original Message-
From: Vanessa Valderrama via RT [mailto:fdio-helpd...@rt.linuxfoundation.org] 
Sent: Wednesday, August 08, 2018 5:22 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
Cc: csit-...@lists.fd.io; infra-steer...@lists.fd.io; vpp-dev@lists.fd.io
Subject: [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP artifacts

Peter,

How many artifacts do you need us to retain for your testing?

Thank you,
Vanessa


On Mon Aug 06 04:53:29 2018, pmi...@cisco.com wrote:
> Hello Vanessa,
> 
> For CSIT it is not about release or not. We would need to increase 
> cadence on our weekly jobs to daily. Currently CSIT jobs are all 
> failing as VPP has more than 10-15 artifacts in week.
> Our defined stable versions of VPP (updated once a week) are not in 
> repo anymore or are obsoleting faster than we are updating. This is 
> impacting everything.
> 
> Right now we are *blocked* and we need to work new solution do adopt.
> 
> One of the option is that we will have to start building VPP from 
> scratch for every job as we cannot use artifacts anymore. This will 
> cause huge overhead on infrastructure and execution times will extend 
> as nexus acted as optimization for us.
> Right now Nexus is not an option for us anymore. This also means that 
> Nexus artifacts will not be tested by CSIT.
> 
> Peter Mikus
> Engineer – Software
> Cisco Systems Limited
> 
> -Original Message-
>  From: Vanessa Valderrama via RT [mailto:fdio- 
> helpd...@rt.linuxfoundation.org]
> Sent: Tuesday, July 31, 2018 8:16 PM
> To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
> 
> Cc: csit-...@lists.fd.io; infra-steer...@lists.fd.io; vpp- 
> d...@lists.fd.io
> Subject: [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP 
> artifacts
> 
> Peter,
> 
> We need to make a decision on the number of artifacts to keep. I'd 
> like to propose the following
> 
> previous release repos - 10 packages per subproject master - 10 to 15 
> packages per subproject
> 
> Thank you,
> Vanessa
> 
> On Tue Jun 05 00:51:02 2018, pmi...@cisco.com wrote:
> > Hello Vanessa,
> >
> > Thank you for an explanation. Indeed this will impact certain things 
> > that are planned like "automatic bisecting" (downloading and testing 
> > range of artifacts). Let me analyze current situation with CSIT team 
> > and get back to you.
> >
> > Peter Mikus
> > Engineer – Software
> > Cisco Systems Limited
> >
> >
> > -Original Message-
> >   From: Vanessa Valderrama via RT [mailto:fdio- 
> > helpd...@rt.linuxfoundation.org]
> > Sent: Monday, June 04, 2018 9:47 PM
> >  To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
> > 
> >  Cc: csit-...@lists.fd.io; infra-steer...@lists.fd.io; vpp- 
> > d...@lists.fd.io
> >  Subject: [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP 
> > artifacts
> >
> > Peter,
> >
> > The fd.io.master.centos7 repo had to be cleaned up significantly to 
> > eliminate Jenkins build timeout errors.  This was discussed in the 
> > TSC. Going forward we'll only be keeping an average of 10 of the 
> > current release candidate artifacts in the repository.  Please let 
> > me know if this retention policy causes an issue for you.
> >
> > We do need to clean up the other repositories as well.  Please let 
> > me know if you'd like to discuss retention policies.  I'll hold off 
> > on cleaning up other repositories for now.
> >
> > Thank you,
> > Vanessa
> >
> > On Wed May 30 10:20:21 2018, pmi...@cisco.com wrote:
> > > Hello,
> > >
> > > I have recently spotted that CentOS repo got reduced and old 
> > > binaries are missing [1].
> > >
> > > Is this expected?
> > > Will the similar be done for Ubuntu repos?
> > >
> > > Was this announced somewhere?
> > >
> > > Thank you.
> > >
> > > [1]
> > > https://nexus.fd.io/content/repositories/fd.io.master.centos7/io/f
> > > d/
> > > vp
> > > p/vpp/
> > >
> > > Peter Mikus
> > > Engineer - Software
> > > Cisco Systems Limited
> > > [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg
> > > ]
> > > Think before you print.
> > >This email may contain confidential and privileged material for 
> > > the
> > >   sole use of the intended recipient. Any review, use, 
> > > distribution or
> > >   disclosure by others is strictly prohibited. If you are not the
> > >   intended recipient (or authorized to receive for the recipient),  
> > > please contact the sender by reply email and delete all copies of 
> > > this message.
> > > For corporate legal information go to:
> > > 

Re: [vpp-dev] [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP artifacts

2018-08-06 Thread Peter Mikus via Lists.Fd.Io
Hello Vanessa,

For CSIT it is not about release or not. We would need to increase cadence on 
our weekly jobs to daily. Currently CSIT jobs are all failing as VPP has more 
than 10-15 artifacts in week.
Our defined stable versions of VPP (updated once a week) are not in repo 
anymore or are obsoleting faster than we are updating. This is impacting 
everything.

Right now we are *blocked* and we need to work new solution do adopt.

One of the option is that we will have to start building VPP from scratch for 
every job as we cannot use artifacts anymore. This will cause huge overhead on 
infrastructure and execution times will extend as nexus acted as optimization 
for us.
Right now Nexus is not an option for us anymore. This also means that Nexus 
artifacts will not be tested by CSIT.

Peter Mikus
Engineer – Software
Cisco Systems Limited

-Original Message-
From: Vanessa Valderrama via RT [mailto:fdio-helpd...@rt.linuxfoundation.org] 
Sent: Tuesday, July 31, 2018 8:16 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
Cc: csit-...@lists.fd.io; infra-steer...@lists.fd.io; vpp-dev@lists.fd.io
Subject: [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP artifacts

Peter,

We need to make a decision on the number of artifacts to keep. I'd like to 
propose the following

previous release repos - 10 packages per subproject master - 10 to 15 packages 
per subproject

Thank you,
Vanessa

On Tue Jun 05 00:51:02 2018, pmi...@cisco.com wrote:
> Hello Vanessa,
> 
> Thank you for an explanation. Indeed this will impact certain things 
> that are planned like "automatic bisecting" (downloading and testing 
> range of artifacts). Let me analyze current situation with CSIT team 
> and get back to you.
> 
> Peter Mikus
> Engineer – Software
> Cisco Systems Limited
> 
> 
> -Original Message-
>  From: Vanessa Valderrama via RT [mailto:fdio- 
> helpd...@rt.linuxfoundation.org]
> Sent: Monday, June 04, 2018 9:47 PM
> To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
> 
> Cc: csit-...@lists.fd.io; infra-steer...@lists.fd.io; vpp- 
> d...@lists.fd.io
> Subject: [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP 
> artifacts
> 
> Peter,
> 
> The fd.io.master.centos7 repo had to be cleaned up significantly to 
> eliminate Jenkins build timeout errors.  This was discussed in the 
> TSC. Going forward we'll only be keeping an average of 10 of the 
> current release candidate artifacts in the repository.  Please let me 
> know if this retention policy causes an issue for you.
> 
> We do need to clean up the other repositories as well.  Please let me 
> know if you'd like to discuss retention policies.  I'll hold off on 
> cleaning up other repositories for now.
> 
> Thank you,
> Vanessa
> 
> On Wed May 30 10:20:21 2018, pmi...@cisco.com wrote:
> > Hello,
> >
> > I have recently spotted that CentOS repo got reduced and old 
> > binaries are missing [1].
> >
> > Is this expected?
> > Will the similar be done for Ubuntu repos?
> >
> > Was this announced somewhere?
> >
> > Thank you.
> >
> > [1]
> > https://nexus.fd.io/content/repositories/fd.io.master.centos7/io/fd/
> > vp
> > p/vpp/
> >
> > Peter Mikus
> > Engineer - Software
> > Cisco Systems Limited
> > [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
> > Think before you print.
> >  This email may contain confidential and privileged material for the  
> > sole use of the intended recipient. Any review, use, distribution or  
> > disclosure by others is strictly prohibited. If you are not the  
> > intended recipient (or authorized to receive for the recipient),  
> > please contact the sender by reply email and delete all copies of 
> > this message.
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/index.html
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10042): https://lists.fd.io/g/vpp-dev/message/10042
Mute This Topic: https://lists.fd.io/mt/21275985/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

2018-08-03 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

Can you please take a look and advise? Currently 50-70% of CSIT tests on SKX 
(Ubuntu 18.04) are failing. About 10% affected on Haswell testbeds (Ubuntu 
16.04).

Thank you in advance.

Peter Mikus
Engineer – Software
Cisco Systems Limited


-Original Message-
From: Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) 
Sent: Thursday, August 02, 2018 1:05 PM
To: Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) 
; Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; Ray Kinsella ; vpp-dev@lists.fd.io
Subject: RE: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

Added a Jira comment [1] with some details and attached the same dump (just 
compressed better) to the Jira bug report.

Vratko.

[1] 
https://jira.fd.io/browse/VPP-1361?focusedCommentId=13104=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13104

-Original Message-
From: vpp-dev@lists.fd.io  On Behalf Of Vratko Polak -X 
(vrpolak - PANTHEON TECHNOLOGIES at Cisco) via Lists.Fd.Io
Sent: Wednesday, 2018-August-01 18:13
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; Ray Kinsella ; vpp-dev@lists.fd.io
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

> VPP is not crashing so no core dump are available.

I tried to use "gcore" command to create a core dump from running VPP.
So far I got this [0] archive, compressed to around 25 MB, but the core file 
inside is around 160 GB big.

Not sure how to make it smaller, even with small numbers in startup.conf, the 
core file has around 140 GB.

Vratko.

[0] 
https://jenkins.fd.io/sandbox/job/csit-vpp-perf-verify-master-2n-skx/13/artifact/archive/DUT1_cores.tar.xz

-Original Message-
From: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco)
Sent: Tuesday, 2018-July-31 13:25
To: Ray Kinsella ; vpp-dev@lists.fd.io
Cc: csit-...@lists.fd.io; Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at 
Cisco) ; yulong@intel.com
Subject: RE: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

Hello,

Thanks to Vratko (cc), he tested latest master with DPDK 18.02.2 [0]. The issue 
is there as well.

I cannot confirm if "no JSON data.VAT" is related. The bad thing is that there 
is no meaningful return message with more verbose output.

(we do see this on pretty much on all NIC cards in LF and all TBs)

[0] 
https://jenkins.fd.io/sandbox/job/vpp-csit-verify-hw-perf-master-2n-skx/6/consoleFull

Peter Mikus
Engineer – Software
Cisco Systems Limited

-Original Message-
From: Ray Kinsella [mailto:m...@ashroe.eu]
Sent: Tuesday, July 31, 2018 12:06 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; vpp-dev@lists.fd.io; yulong@intel.com
Subject: Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

Hi Peter,

It may be unrelated, but I think we see this issue also pretty regularly with 
FD.io VPP 18.04 and the x520, on our local test rig.

The error we typically see is "VAT command sw_interface_set_flags sw_if_index 1 
admin-up: no JSON data.VAT".

Do think it is the same or a separate issue?

Ray K


On 30/07/2018 08:02, Peter Mikus via Lists.Fd.Io wrote:
> Hello vpp-dev,
> 
> I am looking for consultation. We started to test VPP for report on 
> all LF CSIT testbeds Skylakes and Haswells.
> 
> We are observing weird behavior. In each test we are using sequence to 
> first bring the both interfaces (physical up) by VAT:
> 
>    sw_interface_set_flags sw_if_index  admin-up (I 
> also tried sw_interface_set_flags sw_if_index idx admin-up link-up)
> 
> After setting all interfaces UP we are testing if interfaces are 
> really UP by VAT (loop 30times, 1s between API call check): 
> “sw_interface_dump”.
> 
> It wasn’t an issue in past but recently we start seeing that 
> sw_interface_dump is reporting interfaces as link_down (admin-up).
> 
> Notes/symptoms:
> 
> -Our sw_interface_dump check is running 30x (1s interval) in loop.
> 
> -Link-down is random, sometimes both interfaces are link-up sometimes 
> just one and sometimes both link are down.
> 
> -_It is not TB related_, nor cabling related, we see it on 
> Haswells-3node in like 1 out of 70 tests, Skylakes-2node 1 out of 70, 
> but on Skylake-3node more than half of the tests.
> 
> -Checking state during test reveals that interfaces are link-down 
> (show
> int) so “sw_interface_dump” is reporting state correctly.
> 
> -Doing CLI during test “set interface state … up” does bring 
> interfaces UP -> (but it is hard to check the timing here).
> 
> -Affected are mostly x520 and x710, but that is most probably because 
> of statistics (low coverage of other NICs like xxv710 and xl710).
> 
> -We have seen this in master vpp as well as rc2 vpp.
> 
> -It is not clear 

Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

2018-07-31 Thread Peter Mikus via Lists.Fd.Io
Hello,

Thanks to Vratko (cc), he tested latest master with DPDK 18.02.2 [0]. The issue 
is there as well.

I cannot confirm if "no JSON data.VAT" is related. The bad thing is that there 
is no meaningful return message with more verbose output.

(we do see this on pretty much on all NIC cards in LF and all TBs)

[0] 
https://jenkins.fd.io/sandbox/job/vpp-csit-verify-hw-perf-master-2n-skx/6/consoleFull

Peter Mikus
Engineer - Software
Cisco Systems Limited

-Original Message-
From: Ray Kinsella [mailto:m...@ashroe.eu] 
Sent: Tuesday, July 31, 2018 12:06 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; vpp-dev@lists.fd.io; yulong@intel.com
Subject: Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

Hi Peter,

It may be unrelated, but I think we see this issue also pretty regularly with 
FD.io VPP 18.04 and the x520, on our local test rig.

The error we typically see is "VAT command sw_interface_set_flags sw_if_index 1 
admin-up: no JSON data.VAT".

Do think it is the same or a separate issue?

Ray K


On 30/07/2018 08:02, Peter Mikus via Lists.Fd.Io wrote:
> Hello vpp-dev,
> 
> I am looking for consultation. We started to test VPP for report on 
> all LF CSIT testbeds Skylakes and Haswells.
> 
> We are observing weird behavior. In each test we are using sequence to 
> first bring the both interfaces (physical up) by VAT:
> 
>    sw_interface_set_flags sw_if_index  admin-up (I 
> also tried sw_interface_set_flags sw_if_index idx admin-up link-up)
> 
> After setting all interfaces UP we are testing if interfaces are 
> really UP by VAT (loop 30times, 1s between API call check): 
> "sw_interface_dump".
> 
> It wasn't an issue in past but recently we start seeing that 
> sw_interface_dump is reporting interfaces as link_down (admin-up).
> 
> Notes/symptoms:
> 
> -Our sw_interface_dump check is running 30x (1s interval) in loop.
> 
> -Link-down is random, sometimes both interfaces are link-up sometimes 
> just one and sometimes both link are down.
> 
> -_It is not TB related_, nor cabling related, we see it on 
> Haswells-3node in like 1 out of 70 tests, Skylakes-2node 1 out of 70, 
> but on Skylake-3node more than half of the tests.
> 
> -Checking state during test reveals that interfaces are link-down 
> (show
> int) so "sw_interface_dump" is reporting state correctly.
> 
> -Doing CLI during test "set interface state . up" does bring 
> interfaces UP -> (but it is hard to check the timing here).
> 
> -Affected are mostly x520 and x710, but that is most probably because 
> of statistics (low coverage of other NICs like xxv710 and xl710).
> 
> -We have seen this in master vpp as well as rc2 vpp.
> 
> -It is not clear when this starts to happen, so bisecting would take 
> lot of time.
> 
> -This was spotted on VIRL as well also on Memif interface which bring 
> us to suspicious that this is related to API not HW.
> 
> Do you have an idea what we could check further? VPP is not crashing 
> so no core dump are available. This issue is not 100% replicable which 
> makes it hard to debug.
> 
> Is there a way to get more verbose error from the api call mentioned 
> to reveal more information?
> 
> **
> 
> Thank you.
> 
> *Peter Mikus*
> Engineer - Software
> 
> *Cisco Systems Limited*
> 
> http://www.cisco.com/web/europe/images/email/signature/logo05.jpg
> 
> Think before you print.
> 
> This email may contain confidential and privileged material for the 
> sole use of the intended recipient. Any review, use, distribution or 
> disclosure by others is strictly prohibited. If you are not the 
> intended recipient (or authorized to receive for the recipient), 
> please contact the sender by reply email and delete all copies of this 
> message.
> 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#9967): https://lists.fd.io/g/vpp-dev/message/9967
> Mute This Topic: https://lists.fd.io/mt/23857615/675355
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [m...@ashroe.eu]
> -=-=-=-=-=-=-=-=-=-=-=-
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9983): https://lists.fd.io/g/vpp-dev/message/9983
Mute This Topic: https://lists.fd.io/mt/23857615/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

2018-07-30 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

I am looking for consultation. We started to test VPP for report on all LF CSIT 
testbeds Skylakes and Haswells.
We are observing weird behavior. In each test we are using sequence to first 
bring the both interfaces (physical up) by VAT:

  sw_interface_set_flags sw_if_index  admin-up (I also tried 
sw_interface_set_flags sw_if_index idx admin-up link-up)

After setting all interfaces UP we are testing if interfaces are really UP by 
VAT (loop 30times, 1s between API call check): "sw_interface_dump".
It wasn't an issue in past but recently we start seeing that sw_interface_dump 
is reporting interfaces as link_down (admin-up).

Notes/symptoms:
-   Our sw_interface_dump check is running 30x (1s interval) in loop.
-   Link-down is random, sometimes both interfaces are link-up sometimes just 
one and sometimes both link are down.
-   It is not TB related, nor cabling related, we see it on Haswells-3node in 
like 1 out of 70 tests, Skylakes-2node 1 out of 70, but on Skylake-3node more 
than half of the tests.
-   Checking state during test reveals that interfaces are link-down (show int) 
so "sw_interface_dump" is reporting state correctly.
-   Doing CLI during test "set interface state ... up" does bring interfaces UP 
-> (but it is hard to check the timing here).
-   Affected are mostly x520 and x710, but that is most probably because of 
statistics (low coverage of other NICs like xxv710 and xl710).
-   We have seen this in master vpp as well as rc2 vpp.
-   It is not clear when this starts to happen, so bisecting would take lot of 
time.
-   This was spotted on VIRL as well also on Memif interface which bring us to 
suspicious that this is related to API not HW.

Do you have an idea what we could check further? VPP is not crashing so no core 
dump are available. This issue is not 100% replicable which makes it hard to 
debug.

Is there a way to get more verbose error from the api call mentioned to reveal 
more information?

Thank you.

Peter Mikus
Engineer - Software
Cisco Systems Limited
[http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
Think before you print.
This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9967): https://lists.fd.io/g/vpp-dev/message/9967
Mute This Topic: https://lists.fd.io/mt/23857615/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [csit-dev] [vpp-dev] Parallel test execution in VPP Test Framework

2018-07-27 Thread Peter Mikus via Lists.Fd.Io
Hello,


Ø  What is the “significant problem” you’re running into?

The problem can be better described as: When python is spawning N instances of 
VPP process, all processes are from unknown reason placed with affinity 0x2 
(bin 10). This can be verified by taskset –p . CFS is then placing all 
VPP process to the same core, making it inefficient on multicore jenkins slave 
container.
The default vpp startup.conf is not modified thus there is no input to know 
where to pin the vpp threads. Simply one can said or think that this is related 
to python multiprocess/subprocess.popen code, which is hard-setting affinity 
mask to 0x2.

There are multiple solutions for workaround that Juraj proposed or Maciek, but 
none of them is answering why is this happening.

Peter Mikus
Engineer – Software
Cisco Systems Limited

From: csit-...@lists.fd.io [mailto:csit-...@lists.fd.io] On Behalf Of Maciek 
Konstantynowicz (mkonstan) via Lists.Fd.Io
Sent: Friday, July 27, 2018 6:53 PM
To: Alec Hothan (ahothan) ; Juraj Linkeš 

Cc: csit-...@lists.fd.io
Subject: Re: [csit-dev] [vpp-dev] Parallel test execution in VPP Test Framework

Alec, This is about make test and not real packet forwarding. Per Juraj’s patch 
[1]

Juraj, My understanding is that if you’re starting VPP without specifying core 
placement in startup.conf [2] cpu {..}, then Linux CFS will be placing the 
threads onto available cpu core resources. If you’re saying this is not the 
case, and indeed the wiki comment indicates this, then the way to address it is 
to specify different core for main.c thread per vpp instance.

What is the “significant problem” you’re running into? Are tests not executing 
in parallel using python multiprocessing, are vpp’s having issues, else? Could 
you describe it a bit more?

-Maciek

[1] https://gerrit.fd.io/r/#/c/13491/
[2] https://git.fd.io/vpp/tree/src/vpp/conf/startup.conf


On 27 Jul 2018, at 17:23, Alec Hothan (ahothan) 
mailto:ahot...@cisco.com>> wrote:

Hi Juraj,
How many instances and what level of performance are you looking at?
Even if you assign different cores to each VPP instance, results can be skewed 
due to interference at the LLC and PCIe/NIC level (this can be somewhat 
mitigated by running on separate sockets)

   Alec


From: mailto:vpp-dev@lists.fd.io>> on behalf of Juraj 
Linkeš mailto:juraj.lin...@pantheon.tech>>
Date: Friday, July 27, 2018 at 7:25 AM
To: "Maciek Konstantynowicz (mkonstan)" 
mailto:mkons...@cisco.com>>
Cc: "vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>, csit-dev 
mailto:csit-...@lists.fd.io>>
Subject: Re: [vpp-dev] Parallel test execution in VPP Test Framework

Hi Maciek and vpp-devs,

I've run into a significant problem regarding VPP assignment to cores. All VPPs 
that are spawned are assigned to core 1. I looked at 
https://wiki.fd.io/view/VPP/Command-line_Arguments and I guess it's because 
that's the default behavior of VPP (dpdk coremask is not configured and  Note 
that the "main" thread always occupies the lowest core-id specified in the DPDK 
[process-level] coremask.").

Is my reading of the config options accurate?

Obviously, all VPP instances running on the same core goes against running the 
tests on multiple cores. There are a couple of solutions that come to mind:
• Assign VPP instances to cores manually. With possible multiple jobs 
running on a given host, this creates a situation where the different jobs 
don't know cores are already occupied (and by how many VPP instances) and thus 
introduces additional challenges to solve.
• Add an option to override this default behavior and let the Linux CFS 
scheduler assign VPPs to cores or something similar where VPPs would land on 
different cores.

Is there some other solution?

Vpp-devs, what do you think about the second solution? What it be possible?

Thanks,
Juraj

From: Maciek Konstantynowicz (mkonstan) [mailto:mkons...@cisco.com]
Sent: Wednesday, July 25, 2018 1:10 PM
To: Juraj Linkeš mailto:juraj.lin...@pantheon.tech>>
Cc: vpp-dev@lists.fd.io; csit-dev 
mailto:csit-...@lists.fd.io>>
Subject: Re: [vpp-dev] Parallel test execution in VPP Test Framework






On 19 Jul 2018, at 15:44, Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>> wrote:

Hi VPP devs,

I'm implementing parallel test execution of tests in VPP Test Framework (the 
patch is here https://gerrit.fd.io/r/#/c/13491/) and the last big outstanding 
question is how scalable the parallelization actually is.

That’s a good question. What do the tests say? :)




The tests are spawning one VPP instance per each VPPTestCase class

How many VPP instances are spawned and run in parallel? Cause assuming
there is at least one VPPTestCase class per test_, that’s 70 VPP
instances ..





and the question is - how do the required compute resources per each VPP 
instance (cpu, ram, shm) scale and how much resources do we need with 
increasing number of VPP instances running in parallel (in the context 

Re: [vpp-dev] CSIT-VPP srv6 jumbo frames not working.

2018-07-17 Thread Peter Mikus via Lists.Fd.Io
Hello,

Our interfaces (sh hard) are able to handle MUT 9202B. 

Most probably you are referring to this output:

  Name   IdxState  MTU (L3/IP4/IP6/MPLS) 
Counter  Count
TenGigabitEtherneta/0/0   1  up  9000/0/0/0 rx packets  
1088

It seems that VPP is configured by default with just 9000B?

Command:
set interface mtu [packet|ip4|ip6|mpls]  

Unfortunately cannot be used on software interfaces per my testing so LISP and 
Crypto tests are failing as well.

Is MTU from Physical interfaces inherited to software interfaces?

Thank you.

Peter Mikus
Engineer – Software
Cisco Systems Limited


-Original Message-
From: Ole Troan [mailto:otr...@employees.org] 
Sent: Tuesday, July 17, 2018 1:39 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
Cc: vpp-dev@lists.fd.io; Pablo Camarillo (pcamaril) 
Subject: Re: [vpp-dev] CSIT-VPP srv6 jumbo frames not working.

Peter,

> During testing we have hit an issue with our SRv6 tests and with Jumbo 
> packets of size 9000B.
> Currently all 9K tests are not passing. It is not clear as of now when it was 
> the first time this issue appear as 9K tests are not frequently run. 78/1518B 
> tests are passing with no issue.
> 
> Error example (see attachment for full output):
> 
> return STDOUTCountNode  Reason
>1379690  sr-pl-rewrite-encaps  SR steered IPv6 packets
>1379690ip6-input   ip6 MTU exceeded
>1379690 ip6-icmp-error packet too big response sent
> 
> Generated packet is 9000B (srv6 overhead should be 40B)
> 
> Full VPP configuration attached as well as interface and error statistics. 
> VPP is not crashing so no core dumps available.
> 
> NIC: Intel x520-da2 with MTU 9202 (per attachment)
> 
> Can you please advise?

Isn't the error quite clear?
  /* Check MTU of outgoing interface. */
  ip6_mtu_check (p0, clib_net_to_host_u16 (ip0->payload_length) +
 sizeof (ip6_header_t),
 adj0[0].rewrite_header.max_l3_packet_bytes,
 is_locally_originated0, , );

You are failing that check in ip6_forward.c Size of packet is larger than the 
set maximum size in the adjacency.
Check the MTU in the adjacency table (and on the interfaces (with show 
interfaces)).

Cheers,
Ole


> 
> Thank you.
> 
> Peter Mikus
> Engineer – Software
> Cisco Systems Limited
> 
> Think before you print.
> This email may contain confidential and privileged material for the sole use 
> of the intended recipient. Any review, use, distribution or disclosure by 
> others is strictly prohibited. If you are not the intended recipient (or 
> authorized to receive for the recipient), please contact the sender by reply 
> email and delete all copies of this message.
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html
> 
> -=-=
> -=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#9855): https://lists.fd.io/g/vpp-dev/message/9855
> Mute This Topic: https://lists.fd.io/mt/23540694/675193
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
> [otr...@employees.org]
> -=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9858): https://lists.fd.io/g/vpp-dev/message/9858
Mute This Topic: https://lists.fd.io/mt/23540694/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] CSIT-VPP srv6 jumbo frames not working.

2018-07-17 Thread Peter Mikus via Lists.Fd.Io
Hello

During testing we have hit an issue with our SRv6 tests and with Jumbo packets 
of size 9000B.
Currently all 9K tests are not passing. It is not clear as of now when it was 
the first time this issue appear as 9K tests are not frequently run. 78/1518B 
tests are passing with no issue.

Error example (see attachment for full output):

return STDOUTCountNode  Reason
   1379690  sr-pl-rewrite-encaps  SR steered IPv6 packets
   1379690ip6-input   ip6 MTU exceeded
   1379690 ip6-icmp-error packet too big response sent

Generated packet is 9000B (srv6 overhead should be 40B)

Full VPP configuration attached as well as interface and error statistics. VPP 
is not crashing so no core dumps available.

NIC: Intel x520-da2 with MTU 9202 (per attachment)

Can you please advise?

Thank you.

Peter Mikus
Engineer - Software
Cisco Systems Limited
[http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
Think before you print.
This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html



srv6_jumbo_configs.log
Description: srv6_jumbo_configs.log


srv6_jumbo_DUT1.log
Description: srv6_jumbo_DUT1.log


srv6_jumbo_DUT2.log
Description: srv6_jumbo_DUT2.log
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9855): https://lists.fd.io/g/vpp-dev/message/9855
Mute This Topic: https://lists.fd.io/mt/23540694/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP artifacts

2018-06-04 Thread Peter Mikus via Lists.Fd.Io
Hello Vanessa,

Thank you for an explanation. Indeed this will impact certain things that are 
planned like "automatic bisecting" (downloading and testing range of 
artifacts). Let me analyze current situation with CSIT team and get back to you.

Peter Mikus
Engineer – Software
Cisco Systems Limited


-Original Message-
From: Vanessa Valderrama via RT [mailto:fdio-helpd...@rt.linuxfoundation.org] 
Sent: Monday, June 04, 2018 9:47 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
Cc: csit-...@lists.fd.io; infra-steer...@lists.fd.io; vpp-dev@lists.fd.io
Subject: [FD.io Helpdesk #56625] Nexus fd.io.master.centos7 VPP artifacts

Peter,

The fd.io.master.centos7 repo had to be cleaned up significantly to eliminate 
Jenkins build timeout errors.  This was discussed in the TSC. Going forward 
we'll only be keeping an average of 10 of the current release candidate 
artifacts in the repository.  Please let me know if this retention policy 
causes an issue for you.

We do need to clean up the other repositories as well.  Please let me know if 
you'd like to discuss retention policies.  I'll hold off on cleaning up other 
repositories for now.

Thank you,
Vanessa

On Wed May 30 10:20:21 2018, pmi...@cisco.com wrote:
> Hello,
> 
> I have recently spotted that CentOS repo got reduced and old binaries 
> are missing [1].
> 
> Is this expected?
> Will the similar be done for Ubuntu repos?
> 
> Was this announced somewhere?
> 
> Thank you.
> 
> [1]
> https://nexus.fd.io/content/repositories/fd.io.master.centos7/io/fd/vp
> p/vpp/
> 
> Peter Mikus
> Engineer - Software
> Cisco Systems Limited
> [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
> Think before you print.
> This email may contain confidential and privileged material for the 
> sole use of the intended recipient. Any review, use, distribution or 
> disclosure by others is strictly prohibited. If you are not the 
> intended recipient (or authorized to receive for the recipient), 
> please contact the sender by reply email and delete all copies of this 
> message.
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html




-=-=-=-=-=-=-=-=-=-=-=-
Links:

You receive all messages sent to this group.

View/Reply Online (#9525): https://lists.fd.io/g/vpp-dev/message/9525
View All Messages In Topic (2): https://lists.fd.io/g/vpp-dev/topic/21275985
Mute This Topic: https://lists.fd.io/mt/21275985/21656
New Topic: https://lists.fd.io/g/vpp-dev/post

Change Your Subscription: https://lists.fd.io/g/vpp-dev/editsub/21656
Group Home: https://lists.fd.io/g/vpp-dev
Contact Group Owner: vpp-dev+ow...@lists.fd.io
Terms of Service: https://lists.fd.io/static/tos
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub
Email sent to: arch...@mail-archive.com
-=-=-=-=-=-=-=-=-=-=-=-