Re: [vpp-dev] Issues about RDMA native plugin

2022-11-07 Thread Honnappa Nagarahalli


> 
> >>> Issue #1: VPP crashes when scales to multiple worker cores/threads.
> [...]
> >>> We guess VPP buffer metadata has been corrupted in the case of
> >>> multiple worker threads using VPP release version
> 
> >> BG: do you reproduce the issue on both x86 and ARM? Or only on ARM?
> >> I’m especially thinking about difference in memory coherency that
> >> could bite us here…
> 
> > [Honnappa]  On Arm, the barriers for working with MMIO are different
> > than the ones used for normal shared memory. Currently, in VPP, this
> > distinction does not exist (please correct me if I am wrong).
> 
> Yes, I was thinking about something along that line. Currently we use
> CLIB_MEMORY_STORE_BARRIER() defined as __builtin_ia32_sfence() on x86
> and __sync_synchronize() on ARM.
> __sync_synchronize() should be a full memory barrier but is it enough for
> MMIO on ARM?
Yes, this is the code I am referring to. That is not enough. It generates 'dmb 
ish'.

We need to use 'dmb oshst' for store-store barrier and 'dmb oshld' for 
load-load barrier.

I would suggest defining 3 new APIs:

xxx_io_mb() - Full barrier (on ARM, 'dmb osh')
xxx_io_rmb() - load-load barrier ('dmb oshld')
xxx_io_wmb() - store-store barrier ('dmd oshst')

I do not think there are any compiler intrinsics that generate these 
instructions.

> 
> Best
> ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22149): https://lists.fd.io/g/vpp-dev/message/22149
Mute This Topic: https://lists.fd.io/mt/94862787/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Issues about RDMA native plugin

2022-11-07 Thread Honnappa Nagarahalli


From: vpp-dev@lists.fd.io  On Behalf Of Benoit Ganne 
(bganne) via lists.fd.io
Sent: Monday, November 7, 2022 3:56 AM
To: Jieqiang Wang 
Cc: Lijian Zhang ; Tianyu Li ; nd 
; vpp-dev@lists.fd.io; moham...@hawari.fr
Subject: Re: [vpp-dev] Issues about RDMA native plugin

Hi Jieqiang,

Thanks a lot for your report! CC’ing vpp-dev because that may be of interest to 
others, and also you can get better support.
My comments/questions inline prefixed with BG: in red

Best
ben

From: Jieqiang Wang mailto:jieqiang.w...@arm.com>>
Sent: Monday, November 7, 2022 9:25
To: Benoit Ganne (bganne) mailto:bga...@cisco.com>>; 
moham...@hawari.fr
Cc: Lijian Zhang mailto:lijian.zh...@arm.com>>; Tianyu Li 
mailto:tianyu...@arm.com>>; nd 
mailto:n...@arm.com>>
Subject: Issues about RDMA native plugin

Hi Ben/Mohammed,

I am Jieqiang Wang from Arm Open Source Software team focusing on open source 
network software such as VPP.
Recently our team did some investigation on RDMA native plugin on both Arm and 
X86 platform and met some crash and performance issues.
I’m writing this email to share what are the issues and hope to get some input 
from you.

Before going through the issues, I would like to share my local test 
environment for using RDMA native plugin.
VPP version: vpp v23.02-rc0~89-ge7adafeaf
Compiler to build VPP binary: Clang 13.0.0
Mellanox NIC: MCX516A-CDA_Ax ConnectX-5 100GbE dual-port QSFP28; PCIe4.0 x16
Server CPU:
Arm: Neoverse-N1(Ampere Altra 1P)
X86: Intel(R) Xeon(R) Platinum 8268 CPU @ 2.90GHz(Dell PowerEdge R820)

Here are the issues we found using RDMA native plugin.
Issue #1: VPP crashes when scales to multiple worker cores/threads.
For example, running VPP(two worker threads) with startup.conf and CLI command 
file L3-rdma.exec in the attachment works fine. However, when injecting packets 
into VPP, VPP would just crash with output messages like the following shows.
We saw similar segmentation faults for different test cases like L2 
cross-connect/L2 mac learning. But we didn’t see crash issues for VPP debug 
version. We guess VPP buffer metadata has been corrupted in the case of 
multiple worker threads using VPP release version but are not sure how to debug 
this issue.
Any suggestion to find the root cause of this issue?

Thread 2 "vpp_wk_0" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7efb7649b700 (LWP 3355470)]
0x7efd7842e8f6 in ip4_lookup_inline (vm=, frame=, node=) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vnet/ip/ip4_forward.h:338
338   if (PREDICT_FALSE (lb0->lb_n_buckets > 1))
(gdb) bt
#0  0x7efd7842e8f6 in ip4_lookup_inline (vm=, 
frame=, node=) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vnet/ip/ip4_forward.h:338
#1  ip4_lookup_node_fn_skx (vm=, node=0x7efc7c455b00, 
frame=0x7efc7c7e16c0) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vnet/ip/ip4_forward.c:101
#2  0x7efd7711effb in dispatch_node (vm=0x7efc7c43bd40, 
node=0x7efc7c455b00, type=VLIB_NODE_TYPE_INTERNAL, 
dispatch_state=VLIB_NODE_STATE_POLLING, frame=,
last_time_stamp=2494501287259978) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vlib/main.c:960
#3  dispatch_pending_node (vm=0x7efc7c43bd40, pending_frame_index=, last_time_stamp=2494501287259978) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vlib/main.c:1119
#4  vlib_main_or_worker_loop (vm=, is_main=0) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vlib/main.c:1588
#5  vlib_worker_loop (vm=, vm@entry=0x7efc7c43bd40) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vlib/main.c:1722
#6  0x7efd77171dda in vlib_worker_thread_fn (arg=0x7efc78814d00) at 
/home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vlib/threads.c:1598
#7  0x7efd7716c8f1 in vlib_worker_thread_bootstrap_fn (arg=0x7efc78814d00) 
at /home/jiewan01/tasks/vpp_dpdk_mlx/latest/src/vlib/threads.c:418
#8  0x7efd7709b609 in start_thread (arg=) at 
pthread_create.c:477
#9  0x7efd76dd8163 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

BG: do you reproduce the issue on both x86 and ARM? Or only on ARM? I’m 
especially thinking about difference in memory coherency that could bite us 
here…
[Honnappa]  On Arm, the barriers for working with MMIO are different than the 
ones used for normal shared memory. Currently, in VPP, this distinction does 
not exist (please correct me if I am wrong).

For ex:

update_nic_descriptors_in_memory();
store_io_barrier(); /*** Currently, this API abstraction does not exist in VPP 
***/
ring_door_bell_on_nic();

So, we need new APIs for the IO barriers and those need to be incorporated in 
the native drivers. For DPDK drivers, this is addressed in DPDK.
I cannot think of any other synchronization/memory model issue.

Issue #2: Huge performance gap between Striding RQ mode and Non-Striding RQ 
mode.
We saw a huge performance gap when using RDMA interfaces created with Striding 
RQ enabled VS RDMA interfaces with Striding RQ disabled.
On both Arm and X86 server mentioned above, 

Re: [vpp-dev] Increase memif from 10G to 100G

2021-11-16 Thread Honnappa Nagarahalli


From: vpp-dev@lists.fd.io  On Behalf Of Mrityunjay Kumar 
via lists.fd.io
Sent: Tuesday, November 16, 2021 7:42 AM
To: Felipe P. 
Cc: vpp-dev 
Subject: Re: [vpp-dev] Increase memif from 10G to 100G

Memif based on share memory communication.  There is concept of slave / master, 
where one folk involve one copy.


Can you check and share unidirectional max speed without pkt processing?


BTW, I am not impressed with memif library,  hence idea is look for better 
aproach for share memory instead high cost of pkt buff translation.
[Honnappa] Which parts of memif library you don’t like?





On Tue, 16 Nov, 2021, 4:38 pm Felipe P., 
mailto:felipeapola...@gmail.com>> wrote:
Hi,

We are running 100G tests in our new dev hardware and found out yesterday that 
memif are coming up with 10G speed.

Is there any way to increase this to 100G speed?

Thanks,


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20498): https://lists.fd.io/g/vpp-dev/message/20498
Mute This Topic: https://lists.fd.io/mt/87092449/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Linking DPDK libs to plugins

2021-10-25 Thread Honnappa Nagarahalli
There are few additional modes added to the ring library (a year back) in DPDK 
that improve the performance when there are threads on control plane and data 
plane doing enqueue/dequeue from the same ring. Are you talking about these or 
just the ring in general?

Thanks,
Honnappa

> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of bjeremy32 via
> lists.fd.io
> Sent: Monday, October 25, 2021 2:18 PM
> To: 'Damjan Marion' 
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Linking DPDK libs to plugins
> 
> I believe it was just ring that they cared about.
> 
> -Original Message-
> From: Damjan Marion 
> Sent: Monday, October 25, 2021 11:08 AM
> To: bjerem...@gmail.com
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Linking DPDK libs to plugins
> 
> 
> Ok, i’m affraid that to implement this you will need to introduce lot of mess.
> At the end probably will be easier to implement that functionality natively.
> 
> Which exact implementation of the dpdk mempool you are looking to use
> (ring, stack, bucket, ...)?
> 
> —
> Damjan
> 
> 
> 
> > On 25.10.2021., at 17:39, 
>  wrote:
> >
> > Hi Damjan,
> >
> > Thanks for the reply
> >
> > Here are the  details:
> >
> > 1. We want to use only the rte_mempool infrastructure for lockless global
> memory pools. We will not be using any mbuf infrastructure from dpdk
> > 2. We want to use this infra across our multiple plugins
> > 3. We want to be able to include rte_mempool data structures from our
> multiple header files (.h files )
> > 4. We want to be able to make calls to rte_mempool apis from our source
> code ( .c files )
> >
> > -Original Message-
> > From: Damjan Marion 
> > Sent: Monday, October 25, 2021 5:22 AM
> > To: bjerem...@gmail.com
> > Cc: vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] Linking DPDK libs to plugins
> >
> >
> >
> >
> >> On 25.10.2021., at 01:13, bjerem...@gmail.com wrote:
> >>
> >> Greetings,
> >>
> >> Let me preface this by saying that I really do not know much about the
> CMake utility. But I am trying to see if there is a way to make the DPDK libs
> accessible to other plugins (aside from the dpdk plugin) that are in their own
> project/subdirectory similar. I am working with v20.05 currently (although we
> are upgrading to 21.06 if that make a difference).
> >>
> >> Initially it was suggested to me that I could just add a couple lines
> >> to my CMakeLists to link the dpdk_plugin.so to my own plugin.. but I
> >> have not been able to get this to work.. It never seems to recognize
> >> the path to the .so, even if I give the absolute path
> >>
> >> set(DPDK_PLUGIN_LINK_FLAGS "${DPDK_PLUGIN_LINK_FLAGS} -L  vpp
> >> plugins> -ldpdk_plugin.so")
> >>
> >> add_vpp_plugin(my_plugin
> >> ….
> >>  LINK_FLAGS
> >>  “${ DPDK_PLUGIN_LINK_FLAGS }”
> >>
> >> Another approach suggested was to maybe use dlsym to dynamically load
> symbols… Anyway, I was thinking that someone has to have had done this
> before, or maybe have more of a clue as to how to do this then I currently do.
> >>
> >
> >
> >
> > Please note that VPP is not DPDK application, DPDK is just optional device
> driver layer for us.
> >
> > Even if you manage to get your plugin linked against DPDK libs, there is no
> guarantee that you will be able to use all dpdk data structures. Most obvious
> example, rte_mbuf structure for a packet buffer may not be populated for
> you.
> >
> > Also use of DPDK comes with a performance cost, we need to copy buffer
> metadata left and right on both RX and TX side.
> >
> > What specific DPDK library you would like to use? We may have alternative
> proposal….
> >
> > —
> > Damjan
> >
> >
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20389): https://lists.fd.io/g/vpp-dev/message/20389
Mute This Topic: https://lists.fd.io/mt/86565585/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [EXT] [vpp-dev] New perfmon plugin

2020-12-14 Thread Honnappa Nagarahalli


> 
> Hi Damjan,
> 
> ARM defines two sets of performance monitoring counters and extension
> 1. Common Event number and micro-architecture events defined by ARM
> which every chip vendor should implement.
> 2. Chip vendor specific PMU counters other than (1)
> 
> I am not in ThunderX2 BU but I think the kernel driver you are referring to
> seems to be a PMU extension which falls under the category of (2) above. See
> below for OCTEONTX2 output
> 
> So for ARM to be enabled in perfmon plugin, I am thinking,
> - we need common bundle to register common ARM PMU events. This should
> be first step and include most of the useful/important events
> - chip vendor specific bundle should also be allowed to "implementation
> defined" PMU events
> 
> One of the key differentiation in ARM is a kernel driver needs to be hooked at
> runtime to allow VPP to get hold of PMU counters (which is not the case with
> x86)
> 
> >>> Can you capture contents of /sys/bus/event_source/devices/ from one
> system?
> I do not have ThunderX2 access but here is the output of OCTEONTX2
On thunderx2:
honnag01@2u-thunderx2:~$ ls /sys/bus/event_source/devices/
armv8_pmuv3_0  breakpoint  kprobe  software  tracepoint  uncore_dmc_0  
uncore_dmc_1  uncore_l3c_0  uncore_l3c_1  uprobe

2u-thunderx2:~$ ls /sys/bus/event_source/devices/uncore_dmc_0/
cpumask  events  format  perf_event_mux_interval_ms  power  subsystem  type  
uevent

> 
> $ ls -ltr /sys/bus/event_source/devices/ total 0 lrwxrwxrwx 1 root root 0 Dec
> 14 06:48 software -> ../../../devices/software lrwxrwxrwx 1 root root 0 Dec 14
> 06:48 cs_etm -> ../../../devices/cs_etm lrwxrwxrwx 1 root root 0 Dec 14 06:48
> breakpoint -> ../../../devices/breakpoint lrwxrwxrwx 1 root root 0 Dec 14 
> 06:48
> tracepoint -> ../../../devices/tracepoint lrwxrwxrwx 1 root root 0 Dec 14 
> 06:48
> armv8_cavium_thunder -> ../../../devices/armv8_cavium_thunder
> 
> Thanks,
> Nitin
> 
> 
> 
> 
> > -Original Message-
> > From: Damjan Marion 
> > Sent: Monday, December 14, 2020 4:19 PM
> > To: Nitin Saxena 
> > Cc: vpp-dev 
> > Subject: Re: [EXT] [vpp-dev] New perfmon plugin
> >
> >
> > Isn’t there also uncore PMU? I can see some thunderx2 specific driver
> > in kernel source.
> >
> > Can you capture contents of /sys/bus/event_source/devices/ from one
> > system?
> >
> > Thanks,
> >
> > —
> > Damjan
> >
> >
> > > On 14.12.2020., at 09:09, Nitin Saxena  wrote:
> > >
> > > Yes most of the ARM processors including ThunderX2, OCTEONTX2 has
> > PMU as per AARCH64 specifications. I did some analysis to add ARM
> > support in older perfmon plugin and should be easy to port to this new
> > one. This is something in TODO list which is much needed for us and
> > overall ARM
> > >
> > > Thanks,
> > > Nitin
> > >
> > >> -Original Message-
> > >> From: Damjan Marion 
> > >> Sent: Saturday, December 12, 2020 7:46 PM
> > >> To: Nitin Saxena 
> > >> Cc: vpp-dev 
> > >> Subject: Re: [EXT] [vpp-dev] New perfmon plugin
> > >>
> > >>
> > >> cool, if I got it right ThunderX2 have own PMU so we can add it as
> > >> new source and create specific bundles.
> > >>
> > >> --
> > >> Damjan
> > >>
> > >>> On 12.12.2020., at 11:07, Nitin Saxena  wrote:
> > >>>
> > >>> Hi Damjan,
> > >>>
> > >>> I was already fan of older perfmon plugin and new one seems
> > >>> superset of the older one (at-least from video)
> > >>>
> > >>> Nice addition
> > >>>
> > >>> Thanks,
> > >>> Nitin
> > >>>
> >  -Original Message-
> >  From: vpp-dev@lists.fd.io  On Behalf Of
> >  Damjan Marion via lists.fd.io
> >  Sent: Friday, December 11, 2020 9:44 PM
> >  To: vpp-dev 
> >  Subject: [EXT] [vpp-dev] New perfmon plugin
> > 
> >  External Email
> > 
> >  -
> >  
> >  -
> > 
> >  Guys,
> > 
> >  I just submitted patch with the new perfmon plugin: [1]
> > 
> >  It takes significantly different approach compared to current one.
> > 
> >  - it support multiple sources of perf counters (linux, intel
> >  core, intel uncore) and it is extensible to other vendors
> >  - it have concept instances so it can monitor multiple instances
> >  of specific PMU (DRAM channels, UPI/QPU links, ..)
> >  - it supports node, thread and system metrics
> >  - different metrics are organized in bundles, where bundle
> >  consists of multiple counters and format functions which
> >  calculates and
> > presents
> > >> metric.
> >  Yuo can find example of bundle here [2]
> > 
> >  To se how this looks in action, I captured small asciinema video:
> >  [3]
> > 
> >  As this new plugin is significantly different than old one, I
> >  wonder if anyone thinks we should keep old une.
> >  Also, any other feedback is wellcome.
> > 
> >  Thanks,
> > 
> >  Damjan
> > 
> > 
> >  [1] https://urldefense.proofpoint.com/v2/url?u=https-
> >  

Re: [vpp-dev] Marvell PP2 plugin

2020-11-25 Thread Honnappa Nagarahalli
Hi Hemant,
Yes, VPP is tested with Arm patforms. Arm machines are in VPP lab and 
are part of CI/CSIT. Every patch submitted undergoes testing on Arm platforms.

The Marvell plugin is similar to the native drivers. Hence the performance is 
better when compared to DPDK plugin.

Thanks,
Honnappa


> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of hemant via
> lists.fd.io
> Sent: Wednesday, November 25, 2020 1:54 PM
> To: Honnappa Nagarahalli ;
> dmar...@me.com; vpp-dev@lists.fd.io
> Cc: nd 
> Subject: Re: [vpp-dev] Marvell PP2 plugin
> 
> Honnappa,
> 
> Got it, thanks.  I am new to VPP and wondering about the path to the DPDK
> plugin? Has Arm been also tested with VPP since I hear from you only about Arm
> and DPDK.  I also noticed in the vpp/src/plugins/marvell notes that the plugin
> works 30% faster if DPDK is disabled.
> 
> Thanks,
> 
> Hemant
> 
> -Original Message-
> From: Honnappa Nagarahalli 
> Sent: Wednesday, November 25, 2020 12:04 AM
> To: hem...@mnkcg.com; dmar...@me.com; vpp-dev@lists.fd.io
> Cc: nd ; Honnappa Nagarahalli
> 
> Subject: RE: [vpp-dev] Marvell PP2 plugin
> 
> Hi Hemant,
>   Please note that this is a plugin specifically for Marvel's Armada 
> series of
> SoCs. This does not apply to rest of SoCs from Marvell or SoCs from other Arm
> partners or server platforms. They all (including the Marvel's Armada series 
> of
> SoCs) work fine through the DPDK plugin.
> 
> Thanks,
> Honnappa
> 
> 
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of hemant
> > via lists.fd.io
> > Sent: Tuesday, November 24, 2020 10:37 AM
> > To: dmar...@me.com; vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] Marvell PP2 plugin
> >
> > Is there another plugin that supports ARM? It would be good for VPP to
> > have a plugin for ARM.  Certainly, if Marvell has killed the product
> > line, there's no point in supporting its code.
> >
> > Thanks,
> >
> > Hemant
> >
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Damjan
> > Marion via lists.fd.io
> > Sent: Tuesday, November 24, 2020 5:13 AM
> > To: vpp-dev 
> > Subject: [vpp-dev] Marvell PP2 plugin
> >
> >
> > Some time ago i developed native plugin for Marvel Armada SoC using
> > their MUSDK.
> > Since that, they bought Cavium and looks like they decided to kill
> > that product line.
> >
> > So i don’t see a bright future for that plugin, so I wonder if we
> > should deprecate it.
> >
> > Is anybody using it? If no replies I will take it as no and move it to
> > extras/deprecated…
> >
> > Thanks,
> >
> > Damjan
> >


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18148): https://lists.fd.io/g/vpp-dev/message/18148
Mute This Topic: https://lists.fd.io/mt/78474298/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Marvell PP2 plugin

2020-11-24 Thread Honnappa Nagarahalli
Hi Hemant,
Please note that this is a plugin specifically for Marvel's Armada 
series of SoCs. This does not apply to rest of SoCs from Marvell or SoCs from 
other Arm partners or server platforms. They all (including the Marvel's Armada 
series of SoCs) work fine through the DPDK plugin.

Thanks,
Honnappa


> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of hemant via
> lists.fd.io
> Sent: Tuesday, November 24, 2020 10:37 AM
> To: dmar...@me.com; vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Marvell PP2 plugin
> 
> Is there another plugin that supports ARM? It would be good for VPP to have a
> plugin for ARM.  Certainly, if Marvell has killed the product line, there's 
> no point
> in supporting its code.
> 
> Thanks,
> 
> Hemant
> 
> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Damjan Marion
> via lists.fd.io
> Sent: Tuesday, November 24, 2020 5:13 AM
> To: vpp-dev 
> Subject: [vpp-dev] Marvell PP2 plugin
> 
> 
> Some time ago i developed native plugin for Marvel Armada SoC using their
> MUSDK.
> Since that, they bought Cavium and looks like they decided to kill that 
> product
> line.
> 
> So i don’t see a bright future for that plugin, so I wonder if we should 
> deprecate
> it.
> 
> Is anybody using it? If no replies I will take it as no and move it to
> extras/deprecated…
> 
> Thanks,
> 
> Damjan
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18133): https://lists.fd.io/g/vpp-dev/message/18133
Mute This Topic: https://lists.fd.io/mt/78474298/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: Handoff design issues [Re: RES: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?]

2020-11-13 Thread Honnappa Nagarahalli
This is a typical problem one would face with a pipeline mode of processing 
packets. i.e. the over all performance of the pipeline is equal to the 
performance of the lowest performing stage in the pipeline. Having a bigger 
queue would help handle a burst or might solve the problem for a given platform 
and traffic profile.

One option could be to run the lowest performing stage in multiple 
threads/CPUs. But, then the previous stage needs to distribute the packets 
evenly.

Thanks,
Honnappa

> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Christian
> Hopps via lists.fd.io
> Sent: Friday, November 13, 2020 3:47 PM
> To: Marcos - Mgiga 
> Cc: Christian Hopps ; Klement Sekera
> ; Elias Rudberg ; vpp-
> d...@lists.fd.io
> Subject: Handoff design issues [Re: RES: RES: [vpp-dev] Increasing NAT
> worker handoff frame queue size NAT_FQ_NELTS to avoid congestion
> drops?]
> 
> FWIW, I too have hit this issue. Basically VPP is designed to process a packet
> from rx to tx in the same thread. When downstream nodes run slower, the
> upstream rx node doesn't run, so the vector size in each frame naturally
> increases, and then the downstream nodes can benefit from "V" (i.e.,
> processing multiple packets in one go).
> 
> This back-pressure from downstream does not occur when you hand-off
> from a fast thread to a slower thread, so you end up with many single packet
> frames and fill your hand-off queue.
> 
> The quick fix one tries then is to increase the queue size; however, this is 
> not
> a great solution b/c you are still not taking advantage of the "V" in VPP. To
> really fit this back into the original design one needs to somehow still be
> creating larger vectors in the hand-off frames.
> 
> TBH I think the right solution here is to not hand-off frames, and instead
> switch to packet queues and then on the handed-off side the frames would
> get constructed from packet queues (basically creating another polling input
> node but on the new thread).
> 
> Thanks,
> Chris.
> 
> > On Nov 13, 2020, at 12:21 PM, Marcos - Mgiga 
> wrote:
> >
> > Understood. And what path did you take in order to analyse and monitor
> vector rates ? Is there some specific command or log ?
> >
> > Thanks
> >
> > Marcos
> >
> > -Mensagem original-
> > De: vpp-dev@lists.fd.io  Em nome de ksekera via
> > [] Enviada em: sexta-feira, 13 de novembro de 2020 14:02
> > Para: Marcos - Mgiga 
> > Cc: Elias Rudberg ; vpp-dev@lists.fd.io
> > Assunto: Re: RES: [vpp-dev] Increasing NAT worker handoff frame queue
> size NAT_FQ_NELTS to avoid congestion drops?
> >
> > Not completely idle, more like medium load. Vector rates at which I saw
> congestion drops were roughly 40 for thread doing no work (just handoffs - I
> hardcoded it this way for test purpose), and roughly 100 for thread picking
> the packets doing NAT.
> >
> > What got me into infra investigation was the fact that once I was hitting
> vector rates around 255, I did see packet drops, but no congestion drops.
> >
> > HTH,
> > Klement
> >
> >> On 13 Nov 2020, at 17:51, Marcos - Mgiga  wrote:
> >>
> >> So you mean that this situation ( congestion drops) is most likely to occur
> when the system in general is idle than when it is processing a large amount
> of traffic?
> >>
> >> Best Regards
> >>
> >> Marcos
> >>
> >> -Mensagem original-
> >> De: vpp-dev@lists.fd.io  Em nome de Klement
> >> Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de
> >> 2020
> >> 12:15
> >> Para: Elias Rudberg 
> >> Cc: vpp-dev@lists.fd.io
> >> Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size
> NAT_FQ_NELTS to avoid congestion drops?
> >>
> >> Hi Elias,
> >>
> >> I’ve already debugged this and came to the conclusion that it’s the infra
> which is the weak link. I was seeing congestion drops at mild load, but not at
> full load. Issue is that with handoff, there is uneven workload. For 
> simplicity’s
> sake, just consider thread 1 handing off all the traffic to thread 2. What
> happens is that for thread 1, the job is much easier, it just does some ip4
> parsing and then hands packet to thread 2, which actually does the heavy
> lifting of hash inserts/lookups/translation etc. 64 element queue can hold 64
> frames, one extreme is 64 1-packet frames, totalling 64 packets, other
> extreme is 64 255-packet frames, totalling ~16k packets. What happens is
> this: thread 1 is mostly idle and just picking a few packets from NIC and 
> every
> one of these small frames creates an entry in the handoff queue. Now
> thread 2 picks one element from the handoff queue and deals with it before
> picking another one. If the queue has only 3-packet or 10-packet elements,
> then thread 2 can never really get into what VPP excels in - bulk processing.
> >>
> >> Q: Why doesn’t it pick as many packets as possible from the handoff
> queue?
> >> A: It’s not implemented.
> >>
> >> I already wrote a patch for it, which made all congestion drops which I saw
> (in above 

Re: [vpp-committers] [vpp-dev] VPP committers: VPP PTL vote

2020-09-28 Thread Honnappa Nagarahalli
Congratulations Damjan, unanimous decision speaks for itself.

Hi Dave,
   Thank you. Wish you well for the next chapter.

Thank you,
Honnappa


From: vpp-dev@lists.fd.io  On Behalf Of Dave Barach via 
lists.fd.io
Sent: Monday, September 28, 2020 2:10 PM
To: 'Florin Coras' ; 'Damjan Marion (damarion)' 

Cc: 'Benoit Ganne (bganne)' ; 'Dave Barach (dbarach)' 
; vpp-committ...@lists.fd.io; vpp-dev@lists.fd.io
Subject: Re: [vpp-committers] [vpp-dev] VPP committers: VPP PTL vote

Thanks for the kind words... It’s been fun, but now it’s time for others to 
take the ball and run with it... Congrats to Damjan, he’ll do a great job!

Dave

From: vpp-committ...@lists.fd.io 
mailto:vpp-committ...@lists.fd.io>> On Behalf Of 
Florin Coras
Sent: Monday, September 28, 2020 2:28 PM
To: Damjan Marion (damarion) mailto:damar...@cisco.com>>
Cc: Benoit Ganne (bganne) mailto:bga...@cisco.com>>; Dave 
Barach (dbarach) mailto:dbar...@cisco.com>>; 
vpp-committ...@lists.fd.io; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-committers] [vpp-dev] VPP committers: VPP PTL vote

Congrats, Damjan!! Those are some huge “shoes" you’ll have to fill but I’m sure 
they’ll fit ;-)

Dave, a few sentences won’t do all your contributions as PTL any justice! 
Nonetheless, thank you for starting the project, for diligently working towards 
growing it and for entrusting us, as a community, with its future!

Florin

On Sep 28, 2020, at 4:47 AM, Damjan Marion via lists.fd.io 
mailto:damarion=cisco@lists.fd.io>> wrote:



Now when we have votes from all 12 committers (me excluded) I would like to 
thank you all for your +1s.
It is nice to be elected by unanimous decision :)

—
Damjan


On 28.09.2020., at 09:44, Benoit Ganne (bganne) via 
lists.fd.io 
mailto:bganne=cisco@lists.fd.io>> wrote:

+1

And I take this opportunity to say a big thank you Dave for your efforts to 
build an healthy community, and all the best for Damjan in his (I hope) future 
new role 

Best
ben

-Original Message-
From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of Dave Barach
via lists.fd.io
Sent: vendredi 25 septembre 2020 21:14
To: vpp-committ...@lists.fd.io
Cc: vpp-dev@lists.fd.io
Subject: [vpp-dev] VPP committers: VPP PTL vote

Folks,



The self-nomination period closed yesterday. We had one self-nomination,
from Damjan Marion. At this point, we can proceed with a vote.



I’m sure that Damjan will do a great job, so let me start:



Damjan Marion as VPP PTL: +1



Please vote +1, 0, -1. For once, the “reply-all” button is everyone’s
friend.



Thanks... Dave



-Original Message-
From: d...@barachs.net 
mailto:d...@barachs.net>>
Sent: Thursday, September 17, 2020 10:32 AM
To: 'vpp-dev@lists.fd.io' 
mailto:vpp-dev@lists.fd.io>>; 
't...@lists.fd.io'
mailto:t...@lists.fd.io>>
Subject: Happy Trails to Me...



Folks,



I’m departing the employment rolls towards the end of next month. Although
I intend to remain active in the fd.io vpp community as a coder,
committer, and resident greybeard, it’s time for the community to pick a
new PTL.



According to the project governance document,
https://fd.io/docs/tsc/FD.IO-Technical-Community-Document-12-12-2017.pdf

:



3.2.3.1 Project Technical Leader Candidates Candidates for the project’s
PTL will be derived from the Committers of the Project. Candidates must
self-nominate.



I'd like to invite any interested vpp project committer to self-nominate
for the PTL role. Please email vpp-dev@lists.fd.io 
> .



Let's close the self-nomination period in one week: more specifically, by
5pm EDT on Thursday, September 24, 2020; committer vote to follow
thereafter.



I'll be glad to answer unicast questions about the PTL role from eligible
committers.



Thanks... Dave
















-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#17565): https://lists.fd.io/g/vpp-dev/message/17565
Mute This Topic: https://lists.fd.io/mt/77172336/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Create big tables on huge-page

2020-07-23 Thread Honnappa Nagarahalli
Sure. We will create couple of patches (in the areas we are analyzing 
currently) and we can decide from there.
Thanks,
Honnappa

From: Damjan Marion 
Sent: Thursday, July 23, 2020 12:17 PM
To: Honnappa Nagarahalli 
Cc: Lijian Zhang ; vpp-dev ; nd 
; Govindarajan Mohandoss ; 
Jieqiang Wang 
Subject: Re: [vpp-dev] Create big tables on huge-page



Hard to say without seeing the patch. Can you summarize what the changes will 
be in each particular .c file?



On 23 Jul 2020, at 18:00, Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>> wrote:

Hi Damjan,
Thank you. Till your patch is ready, would you accept patches 
that would enable creating these tables in 1G huge pages as temporary solution?

Thanks,
Honnappa

From: Damjan Marion mailto:dmar...@me.com>>
Sent: Thursday, July 23, 2020 7:15 AM
To: Lijian Zhang mailto:lijian.zh...@arm.com>>
Cc: vpp-dev mailto:vpp-dev@lists.fd.io>>; nd 
mailto:n...@arm.com>>; Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>>; 
Govindarajan Mohandoss 
mailto:govindarajan.mohand...@arm.com>>; 
Jieqiang Wang mailto:jieqiang.w...@arm.com>>
Subject: Re: [vpp-dev] Create big tables on huge-page


I started working on patch which addresses most of this points, few weeks ago, 
but likely I will not have it completed for 20.09.
Even if it is completed, it is probably bad idea to merge it so late in the 
release process….

—
Damjan




On 23 Jul 2020, at 10:45, Lijian Zhang 
mailto:lijian.zh...@arm.com>> wrote:

Hi Maintainers,
From VPP source code, ip4-mtrie table is created on huge-page only when below 
parameters are set in configuration file.
While adjacency table is created on normal-page always.
  36 ip {
  37   heap-size 256M
  38   mtrie-hugetlb
  39 }
In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 10K 
entries in adjacency table.
By creating ip4-mtrie table on 1G huge-page with above parameters set and 
similarly create adjacency table on 1G huge-page, although I don’t observe 
obvious throughput performance improvement, but TLB misses are dramatically 
reduced.
Do you think configuration of 10K routing entries + 10K adjacency entries is a 
reasonable and possible config, or normally it would be 10K routing entries + 
only several adjacency entries?
Does it make sense to create adjacency table on huge-pages?
Another problem is although above assigned heap-size is 256M, but on 1G 
huge-page system, it seems to occupy a huge-page completely, other memory space 
within that huge-page seems will not be used by other tables.

Same as the bihash based tables, only 2M huge-page system is supported. To 
support creating bihash based tables on 1G huge-page system, each table will 
occupy a 1G huge-page completely, but that will waste a lot of memories.
Is it possible just like pmalloc module, reserving a big memory space on 1G/2M 
huge-pages in initialization stage, and then allocate memory pieces per 
requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
could be created on huge-pages and will fully utilize the huge-pages.
I tried to create MAC table on 1G huge-page, and it does improve throughput 
performance.
vpp# show bihash
Name Actual Configured
GBP Endpoints - MAC/BD   1m 1m
b4s 64m 64m
b4s 64m 64m
in2out   10.12m 10.12m
in2out   10.12m 10.12m
ip4-dr   2m 2m
ip4-dr   2m 2m
ip6 FIB fwding table32m 32m
ip6 FIB non-fwding table32m 32m
ip6 mFIB table  32m 32m
l2fib mac table512m 512m
mapping_by_as4  64m 64m
out2in 128m 128m
out2in 128m 128m
out2in   10.12m 10.12m
out2in   10.12m 10.12m
pppoe link table 8m 8m
pppoe session table  8m 8m
static_mapping_by_external  64m 64m
static_mapping_by_local 64m 64m
stn addresses1m 1m
users  648k 648k
users  648k 648k
vip_index_per_port  64m 64m
vxlan4   1m 1m
vxlan4-gbp   1m 1m
Total 1.28g 1.28g

Thanks.



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17072): https://lists.fd.io/g/vpp-dev/message/17072
Mute This Topic: https://lists.fd.io/mt/75742152/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Create big tables on huge-page

2020-07-23 Thread Honnappa Nagarahalli
Hi Damjan,
Thank you. Till your patch is ready, would you accept patches 
that would enable creating these tables in 1G huge pages as temporary solution?

Thanks,
Honnappa

From: Damjan Marion 
Sent: Thursday, July 23, 2020 7:15 AM
To: Lijian Zhang 
Cc: vpp-dev ; nd ; Honnappa Nagarahalli 
; Govindarajan Mohandoss 
; Jieqiang Wang 
Subject: Re: [vpp-dev] Create big tables on huge-page


I started working on patch which addresses most of this points, few weeks ago, 
but likely I will not have it completed for 20.09.
Even if it is completed, it is probably bad idea to merge it so late in the 
release process….

—
Damjan



On 23 Jul 2020, at 10:45, Lijian Zhang 
mailto:lijian.zh...@arm.com>> wrote:

Hi Maintainers,
From VPP source code, ip4-mtrie table is created on huge-page only when below 
parameters are set in configuration file.
While adjacency table is created on normal-page always.
  36 ip {
  37   heap-size 256M
  38   mtrie-hugetlb
  39 }
In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 10K 
entries in adjacency table.
By creating ip4-mtrie table on 1G huge-page with above parameters set and 
similarly create adjacency table on 1G huge-page, although I don’t observe 
obvious throughput performance improvement, but TLB misses are dramatically 
reduced.
Do you think configuration of 10K routing entries + 10K adjacency entries is a 
reasonable and possible config, or normally it would be 10K routing entries + 
only several adjacency entries?
Does it make sense to create adjacency table on huge-pages?
Another problem is although above assigned heap-size is 256M, but on 1G 
huge-page system, it seems to occupy a huge-page completely, other memory space 
within that huge-page seems will not be used by other tables.

Same as the bihash based tables, only 2M huge-page system is supported. To 
support creating bihash based tables on 1G huge-page system, each table will 
occupy a 1G huge-page completely, but that will waste a lot of memories.
Is it possible just like pmalloc module, reserving a big memory space on 1G/2M 
huge-pages in initialization stage, and then allocate memory pieces per 
requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
could be created on huge-pages and will fully utilize the huge-pages.
I tried to create MAC table on 1G huge-page, and it does improve throughput 
performance.
vpp# show bihash
Name Actual Configured
GBP Endpoints - MAC/BD   1m 1m
b4s 64m 64m
b4s 64m 64m
in2out   10.12m 10.12m
in2out   10.12m 10.12m
ip4-dr   2m 2m
ip4-dr   2m 2m
ip6 FIB fwding table32m 32m
ip6 FIB non-fwding table32m 32m
ip6 mFIB table  32m 32m
l2fib mac table512m 512m
mapping_by_as4  64m 64m
out2in 128m 128m
out2in 128m 128m
out2in   10.12m 10.12m
out2in   10.12m 10.12m
pppoe link table 8m 8m
pppoe session table  8m 8m
static_mapping_by_external  64m 64m
static_mapping_by_local 64m 64m
stn addresses1m 1m
users  648k 648k
users  648k 648k
vip_index_per_port  64m 64m
vxlan4   1m 1m
vxlan4-gbp   1m 1m
Total 1.28g 1.28g

Thanks.


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17069): https://lists.fd.io/g/vpp-dev/message/17069
Mute This Topic: https://lists.fd.io/mt/75742152/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Q: how best to avoid locking for cleanup.

2020-02-28 Thread Honnappa Nagarahalli


> Subject: Re: [vpp-dev] Q: how best to avoid locking for cleanup.
> 
> On 2/28/20, Honnappa Nagarahalli  wrote:
> 
> >> On the other hand, if you do modify shared data structures in the
> >> datapath, you are on your own - you need to take care of the data
> >> consistency.
> >> Again, the way we usually deal with that is to do a "rpc" to the main
> >> thread - then the main thread can request the worker barrier, etc.
> >>
> >> Or do you refer to other situations?
> > I was looking at the bi-hash library on a standalone basis. The
> > entries are deleted and buckets are freed without any synchronization
> > between the writer
> 
> FWIW, the above statement is incorrect if all we are talking is pure bihash
> operation with values that are not used as keys for subsequent memory
> accesses in other data structures. This code in
Yes, pure bihash operations, not interconnected to another data structure.

> include/vppinfra/bihash_template.c might be of interest:
> 
> static inline int BV (clib_bihash_add_del_inline)
>   (BVT (clib_bihash) * h, BVT (clib_bihash_kv) * add_v, int is_add,
>int (*is_stale_cb) (BVT (clib_bihash_kv) *, void *), void *arg) {
>   u32 bucket_index;
> 
> ...
> 
> 
>   BV (clib_bihash_lock_bucket) (b);   <- LOCK
> 
>   /* First elt in the bucket? */
>   if (BV (clib_bihash_bucket_is_empty) (b))
> {
>   if (is_add == 0)
> {
>   BV (clib_bihash_unlock_bucket) (b);
>   return (-1);
> }
> 
> 
> 
>   /* Move readers to a (locked) temp copy of the bucket */
>   BV (clib_bihash_alloc_lock) (h);<- LOCK
>   BV (make_working_copy) (h, b);
> 
> -
> 
> and so on.
Yes, those locks provide writer-writer concurrency. There is no issue here.

> 
> when performing the actual bihash operations as a user of the bihash, you
> most definitely do *not* need any extra locking, the bihash is doing it for 
> you
> behind the scenes.
Yes, if these operations are writer operations.

> 
> There is only one transient condition that I had seen - under intense
> add/delete workload, the readers in other threads may see the lookup
> successful but the value returned being ~0.
This is the reader-writer concurrency. This is the issue I am talking about. 
This can happen without intense add/delete workload. The return value contains 
keys and value. So, the return value may be a mix of ~0 and previous values.

Also, while freeing the bucket on the writer, the readers might still be 
accessing the bucket.

> That is fairly easy to deal with.
Are you thinking of the same solution as Dave suggested?

> 
> But of course there is a race in case you are using bihash to store secondary
> indices into your own data structures - if you are deleting a bihash entry,
> another thread may have *just* made a lookup, obtained the index, and is
> using that index, so for that part you do need to take care of by yourself,
> indeed.
> 
> --a
> 
> > and the data plane threads. If the synchronization is handled outside
> > of this library using 'worker barrier' it should be alright.
> >
> >>
> >> Best
> >> Ben
> >>
> >> > -Original Message-
> >> > From: vpp-dev@lists.fd.io  On Behalf Of
> >> > Honnappa Nagarahalli
> >> > Sent: jeudi 27 février 2020 17:51
> >> > To: cho...@chopps.org; vpp-dev@lists.fd.io; Honnappa Nagarahalli
> >> > 
> >> > Cc: nd 
> >> > Subject: Re: [vpp-dev] Q: how best to avoid locking for cleanup.
> >> >
> >> > I think there are similar issues in bi-hash (i.e. the entry could
> >> > be deleted from control plane while the data plane threads are
> >> > doing the lookup).
> >> >
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > Honnappa
> >> >
> >> >
> >> >
> >> > From: vpp-dev@lists.fd.io  On Behalf Of
> >> > Christian Hopps via Lists.Fd.Io
> >> > Sent: Thursday, February 27, 2020 5:09 AM
> >> > To: vpp-dev 
> >> > Cc: vpp-dev@lists.fd.io
> >> > Subject: Re: [vpp-dev] Q: how best to avoid locking for cleanup.
> >> >
> >> >
> >> >
> >> > I received a private message indicating that one solution was to
> >> > just wait "long enough" for the packets to drain. This is the
> >> > method I'm going to go with as it's simple (albeit not as
> >> > deterministic as some marking/callback sche

Re: [vpp-dev] Q: how best to avoid locking for cleanup.

2020-02-27 Thread Honnappa Nagarahalli


> 
> Unless I misunderstand something, the usual way we deal with that is the
> worker barrier as mentioned by Neale.
> 
> API calls and CLI commands are executed under this barrier unless marked as
> mp_safe (which is off by default).
> When the worker barrier is requested by the main thread, all worker threads
> are drained and stopped. Then the critical section is executed, the barrier is
> released and workers resume.
> So, as long as the bihash delete (or any shared data non-atomic modification)
> happens under this barrier, you do not need to take care of workers being
> active: VPP is taking care of it for you 
Thank you, I was not aware of this. Would not this result in packet drops on 
the data plane if the critical section is big? For ex: searching through the 
bi-hash table for a particular entry to delete.

> 
> On the other hand, if you do modify shared data structures in the datapath,
> you are on your own - you need to take care of the data consistency.
> Again, the way we usually deal with that is to do a "rpc" to the main thread -
> then the main thread can request the worker barrier, etc.
> 
> Or do you refer to other situations?
I was looking at the bi-hash library on a standalone basis. The entries are 
deleted and buckets are freed without any synchronization between the writer 
and the data plane threads. If the synchronization is handled outside of this 
library using 'worker barrier' it should be alright.

> 
> Best
> Ben
> 
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Honnappa
> > Nagarahalli
> > Sent: jeudi 27 février 2020 17:51
> > To: cho...@chopps.org; vpp-dev@lists.fd.io; Honnappa Nagarahalli
> > 
> > Cc: nd 
> > Subject: Re: [vpp-dev] Q: how best to avoid locking for cleanup.
> >
> > I think there are similar issues in bi-hash (i.e. the entry could be
> > deleted from control plane while the data plane threads are doing the
> > lookup).
> >
> >
> >
> > Thanks,
> >
> > Honnappa
> >
> >
> >
> > From: vpp-dev@lists.fd.io  On Behalf Of Christian
> > Hopps via Lists.Fd.Io
> > Sent: Thursday, February 27, 2020 5:09 AM
> > To: vpp-dev 
> > Cc: vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] Q: how best to avoid locking for cleanup.
> >
> >
> >
> > I received a private message indicating that one solution was to just
> > wait "long enough" for the packets to drain. This is the method I'm
> > going to go with as it's simple (albeit not as deterministic as some
> > marking/callback scheme :)
> >
> > For my case I think I can wait ridiculously long for "long enough" and
> > just have a process do garbage collection after a full second.
> >
> > I do wonder how many other cases of "state associated with in-flight
> > packets" there might be, and if more sophisticated general solution
> > might be useful.
> >
> > Thanks,
> > Chris.
> >
> > > On Feb 25, 2020, at 6:27 PM, Christian Hopps  > <mailto:cho...@chopps.org> > wrote:
> > >
> > > I've got a (hopefully) interesting problem with locking in VPP.
> > >
> > > I need to add some cleanup code to my IPTFS additions to ipsec.
> > Basically I have some per-SA queues that I need to cleanup when an SA
> > is deleted.
> > >
> > > - ipsec only deletes it's SAs when its "fib_node" locks reach zero.
> > > - I hoping this means that ipsec will only be deleting the SA after
> > > the
> > FIB has stopped injecting packets "along" this SA path (i.e., it's
> > removed prior to the final unlock/deref).
> > > - I'm being called back by ipsec during the SA deletion.
> > > - I have queues (one RX for reordering, one TX for aggregation and
> > subsequent output) associated with the SA, both containing locks, that
> > need to be emptied and freed.
> > > - These queues are being used in multiple worker threads in various
> > graph nodes in parallel.
> > >
> > > What I think this means is that when my "SA deleted" callback is
> > > called,
> > no *new* packets will be delivered on the SA path. Good so far.
> > >
> > > What I'm concerned with is the packets that may currently be "in-flight"
> > in the graph, as these will have the SA associated with them, and thus
> > my code may try and use the per SA queues which I'm now trying to delete.
> > >
> > > There's a somewhat clunky solution involving global locks prior to
> > > and
> > after using an SA in each node, t

Re: [vpp-dev] Q: how best to avoid locking for cleanup.

2020-02-27 Thread Honnappa Nagarahalli
I think there are similar issues in bi-hash (i.e. the entry could be deleted 
from control plane while the data plane threads are doing the lookup).

Thanks,
Honnappa

From: vpp-dev@lists.fd.io  On Behalf Of Christian Hopps 
via Lists.Fd.Io
Sent: Thursday, February 27, 2020 5:09 AM
To: vpp-dev 
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Q: how best to avoid locking for cleanup.

I received a private message indicating that one solution was to just wait 
"long enough" for the packets to drain. This is the method I'm going to go with 
as it's simple (albeit not as deterministic as some marking/callback scheme :)

For my case I think I can wait ridiculously long for "long enough" and just 
have a process do garbage collection after a full second.

I do wonder how many other cases of "state associated with in-flight packets" 
there might be, and if more sophisticated general solution might be useful.

Thanks,
Chris.

> On Feb 25, 2020, at 6:27 PM, Christian Hopps 
> mailto:cho...@chopps.org>> wrote:
>
> I've got a (hopefully) interesting problem with locking in VPP.
>
> I need to add some cleanup code to my IPTFS additions to ipsec. Basically I 
> have some per-SA queues that I need to cleanup when an SA is deleted.
>
> - ipsec only deletes it's SAs when its "fib_node" locks reach zero.
> - I hoping this means that ipsec will only be deleting the SA after the FIB 
> has stopped injecting packets "along" this SA path (i.e., it's removed prior 
> to the final unlock/deref).
> - I'm being called back by ipsec during the SA deletion.
> - I have queues (one RX for reordering, one TX for aggregation and subsequent 
> output) associated with the SA, both containing locks, that need to be 
> emptied and freed.
> - These queues are being used in multiple worker threads in various graph 
> nodes in parallel.
>
> What I think this means is that when my "SA deleted" callback is called, no 
> *new* packets will be delivered on the SA path. Good so far.
>
> What I'm concerned with is the packets that may currently be "in-flight" in 
> the graph, as these will have the SA associated with them, and thus my code 
> may try and use the per SA queues which I'm now trying to delete.
>
> There's a somewhat clunky solution involving global locks prior to and after 
> using an SA in each node, tracking it's validity (which has it's own issues), 
> freeing when no longer in use etc.. but this would introduce global locking 
> in the packet path which I'm loathe to do.
>
> What I'd really like is if there was something like this:
>
> - packet ingress to SA fib node, fib node lock count increment.
> - packet completes it's journey through the VPP graph (or at least my part of 
> it) and decrements that fib node lock count.
> - when the SA should be deleted it removes it's fib node from the fib, thus 
> preventing new packets entering the graph then unlocks.
> - the SA is either immediately deleted (no packets in flight), or deleted 
> when the last packet completes it's graph traversal.
>
> I could do something like this inside my own nodes (my first node is point 
> B), but then there's still the race between when the fib node is used to 
> inject the packet to the next node in the graph (point A) and that packet 
> arriving at my first IPTFS node (point B), when the SA deletion could occur. 
> Maybe i could modify the fib code to do this at point A. I haven't looked 
> closely at the fib code yet.
>
> Anyway I suspect this has been thought about before, and maybe there's even a 
> solution already present in VPP, so I wanted to ask. :)
>
> Thanks,
> Chris.
>
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15595): https://lists.fd.io/g/vpp-dev/message/15595
Mute This Topic: https://lists.fd.io/mt/71544411/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] RFC: FD.io Summit (Userspace), September, Bordeaux France

2020-02-24 Thread Honnappa Nagarahalli


> 
> Hi folks,
> 
> A 2020 FD.io event is something that has been discussed a number of times
> recently at the FD.io TSC.
> With the possibility of co-locating such an event with DPDK Userspace, in
> Bordeaux, in September.
> 
> Clearly, we are incredibly eager to make sure that such an event would be a
> success.
> That FD.io users and contributors would attend, and get value out of the
> event.
> (it is a ton of work for those involved - we want people to benefit)
> 
> The likelihood is that this would be the only FD.io event of this kind in 
> 2020.
> 
> So instead of speculating, it is better to ask a direct question to the
> community and ask for honest feedback.
> How does the community feel about such an event at DPDK Userspace:-
> 
> * Do they value co-locating with DPDK Userspace?
> * Are they likely to attend?
IMO, this is valuable and would definitely be helpful to solve problems across 
the aisle
I would attend.

> 
> Thanks,
> 
> Ray K
> FD.io TSC

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15511): https://lists.fd.io/g/vpp-dev/message/15511
Mute This Topic: https://lists.fd.io/mt/71450821/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] arm hardware recommendation

2019-12-22 Thread Honnappa Nagarahalli
There are platforms from raspberry pi to AWS instances. Depends on your needs.

Going with the server platforms will give you ‘works-out-of-the-box’ experience 
(will be able to run upstream distros and packages), but they are servers and 
will cost accordingly. You can compile natively and run natively. These have 
enough cycles to run multiple CI instances as well. Depending on what you want 
to run in CI, you might be able to use Travis CI (it supports LXD containers 
for now). Packet hosts Arm servers too. AWS instance uses the N1 core from Arm 
and has good performance. AWS also has an instance that uses A72 cores.

There are development platforms from Marvell (Octeon TX) and Mellanox 
(Bluefield), but you might have to enquire with them.

Thanks,
Honnappa

From: vpp-dev@lists.fd.io  On Behalf Of Damjan Marion via 
Lists.Fd.Io
Sent: Sunday, December 22, 2019 6:44 PM
To: Paul Vinciguerra 
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] arm hardware recommendation

that board seems to be dead end, after Marvell bought Cavium looks like they 
stopped opensource activities around Armada, so for example musdk is not 
updated for more than a tear. NXP seems to be better choice with lx2160, 16 
cores. I have ClearFog CX LX2K which is more expensive version of the board you 
found as it have 100G and 4x10G ports but same CPU.

Another option is just to use AWS arm instances, probably cheaper and faster if 
you just want to run tests

--
Damjan


On 23 Dec 2019, at 01:35, Paul Vinciguerra 
mailto:pvi...@vinciconsulting.com>> wrote:

Have you used it?

How long does it take to build vpp and run the tests?

On Sun, Dec 22, 2019 at 7:23 PM Jim Thompson 
mailto:j...@netgate.com>> wrote:
https://shop.solid-run.com/product/SRM8040S00D16GE008S00CH/


On Dec 22, 2019, at 2:50 PM, Paul Vinciguerra 
mailto:pvi...@vinciconsulting.com>> wrote:

Hi Damian,

I looked at the lspci on the arm tests in the CI, and saw that we are running 
against thunderX/thunderX2.
I requested a quote for a thunderX2 workstation. ;)

Right now, I'm running experiments using the CI, but the feedback loop is too 
long.
I was looking at this [0] because it has 16 cores and that is how we are 
running the arm tests in the CI at the moment.

Instead of what's my budget, I'm thinking more along the lines of what will 
give me the greatest return for the investment.

[0] 
https://shop.solid-run.com/product-category/embedded-computers/nxp-family/honeycomb/

On Sun, Dec 22, 2019 at 3:06 PM Damjan Marion 
mailto:dmar...@me.com>> wrote:


> On 22 Dec 2019, at 01:02, Paul Vinciguerra 
> mailto:pvi...@vinciconsulting.com>> wrote:
>
> I'm looking for something that I can run tests on an arm platform in a 
> reasonable timeframe.
>
> Any suggestions?

What is your budget?

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14945): https://lists.fd.io/g/vpp-dev/message/14945
Mute This Topic: https://lists.fd.io/mt/69204936/675164
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
[j...@netgate.com]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14947): https://lists.fd.io/g/vpp-dev/message/14947
Mute This Topic: https://lists.fd.io/mt/69204936/675642
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [dmar...@me.com]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14949): https://lists.fd.io/g/vpp-dev/message/14949
Mute This Topic: https://lists.fd.io/mt/69204936/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] efficient use of DPDK

2019-12-05 Thread Honnappa Nagarahalli


> >
> > Actually native drivers (like Mellanox or AVF) can be faster w/o buffer
> > conversion and tend to be faster than when used by DPDK. I suspect VPP 
> is
> not
> > the only project to report this extra cost.
> It would be good to know other projects that report this extra cost. It 
> will
> help support changes to DPDK.
> [JT] I may be wrong but I think there was a presentation about that last week
> during DPDK user conf in the US.
That was from me using VPP as an example.  I am trying to explore solutions in 
DPDK to bridge the gap between native drivers and DPDK, assuming such 
situations exist in other applications (which could be private) too.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14816): https://lists.fd.io/g/vpp-dev/message/14816
Mute This Topic: https://lists.fd.io/mt/65218320/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] efficient use of DPDK

2019-12-05 Thread Honnappa Nagarahalli


> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Jerome Tollet via
> Lists.Fd.Io
> Sent: Wednesday, December 4, 2019 9:33 AM
> To: tho...@monjalon.net
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] efficient use of DPDK
>
> Actually native drivers (like Mellanox or AVF) can be faster w/o buffer
> conversion and tend to be faster than when used by DPDK. I suspect VPP is not
> the only project to report this extra cost.
It would be good to know other projects that report this extra cost. It will 
help support changes to DPDK.

> Jerome
>
> Le 04/12/2019 15:43, « Thomas Monjalon »  a écrit :
>
> 03/12/2019 22:11, Jerome Tollet (jtollet):
> > Thomas,
> > I am afraid you may be missing the point. VPP is a framework where 
> plugins
> are first class citizens. If a plugin requires leveraging offload (inline or
> lookaside), it is more than welcome to do it.
> > There are multiple examples including hw crypto accelerators
> (https://software.intel.com/en-us/articles/get-started-with-ipsec-acceleration-
> in-the-fdio-vpp-project).
>
> OK I understand plugins are open.
> My point was about the efficiency of the plugins,
> given the need for buffer conversion.
> If some plugins are already efficient enough, great:
> it means there is no need for bringing effort in native VPP drivers.
>
>
> > Le 03/12/2019 17:07, « vpp-dev@lists.fd.io au nom de Thomas Monjalon
> »  a écrit :
> >
> > 03/12/2019 13:12, Damjan Marion:
> > > > On 3 Dec 2019, at 09:28, Thomas Monjalon 
> wrote:
> > > > 03/12/2019 00:26, Damjan Marion:
> > > >>
> > > >> Hi THomas!
> > > >>
> > > >> Inline...
> > > >>
> > >  On 2 Dec 2019, at 23:35, Thomas Monjalon
>  wrote:
> > > >>>
> > > >>> Hi all,
> > > >>>
> > > >>> VPP has a buffer called vlib_buffer_t, while DPDK has 
> rte_mbuf.
> > > >>> Are there some benchmarks about the cost of converting, from 
> one
> format
> > > >>> to the other one, during Rx/Tx operations?
> > > >>
> > > >> We are benchmarking both dpdk i40e PMD performance and native
> VPP AVF driver performance and we are seeing significantly better
> performance with native AVF.
> > > >> If you taake a look at [1] you will see that DPDK i40e driver 
> provides
> 18.62 Mpps and exactly the same test with native AVF driver is giving us 
> arounf
> 24.86 Mpps.
> > [...]
> > > >>
> > > >>> So why not improving DPDK integration in VPP to make it 
> faster?
> > > >>
> > > >> Yes, if we can get freedom to use parts of DPDK we want 
> instead of
> being forced to adopt whole DPDK ecosystem.
> > > >> for example, you cannot use dpdk drivers without using EAL,
> mempool, rte_mbuf... rte_eal_init is monster which I was hoping that it will
> disappear for long time...
> > > >
> > > > You could help to improve these parts of DPDK,
> > > > instead of spending time to try implementing few drivers.
> > > > Then VPP would benefit from a rich driver ecosystem.
> > >
> > > Thank you for letting me know what could be better use of my time.
> >
> > "You" was referring to VPP developers.
> > I think some other Cisco developers are also contributing to VPP.
> >
> > > At the moment we have good coverage of native drivers, and still 
> there
> is a option for people to use dpdk. It is now mainly up to driver vendors to
> decide if they are happy with performance they wil get from dpdk pmd or they
> want better...
> >
> > Yes it is possible to use DPDK in VPP with degraded performance.
> > If an user wants best performance with VPP and a real NIC,
> > a new driver must be implemented for VPP only.
> >
> > Anyway real performance benefits are in hardware device offloads
> > which will be hard to implement in VPP native drivers.
> > Support (investment) would be needed from vendors to make it happen.
> > About offloads, VPP is not using crypto or compression drivers
> > that DPDK provides (plus regex coming).
> >
> > VPP is a CPU-based packet processing software.
> > If users want to leverage hardware device offloads,
> > a truly DPDK-based software is required.
> > If I understand well your replies, such software cannot be VPP.
>
>
>
>

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14813): https://lists.fd.io/g/vpp-dev/message/14813
Mute This 

Re: [vpp-dev] efficient use of DPDK

2019-12-02 Thread Honnappa Nagarahalli
Thanks for bringing up the discussion

> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Thomas
> Monjalon via Lists.Fd.Io
> Sent: Monday, December 2, 2019 4:35 PM
> To: vpp-dev@lists.fd.io
> Cc: vpp-dev@lists.fd.io
> Subject: [vpp-dev] efficient use of DPDK
> 
> Hi all,
> 
> VPP has a buffer called vlib_buffer_t, while DPDK has rte_mbuf.
> Are there some benchmarks about the cost of converting, from one format to
> the other one, during Rx/Tx operations?
> 
> I'm sure there would be some benefits of switching VPP to natively use the
> DPDK mbuf allocated in mempools.
> What would be the drawbacks?
> 
> Last time I asked this question, the answer was about compatibility with
> other driver backends, especially ODP. What happened?
I think, the ODP4VPP project was closed sometime back. I do not know of anyone 
working on this project anymore.

> DPDK drivers are still the only external drivers used by VPP?
> 
> When using DPDK, more than 40 networking drivers are available:
>   https://core.dpdk.org/supported/
> After 4 years of Open Source VPP, there are less than 10 native drivers:
>   - virtual drivers: virtio, vmxnet3, af_packet, netmap, memif
>   - hardware drivers: ixge, avf, pp2
> And if looking at ixge driver, we can read:
> "
>   This driver is not intended for production use and it is unsupported.
>   It is provided for educational use only.
>   Please use supported DPDK driver instead.
> "
> 
> So why not improving DPDK integration in VPP to make it faster?
> 
> DPDK mbuf has dynamic fields now; it can help to register metadata on
> demand.
> And it is still possible to statically reserve some extra space for 
> application-
> specific metadata in each packet.
> 
> Other improvements, like meson packaging usable with pkg-config, were
> done during last years and may deserve to be considered.
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14760): https://lists.fd.io/g/vpp-dev/message/14760
Mute This Topic: https://lists.fd.io/mt/65218320/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] DPDK development process and tools survey

2019-03-14 Thread Honnappa Nagarahalli
The survey link is: 
https://forms.office.com/Pages/ResponsePage.aspx?id=eVlO89lXqkqtTbEipmIYTcwgJ8psxytOnArCkHeSZSZUREdIN09QOEVRSUJWN0I2TzNYUTk5STVJRC4u



Thanks,

Honnappa



> -Original Message-

> From: Honnappa Nagarahalli

> Sent: Thursday, March 14, 2019 10:58 PM

> To: vpp-dev 

> Cc: tho...@monjalon.net; nd 

> Subject: FW: DPDK development process and tools survey

>

> DPDK community is trying to improve DPDK's development process. We are

> conducting a survey to understand the pain points. The survey itself takes no

> more than 10mns. If you have worked with the community and have

> feedback, please consider taking the survey.

>

> The survey is open till 23rd March 2019.

>

> Thank you,

> Honnappa

>

> > > -Original Message-

> > > From: Honnappa Nagarahalli 
> > > mailto:honnappa.nagaraha...@arm.com>>

> > > Sent: Wednesday, February 27, 2019 3:08 PM

> > > To: annou...@dpdk.org<mailto:annou...@dpdk.org>

> > > Cc: Honnappa Nagarahalli 
> > > mailto:honnappa.nagaraha...@arm.com>>; nd

> > > mailto:n...@arm.com>>

> > > Subject: DPDK development process and tools survey

> > >

> > > Hello,

> > >  There have been questions/comments in the past DPDK summits on

> > > improving the development process and the tools being used. This

> > > survey is being conducted to better understand the pain points and

> > > arrive at a set of tools to use going forward.

> > >

> > > The survey itself will be done in 2 stages.

> > > 1) Understand the problems faced by the community (this survey)

> > > 2) *If required*, another survey to choose from available solutions

> > >

> > > The survey itself does not take more than 10mns. It is for all of us

> > > in the community to improve the way we contribute, I highly

> > > encourage you to take the survey.

> > >

> > > The survey is open till 13th March 2019, 6:00PM CST.

> > >

> > > Thank you,

> > > Honnappa

> > >

> > > Survey Link:

> > >

> >

> https://forms.office.com/Pages/ResponsePage.aspx?id=eVlO89lXqkqtTbEipm

> > > I

> >

> YTcwgJ8psxytOnArCkHeSZSZUREdIN09QOEVRSUJWN0I2TzNYUTk5STVJRC4u
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12553): https://lists.fd.io/g/vpp-dev/message/12553
Mute This Topic: https://lists.fd.io/mt/30437537/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] FW: DPDK development process and tools survey

2019-03-14 Thread Honnappa Nagarahalli
DPDK community is trying to improve DPDK's development process. We are 
conducting a survey to understand the pain points. The survey itself takes no 
more than 10mns. If you have worked with the community and have feedback, 
please consider taking the survey.

The survey is open till 23rd March 2019.

Thank you,
Honnappa
 
> > -Original Message-
> > From: Honnappa Nagarahalli 
> > Sent: Wednesday, February 27, 2019 3:08 PM
> > To: annou...@dpdk.org
> > Cc: Honnappa Nagarahalli ; nd 
> > 
> > Subject: DPDK development process and tools survey
> >
> > Hello,
> > There have been questions/comments in the past DPDK summits on 
> > improving the development process and the tools being used. This 
> > survey is being conducted to better understand the pain points and 
> > arrive at a set of tools to use going forward.
> >
> > The survey itself will be done in 2 stages.
> > 1) Understand the problems faced by the community (this survey)
> > 2) *If required*, another survey to choose from available solutions
> >
> > The survey itself does not take more than 10mns. It is for all of us 
> > in the community to improve the way we contribute, I highly 
> > encourage you to take the survey.
> >
> > The survey is open till 13th March 2019, 6:00PM CST.
> >
> > Thank you,
> > Honnappa
> >
> > Survey Link:
> >
> https://forms.office.com/Pages/ResponsePage.aspx?id=eVlO89lXqkqtTbEipm
> > I
> YTcwgJ8psxytOnArCkHeSZSZUREdIN09QOEVRSUJWN0I2TzNYUTk5STVJRC4u
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12552): https://lists.fd.io/g/vpp-dev/message/12552
Mute This Topic: https://lists.fd.io/mt/30437416/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] 128 byte cache line support

2019-03-14 Thread Honnappa Nagarahalli


Related to change 18278[1], I was wondering if there is really a benefit of 
dealing with 128-byte cachelines like we do today.
Compiling VPP with cacheline size set to 128 will basically just add 64 bytes 
of unused space at the end of each cacheline so
vlib_buffer_t for example will grow from 128 bytes to 256 bytes, but we will 
still need to prefetch 2 cachelines like we do by default.

Whta will happen if we just leave that to be 64?
[Honnappa] Currently, ThunderX1 and Octeon TX have 128B cache line. What I have 
heard from Marvel folks is 64B cache line setting in DPDK does not work. I have 
not gone into details on what does not work exactly. May be Nitin can elaborate.

1. sometimes (and not very frequently) we will issue 2 prefetch instructions 
for same cacheline, but I hope hardware is smart enough to just ignore 2nd one

2. we may face false sharing issues if first 64 bytes is touched by one thread 
and another 64 bytes are touched by another one

Second one sounds to me like a real problem, but it can be solved by aligning 
all per-thread data structures to 2 x cacheline size.
[Honnappa] Sorry, I don’t understand you here. Even if the data structure is 
aligned on 128B (2 X 64B), 2 contiguous blocks of 64B data would be on a single 
cache line.
Actually If i remember correctly, even on x86 some of hardware prefetchers are 
dealing with blocks of 2 cachelines.

So unless I missed something, my proposal here is, instead of maintaining 
special 128 byte images for some ARM64 machines,
let’s just align all per-thread data structures to 128 and have just one ARM 
image.
[Honnappa] When we run VPP compiled with 128B cache line size on platforms with 
64B cache line size, there is a performance degradation. Hence the proposal is 
to make sure the distro packages run on all platforms. But one can get the best 
performance when compiled for a particular target.
Thoughts?

--
Damjan


[1] https://gerrit.fd.io/r/#/c/18278/
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12532): https://lists.fd.io/g/vpp-dev/message/12532
Mute This Topic: https://lists.fd.io/mt/30426937/675477
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [honnappa.nagaraha...@arm.com]
-=-=-=-=-=-=-=-=-=-=-=-
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12551): https://lists.fd.io/g/vpp-dev/message/12551
Mute This Topic: https://lists.fd.io/mt/30426937/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VPP 19.04 Program Plan Deliverables

2019-02-27 Thread Honnappa Nagarahalli
Hi Dave,
   Is there any deadline for proposing the deliverables? (F0?)

Thank you,
Honnappa


From: vpp-dev@lists.fd.io  On Behalf Of Dave Wallace
Sent: Monday, February 25, 2019 6:47 AM
To: vpp-dev@lists.fd.io
Subject: [vpp-dev] VPP 19.04 Program Plan Deliverables

Folks,

Please add the deliverables that you are planning to develop for the VPP 19.04 
release to the "Deliverables" section of the VPP 19.04 Program Plan.

https://wiki.fd.io/view/Projects/vpp/Release_Plans/Release_Plan_19.04#Release_Deliverables

Thanks,
-daw-
"Your Friendly VPP 19.04 Program Manager"
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12362): https://lists.fd.io/g/vpp-dev/message/12362
Mute This Topic: https://lists.fd.io/mt/30037464/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VPP and RCU?

2018-10-30 Thread Honnappa Nagarahalli


> > >
> > >> Hi Stephen,
> > >>
> > >> No, we don’t support RCU. Wouldn’t rw-locks be enough to support your
> usecases?
> > >>
> > >> Florin
> > >>
> > >>> On Oct 29, 2018, at 12:40 PM, Stephen Hemminger
>  wrote:
> > >>>
> > >>> Is it possible to do Read Copy Update with VPP? Either using
> > >>> Userspace RCU (https://librcu.org) or manually. RCU is very
> > >>> efficient way to handle read mostly tables and other dynamic cases such
> as plugins.
> > >>>
> > >>> The several things that are needed are non-preempt, atomic update
> > >>> of a pointer and a mechanism to be sure all active threads have
> > >>> gone through a quiescent period. I don't think VPP will preempt
> > >>> one node for another so that is done. The atomic update is
> > >>> relatively easy with basic barriers, either from FD.IO, DPDK, or native
> compiler operations. But is there an API to have a quiescent period marker in
> the main VPP vector scheduler?
> > >>>
> > >>> Something like the QSBR model of Userspace RCU library.
> > >>> (each thread calls rcu_queiscent_state() periodically) would be
> > >>> ideal.
> > >>>
> > >>> -=-=-=-=-=-=-=-=-=-=-=-
> > >>> Links: You receive all messages sent to this group.
> > >>>
> > >>> View/Reply Online (#11023):
> > >>> https://lists.fd.io/g/vpp-dev/message/11023
> > >>> Mute This Topic: https://lists.fd.io/mt/27785182/675152
> > >>> Group Owner: vpp-dev+ow...@lists.fd.io
> > >>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub
> [fcoras.li...@gmail.com]
> > >>> -=-=-=-=-=-=-=-=-=-=-=-
> > >>
> > >
> > > No reader-writer locks are 100's of times slower.  In fact reader
> > > write locks are slower than normal spin lock.
> > >
> >
> > I guess you meant that in general, and I can see how for scenarios with
> multiple writers and readers performance can be bad. But from your original
> message I assumed you’re mostly interested in concurrent read performance
> with few writes. For such scenarios I would expect our current, simple, spin
> and rw lock implementations to not be that bad. If that’s provably not the 
> case,
> we should definitely consider doing RCU.
> >
> > Also worth noting that a common pattern in vpp is to have per thread data
> structures and then entirely avoid locking. For lookups we typically use the
> bihash and that is thread safe.
When you say 'per thread data structures', does it mean the data structures 
will be duplicated for each data plane thread?

> >
> > Florin
> >
> >
> 
> https://www.researchgate.net/publication/247337469_RCU_vs_Locking_Perf
> ormance_on_Different_CPUs
> 
> https://lwn.net/Articles/263130/
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#11045): https://lists.fd.io/g/vpp-dev/message/11045
Mute This Topic: https://lists.fd.io/mt/27785182/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-