For C++ I believe you can provide your own allocation routine for standard library functions (https://en.cppreference.com/w/cpp/named_req/Allocator.html could be starting point in researching this). But as you said, the ones which request memory allocation should most probably not run in the fast path.
Klement > On 29 Dec 2025, at 12:12, Maxim Uvarov via lists.fd.io > <[email protected]> wrote: > > > Hello Rob, > > 1. In first example direct bindgen looks a little bit strange. I guess that > can use direct define in future. > let ip: *const ip4_header_t = b0.current_ptr_mut() as *const ip4_header_t; > if (*ip).__bindgen_anon_1.protocol == IP_PROTOCOL_ICMP { > > 2. How do Rust libraries use alloc/free for their common algorithms? Are they > use vpp pool and buffers (huge tables, numa, cache align)? I interested how > Rust or C++ can be integrated to VPP nodes realization. Many of standard > algorithms can request of memory allocation. Allocation is not expected in > runtime data path. Is that handler by your Rust port or there are some > limitations of usage? > > Thank you, > Maxim. > > 27.11.2025, 15:32, [email protected] > > Results with node functions performing prefetching and processing four > buffers at a time, with vector instructions emitted by both compilers > in the update of next node indices: > C: 1.08e1 clocks per packet > Rust: 1.09e1 clocks per packet > > So again, approximately equal. > > The code for this: > C: > https://github.com/rshearman/vpp/blob/48ca99dc7079bd46b2bb37605ec3d62a44bbf58f/src/plugins/example-c/example_node.c > Rust: > https://github.com/rshearman/vpp-plugin-rs/blob/de41333b46b671b8c61182b6f38d0996fce8cf5e/vpp-example-plugin/src/lib.rs > > Median values from: > $ grep example */show_runtime.txt > c-1/show_runtime.txt:example-c active > 809723 28203452 0 1.08e1 > 34.83 > c-2/show_runtime.txt:example-c active > 816954 28795970 0 1.06e1 > 35.25 > c-3/show_runtime.txt:example-c active > 906334 28421093 0 1.16e1 > 31.36 > rust-1/show_runtime.txt:example active > 864643 28368780 0 1.09e1 > 32.81 > rust-2/show_runtime.txt:example active > 924042 29116994 0 1.13e1 > 31.51 > rust-3/show_runtime.txt:example active > 858871 29217427 0 1.07e1 > 34.02 > > All of these runs were done while offering the highest packet rate the > traffic generator could do (rather than NDR in the previous set of > results). > > Thanks, > Rob > > On Tue, 25 Nov 2025 at 14:52, Robert Shearman via lists.fd.io > wrote: > > > > There is SIMD usage in vlib_get_buffers (and the Rust implementation > > of the same functionality), so it does cover that base, but I can > > enhance the plugins to perform prefetching. > > > > Thanks, > > Rob > > > > On Tue, 25 Nov 2025 at 13:55, Damjan Marion via lists.fd.io > > wrote: > > > > > > > > > Well, your C plugin is very basic, no prefetching, no instruction level > > > parallelism, no SIMD usage so no surprises that numbers are close. > > > Any chance you can try something involving those techniques? > > > > > > Thanks, > > > > > > Damjan > > > > > > > > > > On 24.11.2025., at 21:22, Robert Shearman via lists.fd.io wrote: > > > > > > > > I've been able to capture some numbers for what I believe is a fair > > > > apples-to-apples comparison of a C plugin versus a Rust plugin, and a > > > > short summary is that they are approximately equal in performance. > > > > > > > > C: 1.22e1 clock cycles per packet > > > > Rust: 1.19e1 clock cycles per packet > > > > > > > > Details: > > > > > > > > Rust plugin code (also using vpp-plugin crate from same git hash): > > > > https://github.com/rshearman/vpp-plugin-rs/tree/15437cd8d848fd877dbb5858dec1e4ab853bbd42/vpp-example-plugin > > > > C plugin code: > > > > https://github.com/rshearman/vpp/blob/8ebcf29538b1145e376bdf7f8b405ca4822e9c24/src/plugins/example-c/example_node.c > > > > > > > > Obviously, the plugins here are about as basic as it comes but the > > > > more basic the plugins the easier it is to perform an apples-to-apples > > > > comparison and visually validate they are doing the same job. > > > > > > > > Rust compiler version: > > > > rustc 1.91.0 (f8297e351 2025-10-28) > > > > > > > > C compiler version: > > > > Ubuntu clang version 18.1.3 (1ubuntu1) > > > > Target: x86_64-pc-linux-gnu > > > > Thread model: posix > > > > InstalledDir: /usr/bin > > > > > > > > The test setup is a VM with 8G of memory where TRex is running, > > > > connected via two virtio interfaces to a second VM with 4G of memory > > > > where vpp is run. Both VMs have 2 lcores pinned to set lcores on the > > > > hypervisor. Both VMs are running Ubuntu 24.04.3, along with the > > > > hypervisor. Given that this was a one-time performance test, I made no > > > > attempt at doing CPU isolation in either the guest VMs or the > > > > hypervisor, with the hope that the noise in the results that this > > > > causes is acceptable. > > > > > > > > VPP code was built using "make pkg-deb" and then installed as packages > > > > in the VM, with the only change to the configuration being to > > > > configure `workers 1`. The Rust plugin was built using `cargo build > > > > --release` and then the resulting .so file for the vpp-example-plugin > > > > was copied into the expected location in the VM. > > > > > > > > IPv4 UDP 1500-byte packets are generated from TRex in an NDR test > > > > (although the overhead from the example plugin turned out to be lost > > > > in the noise), meaning that these packets don't match in the plugins > > > > under test so the packets aren't dropped, but follow the next-feature > > > > path. > > > > > > > > The CPU on which the test was run is 11th Gen Intel(R) Core(TM) > > > > i5-11600K @ 3.90GHz (Rocket Lake) with 12 cores (although only 4 cores > > > > were being used by the VMs part of the test topology), meaning Icelake > > > > multiarch functions are in use. > > > > > > > > Three runs were performed for each of the C and Rust plugins being > > > > enabled (with the C and Rust runs interleaved to avoid bias, such as > > > > the CPU becoming thermal-limited in performance), with the median > > > > clocks value being picked for each (to avoid bias from outliers): > > > > > > > > $ grep example */show_runtime.txt > > > > c-1/show_runtime.txt:example-c active > > > > 5567253 168564707 0 1.31e1 > > > > 30.28 > > > > c-2/show_runtime.txt:example-c active > > > > 5724169 169749914 0 1.22e1 > > > > 29.65 > > > > c-3/show_runtime.txt:example-c active > > > > 5831253 165498112 0 1.15e1 > > > > 28.38 > > > > rust-1/show_runtime.txt:example active > > > > 5423194 162165846 0 1.19e1 > > > > 29.90 > > > > rust-2/show_runtime.txt:example active > > > > 5768590 172390325 0 1.15e1 > > > > 29.88 > > > > rust-3/show_runtime.txt:example active > > > > 4822482 136679532 0 1.22e1 > > > > 28.34 > > > > > > > > The full "vppctl show runtime" output for the median runs are attached > > > > for reference. > > > > > > > > Thanks, > > > > Rob > > > > > > > > On Fri, 14 Nov 2025 at 09:34, Robert Shearman wrote: > > > >> > > > >> Hi Damjan, > > > >> > > > >> I haven't done that yet, but I'll give it a go! > > > >> > > > >> Thanks, > > > >> Rob > > > >> > > > >> On Thu, 13 Nov 2025 at 12:49, Damjan Marion via lists.fd.io > > > >> wrote: > > > >>> > > > >>> > > > >>> Hi, > > > >>> > > > >>> have you tried to implement something already existing in C and > > > >>> compare > > > >>> performance? > > > >>> > > > >>> I would really like to se apple-to-apple comparison of same > > > >>> functionality in > > > >>> C and rust when it comes to high-performance datapath code. > > > >>> > > > >>> Thanks, > > > >>> > > > >>> — > > > >>> Damjan > > > >>> > > > >>> > > > >>>> On 12.11.2025., at 13:21, Robert Shearman via lists.fd.io wrote: > > > >>>> > > > >>>> Hi folks, > > > >>>> > > > >>>> I believe there could be benefits in having the option of writing VPP > > > >>>> plugins in Rust, so to that end I've created a set of Rust > > > >>>> crates/packages to make it easier to write plugins, make use of the > > > >>>> underlying VPP C APIs, and an example feature plugin all of which can > > > >>>> be found here: > > > >>>> > > > >>>> https://github.com/rshearman/vpp-plugin-rs/ > > > >>>> > > > >>>> The goal is to have performance parity with VPP plugins written in C > > > >>>> (compiling with support for different instruction sets similar to C > > > >>>> code is already supported, for example), but whilst still feeling > > > >>>> like > > > >>>> Rust code. > > > >>>> > > > >>>> I'd be interested in feedback from the VPP development community. > > > >>>> > > > >>>> Thanks, > > > >>>> -- > > > >>>> Rob Shearman > > > >>>> > > > >>>> > > > >>>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >> > > > >> > > > >> -- > > > >> Rob Shearman > > > > > > > > > > > > > > > > -- > > > > Rob Shearman > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Rob Shearman > > > > > > > > > -- > Rob Shearman > > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#26686): https://lists.fd.io/g/vpp-dev/message/26686 Mute This Topic: https://lists.fd.io/mt/116254824/21656 Group Owner: [email protected] Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
