Recently I added malloc/free intercept code for C, C++ and rust[1] so non-vpp code should use VPP main heap….
[1] https://git.fd.io/vpp/tree/src/vppinfra/mem_intercept.c#n485 On 30.12.2025., at 09:06, Maxim Uvarov via lists.fd.io <[email protected]> wrote: yes, that was my question. How is that handled in current Rust support? I.e. Vec, String, Box, HashMap, Rc, Arc and etc also things like .collect() can allocate memory. So more likely there have to be wrappers around clib_mem_alloc() / clib_mem_free() to use internal allocator instead of default Rust. BR, Maxim. 29.12.2025, 15:11, [email protected] For C++ I believe you can provide your own allocation routine for standard library functions (https://en.cppreference.com/w/cpp/named_req/Allocator.html could be starting point in researching this). But as you said, the ones which request memory allocation should most probably not run in the fast path. Klement On 29 Dec 2025, at 12:12, Maxim Uvarov via lists.fd.io <[email protected]> wrote: Hello Rob, 1. In first example direct bindgen looks a little bit strange. I guess that can use direct define in future. let ip: *const ip4_header_t = b0.current_ptr_mut() as *const ip4_header_t; if (*ip).__bindgen_anon_1.protocol == IP_PROTOCOL_ICMP { 2. How do Rust libraries use alloc/free for their common algorithms? Are they use vpp pool and buffers (huge tables, numa, cache align)? I interested how Rust or C++ can be integrated to VPP nodes realization. Many of standard algorithms can request of memory allocation. Allocation is not expected in runtime data path. Is that handler by your Rust port or there are some limitations of usage? Thank you, Maxim. 27.11.2025, 15:32, [email protected] Results with node functions performing prefetching and processing four buffers at a time, with vector instructions emitted by both compilers in the update of next node indices: C: 1.08e1 clocks per packet Rust: 1.09e1 clocks per packet So again, approximately equal. The code for this: C: https://github.com/rshearman/vpp/blob/48ca99dc7079bd46b2bb37605ec3d62a44bbf58f/src/plugins/example-c/example_node.c Rust: https://github.com/rshearman/vpp-plugin-rs/blob/de41333b46b671b8c61182b6f38d0996fce8cf5e/vpp-example-plugin/src/lib.rs Median values from: $ grep example */show_runtime.txt c-1/show_runtime.txt:example-c active 809723 28203452 0 1.08e1 34.83 c-2/show_runtime.txt:example-c active 816954 28795970 0 1.06e1 35.25 c-3/show_runtime.txt:example-c active 906334 28421093 0 1.16e1 31.36 rust-1/show_runtime.txt:example active 864643 28368780 0 1.09e1 32.81 rust-2/show_runtime.txt:example active 924042 29116994 0 1.13e1 31.51 rust-3/show_runtime.txt:example active 858871 29217427 0 1.07e1 34.02 All of these runs were done while offering the highest packet rate the traffic generator could do (rather than NDR in the previous set of results). Thanks, Rob On Tue, 25 Nov 2025 at 14:52, Robert Shearman via lists.fd.io wrote: > > There is SIMD usage in vlib_get_buffers (and the Rust implementation > of the same functionality), so it does cover that base, but I can > enhance the plugins to perform prefetching. > > Thanks, > Rob > > On Tue, 25 Nov 2025 at 13:55, Damjan Marion via lists.fd.io > wrote: > > > > > > Well, your C plugin is very basic, no prefetching, no instruction level > > parallelism, no SIMD usage so no surprises that numbers are close. > > Any chance you can try something involving those techniques? > > > > Thanks, > > > > Damjan > > > > > > > On 24.11.2025., at 21:22, Robert Shearman via lists.fd.io wrote: > > > > > > I've been able to capture some numbers for what I believe is a fair > > > apples-to-apples comparison of a C plugin versus a Rust plugin, and a > > > short summary is that they are approximately equal in performance. > > > > > > C: 1.22e1 clock cycles per packet > > > Rust: 1.19e1 clock cycles per packet > > > > > > Details: > > > > > > Rust plugin code (also using vpp-plugin crate from same git hash): > > > https://github.com/rshearman/vpp-plugin-rs/tree/15437cd8d848fd877dbb5858dec1e4ab853bbd42/vpp-example-plugin > > > C plugin code: > > > https://github.com/rshearman/vpp/blob/8ebcf29538b1145e376bdf7f8b405ca4822e9c24/src/plugins/example-c/example_node.c > > > > > > Obviously, the plugins here are about as basic as it comes but the > > > more basic the plugins the easier it is to perform an apples-to-apples > > > comparison and visually validate they are doing the same job. > > > > > > Rust compiler version: > > > rustc 1.91.0 (f8297e351 2025-10-28) > > > > > > C compiler version: > > > Ubuntu clang version 18.1.3 (1ubuntu1) > > > Target: x86_64-pc-linux-gnu > > > Thread model: posix > > > InstalledDir: /usr/bin > > > > > > The test setup is a VM with 8G of memory where TRex is running, > > > connected via two virtio interfaces to a second VM with 4G of memory > > > where vpp is run. Both VMs have 2 lcores pinned to set lcores on the > > > hypervisor. Both VMs are running Ubuntu 24.04.3, along with the > > > hypervisor. Given that this was a one-time performance test, I made no > > > attempt at doing CPU isolation in either the guest VMs or the > > > hypervisor, with the hope that the noise in the results that this > > > causes is acceptable. > > > > > > VPP code was built using "make pkg-deb" and then installed as packages > > > in the VM, with the only change to the configuration being to > > > configure `workers 1`. The Rust plugin was built using `cargo build > > > --release` and then the resulting .so file for the vpp-example-plugin > > > was copied into the expected location in the VM. > > > > > > IPv4 UDP 1500-byte packets are generated from TRex in an NDR test > > > (although the overhead from the example plugin turned out to be lost > > > in the noise), meaning that these packets don't match in the plugins > > > under test so the packets aren't dropped, but follow the next-feature > > > path. > > > > > > The CPU on which the test was run is 11th Gen Intel(R) Core(TM) > > > i5-11600K @ 3.90GHz (Rocket Lake) with 12 cores (although only 4 cores > > > were being used by the VMs part of the test topology), meaning Icelake > > > multiarch functions are in use. > > > > > > Three runs were performed for each of the C and Rust plugins being > > > enabled (with the C and Rust runs interleaved to avoid bias, such as > > > the CPU becoming thermal-limited in performance), with the median > > > clocks value being picked for each (to avoid bias from outliers): > > > > > > $ grep example */show_runtime.txt > > > c-1/show_runtime.txt:example-c active > > > 5567253 168564707 0 1.31e1 > > > 30.28 > > > c-2/show_runtime.txt:example-c active > > > 5724169 169749914 0 1.22e1 > > > 29.65 > > > c-3/show_runtime.txt:example-c active > > > 5831253 165498112 0 1.15e1 > > > 28.38 > > > rust-1/show_runtime.txt:example active > > > 5423194 162165846 0 1.19e1 > > > 29.90 > > > rust-2/show_runtime.txt:example active > > > 5768590 172390325 0 1.15e1 > > > 29.88 > > > rust-3/show_runtime.txt:example active > > > 4822482 136679532 0 1.22e1 > > > 28.34 > > > > > > The full "vppctl show runtime" output for the median runs are attached > > > for reference. > > > > > > Thanks, > > > Rob > > > > > > On Fri, 14 Nov 2025 at 09:34, Robert Shearman wrote: > > >> > > >> Hi Damjan, > > >> > > >> I haven't done that yet, but I'll give it a go! > > >> > > >> Thanks, > > >> Rob > > >> > > >> On Thu, 13 Nov 2025 at 12:49, Damjan Marion via lists.fd.io > > >> wrote: > > >>> > > >>> > > >>> Hi, > > >>> > > >>> have you tried to implement something already existing in C and compare > > >>> performance? > > >>> > > >>> I would really like to se apple-to-apple comparison of same > > >>> functionality in > > >>> C and rust when it comes to high-performance datapath code. > > >>> > > >>> Thanks, > > >>> > > >>> — > > >>> Damjan > > >>> > > >>> > > >>>> On 12.11.2025., at 13:21, Robert Shearman via lists.fd.io wrote: > > >>>> > > >>>> Hi folks, > > >>>> > > >>>> I believe there could be benefits in having the option of writing VPP > > >>>> plugins in Rust, so to that end I've created a set of Rust > > >>>> crates/packages to make it easier to write plugins, make use of the > > >>>> underlying VPP C APIs, and an example feature plugin all of which can > > >>>> be found here: > > >>>> > > >>>> https://github.com/rshearman/vpp-plugin-rs/ > > >>>> > > >>>> The goal is to have performance parity with VPP plugins written in C > > >>>> (compiling with support for different instruction sets similar to C > > >>>> code is already supported, for example), but whilst still feeling like > > >>>> Rust code. > > >>>> > > >>>> I'd be interested in feedback from the VPP development community. > > >>>> > > >>>> Thanks, > > >>>> -- > > >>>> Rob Shearman > > >>>> > > >>>> > > >>>> > > >>> > > >>> > > >>> > > >>> > > >> > > >> > > >> -- > > >> Rob Shearman > > > > > > > > > > > > -- > > > Rob Shearman > > > > > > > > > > > > > > > > > > > > -- > Rob Shearman > > > -- Rob Shearman
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#26692): https://lists.fd.io/g/vpp-dev/message/26692 Mute This Topic: https://lists.fd.io/mt/116254824/21656 Group Owner: [email protected] Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
