Re: [vpp-dev] VPP buffer pool allocation optimization guidance

2022-07-27 Thread PRANAB DAS
Hi all,

I am referring to vlib_buffer_pool_put ( 
https://vpp.flirble.org/stable-2110/d8/d9c/buffer__funcs_8h.html#ad7c38a82bb64d64e1f1713a42f344bab
 ) which grabs a spinlock to return the buffers to the rx-thread from the 
tx-threads.

Thank you

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21737): https://lists.fd.io/g/vpp-dev/message/21737
Mute This Topic: https://lists.fd.io/mt/92656278/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] VPP buffer pool allocation optimization guidance

2022-07-27 Thread PRANAB DAS
Hello all, (Dave, Benoit, Damjan and all others)

We have a VPP application that has a single RX worker/thread that receives
all packets from a NIC and N-1 packet processing thread that are transmit
only. Basically on the NIC we have 1 rx queue and N-1 transmit queues. The
rx-packet/buffer is hand-offed from the rx-thread to a set of cores
(service-chaining, pipe-lining) and each packet processing core transmits
on its transmit queue. Some of the packet processing threads might queue
the packets for seconds, minutes.

I read in VPP buffer management,  a buffer has three states - available,
cached (worker thread), used. There is a single global buffer pool and per
worker cache pool.

Since buffer after packet tx is completed needed to be returned to the
pool, in this specific scenario (1 rx and N-1 tx threads) we would like
buffers to be returned to rx thread so that there is always rx buffers to
receive packets and we don't encounter rx-miss from the NIC.

There is a spinlock that is used alloc/free a buffer from the global pool.
In this case, since there is rx on N-1 threads these are tx only, returning
buffers to local cache does not benefit performance. We would like the
buffers to be returned to the global pool and in fact to the buffer cache
of the single rx-thread directly. I am concerned that as the number of the
tx threads grows, more buffers will be returned to the global pool which
requires the spin-lock to free the buffers. The single rx-thread will run
out of cache-buffer and will attempt to allocate from the global pool and
thus increasing the chances of spin-lock contention overall which could
potentially hurt performance.

Do you agree with my characterization of the problem? Or do you think the
problem is not severe?

Do you have any suggestion how we could optimize buffer allocation in this
case. There are two goals

   - rx-thread never runs out of rx-buffers
   - buffer in pool/caches are not left unused
   - spinlock contention in allocating/freeing buffer from global pool is
   almost 0
   - should scale as we increase number of transmit treads/cores e.g 8, 10,
   12, 16, 20, 24

One obvious solution I was thinking of is to reduce the size of local
buffer cache in transmit worker threads and increase the local buffer cache
of the single rx thread. Does VPP support an application to set per
worker/thread buffer caches size?

The application threads (tx-threads) that queue packets are required to
enforce a max or threshold queue depth. And if the threshold exceeds, the
application flushes out the queued packets.

Is there any other technique we can use, e.g. after transmitting, let the
NIC move the buffers directly back to the rx for instance.

I really appreciate your guidance on optimizing buffer usage and reducing
spinlock contention on tx and rx across cores.

Thank you,

- Pranab K Das

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21736): https://lists.fd.io/g/vpp-dev/message/21736
Mute This Topic: https://lists.fd.io/mt/92656278/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] Aug 2022 VPP Community Meetings are cancelled

2022-07-27 Thread Dave Wallace

Folks,

Per agreement at the last VPP Community meeting, the August 2022 VPP 
Community meetings are cancelled due to the summer holiday season.


The next VPP Community meeting is scheduled for Tuesday September 13, 
2022 at 8am PDT.


Enjoy the rest of the summer!
-daw-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21735): https://lists.fd.io/g/vpp-dev/message/21735
Mute This Topic: https://lists.fd.io/mt/92651804/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Crash in vlib_buffer_enqueue_next

2022-07-27 Thread Vijay Kumar
Thanks Neale.

Looks like next node index 28 is invalid.



Regards


On Wed, 27 Jul 2022, 10:18 Neale Ranns,  wrote:

>
>
>
>
> *From: *vpp-dev@lists.fd.io  on behalf of Vijay
> Kumar via lists.fd.io 
> *Date: *Wednesday, 27 July 2022 at 11:27
> *To: *vpp-dev 
> *Subject: *[vpp-dev] Crash in vlib_buffer_enqueue_next
>
> Hi experts,
>
>
>
> I am seeing this callstack where the enque next crashes due to sig abort.
> Pls let me know possible reasons for this call stack. I highly appreciate
> any response related to this bt.
>
>
>
>
>
> Program terminated with signal SIGABRT, Aborted.
>
> #0  __pthread_kill_implementation (threadid=,
> signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
>
> Downloading source file
> /usr/src/debug/glibc-2.34-25.fc35.x86_64/nptl/pthread_kill.c...
>
> 44return INTERNAL_SYSCALL_ERROR_P (ret) ?
> INTERNAL_SYSCALL_ERRNO (ret) : 0;
>
> [Current thread is 1 (Thread 0x7ff3a7fff640 (LWP 113))]
>
> (gdb) bt
>
> #0  __pthread_kill_implementation (threadid=,
> signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
>
> #1  0x7ff79fe828f3 in __pthread_kill_internal (signo=6,
> threadid=) at pthread_kill.c:78
>
> #2  0x7ff79fe356a6 in __GI_raise (sig=sig@entry=6) at
> ../sysdeps/posix/raise.c:26
>
> #3  0x7ff79fe1f865 in __GI_abort () at abort.c:100
>
> #4  0x557244c3d30a in os_exit (code=) at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vpp/vnet/main.c:477
>
> #5  
>
> #6  vlib_next_frame_change_ownership (node_runtime=0x7ff53dd56500,
> node_runtime=0x7ff53dd56500, next_index=28, vm=0x7ff5269d83c0)
>
> at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/main.c:344
>
> #7  vlib_get_next_frame_internal (vm=vm@entry=0x7ff5269d83c0,
> node=node@entry=0x7ff53dd56500, next_index=next_index@entry=28,
> allocate_new_next_frame=allocate_new_next_frame@entry=0)
>
> at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/main.c:418
>
> #8  0x7ff7a02816d0 in enqueue_one (tmp=0x7ff3ac254ec0, n_left=2,
> n_buffers=2, nexts=0x7ff3ac2553c0, buffers=0x7ff53e10f9d0, next_index=28,
> used_elt_bmp=0x7ff3ac254e80, node=0x7ff53dd56500,
>
> vm=0x7ff5269d83c0) at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/buffer_funcs.c:74
>
> #9  vlib_buffer_enqueue_to_next_fn_skx (vm=0x7ff5269d83c0, node= out>, buffers=, nexts=, count=)
>
> at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/buffer_funcs.c:165
>
> #10 0x7ff51e7975c7 in ?? () from
> /usr/lib/vpp_plugins/an_ppe_vppctrl_plugin.so
>
>
>
> You’re doing something wrong in here. An invalid next node would be my
> guess.
>
>
>
> /neale
>
>
>
>
>
> #11 0x7ff7a01c62c2 in dispatch_node (last_time_stamp=,
> frame=, dispatch_state=VLIB_NODE_STATE_POLLING,
> type=VLIB_NODE_TYPE_INTERNAL, node=0x7ff53dd56500,
>
> vm=0x7ff5269d83c0) at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/main.c:1074
>
> #12 dispatch_pending_node (vm=vm@entry=0x7ff5269d83c0,
> pending_frame_index=pending_frame_index@entry=4,
> last_time_stamp=)
>
> at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/main.c:1252
>
> #13 0x7ff7a01c79c7 in vlib_main_or_worker_loop (is_main=0,
> vm=0x7ff5269d83c0) at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/main.c:1841
>
> #14 vlib_worker_loop (vm=0x7ff5269d83c0) at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vlib/main.c:1975
>
> #15 0x7ff7a013c68c in clib_calljmp () at
> /usr/src/debug/vpp-21.06.0-5~g18265fb04_dirty.x86_64/src/vppinfra/longjmp.S:123
>
> #16 0x7ff3a7ffec80 in ?? ()
>
> #17 0x7ff51b2af1f2 in eal_thread_loop.cold () from
> /usr/lib/vpp_plugins/dpdk_plugin.so
>
> #18 0x in ?? ()
>
> (gdb) fr 10
>
> #10 0x7ff51e7975c7 in ?? () from
> /usr/lib/vpp_plugins/an_ppe_vppctrl_plugin.so
>
> (gdb) info args
>
>
>
> 
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21734): https://lists.fd.io/g/vpp-dev/message/21734
Mute This Topic: https://lists.fd.io/mt/92641507/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] TCP msg queue full, connections reset issue

2022-07-27 Thread Vijay Kumar
Hi Florin,

Thanks for a quick response.  Seeing the app logs closely,  we saw it was
rejecting due to a lookup failure for ue ip. So its not first case related
to congestion.



Regards.

On Wed, 27 Jul 2022, 11:42 Florin Coras,  wrote:

> Hi Vijay,
>
> That looks like an accept that either 1) can’t be propagated over shared
> memory message queue to app, because mq is congested or is 2) rejected by a
> builtin app
>
> Regards,
> Florin
>
> On Jul 26, 2022, at 7:13 PM, Vijay Kumar  wrote:
>
> Hi experts,
>
> We are seeing the below counters being pegged. The scenario is the UEs are
> trying to establish TCP with VPP.
>
> It would be highly appreciated if anyone could tell us why we see the msg
> queue full counter shown below?
>
>
> 1 tcp4-rcv-process   Events not sent for lack of
> msg queue space
> 2 tcp4-output Packets sent
> 1 tcp4-output Resets sent
> 1 tcp4-output Invalid connection
>
>
>
>
>
>
>
>
>
>
> 
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21733): https://lists.fd.io/g/vpp-dev/message/21733
Mute This Topic: https://lists.fd.io/mt/92642099/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] TCP msg queue full, connections reset issue

2022-07-27 Thread Florin Coras
Hi Vijay, 

That looks like an accept that either 1) can’t be propagated over shared memory 
message queue to app, because mq is congested or is 2) rejected by a builtin 
app 

Regards, 
Florin 

> On Jul 26, 2022, at 7:13 PM, Vijay Kumar  wrote:
> 
> Hi experts,
> 
> We are seeing the below counters being pegged. The scenario is the UEs are 
> trying to establish TCP with VPP.
> 
> It would be highly appreciated if anyone could tell us why we see the msg 
> queue full counter shown below?
> 
> 
> 1 tcp4-rcv-process   Events not sent for lack of msg 
> queue space
> 2 tcp4-output Packets sent 
> 1 tcp4-output Resets sent 
> 1 tcp4-output Invalid connection  
>   
> 
> 
> 
>  
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21732): https://lists.fd.io/g/vpp-dev/message/21732
Mute This Topic: https://lists.fd.io/mt/92642099/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-