Hi USRP users,

I'm doing a timed transmission and I'm tracking how many samples are buffered 
before the transmission starts. I do this by calling send() with tx_metadata 
timestamps in the future and accumulating the return value from send() until it 
starts returning 0. Send() returns 0 until the start time occurs, after which 
the send() call starts returning a non-zero number again indicating samples are 
successfully being sent from the host. Based on this accumulated number, the 
total number of buffered samples is only 63,488 (=248 kB of data) despite 
having the linux kernel network buffer sizes (net.core.wmem_max and 
wmem_default settings) set to 50 MB. Is this expected? How can I increase the 
number of buffered samples between the host and device? Latency is not a 
concern. Should I be adjusting the samples per buffer or using timeout=0 here?

Background:
I'm using an X440 and my end goal is to get at least 4 channels coherently 
transmitting (i.e. stable phase relationship for an entire burst/session) at 50 
MSps or greater on each channel for runs on the order of minutes. The problem 
I'm running into is that during transmission at these sample rates, it is very 
likely that one of the channels underflows, and even if it's just a single 
underflow, it breaks the phase relationship with the other channels for the 
rest of the run since I'm using multithreaded streamers in a custom C++ 
program. The real core of my problem is trying to figure out why I'm 
underflowing at all. My hardware setup seems more than capable (see below); CPU 
usage per active core stays below 50% or so, the network traffic doesn't and 
shouldn't come anywhere close to the 4x 10GbE capacity, and the file reads from 
SSD are staying far ahead of the sender. Based on my testing, I'm virtually 
certain that a large-ish buffer between my UHD application and the X440 would 
solve all underflow issues, but it seems right now the buffer is only 248 kB, 
or about 1 ms @ 50 MSps. I have occasionally gotten a full transmission to 
complete without a single underflow. The onset of the underflow seems to 
randomly happen, which is another indicator that a larger buffer will help 
smooth these inconsistencies.

Setup info:

  *   UHD 4.6 on host and device, FPGA is running a customized image (X4_400 
stock image but with the RAM replay block replaced with DDC/DUC)
  *   Host Hardware - AMD threadripper, cores set to performance, Intel 
E810-CQDA2 NIC configured to split 1x 100GbE port into 4x logical 10GbE ports
  *   Host Software - Ubuntu 20.04, running a C++ program that spawns 2 threads 
for each transmit channel; 1 "producer" for reading a file into a series of 
very large buffers in memory, and the other "consumer" for moving a pointer to 
the correct point in those buffers and calling send(). The producer threads 
start early and I have proven that they never fall behind the consumers. I've 
increased the net.core.wmem_max and wmem_default values to 50 MB, enabled tx 
pause frames on the NIC, maxed out the tx/rx descriptors on the NIC,  and 
followed all the other tuning tips and tricks. I am not running DPDK as I 
didn't think it would be necessary at these sample rates, although I could be 
wrong about that.

Any help here would be much appreciated, thanks!

Patrick

_______________________________________________
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-le...@lists.ettus.com

Reply via email to