Check the network configuration of the cards on all hosts. Do they have the 
same MTUs? Are there errors showing on the interface?  Do regular pings work 
reliably?

Sent from my iPhone

> On Oct 28, 2020, at 1:05 PM, Johannes Demel <[email protected]> wrote:
> 
> Hi Marcus,
> 
> no, I didn't swap cables. I put this on the list of things I will try. 
> Physical access is cumbersome this year.
> 
> Thanks for the hint.
> Do you have more ideas what to check?
> 
> Cheers
> Johannes
> 
>> On 28.10.20 17:49, Marcus D Leech wrote:
>> Have you tried swapping cables to see if the problem follows the cable?
>> Sent from my iPhone
>>>> On Oct 28, 2020, at 12:44 PM, Johannes Demel via USRP-users 
>>>> <[email protected]> wrote:
>>> 
>>> Hi all,
>>> 
>>> we have a couple of N310s in our lab and some of them seem to fail to 
>>> transmit reliably.
>>> 
>>> Each N310 is connected to a host via one of those SFP+ cables that came 
>>> with them from Ettus. We have 3 N310s that are connected via said cables to 
>>> one host each with an Intel X710 DA2 with an AMD TRX3970. All machines run 
>>> Ubuntu 20.04 with all updates.
>>> I use the UHD 3.15LTS branch: UHD_3.15.0.0-7-g8d228dbe
>>> I made sure to check out the very same commit and recompile and install it.
>>> 
>>> On 2 hosts I can run:
>>> `./benchmark_rate --args "addr=192.168.20.213,master_clock_rate=122.88e6" 
>>> --tx_rate 61.44e6 --tx_channels "3" --rx_rate 61.44e6 --rx_channels "0,1"`
>>> The full output is attached at the bottom of this email.
>>> 
>>> What I observe:
>>> - It runs fine with 2 hosts
>>> - The third host fails.
>>> -- On the third host RX only works.
>>> -- On the third host TX only is haunted: cf. full test output.
>>> - We have a server with Intel Xeon 6254 and X722 where I observe the same 
>>> issue
>>> - I switched USRPs between hosts, the issue seems to stick with the host.
>>> 
>>> It started with one host a couple of weeks back. But now our server starts 
>>> to fail with the same error: The exact same setup used to work on that 
>>> machine.
>>> I am looking into this for quite a while now. I can't find the source of 
>>> the issue.
>>> 
>>> Has anyone had experience with that? I'd really appreciate hints how to 
>>> debug this.
>>> 
>>> 
>>> Cheers
>>> Johannes
>>> 
>>> 
>>> On the working hosts the benchmark rate summary looks like this:
>>> ---------
>>> Benchmark rate summary:
>>>  Num received samples:     1270556340
>>>  Num dropped samples:      0
>>>  Num overruns detected:    0
>>>  Num transmitted samples:  614440368
>>>  Num sequence errors (Tx): 0
>>>  Num sequence errors (Rx): 0
>>>  Num underruns detected:   0
>>>  Num late commands:        0
>>>  Num timeouts (Tx):        0
>>>  Num timeouts (Rx):        0
>>> ---------
>>> 
>>> But on the third device:
>>> ---------
>>> [....]
>>> SUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSU[00:00:16.262123]
>>>  Receiver error: ERROR_CODE_TIMEOUT, continuing...
>>> SUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUU[00:00:16.565159]
>>>  Benchmark complete.
>>> 
>>> 
>>> Benchmark rate summary:
>>>  Num received samples:     66501280
>>>  Num dropped samples:      0
>>>  Num overruns detected:    0
>>>  Num transmitted samples:  154312704
>>>  Num sequence errors (Tx): 3149
>>>  Num sequence errors (Rx): 0
>>>  Num underruns detected:   3156
>>>  Num late commands:        0
>>>  Num timeouts (Tx):        0
>>>  Num timeouts (Rx):        97
>>> ----------
>>> 
>>> We have a server with Intel X722 and Intel Xeon Gold 6252 that reports the 
>>> same issue:
>>> ----------
>>> UUUUUUUU[00:00:16.180094] Receiver error: ERROR_CODE_TIMEOUT, continuing...
>>> US[00:00:16.382393] Benchmark complete.
>>> 
>>> 
>>> Benchmark rate summary:
>>>  Num received samples:     99763328
>>>  Num dropped samples:      0
>>>  Num overruns detected:    0
>>>  Num transmitted samples:  155804944
>>>  Num sequence errors (Tx): 3180
>>>  Num sequence errors (Rx): 0
>>>  Num underruns detected:   164974
>>>  Num late commands:        0
>>>  Num timeouts (Tx):        0
>>>  Num timeouts (Rx):        95
>>> ----------
>>> Though, there are even more underruns.
>>> 
>>> 
>>> 
>>> Working output:
>>> ============
>>> [INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; 
>>> UHD_3.15.0.0-7-g8d228dbe
>>> [00:00:00.000002] Creating the usrp device with: 
>>> addr=192.168.20.213,master_clock_rate=122.88e6...
>>> [INFO] [MPMD] Initializing 1 device(s) in parallel with args: 
>>> mgmt_addr=192.168.20.213,type=n3xx,product=n310,serial=319841B,claimed=False,addr=192.168.20.213,master_clock_rate=122.88e6
>>> [INFO] [MPM.PeriphManager] init() called with device args 
>>> `time_source=gpsdo,clock_source=gpsdo,mgmt_addr=192.168.20.213,product=n310,master_clock_rate=122.88e6'.
>>> [INFO] [0/Replay_0] Initializing block control (NOC ID: 0x4E91A00000000004)
>>> [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD100000011312)
>>> [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD100000011312)
>>> [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0000000000000)
>>> [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0000000000000)
>>> [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0000000000002)
>>> [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0000000000002)
>>> [INFO] [0/FIFO_0] Initializing block control (NOC ID: 0xF1F0000000000000)
>>> [INFO] [0/FIFO_1] Initializing block control (NOC ID: 0xF1F0000000000000)
>>> [INFO] [0/FIFO_2] Initializing block control (NOC ID: 0xF1F0000000000000)
>>> [INFO] [0/FIFO_3] Initializing block control (NOC ID: 0xF1F0000000000000)
>>> Using Device: Single USRP:
>>>  Device: N300-Series Device
>>>  RX Channel: 0
>>>    RX DSP: 0
>>>    RX Dboard: A
>>>    RX Subdev: Magnesium
>>>  RX Channel: 1
>>>    RX DSP: 1
>>>    RX Dboard: A
>>>    RX Subdev: Magnesium
>>>  RX Channel: 2
>>>    RX DSP: 0
>>>    RX Dboard: B
>>>    RX Subdev: Magnesium
>>>  RX Channel: 3
>>>    RX DSP: 1
>>>    RX Dboard: B
>>>    RX Subdev: Magnesium
>>>  TX Channel: 0
>>>    TX DSP: 0
>>>    TX Dboard: A
>>>    TX Subdev: Magnesium
>>>  TX Channel: 1
>>>    TX DSP: 1
>>>    TX Dboard: A
>>>    TX Subdev: Magnesium
>>>  TX Channel: 2
>>>    TX DSP: 0
>>>    TX Dboard: B
>>>    TX Subdev: Magnesium
>>>  TX Channel: 3
>>>    TX DSP: 1
>>>    TX Dboard: B
>>>    TX Subdev: Magnesium
>>> 
>>> [00:00:04.045700] Setting device timestamp to 0...
>>> [INFO] [MULTI_USRP]     1) catch time transition at pps edge
>>> [INFO] [MULTI_USRP]     2) set times next pps (synchronously)
>>> [00:00:05.689405] Testing receive rate 61.440000 Msps on 2 channels
>>> [00:00:05.829315] Testing transmit rate 61.440000 Msps on 1 channels
>>> [00:00:16.180163] Benchmark complete.
>>> 
>>> 
>>> Benchmark rate summary:
>>>  Num received samples:     1270556340
>>>  Num dropped samples:      0
>>>  Num overruns detected:    0
>>>  Num transmitted samples:  614440368
>>>  Num sequence errors (Tx): 0
>>>  Num sequence errors (Rx): 0
>>>  Num underruns detected:   0
>>>  Num late commands:        0
>>>  Num timeouts (Tx):        0
>>>  Num timeouts (Rx):        0
>>> 
>>> 
>>> Done!
>>> =====================
>>> 
>>> _______________________________________________
>>> USRP-users mailing list
>>> [email protected]
>>> http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com

_______________________________________________
USRP-users mailing list
[email protected]
http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com

Reply via email to