On 2021-12-01 16:19, Jonathan Pratt wrote:

Thanks to those who helped.

The problem has disappeared. I hesitate to say “resolved” because we don’t know exactly what the problem was. We knew there was a problem with our configuration, but not what the problem was – just hoping that someone with more experience could tell us where to look since we only followed instructions provided by ettus on line. I will detail our findings for the event that it is helpful to someone else.

It seems to have been related to the FPGA image since that was the last thing we changed before behaviour returned to expected (ie 25MSPS transfer across a 1GbE link without any drop outs).

The set up is:

  * X310  with a1GbE SFP+ module in socket 0, and a generic but ‘intel
    compatible’ 10GbE SFP+ module in socket 1. The IP addresses have
    been altered to not cause conflicts on our LAN.
      o The 1GbE connection connects to the LAN
      o The 10GbE connection connects to a 1GbE connection on the Xavier
  * nVidia AGX Xavier with a 2x 10GbE PCIe card inserted. The same
    generic 10GbE SFP+ modules do not work in the PCIe card, so we
    have a 1GbE SFP+ in one of the sockets.
      o The on board 1GbE connects to the LAN
      o The 1GbE on the PCIe card is connected to the 10GbE module on
        the X310
      o The xavier has UHD 4.0.0 + gnu radio 3.8 + gr-ettus (RFNoC)
        4.0 installed which is the setup described in the workshop for
        RFNoC 4
        
(https://kb.ettus.com/images/e/e9/rfnoc4_workshop_slides_2020_part_2.pdf)

The X310 can be ‘seen’ by the usrp-probe software either across the LAN or directly but always seems to go via the direct link first (reports the direct IP address when connecting and no loss of data if either LAN cable is unplugged while running). Works similarly via both routes.

We setup a simple USRP source to QT frequency sink flow. Any time the sample rate was set above 2MSPS the letter ‘D’ (drop out = sequence numbers not in sequence)  was output repeatedly to the console. This was the only ‘error’. Coincidentally the 2MSPS rate would be the largest sample rate we would expect to receive reliably if the ethernet negotiation had resolved to a 100Base-TX link. However the interface card (on the xavier) reported a 1000mb/s link.

When the X310 arrived this setup would not work until we updated the FPGA image to one compatible with uhd 4.0.0. This was done following the instructions here : https://files.ettus.com/manual/page_usrp_x3x0.html#x3x0_flash

We tried all possible combinations of ethernet interfaces with the same results. At some point we stopped using gnuradio-companion in favour of the benchmark_rate application but the results were the same – dropouts above 2MSPS. We tried a different computer (windows PC with ubuntu VM) but the same results. We then tried a newer version of UHD (4.1.3 I think). In this case the ‘XG’ image was programmed. The results were very similar. The last test was to be a direct link between the on-board ethernet of the xavier (rather than via the PCI ethernet interfaces) and the 1GbE module on the X310. The xavier only had the 4.0.0 uhd software installed so it was necessary to replace the FPGA image for uhd 4.0.0. Initially this was done with the ‘XG’ image (unintentionally) but then we changed back to the ‘HG’ image.

After this we found that the benchmark_rate script would work up to 16MSPS without dropouts, and above 16MSPS it would crash (reset) the Xavier. Thinking this a vast improvement over what we were getting (16MSPS vs 2MSPS) we replaced the cables to the configuration described above. Coincidentally we discovered that gnuradio-companion will now work at 25MSPS without dropouts. So we can get 25MSPS transfer across a 1GbE link from a 10GbE SFP+ module in slot 1 of the X310 to a 1GbE SFP+ module on a PCIe card on the Xavier platform.

What changed? No idea. But its working, so there’s no plan to investigate further, unless performance is inadequate when we change to a 10GbE SFP+ module on the xavier.


A few things to note.   The PHYs on the X310 are *FIXED SPEED*, they don't "do" speed negotiation.  Speed is established by the version of the firmware that you're running.

Keep in mind that the IP stack on these devices is completely-minimal.   They don't "do" routing in any meaningful sense.   Even though they have Ethernet ports,   think of those as a dedicated data channel to a host computer, rather than implying that they can participate as any other network device on a routed network.

If anything between the radio and the host re-orders packets, the UDP-based protocol will not "cope" in any meaningful way.  The assumption is that the network connection   is quite direct between the radio and the host computer, so no packet-reordering would be expected (nor any significant packet loss).  Anything that violates that assumption
  will cause havoc.

The problem is that implementing a full, robust, IP stack on the devices that is fully compliant with, for example, the "Host Requirements" RFC would mean that vital   space in the FPGA would be taken up by all of that "goop", and leave very little room for DSP machinery--either "as shipped" or customer-added via RFNOC.  So, the   network connections should be thought of as a "high speed sample bus with a very familiar connector".

_______________________________________________
USRP-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to