Hello, I believe I've figured out what is causing the TDC errors. I (or one of my coworkers) will created an "issue" in the UHD github repo, but I wanted to post some more info here if someone else runs into this.
I found that I could reproduce the TDC measurement errors at least somewhat consistently with the following command: while true; do uhd_usrp_probe --args="force_reinit=1,master_clock_rate=200000000"; done I don't think the master clock rate matters -- that is just what I selected, but the force_reinit forces the clocks to get setup each time. That was the important part in reproducing the error. If I let this run while the UHD 4.1.0.4 or prior filesystem is installed on an N320, I have not been able to reproduce the TDC error ever. Using the filesystem from UHD 4.1.0.5-rc1 and later and running the above command results in ocassional TDC errors. They occur randomly, but if I leave it running, I'll usually see at least a few per hour. I tended to leave it running over night and I'd check for errors the next morning. It looks like the problem is related to a change made in the LMK04848 configuration in MPM. In UHD commit d7ee3dcf4a7478a17a094a1be2cba37b98843963, it looks like some register writes were changed to decrease PLL lock time. It looks like these registers set the amount of time that the phase detector error must be within a certain window before Lock Detect is asserted. I'm guessing that the reduction in time required to declare lock (number of clock cycles) might be too aggressive. It works most of the time, but not always. Making the following edits to /usr/lib/python3.7/site-packages/usrp_mpm/dboard_manager/lmk_rh.py seems to fix the issue. Note that this file must be edited on the N320. Replace: (0x15B, 0xC7), # PLL1 PFD: negative slope for active filter / CP = 750 uA (0x15C, 0x0F), # PLL1 DLD Count [13:8] With: (0x15B, 0x27), # PLL1 PFD: negative slope for active filter / CP = 750 uA (0x15C, 0x10), # PLL1 DLD Count [13:8] This just undoes the change made in the commit mentioned above and requires more time before the LMK04848 to declares lock. Maybe some value in between would be a better choice, but I'm leaving it this way for now. I haven't seen any TDC errors so far. Thanks, Jim ________________________________ From: Jim Palladino <j...@gardettoengineering.com> Sent: Tuesday, May 10, 2022 2:02 PM To: Marcus D. Leech <patchvonbr...@gmail.com>; USRP-users@lists.ettus.com <usrp-users@lists.ettus.com> Subject: [USRP-users] Re: N320 TDC measurement errors Just passing on that I updated an N320 to UHD 4.2.0.0 and ran into the TDC error pretty quickly. I now reverted that radio to 4.1.0.2 and have not seen that error "yet". Thanks, Jim ________________________________ From: Jim Palladino <j...@gardettoengineering.com> Sent: Monday, May 9, 2022 1:08 PM To: Marcus D. Leech <patchvonbr...@gmail.com>; usrp-users@lists.ettus.com <usrp-users@lists.ettus.com> Subject: [USRP-users] Re: N320 TDC measurement errors Thanks, Marcus. I cannot say with 100% certainty, but we had most radios on UHD 4.1.0.2 before and nobody here remembers seeing those errors (ever) until we updated all of them to 4.1.0.5. There have always been issues (according to the others I talked to) with radios not starting properly with some odd error or another that would magically go away with the next attempt. It could be that some of those errors were related to this problem and were presented to the user differently, but I can't say for sure. If I get a free N320 at some point, I might try reverting it to 4.1.0.2 and keep an eye on its behavior. Thanks Jim ________________________________ From: Marcus D. Leech <patchvonbr...@gmail.com> Sent: Monday, May 9, 2022 12:04 PM To: usrp-users@lists.ettus.com <usrp-users@lists.ettus.com> Subject: [USRP-users] Re: N320 TDC measurement errors On 2022-05-09 11:32, Jim Palladino wrote: Sorry to bring it up again, but this is really becoming an issue for us, in that we can't seem to use our N320 radios reliably with this TDC measurement error issue. When the TDC error occurs, our program or even uhd_usrp_probe immediately errors out and exits. If anyone has seen this or has any thoughts on why this might be happening or how to fix it, that would be greatly appreciated. Thanks, Jim Jim: I'm sorry this is happening to your N320s. Can you confirm that it DOES NOT happen on previous releases? I don't have an N320 here to test with. I've rattled some internal Ettus/NI cages, but I cannot offer a concrete response time. ________________________________ From: Jim Palladino <j...@gardettoengineering.com><mailto:j...@gardettoengineering.com> Sent: Monday, May 2, 2022 12:59 PM To: USRP-users@lists.ettus.com<mailto:USRP-users@lists.ettus.com> <usrp-users@lists.ettus.com><mailto:usrp-users@lists.ettus.com> Subject: [USRP-users] N320 TDC measurement errors Hello, Ever since updating to UHD 4.1.0.5 (including updating the filesystem and FPGA image on our six N320 USRPs), we occasionally get TDC measurement errors when trying to interact with the radio via UHD. It isn't easily reproducible, but it does happen on different radios maybe once a day or so. I've seen it when using either external time and clock sources or internal (doesn't seem to matter which). Here is an example of the output of a uhd_usrp_probe when this occurs. ---------------------- [INFO] [UHD] linux; GNU C++ version 7.5.0; Boost_106501; UHD_4.1.0.HEAD-0-g6bd0be9c [DEBUG] [MPMD] Discovering MPM devices on port 49600 [DEBUG] [MPMD] Discovering MPM devices on port 49600 [DEBUG] [MPMD] Discovering MPM devices on port 49600 [DEBUG] [MPMD] Discovering MPM devices on port 49600 [INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=192.168.40.2,type=n3xx,product=n320,serial=31EDED4,fpga=XG,claimed=False,addr=192.168.40.2 [DEBUG] [MPMD] Claiming mboard 0 [DEBUG] [MPMD] Device args: `mgmt_addr=192.168.40.2,type=n3xx,product=n320,serial=31EDED4,fpga=XG,claimed=False,addr=192.168.40.2'. RPC address: 192.168.40.2 [DEBUG] [MPMD] MPM reports device info: addr=192.168.30.2,claimed=True,connection=remote,dboard_0_pid=338,dboard_0_serial=31EBB6F,dboard_1_pid=338,dboard_1_serial=31EBB94,description=N300-Series Device,eeprom_version=3,fpga=XG,fpga_version=8.0,fpga_version_hash=6bd0be9.clean,fs_version=20211215135436,mender_artifact=v4.1.0.5_n3xx,mpm_sw_version=4.1.0.5-g6bd0be9c,mpm_version=4.0,name=ni-n3xx-31EDED4,pid=16962,product=n320,rev=10,rpc_connection=remote,second_addr=192.168.40.2,serial=31EDED4,type=n3xx [DEBUG] [MPMD] Found 8 motherboard sensors. [DEBUG] [MPMD] Initializing mboard 0 [INFO] [MPM.PeriphManager] init() called with device args `fpga=XG,mgmt_addr=192.168.40.2,product=n320,clock_source=internal,time_source=internal'. [INFO] [MPM.Rhodium-0] init() called with args `fpga=XG,mgmt_addr=192.168.40.2,product=n320,clock_source=internal,time_source=internal' [INFO] [MPM.Rhodium-1] init() called with args `fpga=XG,mgmt_addr=192.168.40.2,product=n320,clock_source=internal,time_source=internal' [INFO] [MPM.Rhodium-0.init.LMK04828] LMK initialized and locked! [ERROR] [MPM.Sync-0] TDC measurements show a wide range of values! Check your clock rates for incompatibilities. [INFO] [MPM.Rhodium-1.init.LMK04828] LMK initialized and locked! [ERROR] [RPC] TDC measurement out of expected range! [INFO] [MPM.Rhodium-1.DAC37J82] DAC PLL Locked! [INFO] [MPM.Rhodium-1.AD9695] ADC PLL Locked! [INFO] [MPM.Rhodium-1.init] JESD204B Link Initialization & Training Complete [ERROR] [MPM.RPCServer] init() failed with error: TDC measurement out of expected range! Error: RuntimeError: Error during RPC call to `init'. Error message: TDC measurement out of expected range! ---------------------- If I run uhd_usrp_probe again immediately, it always seems to work fine. I don't think this is specific to any of the 3 valid master clock rates, but I've seen this happen after a fresh reboot of an N320 with a uhd_usrp_probe -- so it should have been set to default parameters. I also feel like it happens after a radio hasn't been in use for a while, but I'm not sure if that is always the case. Does anyone have any idea what might cause this? Thanks, Jim _______________________________________________ USRP-users mailing list -- usrp-users@lists.ettus.com<mailto:usrp-users@lists.ettus.com> To unsubscribe send an email to usrp-users-le...@lists.ettus.com<mailto:usrp-users-le...@lists.ettus.com>
_______________________________________________ USRP-users mailing list -- usrp-users@lists.ettus.com To unsubscribe send an email to usrp-users-le...@lists.ettus.com