Hi Jeff, Can you describe the dataflow of your RFNoC graph (which blocks you're using and how they're connected)? For example, is it: host -> SEP -> Your Block -> Radio?
Could you try the latest version of UHD (v4.1.0.5)? There were many bug fixes since the initial release of 4.0. You may also want to regenerate your block to get a new noc_shell. Those are very small packets (8 32-bit samples). That should be fine, but maybe there's a corner case with really short packets. Maybe you could try coalescing them into larger packets? If I were debugging this, I'd use chipscope, maybe with some checker logic like you described, to look at the data flowing from the ethernet interface and follow that path to your block to see where the packets are getting dropped. It would also confirm that the packets are making it into the FPGA. But first I think updating to the latest version is a good idea so we're not chasing a bug that's already been fixed. Thanks, Wade On Mon, Mar 7, 2022 at 10:17 AM Jeffrey P Long <[email protected]> wrote: > Hi- > > > > I have determined that somewhere upstream from my custom RFNOC component > the fabric is intermittently dropping a fixed number of packets. > > > > I have a custom transmit waveform encapsulated in a single RFNOC > component. This waveform component effectively takes about 8 32-bit samples > of user data and produces an entire transmit burst of close to 5 msec in > length at a sample rate of 50 MHz. Therefore, a fairly large “upsampling” > operation for a RFNOC block. This is a timed transmission, so I have > interface logic that translates the CHDR info and single EOB to a series of > packets with a timestamp on the first and the EOB set on the last packet > along with the appropriate tlast set along the way. I can verify this works > well and will run without issues for about a few minutes on startup. I have > a similar RX component that receives this transmission in an analog > loopback approach so I can verify the transmission. I have also inserted a > packet number in my transmit data and have a checker(in the HDL) on the > transmit side(upstream of my component) to check when there is an out of > sequence happening. In chipscope I have it triggering when it happens so I > can observe this behavior independent of the RX process. > > > > Setup: Ubuntu 20 LTS, E320, UHD 4.0.0.0-122-g75f2ba94 > > > > Here are some things I have observed: > > > > 1. It will run without an issue for about 1-2 mins on startup. Clean > start or re-run does not matter. > > > > 1. It is always 34 source packets that are missing (each is 8 32 bit > samples in length) each time it drops. > > > > 1. This never happens back to back so it looks like something is > overflowing upstream however it is not perfectly periodic. > > > > 1. If I replace my core tx waveform processing with a simple fifo and > allow the 8 sample packets to flow through my processing(no upsampling) it > never drops anything. Obviously the large 1 to many and resultant stalling > of the upstream is not making things happy. > > > > 1. This continues to happen if I totally disable the RX processing. > > > > 1. There is no indication of underruns or lates or other errors coming > from the tx_core downstream of my component. I verified also by chipscoping > that component and looking for anything. > > > > > > Some things I have tried: > > > > 1. I did increase the (info, pyld) fifo sizes on the input side of my > components noc_shell. Did not change the behavior. I did not touch the > stream endpoint buffers. > 2. I am generally running this in host mode however I did try cross > compiling the app and running embedded mode on the E320. Interesting > observation is that it then becomes exactly 33 packets that are lost each > time (weird or telling?). > 3. If I insert usleeps in the while loop pushing down the data > (txstream->send()) I can change the behavior so that it happens less > frequently, takes longer to happen the first time, and the size of the > number lost can change from the 34 normally. In my HDL I increment the > timestamp by 50 msec so the obvious perfect sleep would be something like > 50 msec minus the time rest of the code can take. Clearly this is hard to > tune. Just setting 50 msec eventually causes a LLLLLLate condition. There > is a sweet spot somewhere but without a RTOS this is a waste of time and > would not be the right way to fix this. > > > > Any help or insight (things to try) would be greatly appreciated. I am out > of ideas. My final idea would be to put my own FIFO just in front with a > level indicator. Fill it up halfway and then monitor it with a register to > keep it happy. Assuming I could keep up with this polling approach it > should keep it happy unless there is a real bug upstream and someone is not > obeying AXIS protocol. I would think this would be unnecessary however > since RFNOC should not allow something like this to happen. > > > > Thanks in advance, > Jeff Long > > > _______________________________________________ > USRP-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] >
_______________________________________________ USRP-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
