On 08/30/2020 02:44 AM, Nando Pellegrini wrote:
Marcus,
I really do not understand what you are trying to demonstrate i have started long ago trying to use the easy GR blocks withe the B200mini and found out at once that the overhead introduced be the Gr blocks were limiting the efficiency so you discovered the "warm water" as we say in Italy , i was just going to waste some more of my time following your indications but to be frank i am tired of reporting problems and receiving back questions instead of answers, considerations more philosophic than technical . The point is that i bought from Ettus a device which was promising and publishing certain level of performances. I have to admit that all was true up to about a year ago , not anymore now why? What should i do to be able to see my expectations satisfied? If a USB Linux based system is not able to sustain your products what type of consideration you think we are forced to have?
nando
I'm not sure how I can provide technical support if asking clarifying questions is not acceptable.

You note that "up to a year ago, I could achieve this performance". But what has changed? From your messages it seems like OS and computer hardware have changed. Unless I'm misunderstanding what you're saying. When you say performance has changed, do you mean with *exactly the same hardware and OS environment*?? If so, the only thing I can think of is that a kernel update has changed the performance envelope of USB support. If you're on an Intel CPU, that kernel update may have turned on something called KPTI, which has often significant performance implications--up to 30%. You might check the status of KPTI on your systems, to see if any
  kernel update has suddenly turned this on:

https://askubuntu.com/questions/992137/how-to-check-that-kpti-is-enabled-on-my-ubuntu

Having said all that, I went into the lab today to try out high-performance using a B200mini, on a different system. I was able to achieve
  56Msps sustained without over-runs.

This is with:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz

The USB controller was:

03:00.0 USB controller:
Etron Technology, Inc. EJ188/EJ198 USB 3.0 Host Controller (prog-if 30 [XHCI])
Subsystem: Etron Technology, Inc. EJ188/EJ198 USB 3.0
Host Controller Flags: bus master, fast devsel, latency 0, IRQ 31 Memory at df300000 (64-bit, non-prefetchable) [size=32K]
Capabilities: <access denied> Kernel driver in use: xhci_hcd

And kernel version is:

Linux localhost.localdomain 5.7.15-100.fc31.x86_64 #1 SMP Tue Aug 11 17:18:01 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

The UHD version was 3.14.1.0 -- same as on the AMD Phenom II X6 system I tested yesterday.

I used a simple Gnu Radio flow-graph, just like yesterday.

I will go back into the lab later this evening or tomorrow and try the latest RC version of UHD, but my success with 56Msps against UHD 3.14.1.0 (what is packaged in Fedora 31) indicates that streaming at that rate continues to be possible with the right
  compute hardware and USB controller.

It is interesting to note that on this system, the CPUs are operating in "Power Save" mode, and KPTI is *enabled*. On the AMD system I tested yesterday, KPTI isn't enabled (because AMD doesn't need it), and the CPUs were operating in "Performance" mode. So system performance
  envelope as a whole plays a significant role.




On 8/29/2020 20:51, Marcus D. Leech wrote:
On 08/29/2020 03:35 AM, Nando Pellegrini wrote:
Marcus,
Attached you can find the results of the benchmark test.
I have been also compared the behavior with 2 different CPU and different USB type 3.0 for the older tower PC, USB 3.1 on the laptop, very strange the case of the older CPU generating an overflow every minute. The conditions were exactly the same in all test with no other visible activity on the machines. Release 14.0 seems a bit better with the benchmark but,sadly, the 2 UHD release are not comparable because the 14.0 as soon as generates an overflow indication drops in the timeout with no recovery but final consideration is that fast sample rate became unusable for long signal recording regardless to software release and PC.
I really hope for a solution.
nando
I played a bit with a B210 on a Fedora 31 system today, and was unable to achieve greater than 37MSPS without overruns.

I constructed a "degenerate-case" Gnu Radio flow graph that was just:

uhd-source-->null-sink

That's roughly equivalent to what benchmark_rate does, and I was forced to do that since F31 doesn't appear to package tools
  like benchmark_rate and some of the other UHD examples.

This was with UHD 3.14.0.1

The system was an AMD Phenom II X6 1090T.

What i noticed was that above 38Msps, you'd get continuous over-runs, and at 38Msps, you'd get a burst of overruns whenever you switched to a new window. This is CLEARLY a system effect, unrelated to UHD at all. Likely contention for memory access, interrupt latency, or PCI-e transaction contention. The CPU consumption for the gr-uhd thread that was servicing the USB interface never rose above 38% CPU. Now the UHD transport code is single-threaded. It's tempting to suggest "why not make it multi-threaded?". That was tried, several times, a few years back, and performance was *worse* with UHD transport spread over multiple threads. Probably due to resource contention
  at the kernel interface.

I'll note that no matter whether I specified sc8, sc12, or sc16 sample sizes I saw the same behavior. This indicates to me that it isn't USB *bandwidth* so much as USB (and by implication PCIe) offered *TRANSACTION* load. It is likely the case that different USB3 controllers make this better/worse, depending on their interrupt behavior, how they do DMA, etc, etc. I did have to use num_recv_frames > 200 to
  achieve even this.

I'll make a general comment that achieving loss-free, "real time", high-bandwidth streaming using a general-purpose operating system is always going to require a lot of tuning, and not a small amount of good luck. Other applications of high-speed streaming are somewhat tolerant if one end does a "hey stop sending for a bit" -- like disk drives and network interfaces, etc. But when you're trying to sample the "real world", you cannot reasonably put it "on hold" while you "catch up". Which is why throwing more buffering at this problem generally doesn't work that well. If the offered load exceeds long-term capacity, even by a tiny bit, you will end up "losing". It is clear that "capacity" is only loosely-coupled to CPU performance, and much-better represented by overall *system* performance.

Over the years, folks have pointed at UHD, hoping that some kind of performance-tuning exercise within UHD will get them the performance their application requires. UHD has been, over the years (roughly 10 years at this point) optimized quite a bit. But UHD lives within an overall
  *system* and it can only do as well as that *system* can provide.







_______________________________________________
USRP-users mailing list
[email protected]
http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com

Reply via email to