I had the same issues -- the big performance eater in my case was anything that was doing modulo in a tight loop.
So, if you have something like: for ( int i = 0; i < 1000; i++) { array[i % arr_size] = ... } You'll take a pretty big hit. On Wed, Sep 6, 2017 at 4:00 PM, Tom Bereknyei via USRP-users < usrp-users@lists.ettus.com> wrote: > We ran into a similar issue. Big things that helped us was to move high > rate dsp calculations to RFNoC. > > I've also had luck with volk_profile. It seems to help with some > workloads. > On Wed, Sep 6, 2017 at 16:53 Philip Balister via USRP-users < > usrp-users@lists.ettus.com> wrote: > >> On 09/06/2017 04:38 PM, Marcus Müller via USRP-users wrote: >> > Hi Mr Hamilton, >> > >> > So, what you'd want to optimize first depends on what needs the most >> > optimization. Your x86 program might be a good place to start looking >> > into what the bottleneck is. If you're running Linux on your x86, I can >> > heartily recommend `perf`, which is a tool that lets you display live, >> > record and analyze the points in your code where the program spends most >> > time. >> >> "perf top" gives results pretty quickly. >> >> It also sounds like you aren't using both cpu's to the full extent. >> Maybe there is just one block doing all the work? >> >> Also, looking at using rfnoc to do high rate functions to reduce >> calculations that need doing on the arm is a good plan. >> >> Philip >> >> > >> > In general, modern x86 have way larger memory bandwidth and larger CPU >> > caches, so that alone can become critical, but also things like more >> > capable SIMD instructions and less hardware-handling overhead. >> > >> > I don't know whether this helped you much, but I hope it's a start, >> > best regards, >> > >> > Marcus Müller >> > >> > On 09/06/2017 10:06 PM, S Hamilton via USRP-users wrote: >> >> We're moving an application that we had running on pc hardware with >> >> the Ettus B210, to the embedded arm E310. On the pc side we were at >> >> 80% idle cpu when running (intel i5-4570). With armv7 we're down to >> >> 30% idle, with one of the cores @100% so it's not keeping up. >> >> Are there any arm specific optimizations that are recommended or >> gotchas. >> >> We are using the release4 version of the SDK and firmware. >> >> >> >> We'd also like to use the complex_to_mag_approx RFNOC block. Is there >> >> any sample code around to look at. >> >> >> >> Thanks, >> >> >> >> >> >> _______________________________________________ >> >> USRP-users mailing list >> >> USRP-users@lists.ettus.com >> >> http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com >> > >> > >> > >> > >> > _______________________________________________ >> > USRP-users mailing list >> > USRP-users@lists.ettus.com >> > http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com >> > >> >> _______________________________________________ >> USRP-users mailing list >> USRP-users@lists.ettus.com >> http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com >> > -- > Maj Tom Bereknyei > Defense Digital Service > t...@dds.mil > (571) 225-1630 > > _______________________________________________ > USRP-users mailing list > USRP-users@lists.ettus.com > http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com > >
_______________________________________________ USRP-users mailing list USRP-users@lists.ettus.com http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com