On Sat, Mar 28, 2015 at 12:50 PM, Andy Walls <[email protected]> wrote:
> On Sat, 2015-03-28 at 14:45 -0400, Andy Walls wrote: > > Hi Tom: > > > > > > On Sat, 2015-03-28 at 11:12 -0700, Tom Rondeau wrote: > > > On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls > > > <[email protected]> wrote: > > > > > Can this memmove() be safely skipped > > > > > > > https://github.com/gnuradio/gnuradio/blob/master/gr-qtgui/lib/time_sink_f_impl.cc#L627 > > [snip] > > > The volk_32f_convert_64f_u_avx() call is unavoidable as Qwt > > > wants > > > doubles for plotting and not floats. But it might also be able > > > to be > > > deferred to the very end when the decision to plot is known > > > for sure. > > > (But that's more surgery than I care to take on at the > > > moment.) > > > > > > > > But thinking about the volk convert function, that's both copying the > > > data from the input buffer into the internal buffer as well as > > > performing the conversion. We can't just hold data in the input since > > > we don't want to back up the data until we're ready to plot both with > > > timing and with a full enough buffer -- it's just sampling a section > > > at a time and drops everything in between. > > > > Right. > > > > > That part could be converted into a memcpy instead of the volk > > > convert. Then, when we're ready to plot, we call the volk convert that > > > also does the move from d_start to 0, so it combines those two > > > elements. > > > > Yeah, that's the surgery part. :) It would require adding a new set of > > buffers to hold floats objects, and then convert them when a > > determination to plot was made. > > > > This also affects the memmove() of the tail for the trigger delay. It > > would operate on the new set of float buffers (vs the buffers holding > > doubles). > > > > > Thoughts on those proposals? > > Your proposal for implementing memcpy() and deferring volk_*() to do the > conversion and "memmove" in one step is great! :) > > I just implemented it, and the time_sink_f thread has gone from 41.5% > CPU down to 29.1% CPU in my tests. :) memcpy() now dominates the > thread, but that's to be expected. > > > > With my initial hack: > > > CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated) > > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a > unit mask of 0x00 (No unit mask) count 100000 > > samples % image name symbol name > > 78158 39.0737 libvolk.so.0.0.0 volk_32f_convert_64f_u_avx > > 22777 11.3870 no-vmlinux /no-vmlinux > > 13972 6.9851 libgnuradio-qtgui-3.7.7git.so.0.0.0 > gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const > > 7781 3.8900 libgnuradio-qtgui-3.7.7git.so.0.0.0 > gr::qtgui::time_sink_f_impl::_test_trigger_norm(int, std::vector<void > const*, std::allocator<void const*> >) > > 7236 3.6175 libpthread-2.18.so pthread_mutex_lock > > 6163 3.0811 libgnuradio-runtime-3.7.7git.so.0.0.0 > boost::detail::sp_counted_base::release() > > 5942 2.9706 libpthread-2.18.so pthread_mutex_unlock > > 4947 2.4732 libgnuradio-runtime-3.7.7git.so.0.0.0 > gr::block_executor::run_one_iteration() > > 3826 1.9127 libgnuradio-runtime-3.7.7git.so.0.0.0 > gr::block_detail::input(unsigned int) > > 3555 1.7773 libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6.0.19 > > 3206 1.6028 libc-2.18.so __memmove_ssse3_back > > [...] > > With my implementation of your suggestion: > > CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 90000 > samples % image name symbol name > 27595 35.6051 libc-2.18.so __memcpy_sse2_unaligned > 12225 15.7736 no-vmlinux /no-vmlinux > 4051 5.2269 libpthread-2.18.so pthread_mutex_lock > 3739 4.8243 libgnuradio-runtime-3.7.7git.so.0.0.0 > boost::detail::sp_counted_base::release() > 3362 4.3379 libpthread-2.18.so pthread_mutex_unlock > 2876 3.7108 libgnuradio-runtime-3.7.7git.so.0.0.0 > gr::block_executor::run_one_iteration() > 2364 3.0502 libgnuradio-runtime-3.7.7git.so.0.0.0 > gr::block_detail::input(unsigned int) > 2091 2.6980 libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6.0.19 > 1388 1.7909 libgnuradio-runtime-3.7.7git.so.0.0.0 > gr::tpb_detail::notify_upstream(gr::block_detail*) > 1138 1.4683 libc-2.18.so __memmove_ssse3_back > [...] > 2 0.0026 libvolk.so.0.0.0 __volk_32f_convert_64f_d > [...] > 1 0.0013 libvolk.so.0.0.0 volk_32f_convert_64f_a_avx > > > Regards, > Andy > Andy, Excellent! I've got a few other minor patches for some things, I'll put this in there to and test on my end as well. Tom
_______________________________________________ Discuss-gnuradio mailing list [email protected] https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
