On Sat, 2015-03-28 at 14:45 -0400, Andy Walls wrote: > Hi Tom: > > > On Sat, 2015-03-28 at 11:12 -0700, Tom Rondeau wrote: > > On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls > > <[email protected]> wrote: > > > Can this memmove() be safely skipped > > > > > > https://github.com/gnuradio/gnuradio/blob/master/gr-qtgui/lib/time_sink_f_impl.cc#L627 > [snip] > > The volk_32f_convert_64f_u_avx() call is unavoidable as Qwt > > wants > > doubles for plotting and not floats. But it might also be able > > to be > > deferred to the very end when the decision to plot is known > > for sure. > > (But that's more surgery than I care to take on at the > > moment.) >
> > > But thinking about the volk convert function, that's both copying the > > data from the input buffer into the internal buffer as well as > > performing the conversion. We can't just hold data in the input since > > we don't want to back up the data until we're ready to plot both with > > timing and with a full enough buffer -- it's just sampling a section > > at a time and drops everything in between. > > Right. > > > That part could be converted into a memcpy instead of the volk > > convert. Then, when we're ready to plot, we call the volk convert that > > also does the move from d_start to 0, so it combines those two > > elements. > > Yeah, that's the surgery part. :) It would require adding a new set of > buffers to hold floats objects, and then convert them when a > determination to plot was made. > > This also affects the memmove() of the tail for the trigger delay. It > would operate on the new set of float buffers (vs the buffers holding > doubles). > > > Thoughts on those proposals? Your proposal for implementing memcpy() and deferring volk_*() to do the conversion and "memmove" in one step is great! :) I just implemented it, and the time_sink_f thread has gone from 41.5% CPU down to 29.1% CPU in my tests. :) memcpy() now dominates the thread, but that's to be expected. With my initial hack: > CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 100000 > samples % image name symbol name > 78158 39.0737 libvolk.so.0.0.0 volk_32f_convert_64f_u_avx > 22777 11.3870 no-vmlinux /no-vmlinux > 13972 6.9851 libgnuradio-qtgui-3.7.7git.so.0.0.0 > gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const > 7781 3.8900 libgnuradio-qtgui-3.7.7git.so.0.0.0 > gr::qtgui::time_sink_f_impl::_test_trigger_norm(int, std::vector<void const*, > std::allocator<void const*> >) > 7236 3.6175 libpthread-2.18.so pthread_mutex_lock > 6163 3.0811 libgnuradio-runtime-3.7.7git.so.0.0.0 > boost::detail::sp_counted_base::release() > 5942 2.9706 libpthread-2.18.so pthread_mutex_unlock > 4947 2.4732 libgnuradio-runtime-3.7.7git.so.0.0.0 > gr::block_executor::run_one_iteration() > 3826 1.9127 libgnuradio-runtime-3.7.7git.so.0.0.0 > gr::block_detail::input(unsigned int) > 3555 1.7773 libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6.0.19 > 3206 1.6028 libc-2.18.so __memmove_ssse3_back > [...] With my implementation of your suggestion: CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 90000 samples % image name symbol name 27595 35.6051 libc-2.18.so __memcpy_sse2_unaligned 12225 15.7736 no-vmlinux /no-vmlinux 4051 5.2269 libpthread-2.18.so pthread_mutex_lock 3739 4.8243 libgnuradio-runtime-3.7.7git.so.0.0.0 boost::detail::sp_counted_base::release() 3362 4.3379 libpthread-2.18.so pthread_mutex_unlock 2876 3.7108 libgnuradio-runtime-3.7.7git.so.0.0.0 gr::block_executor::run_one_iteration() 2364 3.0502 libgnuradio-runtime-3.7.7git.so.0.0.0 gr::block_detail::input(unsigned int) 2091 2.6980 libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6.0.19 1388 1.7909 libgnuradio-runtime-3.7.7git.so.0.0.0 gr::tpb_detail::notify_upstream(gr::block_detail*) 1138 1.4683 libc-2.18.so __memmove_ssse3_back [...] 2 0.0026 libvolk.so.0.0.0 __volk_32f_convert_64f_d [...] 1 0.0013 libvolk.so.0.0.0 volk_32f_convert_64f_a_avx Regards, Andy _______________________________________________ Discuss-gnuradio mailing list [email protected] https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
