On Sat, Mar 28, 2015 at 12:50 PM, Andy Walls <[email protected]>
wrote:

> On Sat, 2015-03-28 at 14:45 -0400, Andy Walls wrote:
> > Hi Tom:
> >
> >
> > On Sat, 2015-03-28 at 11:12 -0700, Tom Rondeau wrote:
> > > On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls
> > > <[email protected]> wrote:
> >
> > >         Can this memmove() be safely skipped
> > >
> > >
> https://github.com/gnuradio/gnuradio/blob/master/gr-qtgui/lib/time_sink_f_impl.cc#L627
> > [snip]
> > >         The volk_32f_convert_64f_u_avx() call is unavoidable as Qwt
> > >         wants
> > >         doubles for plotting and not floats. But it might also be able
> > >         to be
> > >         deferred to the very end when the decision to plot is known
> > >         for sure.
> > >         (But that's more surgery than I care to take on at the
> > >         moment.)
> >
>
> >
> > >  But thinking about the volk convert function, that's both copying the
> > > data from the input buffer into the internal buffer as well as
> > > performing the conversion. We can't just hold data in the input since
> > > we don't want to back up the data until we're ready to plot both with
> > > timing and with a full enough buffer -- it's just sampling a section
> > > at a time and drops everything in between.
> >
> > Right.
> >
> > >  That part could be converted into a memcpy instead of the volk
> > > convert. Then, when we're ready to plot, we call the volk convert that
> > > also does the move from d_start to 0, so it combines those two
> > > elements.
> >
> > Yeah, that's the surgery part. :)  It would require adding a new set of
> > buffers to hold floats objects, and then convert them when a
> > determination to plot was made.
> >
> > This also affects the memmove() of the tail for the trigger delay.  It
> > would operate on the new set of float buffers (vs the buffers holding
> > doubles).
> >
> > > Thoughts on those proposals?
>
> Your proposal for implementing memcpy() and deferring volk_*() to do the
> conversion and "memmove" in one step is great!  :)
>
> I just implemented it, and the time_sink_f thread has gone from 41.5%
> CPU down to 29.1% CPU in my tests. :)  memcpy() now dominates the
> thread, but that's to be expected.
>
>
>
> With my initial hack:
>
> > CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
> > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
> unit mask of 0x00 (No unit mask) count 100000
> > samples  %        image name               symbol name
> > 78158    39.0737  libvolk.so.0.0.0         volk_32f_convert_64f_u_avx
> > 22777    11.3870  no-vmlinux               /no-vmlinux
> > 13972     6.9851  libgnuradio-qtgui-3.7.7git.so.0.0.0
> gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const
> > 7781      3.8900  libgnuradio-qtgui-3.7.7git.so.0.0.0
> gr::qtgui::time_sink_f_impl::_test_trigger_norm(int, std::vector<void
> const*, std::allocator<void const*> >)
> > 7236      3.6175  libpthread-2.18.so       pthread_mutex_lock
> > 6163      3.0811  libgnuradio-runtime-3.7.7git.so.0.0.0
> boost::detail::sp_counted_base::release()
> > 5942      2.9706  libpthread-2.18.so       pthread_mutex_unlock
> > 4947      2.4732  libgnuradio-runtime-3.7.7git.so.0.0.0
> gr::block_executor::run_one_iteration()
> > 3826      1.9127  libgnuradio-runtime-3.7.7git.so.0.0.0
> gr::block_detail::input(unsigned int)
> > 3555      1.7773  libstdc++.so.6.0.19      /usr/lib64/libstdc++.so.6.0.19
> > 3206      1.6028  libc-2.18.so             __memmove_ssse3_back
> > [...]
>
> With my implementation of your suggestion:
>
> CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
> mask of 0x00 (No unit mask) count 90000
> samples  %        image name               symbol name
> 27595    35.6051  libc-2.18.so             __memcpy_sse2_unaligned
> 12225    15.7736  no-vmlinux               /no-vmlinux
> 4051      5.2269  libpthread-2.18.so       pthread_mutex_lock
> 3739      4.8243  libgnuradio-runtime-3.7.7git.so.0.0.0
> boost::detail::sp_counted_base::release()
> 3362      4.3379  libpthread-2.18.so       pthread_mutex_unlock
> 2876      3.7108  libgnuradio-runtime-3.7.7git.so.0.0.0
> gr::block_executor::run_one_iteration()
> 2364      3.0502  libgnuradio-runtime-3.7.7git.so.0.0.0
> gr::block_detail::input(unsigned int)
> 2091      2.6980  libstdc++.so.6.0.19      /usr/lib64/libstdc++.so.6.0.19
> 1388      1.7909  libgnuradio-runtime-3.7.7git.so.0.0.0
> gr::tpb_detail::notify_upstream(gr::block_detail*)
> 1138      1.4683  libc-2.18.so             __memmove_ssse3_back
> [...]
> 2         0.0026  libvolk.so.0.0.0         __volk_32f_convert_64f_d
> [...]
> 1         0.0013  libvolk.so.0.0.0         volk_32f_convert_64f_a_avx
>
>
> Regards,
> Andy
>


Andy,

Excellent!

I've got a few other minor patches for some things, I'll put this in there
to and test on my end as well.

Tom
_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to