When testing, I used 5 float streams rumning at over 150 Msps each, with 15 
microsecomd bursts of 50 MHz at about 10 microseconds apart.  I used enough x 
points to see two bursts on the gui.  Normal trigger.   (Free or auto trigger 
moght be too taxing.)

-Regards
Andy

On March 28, 2015 8:06:08 PM EDT, Tom Rondeau <[email protected]> wrote:
>On Sat, Mar 28, 2015 at 12:50 PM, Andy Walls
><[email protected]>
>wrote:
>
>> On Sat, 2015-03-28 at 14:45 -0400, Andy Walls wrote:
>> > Hi Tom:
>> >
>> >
>> > On Sat, 2015-03-28 at 11:12 -0700, Tom Rondeau wrote:
>> > > On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls
>> > > <[email protected]> wrote:
>> >
>> > >         Can this memmove() be safely skipped
>> > >
>> > >
>>
>https://github.com/gnuradio/gnuradio/blob/master/gr-qtgui/lib/time_sink_f_impl.cc#L627
>> > [snip]
>> > >         The volk_32f_convert_64f_u_avx() call is unavoidable as
>Qwt
>> > >         wants
>> > >         doubles for plotting and not floats. But it might also be
>able
>> > >         to be
>> > >         deferred to the very end when the decision to plot is
>known
>> > >         for sure.
>> > >         (But that's more surgery than I care to take on at the
>> > >         moment.)
>> >
>>
>> >
>> > >  But thinking about the volk convert function, that's both
>copying the
>> > > data from the input buffer into the internal buffer as well as
>> > > performing the conversion. We can't just hold data in the input
>since
>> > > we don't want to back up the data until we're ready to plot both
>with
>> > > timing and with a full enough buffer -- it's just sampling a
>section
>> > > at a time and drops everything in between.
>> >
>> > Right.
>> >
>> > >  That part could be converted into a memcpy instead of the volk
>> > > convert. Then, when we're ready to plot, we call the volk convert
>that
>> > > also does the move from d_start to 0, so it combines those two
>> > > elements.
>> >
>> > Yeah, that's the surgery part. :)  It would require adding a new
>set of
>> > buffers to hold floats objects, and then convert them when a
>> > determination to plot was made.
>> >
>> > This also affects the memmove() of the tail for the trigger delay. 
>It
>> > would operate on the new set of float buffers (vs the buffers
>holding
>> > doubles).
>> >
>> > > Thoughts on those proposals?
>>
>> Your proposal for implementing memcpy() and deferring volk_*() to do
>the
>> conversion and "memmove" in one step is great!  :)
>>
>> I just implemented it, and the time_sink_f thread has gone from 41.5%
>> CPU down to 29.1% CPU in my tests. :)  memcpy() now dominates the
>> thread, but that's to be expected.
>>
>>
>>
>> With my initial hack:
>>
>> > CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz
>(estimated)
>> > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with
>a
>> unit mask of 0x00 (No unit mask) count 100000
>> > samples  %        image name               symbol name
>> > 78158    39.0737  libvolk.so.0.0.0        
>volk_32f_convert_64f_u_avx
>> > 22777    11.3870  no-vmlinux               /no-vmlinux
>> > 13972     6.9851  libgnuradio-qtgui-3.7.7git.so.0.0.0
>> gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const
>> > 7781      3.8900  libgnuradio-qtgui-3.7.7git.so.0.0.0
>> gr::qtgui::time_sink_f_impl::_test_trigger_norm(int, std::vector<void
>> const*, std::allocator<void const*> >)
>> > 7236      3.6175  libpthread-2.18.so       pthread_mutex_lock
>> > 6163      3.0811  libgnuradio-runtime-3.7.7git.so.0.0.0
>> boost::detail::sp_counted_base::release()
>> > 5942      2.9706  libpthread-2.18.so       pthread_mutex_unlock
>> > 4947      2.4732  libgnuradio-runtime-3.7.7git.so.0.0.0
>> gr::block_executor::run_one_iteration()
>> > 3826      1.9127  libgnuradio-runtime-3.7.7git.so.0.0.0
>> gr::block_detail::input(unsigned int)
>> > 3555      1.7773  libstdc++.so.6.0.19     
>/usr/lib64/libstdc++.so.6.0.19
>> > 3206      1.6028  libc-2.18.so             __memmove_ssse3_back
>> > [...]
>>
>> With my implementation of your suggestion:
>>
>> CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz
>(estimated)
>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
>unit
>> mask of 0x00 (No unit mask) count 90000
>> samples  %        image name               symbol name
>> 27595    35.6051  libc-2.18.so             __memcpy_sse2_unaligned
>> 12225    15.7736  no-vmlinux               /no-vmlinux
>> 4051      5.2269  libpthread-2.18.so       pthread_mutex_lock
>> 3739      4.8243  libgnuradio-runtime-3.7.7git.so.0.0.0
>> boost::detail::sp_counted_base::release()
>> 3362      4.3379  libpthread-2.18.so       pthread_mutex_unlock
>> 2876      3.7108  libgnuradio-runtime-3.7.7git.so.0.0.0
>> gr::block_executor::run_one_iteration()
>> 2364      3.0502  libgnuradio-runtime-3.7.7git.so.0.0.0
>> gr::block_detail::input(unsigned int)
>> 2091      2.6980  libstdc++.so.6.0.19     
>/usr/lib64/libstdc++.so.6.0.19
>> 1388      1.7909  libgnuradio-runtime-3.7.7git.so.0.0.0
>> gr::tpb_detail::notify_upstream(gr::block_detail*)
>> 1138      1.4683  libc-2.18.so             __memmove_ssse3_back
>> [...]
>> 2         0.0026  libvolk.so.0.0.0         __volk_32f_convert_64f_d
>> [...]
>> 1         0.0013  libvolk.so.0.0.0         volk_32f_convert_64f_a_avx
>>
>>
>> Regards,
>> Andy
>>
>
>
>Andy,
>
>Excellent!
>
>I've got a few other minor patches for some things, I'll put this in
>there
>to and test on my end as well.
>
>Tom
_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to