Re: [Discuss-gnuradio] Try to improve E100's performance at high sample rate

ziyang Tue, 17 Jan 2012 11:16:44 -0800

On 01/17/2012 07:36 PM, Josh Blum wrote:


On 01/16/2012 09:51 AM, ziyang wrote:

On 01/13/2012 09:30 PM, Josh Blum wrote:

To reduce the computation load of the processor, I tried two methods:
1) modify the gr.quadrature_demod_cf block, replace some multiplication
operations with volk-based operations (gr.multiply and gr.multiply_const
modules in gr_blocks);

I like it. Make sure to contribute patches like that back. :-)

Actually, what I did was writing a new quadrature_demod block without
the multiplication and delay operations, and connect extra gr.multiply
and gr.delay blocks instead in the flow graph. Because my understanding
is that the volk functions take a vector (multiple values) as input, and
I didn't figure out a way to do the single-item-operation in the volk
style.

I dont recommend using the extra blocks, that would probably cause more
overhead. Looking at gr_quadrature_demod_cf::work, it looks like you can
vectorize the operation of the conjugate multiply, then the atan, then
the gain scaler. So, that would be one for loop that operates on 4
samples at a time, and calls 3 volk functions.

Josh, thank you for your advice! Before I tried using gr.multiply out ofthe block, I actually implemented a demodulation block in a way that'ssimilar to your suggestion, but the loop operated on 100 samples at atime. I don't know if it was the 100-samples-vectorization that caused abad performance. I will try processing 4 samples at a time.

Also, you may consider timing a particular operation as a performance
metric, rather than counting the number of demodulated packets.

I was wondering if there are examples from which I can learn how to do
this?

Sorry, I guess there isnt much in the way of examples.

You can time individual work functions by adding some code before an
after. We have some high resolution timers in
gruel/include/gruel/high_res_timers.h

So I call the timer functions of high_res_timers.h before and after theoperation in the work function, is that right?

I have also seen people time the block in a simple flow graph with a
null source, head, your_block, null_sink. You can time tb.run() and
compare run duration vs the non-vectorized code.

-Josh


I got two questions about this:

1) Is the "head" block for generating data for the processing block?

2) The initialization of uhd is done first after tb.run(), so how couldI isolate the processing time from the time between tb.run() - tb.stop() ?


Thanks.


Best Regards,

Terry

_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio



_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Re: [Discuss-gnuradio] Try to improve E100's performance at high sample rate

Reply via email to