Re: Accumulating into output_items[0] across multiple general_work()

Wael Farah Fri, 05 Jun 2026 12:14:02 -0700

Hi Dani,

Thanks for your response, very informative!
I guess it's probably reasonable not to base a block implementation on an 
assumption of “it works because I looked at the source code”.
And you’re right, it’s true that exotic buffer implementations are free to hand 
off output buffers how ever they want to (which I think need very specific 
reasons to do so, but still not contract-breaking).


My block is actually not (just) an FFT, it’s a bit more complicated, I just 
mentioned that for the sake of giving a simple example use case, but your 
insight is valuable.

Thanks again,
Wael

> On Jun 4, 2026, at 11:43 PM, Daniel Estévez <[email protected]> wrote:
> 
> Hi Wael,
> 
> This is a great question. My understanding is basically the same as the 
> conclusion you've arrived to. In practice this works correctly for GR3 
> regular CPU buffers, because the buffer is a ring buffer and the write 
> pointer is not advanced if you return zero as the number of items that your 
> general_work() call has produced. So the next general_work() call sees the 
> same data that the previous call wrote in the output buffer. If I remember 
> correctly, this also works in the same way for GR4 regular CPU buffers.
> 
> However, I think this is going into undocumented assumptions of how the 
> buffer system should work. I don't think this is solidly written down 
> anywhere, but for me the contract that general_work() has with the output 
> buffer is that it should never read data from the output buffer, and it needs 
> to write exactly as many items at the beginning of the buffer as the number 
> of items that are being produced. Anything else might break under an exotic 
> customs buffer implementation. For instance, I could imagine an 
> implementation which hands off an output buffer taken from a common buffer 
> pool, with no guarantee that the buffer is the same in consecutive calls even 
> if no data was produced.
> 
> Since your use case is accumulating many FFT frames, I would say that storing 
> the accumulator as a member in the block class and copying the result to the 
> output buffer whenever the accumulation has finished is quite acceptable. The 
> output data rate is going to be quite low, so the memory copies add very 
> little overhead.
> 
> Another comment is that a very long integration can be realized by cascading 
> multiple shorter integrations, which can be realized with the in-tree 
> Integrate block. For instance you could integrate 1e9 FFTs by cascading three 
> Integrate blocks, each set to a integration of 1e3. When using floating point 
> numbers this approach is also better numerically, because otherwise you are 
> sequentially adding numbers to an accumulator that ends up being 
> approximately 1e9 times larger than the input numbers, which doesn't really 
> work because the float32 machine epsilon is 1e-7.
> 
> Best,
> Dani.
> 
> On 05/06/2026 03:27, Wael Farah wrote:
>> Hi all,
>> I have a block whose output items are running averages over a long 
>> integration (for the sake of simplicity say a power spectrum accumulated 
>> over millions of FFT frames, way too many to hold as one input buffer).
>> The implementation that fell out naturally is: in general_work(), add the 
>> next batch of partial contributions directly into output_items[0] 
>> [n_emitted], return 0 while the integration is still incomplete, and only 
>> return n_emitted > 0 when one or more integrations are done. It's pretty 
>> tempting to do as this would practically avoid an extra memory allocation 
>> for an internal buffer and a memcpy back to output_items when accumulation 
>> is ready.
>> However, this relies on the assumption that returning 0 leaves the write 
>> pointer unadvanced, so the next general_work() call sees the same memory at 
>> output_items[0][n_emitted] and I can keep adding into it.
>> As far as I can tell, this works on current GR (no-op when produce_each(0) 
>> <https://github.com/gnuradio/gnuradio/blob/main/ 
>> gnuradio-runtime/lib/block_detail.cc#L123-L132> -> write pointer not 
>> updated), but it’s more like an implementation detail rather than it being 
>> documented as part of the scheduler API.
>> Two questions:
>> 1) Is the "accumulate into output_items[0] across calls" pattern supported, 
>> or am I in undocumented/unidentified scheduler/buffer behavior territory?
>> 2) If it's not supported, is there any reason beyond "no API guarantee”; 
>> e.g. would it break under certain custom buffers or futuristically GR4?
>> If the answer is just "use an internal buffer," happy to refactor.
>> Thanks!
>> Wael
>

Re: Accumulating into output_items[0] across multiple general_work()

Reply via email to