Re: [VOLK] a += b*c ?

2022-08-18 Thread Marcus Müller
Do we have documentation that an `add` implementation must be able to 
work in-place? Otherwise, we should probably write that down :)


Also, on the API: C99 wise, I'm pretty sure this is a strict aliasing 
rule violation: pointers of different types mustn't point to the same 
data. The compiler is totally allowed to assume the first and second 
argument to volk_32f_x2_add_32f are pointing *distinct* objects, and 
hence could optimize as if the (const) a never changes as soon as the 
function has been entered. But exactly that happens. Don't see how this 
can go wrong for an operation like addition, but I honestly think the 
semantics of type_x2_operation_type should be that two inputs that are 
not the output are passed. If we want in-place kernels, we should 
probably have them separately.


Cheers,
Marcus


On 8/16/22 14:13, Johannes Demel wrote:

Hi Randall,

in your case,

https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32f_x2_multiply_32f.h 



followed by
https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32f_x2_add_32f.h 



would be the way to go at the moment.
```
volk_32f_x2_multiply_32f(multiply_result, b, c, num_samples);
volk_32f_x2_add_32f(a, a, multiply_result, num_samples);
```

You're welcome to start a new kernel
```
volk_32f_x3_multiply_add_32f(out, a, b, c, num_samples);
```
In fact, it would be a great addition to VOLK.

Cheers
Johannes


On 16.08.22 01:38, Randall Wayth wrote:
Thanks for the suggestions and apologies for not being 100% clear at 
the start.

I'm not looking for a dot product. I'm looking for
a[i] += b[i]*c[i]     specifically for floating point

So it would be the equivalent of IPP's ippsAddProduct_32f.
The application is to apply a window to a set of samples before 
accumulating, to implement a weighted overlap add PFB. In my case the 
samples are real-valued, but I could also see a case for a and b 
being complex, or the case for b being 8 or 16-bit ints with a and c 
being floating point.


Cheers,
Randall




Re: [VOLK] a += b*c ?

2022-08-16 Thread Johannes Demel

Hi Randall,

in your case,

https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32f_x2_multiply_32f.h

followed by
https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32f_x2_add_32f.h

would be the way to go at the moment.
```
volk_32f_x2_multiply_32f(multiply_result, b, c, num_samples);
volk_32f_x2_add_32f(a, a, multiply_result, num_samples);
```

You're welcome to start a new kernel
```
volk_32f_x3_multiply_add_32f(out, a, b, c, num_samples);
```
In fact, it would be a great addition to VOLK.

Cheers
Johannes


On 16.08.22 01:38, Randall Wayth wrote:
Thanks for the suggestions and apologies for not being 100% clear at the 
start.

I'm not looking for a dot product. I'm looking for
a[i] += b[i]*c[i]     specifically for floating point

So it would be the equivalent of IPP's ippsAddProduct_32f.
The application is to apply a window to a set of samples before 
accumulating, to implement a weighted overlap add PFB. In my case the 
samples are real-valued, but I could also see a case for a and b being 
complex, or the case for b being 8 or 16-bit ints with a and c being 
floating point.


Cheers,
Randall


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [VOLK] a += b*c ?

2022-08-15 Thread Randall Wayth
Thanks for the suggestions and apologies for not being 100% clear at the
start.
I'm not looking for a dot product. I'm looking for
a[i] += b[i]*c[i] specifically for floating point

So it would be the equivalent of IPP's ippsAddProduct_32f.
The application is to apply a window to a set of samples before
accumulating, to implement a weighted overlap add PFB. In my case the
samples are real-valued, but I could also see a case for a and b being
complex, or the case for b being 8 or 16-bit ints with a and c being
floating point.

Cheers,
Randall


Re: [VOLK] a += b*c ?

2022-08-15 Thread Johannes Demel

Hi,

the kernels are type specific. However, if you want a dot product:


a = \sum_{i=0}^{N-1} b[i]c[i],

https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32fc_x2_dot_prod_32fc.h


A [i] += B [i] * C [i], for i = 0...N-1


This would implement the above formula:
1.
https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32fc_x2_multiply_32fc.h
2.
https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32fc_x2_add_32fc.h

It would be interesting to see how much faster an integrated kernel 
would be.


The ipp `AddProduct function:
https://www.intel.com/content/www/us/en/develop/documentation/ipp-dev-reference/top/volume-1-signal-and-data-processing/essential-functions/arithmetic-functions/addproduct.html

I assume ipp aims for:
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3202,3202,3202,3206,3206,3206=_mm256_fmadd_ps

Which version are you looking for?

Cheers
Johannes


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [VOLK] a += b*c ?

2022-08-15 Thread Fons Adriaensen
On Mon, Aug 15, 2022 at 12:03:25PM +0200, Marcus Müller wrote:
> Just to be sure you mean, for N-long b and c,
> 
> a = \sum_{i=0}^{N-1} b[i]c[i],

I suspect the OP meant

A [i] += B [i] * C [i], for i = 0...N-1

-- 
FA




Re: [VOLK] a += b*c ?

2022-08-15 Thread Marcus Müller

Just to be sure you mean, for N-long b and c,

a = \sum_{i=0}^{N-1} b[i]c[i],

right?

That's the dot product, and that exists for a couple in/output types; 
the kernels you're looking for are all called dot_prod.



Cheers,

Marcus

On 8/15/22 04:41, Randall Wayth wrote:

Hi Folks,

Hopefully I am just missing this, but is there a kernel that does 
vectorised a += b*c ?


Something like the IPP "AddProduct" function?

Cheers,
Randall.




[VOLK] a += b*c ?

2022-08-14 Thread Randall Wayth
Hi Folks,

Hopefully I am just missing this, but is there a kernel that does
vectorised a += b*c ?

Something like the IPP "AddProduct" function?

Cheers,
Randall.