Re: [Numpy-discussion] Calling C code that assumes SIMD aligned data.
On Thu, May 5, 2016 at 2:10 PM, Øystein Schønning-Johansen < oyste...@gmail.com> wrote: > Thanks for your answer, Francesc. Knowing that there is no numpy solution > saves the work of searching for this. I've not tried the solution described > at SO, but it looks like a real performance killer. I'll rather try to > override malloc with glibs malloc_hooks or LD_PRELOAD tricks. Do you think > that will do it? I'll try it and report back. > > Thanks, > -Øystein > Might take a look at how numpy handles this in `numpy/core/src/umath/simd.inc.src`. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Calling C code that assumes SIMD aligned data.
Thanks for your answer, Francesc. Knowing that there is no numpy solution saves the work of searching for this. I've not tried the solution described at SO, but it looks like a real performance killer. I'll rather try to override malloc with glibs malloc_hooks or LD_PRELOAD tricks. Do you think that will do it? I'll try it and report back. Thanks, -Øystein On Thu, May 5, 2016 at 1:55 PM, Francesc Alted wrote: > 2016-05-05 11:38 GMT+02:00 Øystein Schønning-Johansen > : > >> Hi! >> >> I've written a little code of numpy code that does a neural network >> feedforward calculation: >> >> def feedforward(self,x): >> for activation, w, b in zip( self.activations, self.weights, >> self.biases ): >> x = activation( np.dot(w, x) + b) >> >> This works fine when my activation functions are in Python, however I've >> wrapped the activation functions from a C implementation that requires the >> array to be memory aligned. (due to simd instructions in the C >> implementation.) So I need the operation np.dot( w, x) + b to return a >> ndarray where the data pointer is aligned. How can I do that? Is it >> possible at all? >> > > Yes. np.dot() does accept an `out` parameter where you can pass your > aligned array. The way for testing if numpy is returning you an aligned > array is easy: > > In [15]: x = np.arange(6).reshape(2,3) > > In [16]: x.ctypes.data % 16 > Out[16]: 0 > > but: > > In [17]: x.ctypes.data % 32 > Out[17]: 16 > > so, in this case NumPy returned a 16-byte aligned array which should be > enough for 128 bit SIMD (SSE family). This kind of alignment is pretty > common in modern computers. If you need 256 bit (32-byte) alignment then > you will need to build your container manually. See here for an example: > http://stackoverflow.com/questions/9895787/memory-alignment-for-fast-fft-in-python-using-shared-arrrays > > Francesc > > >> >> (BTW: the function works correctly about 20% of the time I run it, and >> else it segfaults on the simd instruction in the the C function) >> >> Thanks, >> -Øystein >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Francesc Alted > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Calling C code that assumes SIMD aligned data.
2016-05-05 11:38 GMT+02:00 Øystein Schønning-Johansen : > Hi! > > I've written a little code of numpy code that does a neural network > feedforward calculation: > > def feedforward(self,x): > for activation, w, b in zip( self.activations, self.weights, > self.biases ): > x = activation( np.dot(w, x) + b) > > This works fine when my activation functions are in Python, however I've > wrapped the activation functions from a C implementation that requires the > array to be memory aligned. (due to simd instructions in the C > implementation.) So I need the operation np.dot( w, x) + b to return a > ndarray where the data pointer is aligned. How can I do that? Is it > possible at all? > Yes. np.dot() does accept an `out` parameter where you can pass your aligned array. The way for testing if numpy is returning you an aligned array is easy: In [15]: x = np.arange(6).reshape(2,3) In [16]: x.ctypes.data % 16 Out[16]: 0 but: In [17]: x.ctypes.data % 32 Out[17]: 16 so, in this case NumPy returned a 16-byte aligned array which should be enough for 128 bit SIMD (SSE family). This kind of alignment is pretty common in modern computers. If you need 256 bit (32-byte) alignment then you will need to build your container manually. See here for an example: http://stackoverflow.com/questions/9895787/memory-alignment-for-fast-fft-in-python-using-shared-arrrays Francesc > > (BTW: the function works correctly about 20% of the time I run it, and > else it segfaults on the simd instruction in the the C function) > > Thanks, > -Øystein > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Calling C code that assumes SIMD aligned data.
Hi! I've written a little code of numpy code that does a neural network feedforward calculation: def feedforward(self,x): for activation, w, b in zip( self.activations, self.weights, self.biases ): x = activation( np.dot(w, x) + b) This works fine when my activation functions are in Python, however I've wrapped the activation functions from a C implementation that requires the array to be memory aligned. (due to simd instructions in the C implementation.) So I need the operation np.dot( w, x) + b to return a ndarray where the data pointer is aligned. How can I do that? Is it possible at all? (BTW: the function works correctly about 20% of the time I run it, and else it segfaults on the simd instruction in the the C function) Thanks, -Øystein ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion