date:20140109

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Neal Becker

Charles R Harris wrote: On Wed, Jan 8, 2014 at 2:39 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: ... Another function that could be useful is a |a|**2 function, abs2 perhaps. Chuck I use mag_sqr all the time. It should be much faster for complex, if computed via: x.real**2

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Freddie Witherden

On 08/01/14 21:39, Julian Taylor wrote: An issue is software emulation of real fma. This can be enabled in the test ufunc with npfma.set_type(libc). This is unfortunately incredibly slow about a factor 300 on my machine without hardware fma. This means we either have a function that is fast

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Frédéric Bastien

Hi, It happen frequently that NumPy isn't compiled with all instruction that is available where it run. For example in distro. So if the decision is made to use the fast version when we don't use the newer instruction, the user need a way to know that. So the library need a function/attribute to

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Daπid

On 8 January 2014 22:39, Julian Taylor jtaylor.deb...@googlemail.comwrote: As you can see even without real hardware support it is about 30% faster than inplace unblocked numpy due better use of memory bandwidth. Its even more than two times faster than unoptimized numpy. I have an i5, and

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Julian Taylor

On Thu, Jan 9, 2014 at 3:54 PM, Daπid davidmen...@gmail.com wrote: On 8 January 2014 22:39, Julian Taylor jtaylor.deb...@googlemail.comwrote: As you can see even without real hardware support it is about 30% faster than inplace unblocked numpy due better use of memory bandwidth. Its even

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Julian Taylor

On Thu, Jan 9, 2014 at 3:50 PM, Frédéric Bastien no...@nouiz.org wrote: Hi, It happen frequently that NumPy isn't compiled with all instruction that is available where it run. For example in distro. So if the decision is made to use the fast version when we don't use the newer instruction,

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Nathaniel Smith

On Thu, Jan 9, 2014 at 3:30 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On Thu, Jan 9, 2014 at 3:50 PM, Frédéric Bastien no...@nouiz.org wrote: How hard would it be to provide the choise to the user? We could provide 2 functions like: fma_fast() fma_prec() (for precision)? Or this

[Numpy-discussion] ENH: add a 'return_counts=' keyword argument to `np.unique`

2014-01-09 Thread Jaime Fernández del Río

Hi, I have just sent a PR, adding a `return_counts` keyword argument to `np.unique` that does exactly what the name suggests: counting the number of times each unique time comes up in the array. It reuses the `flag` array that is constructed whenever any optional index is requested, extracts the

[Numpy-discussion] Memory allocation cleanup

2014-01-09 Thread Charles R Harris

Apropos Julian's changes https://github.com/numpy/numpy/pull/4177 to use the PyObject_* allocation suite for some parts of numpy, I posted the following I think numpy memory management is due a cleanup. Currently we have PyDataMem_* PyDimMem_* PyArray_* Plus the malloc, PyMem_*, and PyObject_*

Re: [Numpy-discussion] Memory allocation cleanup

2014-01-09 Thread Frédéric Bastien

This shouldn't affect Theano. So I have no objection. Making thing faster and more tracktable is always good. So I think it seam a good idea. Fred On Thu, Jan 9, 2014 at 6:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: Apropos Julian's changes to use the PyObject_* allocation suite

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Frédéric Bastien

Good questions where do we stop. I think as you that the fma with guarantees is a good new feature. But if this is made available, people will want to use it for speed. Some people won't like to use another library or dependency. They won't like to have random speed up or slow down. So why not

Re: [Numpy-discussion] adding fused multiply and add to numpy

2014-01-09 Thread Julian Taylor

On 10.01.2014 01:49, Frédéric Bastien wrote: Do you know if those instruction are automatically used by gcc if we use the good architecture parameter? they are used if you enable -ffp-contract=fast. Do not set it to `on` this is an alias to `off` due to the semantics of C. -ffast-math

Re: [Numpy-discussion] Memory allocation cleanup

2014-01-09 Thread Nathaniel Smith

On Thu, Jan 9, 2014 at 11:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: Apropos Julian's changes to use the PyObject_* allocation suite for some parts of numpy, I posted the following I think numpy memory management is due a cleanup. Currently we have PyDataMem_* PyDimMem_*

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] adding fused multiply and add to numpy

[Numpy-discussion] ENH: add a 'return_counts=' keyword argument to `np.unique`

[Numpy-discussion] Memory allocation cleanup

Re: [Numpy-discussion] Memory allocation cleanup

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] adding fused multiply and add to numpy

Re: [Numpy-discussion] Memory allocation cleanup

13 matches

Site Navigation

Mail list logo

Footer information