Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-28 Thread David Cournapeau
On Tue, Oct 28, 2014 at 5:24 AM, Sturla Molden sturla.mol...@gmail.com
wrote:

 Matthew Brett matthew.br...@gmail.com wrote:

  Is this an option for us?  Aren't we a little behind the performance
  curve on FFT after we lost FFTW?

 It does not run on Windows because it uses POSIX to allocate executable
 memory for tasklets, as i understand it.

 By the way, why did we loose FFTW, apart from GPL? One thing to mention
 here is that MKL supports the FFTW APIs. If we can use MKL for linalg and
 numpy.dot I don't see why we cannot use it for FFT.


The problem is APIs: MKL, Accelerate, etc... all use a standard API
(BLAS/LAPACK), but for FFT, you need to reimplement pretty much the whole
thing. Unsurprisingly, this meant the code was not well maintained.

Wrapping non standard, non-BSD libraries makes much more sense in separate
libraries in general.

David



 On Mac there is also vDSP in Accelerate framework which has an insanely
 fast FFT (also claimed to be faster than FFTW). Since it is a system
 library there should be no license problems.

 There are clearly options if someone wants to work on it and maintain it.

 Sturla

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Jerome Kieffer
On Tue, 28 Oct 2014 04:28:37 +
Nathaniel Smith n...@pobox.com wrote:

 It's definitely attractive. Some potential issues that might need dealing
 with, based on a quick skim:

In my tests, numpy's FFTPACK isn't that bad considering 
* (virtually) no extra overhead for installation
* (virtually) no plan creation time
* not that slower for each transformation

Because the plan creation was taking ages with FFTw, numpy's FFTPACK was often 
faster (overall)

Cheers,
-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.i and std::complex

2014-10-28 Thread Robert Kern
On Mon, Oct 27, 2014 at 11:36 PM, Sturla Molden sturla.mol...@gmail.com wrote:
 Robert Kern robert.k...@gmail.com wrote:

 Please stop haranguing the new guy for not knowing things that you
 know.

 I am not doing any of that. You are the only one haranguing here.

I understand that it's not your *intention*, so please take this as a
well-meant caution from a outside observer that it *is* the effect of
your words on other people, and if you intend something else, you may
want to consider your words more carefully. The polite, welcoming
response to someone coming along with a straightforward,
obviously-correct contribution to our SWIG capabilities is Thank
you!, not perhaps you overestimate the number of NumPy users who use
Swig. You are entitled to your opinions on the relative merits of
Cython and SWIG and to argue for them, but not every thread mentioning
SWIG is an appropriate forum for hashing out that argument.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Charles R Harris
On Tue, Oct 28, 2014 at 1:32 AM, Jerome Kieffer jerome.kief...@esrf.fr
wrote:

 On Tue, 28 Oct 2014 04:28:37 +
 Nathaniel Smith n...@pobox.com wrote:

  It's definitely attractive. Some potential issues that might need dealing
  with, based on a quick skim:

 In my tests, numpy's FFTPACK isn't that bad considering
 * (virtually) no extra overhead for installation
 * (virtually) no plan creation time
 * not that slower for each transformation

 Because the plan creation was taking ages with FFTw, numpy's FFTPACK was
 often faster (overall)

 Cheers,


Ondrej says that f90 fftpack (his mod) runs faster than fftw. The main
thing missing from fftpack is the handling of transform sizes that are not
products of 2,3,4,5.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread David Cournapeau
On Tue, Oct 28, 2014 at 9:19 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Tue, Oct 28, 2014 at 1:32 AM, Jerome Kieffer jerome.kief...@esrf.fr
 wrote:

 On Tue, 28 Oct 2014 04:28:37 +
 Nathaniel Smith n...@pobox.com wrote:

  It's definitely attractive. Some potential issues that might need
 dealing
  with, based on a quick skim:

 In my tests, numpy's FFTPACK isn't that bad considering
 * (virtually) no extra overhead for installation
 * (virtually) no plan creation time
 * not that slower for each transformation

 Because the plan creation was taking ages with FFTw, numpy's FFTPACK was
 often faster (overall)

 Cheers,


 Ondrej says that f90 fftpack (his mod) runs faster than fftw.


I would be interested to see the benchmarks for this.

The real issue with fftw (besides the license) is the need for plan
computation, which are expensive (but are not needed for each transform).
Handling this in a way that is user friendly while tweakable for advanced
users is not easy, and IMO more appropriate for a separate package.

The main thing missing from fftpack is the handling of transform sizes that
 are not products of 2,3,4,5.


Strickly speaking, it is handled, just not through an FFT (it goes back to
the brute force O(N**2)).

I made some experiments with the Bluestein transform to handle prime
transforms on fftpack, but the precision seemed to be an issue. Maybe I
should revive this work (if I still have it somewhere).

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Henry Gomersall
On 28/10/14 09:41, David Cournapeau wrote:
 The real issue with fftw (besides the license) is the need for plan 
 computation, which are expensive (but are not needed for each 
 transform). Handling this in a way that is user friendly while 
 tweakable for advanced users is not easy, and IMO more appropriate for 
 a separate package.

Just on this, I like to think I've largely solved the issue with:
https://github.com/hgomersall/pyFFTW

If you have suggestions on how it can be improved, I'm all ears (there 
are a few things in the pipeline, like creating FFTW objects for 
different types of transform more explicit, which is likely to be the 
main difference for the next major version).

Cheers,

Henry
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Henry Gomersall
On 28/10/14 04:28, Nathaniel Smith wrote:

 - not sure if it can handle non-power-of-two problems at all, or at 
 all efficiently. (FFTPACK isn't great here either but major 
 regressions would be bad.)


 From my reading, this seems to be the biggest issue with FFTS (from my 
reading as well) and where FFTW really wins.

Having a faster algorithm used when it will work, with fallback to 
fftpack (or something else) is a good solution IMO.

Henry
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-28 Thread Stefan van der Walt
Hi Michael

On 2014-10-27 15:26:58, D. Michael McFarland dm...@dmmcf.net wrote:
 What I would like to ask about is the situation this illustrates, where
 both NumPy and SciPy provide similar functionality (sometimes identical,
 to judge by the documentation).  Is there some guidance on which is to
 be preferred?  I could argue that using only NumPy when possible avoids
 unnecessary dependence on SciPy in some code, or that using SciPy
 consistently makes for a single interface and so is less error prone.
 Is there a rule of thumb for cases where SciPy names shadow NumPy names?

I'm not sure if you've received an answer to your question so far. My
advice: use the SciPy functions.  SciPy is often built on more extensive
Fortran libraries not available during NumPy compilation, and I am not
aware of any cases where a function in NumPy is faster or more extensive
than the equivalent in SciPy.

If you want code that falls back gracefully when SciPy is not available,
you may use the ``numpy.dual`` library.

Regards
Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.i and std::complex

2014-10-28 Thread Sturla Molden
Robert Kern robert.k...@gmail.com wrote:
 The polite, welcoming
 response to someone coming along with a straightforward,
 obviously-correct contribution to our SWIG capabilities is Thank
 you!, not perhaps you overestimate the number of NumPy users who use
 Swig. 

That was a response to something else. As to why this issue with NumPy and
Swig has not been solved before, the OP suggested he might have
overestimated the number of NumPy users who also use std::complex in C++.
Hence my answer he did not (it is arguably not that uncommon), but maybe
they don't use Swig.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Sturla Molden
Jerome Kieffer jerome.kief...@esrf.fr wrote:

 Because the plan creation was taking ages with FFTw, numpy's FFTPACK was
 often faster (overall)

Matlab switched from FFTPACK to FFTW because the latter was faster in
general. If FFTW guesses a plan it does not take very long. Actual
measurements can be slow, however, but those are not needed. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Sturla Molden
David Cournapeau courn...@gmail.com wrote:

 The real issue with fftw (besides the license) is the need for plan
 computation, which are expensive (but are not needed for each transform).

This is not a problem if you thell FFTW to guess a plan instead of making
measurements. FFTPACK needs to set up a look-up table too.

 I made some experiments with the Bluestein transform to handle prime
 transforms on fftpack, but the precision seemed to be an issue. Maybe I
 should revive this work (if I still have it somewhere).

You have it in a branch on Github.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-28 Thread Pierre Barbier de Reuille
I would add one element to the discussion: for some (odd) reasons, SciPy is
lacking the functions `rfftn` and `irfftn`, functions using half the memory
space compared to their non-real equivalent `fftn` and `ifftn`. However, I
haven't (yet) seriously tested `scipy.fftpack.fftn` vs. `np.fft.rfftn` to
check if there is a serious performance gain (beside memory usage).

Cheers,

Pierre

On Tue Oct 28 2014 at 10:54:00 Stefan van der Walt ste...@sun.ac.za wrote:

 Hi Michael

 On 2014-10-27 15:26:58, D. Michael McFarland dm...@dmmcf.net wrote:
  What I would like to ask about is the situation this illustrates, where
  both NumPy and SciPy provide similar functionality (sometimes identical,
  to judge by the documentation).  Is there some guidance on which is to
  be preferred?  I could argue that using only NumPy when possible avoids
  unnecessary dependence on SciPy in some code, or that using SciPy
  consistently makes for a single interface and so is less error prone.
  Is there a rule of thumb for cases where SciPy names shadow NumPy names?

 I'm not sure if you've received an answer to your question so far. My
 advice: use the SciPy functions.  SciPy is often built on more extensive
 Fortran libraries not available during NumPy compilation, and I am not
 aware of any cases where a function in NumPy is faster or more extensive
 than the equivalent in SciPy.

 If you want code that falls back gracefully when SciPy is not available,
 you may use the ``numpy.dual`` library.

 Regards
 Stéfan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-28 Thread Sturla Molden
Pierre Barbier de Reuille pie...@barbierdereuille.net wrote:

 I would add one element to the discussion: for some (odd) reasons, SciPy is
 lacking the functions `rfftn` and `irfftn`, functions using half the memory
 space compared to their non-real equivalent `fftn` and `ifftn`. 

In both NumPy and SciPy the N-dimensional FFTs are implemented in Python.
It is just a Python loop over all the axes, calling fft or rfft on each
axis.

 However, I
 haven't (yet) seriously tested `scipy.fftpack.fftn` vs. `np.fft.rfftn` to
 check if there is a serious performance gain (beside memory usage).

Real-value FFT is implemented with complex-value FFT. You save half the
memory, but not quite half the computation. Apart from that, the FFT in
SciPy is written in Fortran and the FFT in NumPy is written in C, but they
are algorithmically similar. I don't see any good reason why the Fortran
code in SciPy should be faster than the C code in NumPy. It used to be the
case that Fortran was faster than C, everything else being equal, but
with modern C compilers and CPUs with deep pipelines and branch prediction
this is rarely the case. So I would expect the NumPy rfftn to be slightly
faster than SciPy fftn, but keep in mind that both have a huge Python
overhead.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multi-dimensional c++ proposal

2014-10-28 Thread Sturla Molden
Neal Becker ndbeck...@gmail.com wrote:

 That's harsh!  Do you have any specific features you dislike?  Are you 
 objecting 
 to the syntax?

I have programmed C++ for almost 15 years. But I cannot look at the
proposed code an get a mental image of what it does. It is not a specific
feature, but how the code looks in general. This is e.g. not a problem with
Eigen or Blitz, if you know C++ it is not particularly hard to read. Not as
nice as Fortran or Cython, but ut is still not too bad. Boost multiarray
suffers from not being particularly readable, however, but this proposal is
even worse. I expect that scientists and engineers will not use an
unreadable array API. When we write or maintain numerical algorithms we
need to get a mental image of the code, because we actually spend most of
the time looking at or reading the code.

I agree that C++ needs multidimensional arrays in the STL, but this
proposal will do more harm than good. In particular it will prevent
adoption of a usable array API. And as consequence, it will fuel the
problem it is trying to solve: C++ programmers will still used homebrewed
multiarray classes, because there is no obvious replacement in the standard
library.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Nathaniel Smith
On 28 Oct 2014 07:32, Jerome Kieffer jerome.kief...@esrf.fr wrote:

 On Tue, 28 Oct 2014 04:28:37 +
 Nathaniel Smith n...@pobox.com wrote:

  It's definitely attractive. Some potential issues that might need
dealing
  with, based on a quick skim:

 In my tests, numpy's FFTPACK isn't that bad considering
 * (virtually) no extra overhead for installation
 * (virtually) no plan creation time
 * not that slower for each transformation

Well, this is what makes FFTS intriguing :-). It's BSD licensed, so we
could distribute it by default like we do fftpack, it uses cache-oblivious
algorithms so it has no planning step, and even without planning it
benchmarks as faster than FFTW's most expensive planning mode (in the cases
that FFTS supports, i.e. power-of-two transforms).

The paper has lots of benchmark graphs, including measurements of setup
time:
  http://anthonix.com/ffts/preprints/tsp2013.pdf

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Eelco Hoogendoorn
If I may 'hyjack' the discussion back to the meta-point:

should we be having this discussion on the numpy mailing list at all?

Perhaps the 'batteries included' philosophy made sense in the early days of
numpy; but given that there are several fft libraries with their own pros
and cons, and that most numpy projects will use none of them at all, why
should numpy bundle any of them?

To have a scipy.linalg and scipy.fft makes sense to me, although import
pyfftw or import pyFFTPACK would arguably be better still. Just as in the
case of linear algebra, those different libraries represent meaningful
differences, and if the user wants to paper over those differences with a
named import they are always free to do so themselves, explicitly. To be
sure, the maintenance of quality fft libraries should be part of the
numpy/scipy-stack in some way or another. But I would argue that the core
thing that numpy should do is ndarrays alone.

On Tue, Oct 28, 2014 at 11:11 AM, Sturla Molden sturla.mol...@gmail.com
wrote:

 David Cournapeau courn...@gmail.com wrote:

  The real issue with fftw (besides the license) is the need for plan
  computation, which are expensive (but are not needed for each transform).

 This is not a problem if you thell FFTW to guess a plan instead of making
 measurements. FFTPACK needs to set up a look-up table too.

  I made some experiments with the Bluestein transform to handle prime
  transforms on fftpack, but the precision seemed to be an issue. Maybe I
  should revive this work (if I still have it somewhere).

 You have it in a branch on Github.


 Sturla

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread David Cournapeau
I

On Tue, Oct 28, 2014 at 2:31 PM, Nathaniel Smith n...@pobox.com wrote:

 On 28 Oct 2014 07:32, Jerome Kieffer jerome.kief...@esrf.fr wrote:
 
  On Tue, 28 Oct 2014 04:28:37 +
  Nathaniel Smith n...@pobox.com wrote:
 
   It's definitely attractive. Some potential issues that might need
 dealing
   with, based on a quick skim:
 
  In my tests, numpy's FFTPACK isn't that bad considering
  * (virtually) no extra overhead for installation
  * (virtually) no plan creation time
  * not that slower for each transformation

 Well, this is what makes FFTS intriguing :-). It's BSD licensed, so we
 could distribute it by default like we do fftpack, it uses cache-oblivious
 algorithms so it has no planning step, and even without planning it
 benchmarks as faster than FFTW's most expensive planning mode (in the cases
 that FFTS supports, i.e. power-of-two transforms).

 The paper has lots of benchmark graphs, including measurements of setup
 time:
   http://anthonix.com/ffts/preprints/tsp2013.pdf


Nice. In this case, the solution may be to implement the Bluestein
transform to deal with prime/near-prime numbers on top of FFTS.

I did not look much, but it did not obviously support building on windows
as well ?

David


 -n

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Nathaniel Smith
On 28 Oct 2014 14:48, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com
wrote:

 If I may 'hyjack' the discussion back to the meta-point:

 should we be having this discussion on the numpy mailing list at all?

Of course we should.

 Perhaps the 'batteries included' philosophy made sense in the early days
of numpy; but given that there are several fft libraries with their own
pros and cons, and that most numpy projects will use none of them at all,
why should numpy bundle any of them?

Certainly there's a place for fancy 3rd-party fft libraries. But fft is
such a basic algorithm that it'd be silly to ask people who just need a
quick one-off fft to go evaluate a bunch of third-party libraries. For many
users, downloading one of these libraries will take longer than just doing
their Fourier transform with an O(N**2) algorithm :-). And besides that
there's tons of existing code that uses np.fft. So np.fft will continue to
exist, and given that it exists we should make it as good as we can.

 To have a scipy.linalg and scipy.fft makes sense to me, although import
pyfftw or import pyFFTPACK would arguably be better still. Just as in the
case of linear algebra, those different libraries represent meaningful
differences, and if the user wants to paper over those differences with a
named import they are always free to do so themselves, explicitly. To be
sure, the maintenance of quality fft libraries should be part of the
numpy/scipy-stack in some way or another. But I would argue that the core
thing that numpy should do is ndarrays alone.

According to some sort of abstract project planning aesthetics, perhaps.
But I don't see how fractionating numpy into lots of projects would provide
any benefit for users. (If we split numpy into 10 subprojects then probably
7 of them would never release, because we barely have the engineering to do
release management now.)

CS courses often teach that more modular = more better. That's because
they're desperate to stop newbies from creating balls of mush, though, not
because it's the whole truth :-). It's always true that an organized
codebase is better than a ball of mush, but abstraction barriers,
decoupling, etc. have real and important costs, and this needs to be taken
into account. (See e.g. the Torvalds/Tenenbaum debate.)

And in any case, this ship sailed a long time ago.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread David Cournapeau
On Tue, Oct 28, 2014 at 3:06 PM, David Cournapeau courn...@gmail.com
wrote:

 I

 On Tue, Oct 28, 2014 at 2:31 PM, Nathaniel Smith n...@pobox.com wrote:

 On 28 Oct 2014 07:32, Jerome Kieffer jerome.kief...@esrf.fr wrote:
 
  On Tue, 28 Oct 2014 04:28:37 +
  Nathaniel Smith n...@pobox.com wrote:
 
   It's definitely attractive. Some potential issues that might need
 dealing
   with, based on a quick skim:
 
  In my tests, numpy's FFTPACK isn't that bad considering
  * (virtually) no extra overhead for installation
  * (virtually) no plan creation time
  * not that slower for each transformation

 Well, this is what makes FFTS intriguing :-). It's BSD licensed, so we
 could distribute it by default like we do fftpack, it uses cache-oblivious
 algorithms so it has no planning step, and even without planning it
 benchmarks as faster than FFTW's most expensive planning mode (in the cases
 that FFTS supports, i.e. power-of-two transforms).

 The paper has lots of benchmark graphs, including measurements of setup
 time:
   http://anthonix.com/ffts/preprints/tsp2013.pdf


 Nice. In this case, the solution may be to implement the Bluestein
 transform to deal with prime/near-prime numbers on top of FFTS.

 I did not look much, but it did not obviously support building on windows
 as well ?


Ok, I took a quick look at it, and it will be a significant effort to be
able to make FFTS work at all with MSVC on windows:

- the code is not C89 compatible
- it uses code generation using POSIX library. One would need to port that
part to using Win32 API as well.
- the test suite looks really limited (roundtripping only).

The codebase does not seem particularly well written either (but neither is
FFTPACK to be fair).

Nothing impossible (looks like Sony at least uses this code on windows:
https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but not
a 2 hours thing either.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alexander Belopolsky
On Mon, Oct 27, 2014 at 9:41 PM, Yuxiang Wang yw...@virginia.edu wrote:

 In my opinion - because they don't do the same thing, especially when
 you think in terms in lower-level.

 ndarray.flat returns an iterator; ndarray.flatten() returns a copy;
 ndarray.ravel() only makes copies when necessary; ndarray.reshape() is
 more general purpose, even though you can use it to flatten arrays.


Out of the four ways, I find x.flat the most confusing.  Unfortunately, it
is also the most obvious name for the operation  (and ravel is the least,
but it is the fault of the English language where to ravel means to
unravel.).  What x.flat returns, is not really an iterator.  It is some
hybrid between a view and an iterator.  Consider this:

 x = numpy.arange(6).reshape((2,3))
 i = x.flat
 i.next()
0
 i.next()
1
 i.next()
2

So far no surprises, but what should i[0] return now?  If you think of i as
a C pointer you would expect 3, but

 i[0]
0

What is worse, the above resets the index and now

 i.index
0

OK, so now I expect that i[5] will reset the index to 5, but no

 i[5]
5
 i.index
0

When would you prefer to use x.flat over x.ravel()?

Is x.reshape(-1) always equivalent to x.ravel()?

What is x.flat.copy()?  Is it the same as x.flatten()?  Why does flatiter
even have a .copy() method?  Isn't  i.copy() the same as i.base.flatten(),
only slower?

And  with all these methods, I still don't have the one that would flatten
any array including a nested array like this:

 x = np.array([np.arange(2), np.arange(3), np.arange(4)])

I need yet another function here, for example

 np.hstack(x)
array([0, 1, 0, 1, 2, 0, 1, 2, 3])

and what if I want to flatten a higher dimensional nested array, say

 y = np.array([x[:1],x[:2],x])

can I do better than

 np.hstack(np.hstack(y))
array([0, 1, 0, 1, 0, 1, 2, 0, 1, 0, 1, 2, 0, 1, 2, 3])

?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Sturla Molden
Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote:

 Perhaps the 'batteries included' philosophy made sense in the early days of
 numpy; but given that there are several fft libraries with their own pros
 and cons, and that most numpy projects will use none of them at all, why
 should numpy bundle any of them?

Because sometimes we just need to compute a DFT, just like we sometimes
need to compute a sine or an exponential. It does that job perfectly well.
It is not always about speed. Just typing np.fft.fft(x) is convinient. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Nathaniel Smith
On 28 Oct 2014 16:58, Alexander Belopolsky ndar...@mac.com wrote:

 On Mon, Oct 27, 2014 at 9:41 PM, Yuxiang Wang yw...@virginia.edu wrote:

 In my opinion - because they don't do the same thing, especially when
 you think in terms in lower-level.

 ndarray.flat returns an iterator; ndarray.flatten() returns a copy;
 ndarray.ravel() only makes copies when necessary; ndarray.reshape() is
 more general purpose, even though you can use it to flatten arrays.


 Out of the four ways, I find x.flat the most confusing.

I too would be curious to know why .flat exists (beyond it seemed like a
good idea at the time ;-)). I've always treated it as some weird legacy
thing and ignored it, and this has worked out well for me.

Is there any real problem where .flat is really the best solution? Should
we deprecate it, or at least warn people off from it officially?

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Daniele Nicolodi
On 28/10/14 16:50, David Cournapeau wrote:
 Nothing impossible (looks like Sony at least uses this code on windows:
 https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but
 not a 2 hours thing either.

One of the downsides of the BSD license :)

Cheers,
Daniele


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alan G Isaac
On 10/28/2014 1:25 PM, Nathaniel Smith wrote:
 I too would be curious to know why .flat exists (beyond it seemed like a 
 good idea at the time


How else would you iterate over all items of a multidimensional array?
As an example application, use it to assign to an arbitrary diagonal.
(It can be sliced.)

I don't recall the specifics at the moment, but I've been happy to
have it in the past.

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Stephan Hoyer
On Tue, Oct 28, 2014 at 10:25 AM, Nathaniel Smith n...@pobox.com wrote:

 I too would be curious to know why .flat exists (beyond it seemed like a
 good idea at the time ;-)). I've always treated it as some weird legacy
 thing and ignored it, and this has worked out well for me.

 Is there any real problem where .flat is really the best solution? Should
 we deprecate it, or at least warn people off from it officially?

 .flat lets you iterate over all elements of a N-dimensional array as if it
was 1D, without ever needing to make a copy of the array. In contrast,
ravel() and reshape(-1) cannot always avoid a copy, because they need to
return another ndarray.

np.nditer is a reasonable alternative to .flat (and it's documented as
such), but it's a rather inelegant, kitchen-sink type function.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Stefan van der Walt
On 2014-10-28 19:37:17, Daniele Nicolodi dani...@grinta.net wrote:
 On 28/10/14 16:50, David Cournapeau wrote:
 Nothing impossible (looks like Sony at least uses this code on windows:
 https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but
 not a 2 hours thing either.

 One of the downsides of the BSD license :)

Perhaps one of the upsides, as they may be willing to contribute back if
asked nicely.

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Daniele Nicolodi
On 28/10/14 18:44, Stefan van der Walt wrote:
 On 2014-10-28 19:37:17, Daniele Nicolodi dani...@grinta.net wrote:
 On 28/10/14 16:50, David Cournapeau wrote:
 Nothing impossible (looks like Sony at least uses this code on windows:
 https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but
 not a 2 hours thing either.

 One of the downsides of the BSD license :)
 
 Perhaps one of the upsides, as they may be willing to contribute back if
 asked nicely.

If it would be GPL or similar the would have to, and there would not be
need to ask nicely.

Cheers,
Daniele

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alan G Isaac
On 10/28/2014 1:42 PM, Stephan Hoyer wrote:
 np.nditer is a reasonable alternative to .flat (and it's documented as such), 
 but it's a rather inelegant, kitchen-sink type function.


I'm not sure what reasonable means here,
other than in principle, possible to use.

In particular, `flat` is much more elegant,
and includes an automatic guarantee that the
iterations will be in C-contiguous style.

Alan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [ANN] Python for Scientific Computing conference in Boulder, CO; April'15

2014-10-28 Thread Fernando Perez
Hi folks,

a colleague from NCAR in Boulder just sent me this link about a conference
they are organizing in the spring:

https://sea.ucar.edu/conference/2015

I figured this might be of interest to many on these lists.  The actual
call isn't up yet, so if you're interested, watch that site for an upcoming
call when they post it (I'm not directly involved, just passing the message
along).

Cheers

f

-- 
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [ANN] Python for Scientific Computing conference in Boulder, CO; April'15

2014-10-28 Thread Jean-Baptiste Marquette

Le 28 oct. 2014 à 19:09, Fernando Perez fperez@gmail.com a écrit :

 a colleague from NCAR in Boulder just sent me this link about a conference 
 they are organizing in the spring:
 


Wrong year on the web page: April 13 - 17, 2014

Cheers,
JB



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [ANN] Python for Scientific Computing conference in Boulder, CO; April'15

2014-10-28 Thread Fernando Perez
thanks, reported.

On Tue, Oct 28, 2014 at 11:23 AM, Jean-Baptiste Marquette marqu...@iap.fr
wrote:


 Le 28 oct. 2014 à 19:09, Fernando Perez fperez@gmail.com a écrit :

 a colleague from NCAR in Boulder just sent me this link about a conference
 they are organizing in the spring:


 Wrong year on the web page: *April 13 - 17, 2014*

 Cheers,
 JB




 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Generalize hstack/vstack -- stack; Block matrices like in matlab

2014-10-28 Thread Nathaniel Smith
On 28 Oct 2014 18:34, Stefan Otte stefan.o...@gmail.com wrote:

 Hey,

 In the last weeks I tested `np.asarray(np.bmat())` as `stack`
 function and it works quite well. So the question persits:  If `bmat`
 already offers something like `stack` should we even bother
 implementing `stack`? More code leads to more
 bugs and maintenance work. (However, the current implementation is
 only 5 lines and by using `bmat` which would reduce that even more.)

In the long run we're trying to reduce usage of np.matrix and ideally
deprecate it entirely. So yes, providing ndarray equivalents of matrix
functionality (like bmat) is valuable.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Din Vadhia
It would be nice if there were a single meta numpy.flatten() function with 
kind: {'ravel', 'flatten', 'flat', 'reshape'} options, similar to the 
numpy.sort() function kind : {‘quicksort’, ‘mergesort’, ‘heapsort’} options. It 
would also make it easier to select the best option for each problem need by 
reading the doc in one place.___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-28 Thread Chris Barker
A few thoughts:

1) yes, a faster, more memory efficient text file parser would be great.
Yeah, if your workflow relies on parsing lots of huge text files, you
probably need another workflow. But it's a really really common thing to
nee to do -- why not do it fast?

2) you are describing a special case where you know the data size
apriori (eg not streaming), dtypes are readily apparent from a small sample
case
and in general your data is not messy 

sure -- that's a special case, but it's a really common special case (OK --
without the know your data size ,anyway...)

3)

 Someone also posted some code or the draft thereof for using resizable
 arrays quite a while ago, which would
 reduce the memory overhead for very large arrays.


That may have been me -- I have a resizable array class, both pure python
and not-quite finished Cython version. In practice, if you add stuff to the
array row by row (or item by item), it's no faster than putting it all in a
list and then converting to an array -- but it IS more memory efficient,
which seems to be the issue here. Let me know if you want it -- I really
need to get it up on gitHub one of these days.

My take: for fast parsing of big files you need:

To do the parsing/converting in C -- what wrong with good old fscanf, at
least for the basic types -- it's pretty darn fast.

Memory efficiency -- somethign like my growable array is not all that hard
to implement and pretty darn quick -- you just do the usual trick_ over
allocate a bit of memory, and when it gets full re-allocate a larger chunk.
It turns out, at least on the hardware I tested on, that the performance is
not very sensitive to how much you over allocate -- if it's tiny (1
element) performance really sucks, but once you get to a 10% or so (maybe
less) over-allocation, you don't notice the difference.

Keep the auto-figuring out of the structure / dtypes separate from the high
speed parsing code. Id say write high speed parsing code first -- that
requires specification of the data types and structure, then, if you want,
write some nice pure python code that tries to auto-detect all that. If
it's a small file, it's fast regardless. if it's a large file, then the
overhead of teh fancy parsing will be lost, and you'll want the line by
line parsing to be as fast as possible.

From a quick loo, it seems that the Panda's code is pretty nice -- maybe
the 2X memory footprint should be ignored.

-Chris










 Cheers,
 Derek



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-28 Thread Nathaniel Smith
On 28 Oct 2014 20:10, Chris Barker chris.bar...@noaa.gov wrote:

 Memory efficiency -- somethign like my growable array is not all that
hard to implement and pretty darn quick -- you just do the usual trick_
over allocate a bit of memory, and when it gets full re-allocate a larger
chunk.

Can't you just do this with regular numpy using .resize()? What does your
special class add? (Just curious.)

 From a quick loo, it seems that the Panda's code is pretty nice -- maybe
the 2X memory footprint should be ignored.

+1

It's fun to sit around and brainstorm clever implementation strategies, but
Wes already went ahead and implemented all the tricky bits, and optimized
them too. No point in reinventing the wheel.

(Plus as I pointed out upthread, it's entirely likely that this 2x
overhead is based on a misunderstanding/oversimplification of how virtual
memory works, and the actual practical overhead is much lower.)

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-28 Thread Benjamin Root
As a bit of an aside, I have just discovered that for fixed-width text
data, numpy's text readers seems to edge out pandas' read_fwf(), and numpy
has the advantage of being able to specify the dtypes ahead of time (seems
that the pandas version just won't allow it, which means I end up with
float64's and object dtypes instead of float32's and |S12 dtypes where I
want them).

Cheers!
Ben Root


On Tue, Oct 28, 2014 at 4:09 PM, Chris Barker chris.bar...@noaa.gov wrote:

 A few thoughts:

 1) yes, a faster, more memory efficient text file parser would be great.
 Yeah, if your workflow relies on parsing lots of huge text files, you
 probably need another workflow. But it's a really really common thing to
 nee to do -- why not do it fast?

 2) you are describing a special case where you know the data size
 apriori (eg not streaming), dtypes are readily apparent from a small sample
 case
 and in general your data is not messy 

 sure -- that's a special case, but it's a really common special case (OK
 -- without the know your data size ,anyway...)

 3)

 Someone also posted some code or the draft thereof for using resizable
 arrays quite a while ago, which would
 reduce the memory overhead for very large arrays.


 That may have been me -- I have a resizable array class, both pure python
 and not-quite finished Cython version. In practice, if you add stuff to the
 array row by row (or item by item), it's no faster than putting it all in a
 list and then converting to an array -- but it IS more memory efficient,
 which seems to be the issue here. Let me know if you want it -- I really
 need to get it up on gitHub one of these days.

 My take: for fast parsing of big files you need:

 To do the parsing/converting in C -- what wrong with good old fscanf, at
 least for the basic types -- it's pretty darn fast.

 Memory efficiency -- somethign like my growable array is not all that hard
 to implement and pretty darn quick -- you just do the usual trick_ over
 allocate a bit of memory, and when it gets full re-allocate a larger chunk.
 It turns out, at least on the hardware I tested on, that the performance is
 not very sensitive to how much you over allocate -- if it's tiny (1
 element) performance really sucks, but once you get to a 10% or so (maybe
 less) over-allocation, you don't notice the difference.

 Keep the auto-figuring out of the structure / dtypes separate from the
 high speed parsing code. Id say write high speed parsing code first --
 that requires specification of the data types and structure, then, if you
 want, write some nice pure python code that tries to auto-detect all that.
 If it's a small file, it's fast regardless. if it's a large file, then the
 overhead of teh fancy parsing will be lost, and you'll want the line by
 line parsing to be as fast as possible.

 From a quick loo, it seems that the Panda's code is pretty nice -- maybe
 the 2X memory footprint should be ignored.

 -Chris










 Cheers,
 Derek



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-28 Thread Julian Taylor
On 28.10.2014 21:24, Nathaniel Smith wrote:
 On 28 Oct 2014 20:10, Chris Barker chris.bar...@noaa.gov
 mailto:chris.bar...@noaa.gov wrote:

 Memory efficiency -- somethign like my growable array is not all that
 hard to implement and pretty darn quick -- you just do the usual trick_
 over allocate a bit of memory, and when it gets full re-allocate a
 larger chunk.
 
 Can't you just do this with regular numpy using .resize()? What does
 your special class add? (Just curious.)
 
 From a quick loo, it seems that the Panda's code is pretty nice --
 maybe the 2X memory footprint should be ignored.
 
 +1
 
 It's fun to sit around and brainstorm clever implementation strategies,
 but Wes already went ahead and implemented all the tricky bits, and
 optimized them too. No point in reinventing the wheel.
 

just to through it in there, astropy recently also added a faster ascii
file reader:
https://groups.google.com/forum/#!topic/astropy-dev/biCgb3cF0v0
not familiar with how it compares to pandas.

how is pandas support for unicode text files?
unicode is the big weak point of numpys current text readers and needs
to addressed.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-28 Thread Chris Barker
On Tue, Oct 28, 2014 at 1:24 PM, Nathaniel Smith n...@pobox.com wrote:

  Memory efficiency -- somethign like my growable array is not all that
 hard to implement and pretty darn quick -- you just do the usual trick_
 over allocate a bit of memory, and when it gets full re-allocate a larger
 chunk.

 Can't you just do this with regular numpy using .resize()? What does your
 special class add? (Just curious.)

it used resize under the hood -- it just adds the bookeeping for the over
allocation, etc, and lets you access teh data as though it wasn't
over-allocated

like I said, not that difficult.

I haven't touched it for a while, but it you are curious I just threw it up
on gitHub:

https://github.com/PythonCHB/NumpyExtras

you want accumulator.py -- there is also a cython version that I didn't
quite finish...it theory, it should be a be faster in some cases by
reducing the need to round-trip between numpy and python data types...

in practice, I don't think I got it to a point where I could do real-world
profiling.

It's fun to sit around and brainstorm clever implementation strategies, but
 Wes already went ahead and implemented all the tricky bits, and optimized
 them too. No point in reinventing the wheel.

 (Plus as I pointed out upthread, it's entirely likely that this 2x
 overhead is based on a misunderstanding/oversimplification of how virtual
 memory works, and the actual practical overhead is much lower.)

good point.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Stefan van der Walt
On 2014-10-28 19:55:57, Daniele Nicolodi dani...@grinta.net wrote:
 On 28/10/14 18:44, Stefan van der Walt wrote:
 On 2014-10-28 19:37:17, Daniele Nicolodi dani...@grinta.net wrote:
 On 28/10/14 16:50, David Cournapeau wrote:
 Nothing impossible (looks like Sony at least uses this code on windows:
 https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but
 not a 2 hours thing either.

 One of the downsides of the BSD license :)
 
 Perhaps one of the upsides, as they may be willing to contribute back if
 asked nicely.

 If it would be GPL or similar the would have to, and there would not be
 need to ask nicely.

But then they would not have written the code to start off with, so that
point is moot.

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alexander Belopolsky
On Tue, Oct 28, 2014 at 1:42 PM, Stephan Hoyer sho...@gmail.com wrote:

 .flat lets you iterate over all elements of a N-dimensional array as if it
 was 1D, without ever needing to make a copy of the array. In contrast,
 ravel() and reshape(-1) cannot always avoid a copy, because they need to
 return another ndarray.


In some cases ravel() returns a copy where a view can be easily constructed.
For example,

 x = np.arange(10)
 y = x[::2]
 y.ravel().flags['OWNDATA']
True

Interestingly, in the same case reshape(-1) returns a view:

 y.reshape(-1).flags['OWNDATA']
False

(This suggests at least a documentation bug - numpy.ravel documentation
says that it is equivalent to reshape(-1).)

It is only in situations like this

 a = np.arange(16).reshape((4,4))
 a[1::2,1::2].ravel()
array([ 5,  7, 13, 15])

where flat view cannot be an ndarray, but .flat can still return something
that is at least duck-typing compatible with ndarray (if not an ndarray
subclass) and behaves as a view into original data.

My preferred design would be for x.flat to return a flat view into x.  This
would be consistent with the way .T and .real attributes are defined and
close enough to .imag.  An obvious way to obtain a flat copy would be
x.flat.copy().  Once we have this, ravel() and flatten() can be deprecated
and reshape(-1) discouraged.

I think this would be backward compatible except for rather questionable
situations like this:

 i = x.flat
 list(i)
[0, 1, 2, 3, 4, 0, 6, 7, 8, 9]
 list(i)
[]
 np.array(i)
array([0, 1, 2, 3, 4, 0, 6, 7, 8, 9])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Nathaniel Smith
On Wed, Oct 29, 2014 at 12:37 AM, Alexander Belopolsky ndar...@mac.com wrote:

 On Tue, Oct 28, 2014 at 1:42 PM, Stephan Hoyer sho...@gmail.com wrote:

 .flat lets you iterate over all elements of a N-dimensional array as if it
 was 1D, without ever needing to make a copy of the array. In contrast,
 ravel() and reshape(-1) cannot always avoid a copy, because they need to
 return another ndarray.


 In some cases ravel() returns a copy where a view can be easily constructed.
 For example,

 x = np.arange(10)
 y = x[::2]
 y.ravel().flags['OWNDATA']
 True

 Interestingly, in the same case reshape(-1) returns a view:

 y.reshape(-1).flags['OWNDATA']
 False

 (This suggests at least a documentation bug - numpy.ravel documentation says
 that it is equivalent to reshape(-1).)

Well, that's disturbing. Why have one implementation when you can have three...

 It is only in situations like this

 a = np.arange(16).reshape((4,4))
 a[1::2,1::2].ravel()
 array([ 5,  7, 13, 15])

 where flat view cannot be an ndarray, but .flat can still return something
 that is at least duck-typing compatible with ndarray (if not an ndarray
 subclass) and behaves as a view into original data.

 My preferred design would be for x.flat to return a flat view into x.  This
 would be consistent with the way .T and .real attributes are defined and
 close enough to .imag.

.flat cannot return a flat view analogous to .T, .real, .imag, because
those attributes return ndarray views, and .flat can't guarantee that.

OTOH trying to make .flat into a full duck-compatible ndarray-like
type is a non-starter; it would take a tremendous amount of work for
no clear gain.

Counter-proposal: document that .flat is only for iteration and should
be avoided otherwise, and add a copy = {True, False, if-needed}
kwarg to flatten/ravel/reshape. And the only difference between ravel
and flatten is the default value of this argument. (And while we're at
it, make it so that their implementation is literally to just call
.reshape.)

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alexander Belopolsky
On Tue, Oct 28, 2014 at 9:23 PM, Nathaniel Smith n...@pobox.com wrote:

 OTOH trying to make .flat into a full duck-compatible ndarray-like
 type is a non-starter; it would take a tremendous amount of work for
 no clear gain.


I don't think so - I think all the heavy lifting is already done in
flatiter.  The missing parts are mostly trivial things like .size or .shape
or can be fudged by coercing to true ndarray using existing
flatiter.__array__ method.

It would be more interesting however if we could always return a true
ndarray view.  How is ndarray.diagonal() view implemented in 1.9?  Can
something similar be used to create a flat view?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Nathaniel Smith
On 29 Oct 2014 01:47, Alexander Belopolsky ndar...@mac.com wrote:


 On Tue, Oct 28, 2014 at 9:23 PM, Nathaniel Smith n...@pobox.com wrote:

 OTOH trying to make .flat into a full duck-compatible ndarray-like
 type is a non-starter; it would take a tremendous amount of work for
 no clear gain.


 I don't think so - I think all the heavy lifting is already done in
flatiter.  The missing parts are mostly trivial things like .size or .shape
or can be fudged by coercing to true ndarray using existing
flatiter.__array__ method.

Now try .resize()... The full ndarray api is vast, and niggling problems
would create endless maintenance issues. If your api is going to be that
leaky then it's better not to have it at all.

 It would be more interesting however if we could always return a true
ndarray view.  How is ndarray.diagonal() view implemented in 1.9?  Can
something similar be used to create a flat view?

.diagonal has no magic, it just turns out that the diagonal of any strided
array is also expressible as a strided array. (Specifically, new_strides =
(sum(old_strides),).) There is no analogous theorem for flattening.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Deprecate pkgload, PackageLoader

2014-10-28 Thread Charles R Harris
Hi All,

It is proposed to deprecate, then remove, pkgload and PackageLoader.

Complaints? Cries of Anguish?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion