Re: [Numpy-discussion] NumPy 1.11.2 released

2016-10-03 Thread Evgeni Burovski
Thank you Chuck!
04.10.2016 5:15 пользователь "Charles R Harris" 
написал:

> *Hi All,*
>
> I'm pleased to announce the release of Numpy 1.11.2. This release
> supports Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions
> found in Numpy 1.11.1.  Wheels for Linux, Windows, and OSX can be found
> on PyPI. Sources are available on both PyPI and Sourceforge
> .
>
> Thanks to all who were involved in this release. Contributors and merged
> pull requests are listed below.
>
>
> *Contributors to v1.11.2*
>
>- Allan Haldane
>- Bertrand Lefebvre
>- Charles Harris
>- Julian Taylor
>- Loïc Estève
>- Marshall Bockrath-Vandegrift +
>- Michael Seifert +
>- Pauli Virtanen
>- Ralf Gommers
>- Sebastian Berg
>- Shota Kawabuchi +
>- Thomas A Caswell
>- Valentin Valls +
>- Xavier Abellan Ecija +
>
> A total of 14 people contributed to this release. People with a "+" by
> their names contributed a patch for the first time.
> *Pull requests merged for v1.11.2*
>
>- #7736 : Backport 4619,
>BUG: many functions silently drop keepdims kwarg
>- #7738 : Backport 5706,
>ENH: add extra kwargs and update doc of many MA...
>- #7778 : DOC: Update Numpy
>1.11.1 release notes.
>- #7793 : Backport 7515,
>BUG: MaskedArray.count treats negative axes incorrectly
>- #7816 : Backport 7463,
>BUG: fix array too big error for wide dtypes.
>- #7821 : Backport 7817,
>BUG: Make sure npy_mul_with_overflow_ detects...
>- #7824 : Backport 7820,
>MAINT: Allocate fewer bytes for empty arrays.
>- #7847 : Backport 7791,
>MAINT,DOC: Fix some imp module uses and update...
>- #7849 : Backport 7848,
>MAINT: Fix remaining uses of deprecated Python...
>- #7851 : Backport 7840, Fix
>ATLAS version detection
>- #7870 : Backport 7853,
>BUG: Raise RuntimeError when reloading numpy is...
>- #7896 : Backport 7894,
>BUG: construct ma.array from np.array which contains...
>- #7904 : Backport 7903,
>BUG: fix float16 type not being called due to...
>- #7917 : BUG: Production
>install of numpy should not require nose.
>- #7919 : Backport 7908,
>BLD: Fixed MKL detection for recent versions of...
>- #7920 : Backport #7911:
>BUG: fix for issue#7835 (ma.median of 1d)
>- #7932 : Backport 7925,
>Monkey-patch _msvccompile.gen_lib_option like...
>- #7939 : Backport 7931,
>BUG: Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in...
>- #7953 : Backport 7937,
>BUG: Guard against buggy comparisons in generic...
>- #7954 : Backport 7952,
>BUG: Use keyword arguments to initialize Extension...
>- #7955 : Backport 7941,
>BUG: Make sure numpy globals keep identity after...
>- #7972 : Backport 7963,
>BUG: MSVCCompiler grows 'lib' & 'include' env...
>- #7990 : Backport 7977,
>DOC: Create 1.11.2 release notes.
>- #8005 : Backport 7956,
>BLD: remove __NUMPY_SETUP__ from builtins at end...
>- #8007 : Backport 8006,
>DOC: Update 1.11.2 release notes.
>- #8010 : Backport 8008,
>MAINT: Remove leftover imp module imports.
>- #8012 : Backport 8011,
>DOC: Update 1.11.2 release notes.
>- #8020 : Backport 8018,
>BUG: Fixes return for np.ma.count if keepdims...
>- #8024 : Backport 8016,
>BUG: Fix numpy.ma.median.
>- #8031 : Backport 8030,
>BUG: fix np.ma.median with only one non-masked...
>- #8032 : Backport 8028,
>DOC: Update 1.11.2 release notes.
>- #8044 : Backport 8042,
>BUG: core: fix bug 

Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-03 Thread Marten van Kerkwijk
Note that numpy does store some larger arrays already, in the fft
module. (In fact, this was a cache of unlimited size until #7686.) It
might not be bad if the same cache were used more generally.

That said, if newer versions of python are offering ways of doing this
better, maybe that is the best way forward.

-- Marten
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy 1.11.2 released

2016-10-03 Thread Matthew Brett
On Mon, Oct 3, 2016 at 7:15 PM, Charles R Harris
 wrote:
> Hi All,
>
> I'm pleased to announce the release of Numpy 1.11.2. This release supports
> Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions found in
> Numpy 1.11.1.  Wheels for Linux, Windows, and OSX can be found on PyPI.
> Sources are available on both PyPI and Sourceforge.
>
> Thanks to all who were involved in this release. Contributors and merged
> pull requests are listed below.
>
>
> Contributors to v1.11.2
>
> Allan Haldane
> Bertrand Lefebvre
> Charles Harris
> Julian Taylor
> Loïc Estève
> Marshall Bockrath-Vandegrift +
> Michael Seifert +
> Pauli Virtanen
> Ralf Gommers
> Sebastian Berg
> Shota Kawabuchi +
> Thomas A Caswell
> Valentin Valls +
> Xavier Abellan Ecija +
>
> A total of 14 people contributed to this release. People with a "+" by their
> names contributed a patch for the first time.
>
> Pull requests merged for v1.11.2
>
> #7736: Backport 4619, BUG: many functions silently drop keepdims kwarg
> #7738: Backport 5706, ENH: add extra kwargs and update doc of many MA...
> #7778: DOC: Update Numpy 1.11.1 release notes.
> #7793: Backport 7515, BUG: MaskedArray.count treats negative axes
> incorrectly
> #7816: Backport 7463, BUG: fix array too big error for wide dtypes.
> #7821: Backport 7817, BUG: Make sure npy_mul_with_overflow_ detects...
> #7824: Backport 7820, MAINT: Allocate fewer bytes for empty arrays.
> #7847: Backport 7791, MAINT,DOC: Fix some imp module uses and update...
> #7849: Backport 7848, MAINT: Fix remaining uses of deprecated Python...
> #7851: Backport 7840, Fix ATLAS version detection
> #7870: Backport 7853, BUG: Raise RuntimeError when reloading numpy is...
> #7896: Backport 7894, BUG: construct ma.array from np.array which
> contains...
> #7904: Backport 7903, BUG: fix float16 type not being called due to...
> #7917: BUG: Production install of numpy should not require nose.
> #7919: Backport 7908, BLD: Fixed MKL detection for recent versions of...
> #7920: Backport #7911: BUG: fix for issue#7835 (ma.median of 1d)
> #7932: Backport 7925, Monkey-patch _msvccompile.gen_lib_option like...
> #7939: Backport 7931, BUG: Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in...
> #7953: Backport 7937, BUG: Guard against buggy comparisons in generic...
> #7954: Backport 7952, BUG: Use keyword arguments to initialize Extension...
> #7955: Backport 7941, BUG: Make sure numpy globals keep identity after...
> #7972: Backport 7963, BUG: MSVCCompiler grows 'lib' & 'include' env...
> #7990: Backport 7977, DOC: Create 1.11.2 release notes.
> #8005: Backport 7956, BLD: remove __NUMPY_SETUP__ from builtins at end...
> #8007: Backport 8006, DOC: Update 1.11.2 release notes.
> #8010: Backport 8008, MAINT: Remove leftover imp module imports.
> #8012: Backport 8011, DOC: Update 1.11.2 release notes.
> #8020: Backport 8018, BUG: Fixes return for np.ma.count if keepdims...
> #8024: Backport 8016, BUG: Fix numpy.ma.median.
> #8031: Backport 8030, BUG: fix np.ma.median with only one non-masked...
> #8032: Backport 8028, DOC: Update 1.11.2 release notes.
> #8044: Backport 8042, BUG: core: fix bug in NpyIter buffering with
> discontinuous...
> #8046: Backport 8045, DOC: Update 1.11.2 release notes.

Thanks very much for doing all the release work, congratulations on the release,

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dropping sourceforge for releases.

2016-10-03 Thread Charles R Harris
On Sun, Oct 2, 2016 at 5:53 PM, Vincent Davis 
wrote:

> +1, I am very skeptical of anything on SourceForge, it negatively impacts
> my opinion of any project that requires me to download from sourceforge.
>
>
> On Saturday, October 1, 2016, Charles R Harris 
> wrote:
>
>> Hi All,
>>
>> Ralf has suggested dropping sourceforge as a NumPy release site. There
>> was discussion of doing that some time back but we have not yet done it.
>> Now that we put wheels up on PyPI for all supported architectures source
>> forge is not needed. I note that there are still some 15,000 downloads a
>> week from the site, so it is still used.
>>
>> Thoughts?
>>
>> Chuck
>>
>
I've uploaded the NumPy 1.11.2 release to sourceforge and made a note on
the summary page that that will be the last release to be found there.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-03 Thread Pauli Virtanen
Mon, 03 Oct 2016 15:07:28 -0400, Benjamin Root kirjoitti:
> With regards to arguments about holding onto large arrays, I would like
> to emphasize that my original suggestion mentioned weakref'ed numpy
> arrays.
> Essentially, the idea is to claw back only the raw memory blocks during
> that limbo period between discarding the numpy array python object and
> when python garbage-collects it.

CPython afaik deallocates immediately when the refcount hits zero. It's 
relatively rare that you have arrays hanging around waiting for cycle 
breaking by gc. If you have them hanging around, I don't think it's 
possible to distinguish these from other arrays without running the gc.

Note also that an "is an array in use" check probably always requires 
Julian's stack based hack since you cannot rely on the refcount.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-03 Thread Benjamin Root
With regards to arguments about holding onto large arrays, I would like to
emphasize that my original suggestion mentioned weakref'ed numpy arrays.
Essentially, the idea is to claw back only the raw memory blocks during
that limbo period between discarding the numpy array python object and when
python garbage-collects it.

Ben Root

On Mon, Oct 3, 2016 at 2:43 PM, Julian Taylor  wrote:

> On 03.10.2016 20:23, Chris Barker wrote:
> >
> >
> > On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor
> > >
> > wrote:
> >
> > the problem with this approach is that we don't really want numpy
> > hogging on to hundreds of megabytes of memory by default so it would
> > need to be a user option.
> >
> >
> > indeed -- but one could set an LRU cache to be very small (few items,
> > not small memory), and then it get used within expressions, but not hold
> > on to much outside of expressions.
>
> numpy doesn't see the whole expression so we can't really do much.
> (technically we could in 3.5 by using pep 523, but that would be a
> larger undertaking)
>
> >
> > However, is the allocation the only (Or even biggest) source of the
> > performance hit?
> >
>
> on large arrays the allocation is insignificant. What does cost some
> time is faulting the memory into the process which implies writing zeros
> into the pages (a page at a time as it is being used).
> By storing memory blocks in numpy we would save this portion. This is
> really the job of the libc, but these are usually tuned for general
> purpose workloads and thus tend to give back memory back to the system
> much earlier than numerical workloads would like.
>
> Note that numpy already has a small memory block cache but its only used
> for very small arrays where the allocation cost itself is significant,
> it is limited to a couple megabytes at most.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-03 Thread Julian Taylor
On 03.10.2016 20:23, Chris Barker wrote:
> 
> 
> On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor
> >
> wrote:
> 
> the problem with this approach is that we don't really want numpy
> hogging on to hundreds of megabytes of memory by default so it would
> need to be a user option.
> 
> 
> indeed -- but one could set an LRU cache to be very small (few items,
> not small memory), and then it get used within expressions, but not hold
> on to much outside of expressions.

numpy doesn't see the whole expression so we can't really do much.
(technically we could in 3.5 by using pep 523, but that would be a
larger undertaking)

> 
> However, is the allocation the only (Or even biggest) source of the
> performance hit?
>  

on large arrays the allocation is insignificant. What does cost some
time is faulting the memory into the process which implies writing zeros
into the pages (a page at a time as it is being used).
By storing memory blocks in numpy we would save this portion. This is
really the job of the libc, but these are usually tuned for general
purpose workloads and thus tend to give back memory back to the system
much earlier than numerical workloads would like.

Note that numpy already has a small memory block cache but its only used
for very small arrays where the allocation cost itself is significant,
it is limited to a couple megabytes at most.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-03 Thread Chris Barker
On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor  wrote:

> the problem with this approach is that we don't really want numpy
> hogging on to hundreds of megabytes of memory by default so it would
> need to be a user option.
>

indeed -- but one could set an LRU cache to be very small (few items, not
small memory), and then it get used within expressions, but not hold on to
much outside of expressions.

However, is the allocation the only (Or even biggest) source of the
performance hit?

If you generate a temporary as a result of an operation, rather than doing
it in-place, that temporary needs to be allocated, but it also means that
an additional array needs to be pushed through the processor -- and that
can make a big performance difference too.

I"m not entirely sure how to profile this correctly, but this seems to
indicate that the allocation is cheap compared to the operations (for a
million--element array)

* Regular old temporary creation

In [24]: def f1(arr1, arr2):
...: result = arr1 + arr2
...: return result

In [26]: %timeit f1(arr1, arr2)
1000 loops, best of 3: 1.13 ms per loop

* Completely in-place, no allocation of an extra array

In [27]: def f2(arr1, arr2):
...: arr1 += arr2
...: return arr1

In [28]: %timeit f2(arr1, arr2)
1000 loops, best of 3: 755 µs per loop

So that's about 30% faster

* allocate a temporary that isn't used -- but should catch the creation cost

In [29]: def f3(arr1, arr2):
...: result = np.empty_like(arr1)
...: arr1 += arr2
...: return arr1

In [30]: % timeit f3(arr1, arr2)

1000 loops, best of 3: 756 µs per loop

only a µs slower!

Profiling is hard, and I'm not good at it, but this seems to indicate that
the allocation is cheap.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: pandas v0.19.0 released

2016-10-03 Thread Joris Van den Bossche
Hi all,

I'm happy to announce pandas 0.19.0 has been released.
This is a major release from 0.18.1 and includes a number of API changes,
several new features, enhancements, and performance improvements along with
a large number of bug fixes. See the Whatsnew
 file
for more information. We recommend that all users upgrade to this version.

This is the work of 5 months of development by 117 contributors. A big
thank you to all contributors!

Joris

---

*What is it:*

pandas is a Python package providing fast, flexible, and expressive data
structures designed to make working with “relational” or “labeled” data
both easy and intuitive. It aims to be the fundamental high-level building
block for doing practical, real world data analysis in Python.
Additionally, it has the broader goal of becoming the most powerful and
flexible open source data analysis / manipulation tool available in any
language.

*Highlights of the 0.19.0 release include:*

   - New method merge_asof for asof-style time-series joining, see here
   

   - The .rolling() method is now time-series aware, see here
   

   - read_csv now supports parsing Categorical data, see here
   

   - A function union_categorical has been added for combining
   categoricals, see here
   

   - PeriodIndex now has its own period dtype, and changed to be more
   consistent with other Index classes. See here
   

   - Sparse data structures gained enhanced support of int and bool dtypes,
   see here
   

   - Comparison operations with Series no longer ignores the index, see here
   

   for an overview of the API changes.
   - Introduction of a pandas development API for utility functions, see
   here
   

   .
   - Deprecation of Panel4D and PanelND. We recommend to represent these
   types of n-dimensional data with the xarray package
   .
   - Removal of the previously deprecated modules pandas.io.data,
   pandas.io.wb, pandas.tools.rplot.

See the Whatsnew
 file
for more information.

*How to get it:*

Source tarballs and windows/mac/linux wheels are available on PyPI (thanks
to Christoph Gohlke for the windows wheels, and to Matthew Brett for
setting up the mac/linux wheels).
Conda packages are already available via the conda-forge channel (conda
install pandas -c conda-forge). It will be available on the main channel
shortly.

*Issues:*

Please report any issues on our issue tracker:
https://github.com/pydata/pandas/issues

*Thanks to all the contributors:*

   - adneu
   - Adrien Emery
   - agraboso
   - Alex Alekseyev
   - Alex Vig
   - Allen Riddell
   - Amol
   - Amol Agrawal
   - Andy R. Terrel
   - Anthonios Partheniou
   - babakkeyvani
   - Ben Kandel
   - Bob Baxley
   - Brett Rosen
   - c123w
   - Camilo Cota
   - Chris
   - chris-b1
   - Chris Grinolds
   - Christian Hudon
   - Christopher C. Aycock
   - Chris Warth
   - cmazzullo
   - conquistador1492
   - cr3
   - Daniel Siladji
   - Douglas McNeil
   - Drewrey Lupton
   - dsm054
   - Eduardo Blancas Reyes
   - Elliot Marsden
   - Evan Wright
   - Felix Marczinowski
   - Francis T. O’Donovan
   - Gábor Lipták
   - Geraint Duck
   - gfyoung
   - Giacomo Ferroni
   - Grant Roch
   - Haleemur Ali
   - harshul1610
   - Hassan Shamim
   - iamsimha
   - Iulius Curt
   - Ivan Nazarov
   - jackieleng
   - Jeff Reback
   - Jeffrey Gerard
   - Jenn Olsen
   - Jim Crist
   - Joe Jevnik
   - John Evans
   - John Freeman
   - John Liekezer
   - Johnny Gill
   - John W. O’Brien
   - John Zwinck
   - Jordan Erenrich
   - Joris Van den Bossche
   - Josh Howes
   - Jozef Brandys
   - Kamil Sindi
   - Ka Wo Chen
   - Kerby Shedden
   - Kernc
   - Kevin Sheppard
   - Matthieu Brucher
   - Maximilian Roos
   - Michael Scherer
   - Mike Graham
   - Mortada Mehyar
   - mpuels
   - Muhammad Haseeb Tariq
   - Nate George
   - Neil Parley
   - Nicolas Bonnotte
   - OXPHOS
   - Pan Deng / Zora
   - Paul
   - Pauli Virtanen
   - Paul Mestemaker
   - Pawel Kordek
   - Pietro Battiston
   - pijucha
   - Piotr Jucha
   - priyankjain
   - Ravi Kumar Nimmi
   -