[Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Nathaniel Smith
Hi all,

I'm wondering what people think of the idea of us (= numpy) stopping
providing our "official" win32 builds (the "superpack installers"
distributed on sourceforge) starting with the next release.

These builds are:

- low quality: they're linked to an old & untuned build of ATLAS, so
linear algebra will be dramatically slower than builds using MKL or
OpenBLAS. They're win32 only and will never support win64. They're
using an ancient version of gcc. They will never support python 3.5 or
later.

- a dead end: there's a lot of work going on to solve the windows
build problem, and hopefully we'll have something better in the
short-to-medium-term future; but, any solution will involve throwing
out the current system entirely and switching to a new toolchain,
wheel-based distribution, etc.

- a drain on our resources: producing these builds is time-consuming
and finicky; I'm told that these builds alone are responsible for a
large proportion of the energy spent preparing each release, and take
away from other things that our release managers could be doing (e.g.
QA and backporting fixes).

So the idea would be that for 1.11, we create a 1.11 directory on
sourceforge and upload one final file: a README explaining the
situation, a pointer to the source releases on pypi, and some links to
places where users can find better-supported windows builds (Gohlke's
page, Anaconda, etc.). I think this would serve our users better than
the current system, while also freeing up a drain on our resources.

Thoughts?

-n

-- 
Nathaniel J. Smith -- http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Peter Cock
On Fri, Dec 18, 2015 at 9:12 AM, Nathaniel Smith  wrote:

> Hi all,
>
> I'm wondering what people think of the idea of us (= numpy) stopping
> providing our "official" win32 builds (the "superpack installers"
> distributed on sourceforge) starting with the next release.
>
> These builds are:
>
> - low quality: they're linked to an old & untuned build of ATLAS, so
> linear algebra will be dramatically slower than builds using MKL or
> OpenBLAS. They're win32 only and will never support win64. They're
> using an ancient version of gcc. They will never support python 3.5 or
> later.
>
> - a dead end: there's a lot of work going on to solve the windows
> build problem, and hopefully we'll have something better in the
> short-to-medium-term future; but, any solution will involve throwing
> out the current system entirely and switching to a new toolchain,
> wheel-based distribution, etc.
>
> - a drain on our resources: producing these builds is time-consuming
> and finicky; I'm told that these builds alone are responsible for a
> large proportion of the energy spent preparing each release, and take
> away from other things that our release managers could be doing (e.g.
> QA and backporting fixes).
>
> So the idea would be that for 1.11, we create a 1.11 directory on
> sourceforge and upload one final file: a README explaining the
> situation, a pointer to the source releases on pypi, and some links to
> places where users can find better-supported windows builds (Gohlke's
> page, Anaconda, etc.). I think this would serve our users better than
> the current system, while also freeing up a drain on our resources.
>
> Thoughts?
>
> -n
>


Hi Nathaniel,

Speaking as a downstream library (Biopython) using the NumPy
C API, we have to ensure binary compatibility with your releases.

We've continued to produce our own Windows 32 bit installers -
originally the .exe kind (from python setup.py bdist_wininst) but
now also .msi (from python setup.py bdist_msi).

However, in the absence of an official 64bit Windows NumPy
installer we've simply pointed people at Chris Gohlke's stack
http://www.lfd.uci.edu/~gohlke/pythonlibs/ and will likely also start
to recommend using Anaconda.

This means we don't have any comparable download metrics
to gauge 32 bit vs 64 bit Windows usage, but personally I'm
quite happy for NumPy to phase out their 32 bit Windows
installers (and then we can do the same).

I hope we can follow NumPy's lead with wheel distribution etc.

Thanks,

Peter
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Charles R Harris
On Fri, Dec 18, 2015 at 2:12 AM, Nathaniel Smith  wrote:

> Hi all,
>
> I'm wondering what people think of the idea of us (= numpy) stopping
> providing our "official" win32 builds (the "superpack installers"
> distributed on sourceforge) starting with the next release.
>
> These builds are:
>
> - low quality: they're linked to an old & untuned build of ATLAS, so
> linear algebra will be dramatically slower than builds using MKL or
> OpenBLAS. They're win32 only and will never support win64. They're
> using an ancient version of gcc. They will never support python 3.5 or
> later.
>
> - a dead end: there's a lot of work going on to solve the windows
> build problem, and hopefully we'll have something better in the
> short-to-medium-term future; but, any solution will involve throwing
> out the current system entirely and switching to a new toolchain,
> wheel-based distribution, etc.
>
> - a drain on our resources: producing these builds is time-consuming
> and finicky; I'm told that these builds alone are responsible for a
> large proportion of the energy spent preparing each release, and take
> away from other things that our release managers could be doing (e.g.
> QA and backporting fixes).
>

Once numpy-vendor is set up, preparing and running the builds take about
fifteen minutes on my machine. That assumes familiarity with the process, a
first time user will spend significantly more time. Most of the work  in a
release is keeping track of reported bugs and fixing them. Tracking
deprecations and such also takes time.


> So the idea would be that for 1.11, we create a 1.11 directory on
> sourceforge and upload one final file: a README explaining the
> situation, a pointer to the source releases on pypi, and some links to
> places where users can find better-supported windows builds (Gohlke's
> page, Anaconda, etc.). I think this would serve our users better than
> the current system, while also freeing up a drain on our resources.
>

What about beta releases? I have nothing against offloading part of the
release process, but if we do, we need to determine how to coordinate it
among the different parties, which might be something of a time sink in
itself.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.power -> numpy.random.choice Probabilities don't sum to 1

2015-12-18 Thread Nathaniel Smith
On Fri, Dec 18, 2015 at 1:25 PM, Ryan R. Rosario  wrote:
> Hi,
>
> I have a matrix whose entries I must raise to a certain power and then 
> normalize by row. After I do that, when I pass some rows to 
> numpy.random.choice, I get a ValueError: probabilities do not sum to 1.
>
> I understand that floating point is not perfect, and my matrix is so large 
> that I cannot use np.longdouble because I will run out of RAM.
>
> As an example on a smaller matrix:
>
> np.power(mymatrix, 10, out=mymatrix)
> row_normalized = np.apply_along_axis(lambda x: x / np.sum(x), 1, mymatrix)

I'm sorry I don't have a solution to your actual problem off the top
of my head, but it's probably helpful in general to know that a better
way to write this would be just

  row_normalized = mymatrix / np.sum(mymatrix, axis=1, keepdims=True)

apply_along_axis is slow and can almost always be replaced by a
broadcasting expression like this.

> sums = row_normalized.sum(axis=1)
> sums[np.where(sums != 1)]

And here you can just write

  sums[sums != 1]

i.e. the call to where() isn't doing anything useful.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Ian Henriksen
On Fri, Dec 18, 2015 at 3:27 PM Nathaniel Smith  wrote:

> On Dec 18, 2015 2:22 PM, "Ian Henriksen" <
> insertinterestingnameh...@gmail.com> wrote:
> >
> > An appveyor setup is a great idea. An appveyor build matrix with the
> > various supported MSVC versions would do a lot more to prevent
> > compatibility issues than periodically building installers with old
> versions of
> > MinGW. The effort toward a MinGW-based build is valuable, but having a
> > CI system test for MSVC compatibility will be valuable regardless of
> where
> > things go with that.
>
> Yes, definitely. Would you by chance have any interest in getting this set
> up?
>
> -n
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


I'll take a look at setting that up. On the other hand, getting everything
working with the various MSVC versions isn't likely to be a smooth sailing
process, so I can't guarantee anything.

Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Introducing outer/orthonongal indexing to numpy

2015-12-18 Thread Sebastian Berg
Hello all,

sorry for cross posting (discussion should go to the numpy list). But I
would like to get a bit of discussion on the introduction of (mostly)
two new ways to index numpy arrays. This would also define a way for
code working with different array-likes, some of which implement outer
indexing (i.e. xray and dask I believe), to avoid ambiguity.

The new methods are (names up for discussion):
  1. arr.oindex[...]
  2. arr.vindex[...]

The difference beeing that `oindex` will return outer/orthogonal type
indexing, while `vindex` would be a (hopefully) less confusing variant
of "fancy" indexing.

The biggest reason for introducing this is to provide `oindex`
for situations such as:
   >>> arr = np.arange(25).reshape((5, 5))
   >>> arr[[0, 1], [1, 2]]
   array([1, 7])
   >>> # While most might expect the result to be:
   >>> arr.oindex[[0, 1], [1, 2]]
   array([[1, 2],
  [6, 7]])

To provide backwards compatibility the current plan is to also introduce
`arr.legacy_index[...]` or similar, with the (long term) plan to force
the users to explicitly choose `oindex`, `vindex`, or `legacy_index` if
the indexing operation is otherwise not well defined.

There are still some open questions for me regarding, for example:
 * the exact time line (should we start deprecation immediately, etc.)
 * the handling of boolean indexing arrays
 * questions that might crop up about other array-likes/subclasses
 * Are there indexing needs that we are forgetting but are related?

More details the current status of my NEP, which has a lot of examples,
can be found at:
https://github.com/numpy/numpy/pull/6256/files?short_path=01e4dd9#diff-01e4dd9d2ecf994b24e5883f98f789e6
and comments about are very welcome.

There is a fully functional implementation available  at
https://github.com/numpy/numpy/pull/6075 and you can test it using
(after cloning numpy):

git fetch upstream pull/6075/head:pr-6075 && git checkout pr-6075;
python runtests.py --ipython
# Inside ipython (too see the deprecations):
import warnings; warnings.simplefilter("always")


My current hope for going forward is to get clear feedback of what is
wanted, for the naming and generally from third party module people, so
that we can polish up the NEP and the community can accept it.
With good feedback, I think we may be able to get the new attributes
into 1.11.

So if you are interested in teaching and have suggestions for the names,
or have thoughts about subclasses, or... please share your thoughts! :)

Regards,

Sebastian


signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] 7th Annual Scientific Software Days Conference (25-26 February, 2016 -- Austin, TX)

2015-12-18 Thread Damon McDougall
The 7th Annual Scientific Software Days Conference  (SSD)
 targets users and developers of scientific
software.  The conference will be held at the University of Texas at
Austin Thursday Feb 25 - Friday Feb 26, 2016 and focuses on two themes:

 a) sharing best practices across scientific software communities;
 b) sharing the latest tools and technology relevant to scientific
software.

Past keynotes speakers include Greg Wilson (2008), Victoria Stodden
(2009), Steve Easterbrook (2010),  Fernando Perez (2011), Will Schroeder
(2012), Neil Chue Hong (2013). 

This year's list of speakers include:

- Brian Adams (Sandia, Dakota):
http://www.sandia.gov/~briadam/index.html
- Iain Dunning (MIT, Julia Project): http://iaindunning.com/
- Victor Eijkhout (TACC): http://pages.tacc.utexas.edu/~eijkhout/
- Robert van de Geijn (keynote, UT Austin, libflame):
https://www.cs.utexas.edu/users/rvdg/
- Jeff Hammond (Intel, nwchem): https://jeffhammond.github.io/
- Mark Hoemmen (keynote, Sandia, Trilinos):
https://plus.google.com/+MarkHoemmen
- James Howison (UT Austin): http://james.howison.name/
- Fernando Perez (Berkeley, IPython): http://fperez.org/
- Cory Quammen (Kitware, Paraview/VTK):
http://www.kitware.com/company/team/quammen.html
- Ridgway Scott (UChicago, FEniCS): http://people.cs.uchicago.edu/~ridg/
- Roy Stogner (UT Austin, LibMesh):
https://scholar.google.com/citations?user=XcurJI0J

In additional, we solicit poster submissions that share novel uses of
scientific software.  Please send an abstract of less than 250 words to
ssd-organiz...@googlegroups.com.

Limited travel funding for students and early career researchers who
present posters will be available.

Early-bird registration fees (before Feb 10th):
Students: $35
Everyone else: $50

Late registration fees (Feb 10th onwards):
Students: $55
Everyone else: $70

More details, including how to register, will appear on the website in
the coming weeks: http://scisoftdays.org/

Regards,
S. Fomel (UTexas), T. Isaac (UChicago), M. Knepley (Rice), R. Kirby
(Baylor), Y. Lai (UTexas), K. Long (Texas Tech), D. McDougall (UTexas),
J. Stewart (Sandia)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Ralf Gommers
On Fri, Dec 18, 2015 at 5:55 PM, Charles R Harris  wrote:

>
>
> On Fri, Dec 18, 2015 at 2:12 AM, Nathaniel Smith  wrote:
>
>> Hi all,
>>
>> I'm wondering what people think of the idea of us (= numpy) stopping
>> providing our "official" win32 builds (the "superpack installers"
>> distributed on sourceforge) starting with the next release.
>>
>
+1 from me. Despite the number of downloads still being high, I don't think
there's too much value in these binaries anymore. We have been recommending
Anaconda/Canopy for a couple of years now, and that's almost always a much
better option for users.


>
>> These builds are:
>>
>> - low quality: they're linked to an old & untuned build of ATLAS, so
>> linear algebra will be dramatically slower than builds using MKL or
>> OpenBLAS. They're win32 only and will never support win64. They're
>> using an ancient version of gcc. They will never support python 3.5 or
>> later.
>>
>> - a dead end: there's a lot of work going on to solve the windows
>> build problem, and hopefully we'll have something better in the
>> short-to-medium-term future; but, any solution will involve throwing
>> out the current system entirely and switching to a new toolchain,
>> wheel-based distribution, etc.
>>
>> - a drain on our resources: producing these builds is time-consuming
>> and finicky; I'm told that these builds alone are responsible for a
>> large proportion of the energy spent preparing each release, and take
>> away from other things that our release managers could be doing (e.g.
>> QA and backporting fixes).
>>
>
> Once numpy-vendor is set up, preparing and running the builds take about
> fifteen minutes on my machine.
>

Well, it builds but the current setup is just broken. Try building a binary
and running the tests - you should find that there's a segfault in the
np.fromfile tests (see https://github.com/scipy/scipy/issues/5540). And
that kind of thing is incredibly painful to debug and fix.


> That assumes familiarity with the process, a first time user will spend
> significantly more time. Most of the work  in a release is keeping track of
> reported bugs and fixing them. Tracking deprecations and such also takes
> time.
>
>
>> So the idea would be that for 1.11, we create a 1.11 directory on
>> sourceforge and upload one final file: a README explaining the
>> situation, a pointer to the source releases on pypi, and some links to
>> places where users can find better-supported windows builds (Gohlke's
>> page, Anaconda, etc.). I think this would serve our users better than
>> the current system, while also freeing up a drain on our resources.
>>
>
> What about beta releases? I have nothing against offloading part of the
> release process, but if we do, we need to determine how to coordinate it
> among the different parties, which might be something of a time sink in
> itself.
>

We need to ensure that the MSVC builds work. But that's not new, that was
always necessary for a release. Christophe has always tested beta/rc
releases which is super helpful, but we need to get Appveyor CI to work
soon.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Ian Henriksen
On Fri, Dec 18, 2015 at 2:51 PM Ralf Gommers  wrote:

> On Fri, Dec 18, 2015 at 5:55 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Fri, Dec 18, 2015 at 2:12 AM, Nathaniel Smith  wrote:
>>
>>> Hi all,
>>>
>>> I'm wondering what people think of the idea of us (= numpy) stopping
>>> providing our "official" win32 builds (the "superpack installers"
>>> distributed on sourceforge) starting with the next release.
>>>
>>
> +1 from me. Despite the number of downloads still being high, I don't
> think there's too much value in these binaries anymore. We have been
> recommending Anaconda/Canopy for a couple of years now, and that's almost
> always a much better option for users.
>
>
>>
>>> These builds are:
>>>
>>> - low quality: they're linked to an old & untuned build of ATLAS, so
>>> linear algebra will be dramatically slower than builds using MKL or
>>> OpenBLAS. They're win32 only and will never support win64. They're
>>> using an ancient version of gcc. They will never support python 3.5 or
>>> later.
>>>
>>> - a dead end: there's a lot of work going on to solve the windows
>>> build problem, and hopefully we'll have something better in the
>>> short-to-medium-term future; but, any solution will involve throwing
>>> out the current system entirely and switching to a new toolchain,
>>> wheel-based distribution, etc.
>>>
>>> - a drain on our resources: producing these builds is time-consuming
>>> and finicky; I'm told that these builds alone are responsible for a
>>> large proportion of the energy spent preparing each release, and take
>>> away from other things that our release managers could be doing (e.g.
>>> QA and backporting fixes).
>>>
>>
>> Once numpy-vendor is set up, preparing and running the builds take about
>> fifteen minutes on my machine.
>>
>
> Well, it builds but the current setup is just broken. Try building a
> binary and running the tests - you should find that there's a segfault in
> the np.fromfile tests (see https://github.com/scipy/scipy/issues/5540).
> And that kind of thing is incredibly painful to debug and fix.
>
>
>> That assumes familiarity with the process, a first time user will spend
>> significantly more time. Most of the work  in a release is keeping track of
>> reported bugs and fixing them. Tracking deprecations and such also takes
>> time.
>>
>>
>>> So the idea would be that for 1.11, we create a 1.11 directory on
>>> sourceforge and upload one final file: a README explaining the
>>> situation, a pointer to the source releases on pypi, and some links to
>>> places where users can find better-supported windows builds (Gohlke's
>>> page, Anaconda, etc.). I think this would serve our users better than
>>> the current system, while also freeing up a drain on our resources.
>>>
>>
>> What about beta releases? I have nothing against offloading part of the
>> release process, but if we do, we need to determine how to coordinate it
>> among the different parties, which might be something of a time sink in
>> itself.
>>
>
> We need to ensure that the MSVC builds work. But that's not new, that was
> always necessary for a release. Christophe has always tested beta/rc
> releases which is super helpful, but we need to get Appveyor CI to work
> soon.
>
> Ralf
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


An appveyor setup is a great idea. An appveyor build matrix with the
various supported MSVC versions would do a lot more to prevent
compatibility issues than periodically building installers with old
versions of
MinGW. The effort toward a MinGW-based build is valuable, but having a
CI system test for MSVC compatibility will be valuable regardless of where
things go with that.

Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Nathaniel Smith
On Dec 18, 2015 2:22 PM, "Ian Henriksen" <
insertinterestingnameh...@gmail.com> wrote:
>
> An appveyor setup is a great idea. An appveyor build matrix with the
> various supported MSVC versions would do a lot more to prevent
> compatibility issues than periodically building installers with old
versions of
> MinGW. The effort toward a MinGW-based build is valuable, but having a
> CI system test for MSVC compatibility will be valuable regardless of where
> things go with that.

Yes, definitely. Would you by chance have any interest in getting this set
up?

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.power -> numpy.random.choice Probabilities don't sum to 1

2015-12-18 Thread Ryan R. Rosario
Hi,

I have a matrix whose entries I must raise to a certain power and then 
normalize by row. After I do that, when I pass some rows to 
numpy.random.choice, I get a ValueError: probabilities do not sum to 1.

I understand that floating point is not perfect, and my matrix is so large that 
I cannot use np.longdouble because I will run out of RAM.

As an example on a smaller matrix:

np.power(mymatrix, 10, out=mymatrix)
row_normalized = np.apply_along_axis(lambda x: x / np.sum(x), 1, mymatrix)
sums = row_normalized.sum(axis=1)
sums[np.where(sums != 1)]

array([ 0.9994,  0.9994,  1.0012, ...,  0.9994,
 0.9994,  0.9994], dtype=float32)

np.random.choice(range(row_normalized.shape[0]), 1, p=row_normalized[0, :])
…
ValueError: probabilities do not sum to 1


I also tried the normalize function in sklearn.preprocessing and have the same 
problem.

Is there a way to avoid this problem without having to make manual adjustments 
to get the row sums to = 1?

— Ryan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Benjamin Root
I believe that a lot can be learned from matplotlib's recent foray into
appveyor. Don't hesitate to ask questions on our dev mailing list (I wasn't
personally involved, so I don't know what was learned).

Cheers!
Ben Root

On Fri, Dec 18, 2015 at 5:27 PM, Nathaniel Smith  wrote:

> On Dec 18, 2015 2:22 PM, "Ian Henriksen" <
> insertinterestingnameh...@gmail.com> wrote:
> >
> > An appveyor setup is a great idea. An appveyor build matrix with the
> > various supported MSVC versions would do a lot more to prevent
> > compatibility issues than periodically building installers with old
> versions of
> > MinGW. The effort toward a MinGW-based build is valuable, but having a
> > CI system test for MSVC compatibility will be valuable regardless of
> where
> > things go with that.
>
> Yes, definitely. Would you by chance have any interest in getting this set
> up?
>
> -n
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion