from:"Travis Oliphant"

Re: [Numpy-discussion] guvectorize, a helper for writing generalized ufuncs

2016-09-13 Thread Travis Oliphant

There has been some discussion on the Numba mailing list as well about a
version of guvectorize that doesn't compile for testing and flexibility.

Having this be inside NumPy itself seems ideal.

-Travis


On Tue, Sep 13, 2016 at 12:59 PM, Stephan Hoyer <sho...@gmail.com> wrote:

> On Tue, Sep 13, 2016 at 10:39 AM, Nathan Goldbaum <nathan12...@gmail.com>
> wrote:
>
>> I'm curious whether you have a plan to deal with the python functional
>> call overhead. Numba gets around this by JIT-compiling python functions -
>> is there something analogous you can do in NumPy or will this always be
>> limited by the overhead of repeatedly calling a Python implementation of
>> the "core" operation?
>>
>
> I don't think there is any way to avoid this in NumPy proper, but that's
> OK (it's similar to the existing overhead of vectorize).
>
> Numba already has guvectorize (and it's own version of vectorize as well),
> which already does exactly this.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Custom Dtype/Units discussion

2016-07-11 Thread Travis Oliphant

 while I originally
wrote NumPy (borrowing heavily from Numeric and drawing inspiration from
Numarray and receiving a lot of help for specific modules from many of
you), the community has continued to develop NumPy and now has a proper
governance model.   I am now simply an interested NumPy user and previous
NumPy developer who finally has some concrete ideas to share based on work
that I have been funding, leading, and encouraging for the past several
years.

I am still very interested in helping NumPy progress, but we are also going
to be taking these ideas to create a general concept of the "buffer
protocol in Python" to enable cross-language code-sharing to enable more
code re-use for data analytics among language communities. This is the
concept of "data-fabric" which is pre-alpha vapor-ware at this point but
with some ideas expressed at http://datashape.pydata.org and here:
https://github.com/blaze/datafabric and is something DyND is enabling.

NumPy itself has a clear governance model and whether NumPy (the project)
adopts any of the new array-computing concepts I am proposing will depend
on this community's decisions as well as work done by motivated developers
willing to work on prototypes.I will be wiling to help get funding for
someone motivated to work on this.

Best,

-Travis




> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changing FFT cache to a bounded LRU cache

2016-06-01 Thread Travis Oliphant

Hi all,

At Continuum we are trying to coordinate with Intel about releasing our
patches from Accelerate upstream as well rather than having them redo
things we have already done but have just not been able to open source yet.

Accelerate also uses GPU accelerated FFTs and it would be nice if there
were a supported NumPy-way of plugging in these optimized approaches.
 This is not a trivial thing to do, though and there are a lot of design
choices.

We have been giving away Accelerate to academics since it was released but
have asked companies to pay for it as a means of generating money to
support open source.Several things that used to be in Accelerate only
are now already in open-source (e.g. cuda.jit, guvectorize, target='cuda'
and target='parallel' in numba.vectorize). I expect this trend will
continue.   The FFT enhancements are another thing that are on the list of
things to make open source.

I for one, welcome Intel's contributions and am enthusiastic about their
joining the Python development community.   In many cases it would be
better if they would just pay a company that already has built and tested
this capability to release it then develop things themselves yet again.
 Any encouragement that can be provided to Intel to encourage them in this
direction would help.

Many companies are now supporting open-source.   Even those that sell some
software are still contributing overall to ensure that the total amount of
useful open-source software available is increasing.

Best,

-Travis

On Wed, Jun 1, 2016 at 7:42 PM, Nathaniel Smith <n...@pobox.com> wrote:

> On Jun 1, 2016 4:47 PM, "David Cournapeau" <courn...@gmail.com> wrote:
> >
> >
> >
> > On Tue, May 31, 2016 at 10:36 PM, Sturla Molden <sturla.mol...@gmail.com>
> wrote:
> >>
> >> Joseph Martinot-Lagarde <contreba...@gmail.com> wrote:
> >>
> >> > The problem with FFTW is that its license is more restrictive (GPL),
> and
> >> > because of this may not be suitable everywhere numpy.fft is.
> >>
> >> A lot of us use NumPy linked with MKL or Accelerate, both of which have
> >> some really nifty FFTs. And the license issue is hardly any worse than
> >> linking with them for BLAS and LAPACK, which we do anyway. We could
> extend
> >> numpy.fft to use MKL or Accelerate when they are available.
> >
> >
> > That's what we used to do in scipy, but it was a PITA to maintain.
> Contrary to blas/lapack, fft does not have a standard API, hence exposing a
> consistent API in python, including data layout involved quite a bit of
> work.
> >
> > It is better to expose those through 3rd party APIs.
>
> Fwiw Intel's new python distribution thing has numpy patched to use mkl
> for fft, and they're interested in pushing the relevant changes upstream.
>
> I have no idea how maintainable their patches are, since I haven't seen
> them -- this is just from taking to people here at pycon.
>
> -n
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Blog-post that explains what Blaze actually is and where Pluribus project now lives.

2016-03-29 Thread Travis Oliphant

I have emailed this list in the past explaining what is driving my open
source efforts now.

Here is a blog-post that may help some of you understand at little bit of
the history of Blaze, DyND Numba, and other related developments as they
relate to scaling up and scaling out array-computing in Python.

http://technicaldiscovery.blogspot.com/2016/03/anaconda-and-hadoop-story-of-journey.html

This post and these projects do not have anything to do with the future of
the NumPy and/or SciPy projects which are now in great hands guiding their
community-driven development.

The post is however, a discussion of additional projects that will
hopefully benefit some of you as well, and for which your feedback and
assistance is welcome.

Best,

-Travis


-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changes to generalized ufunc core dimension checking

2016-03-19 Thread Travis Oliphant

On Wed, Mar 16, 2016 at 3:07 PM, Charles R Harris <charlesr.har...@gmail.com
> wrote:

>
>
> On Wed, Mar 16, 2016 at 1:48 PM, Travis Oliphant <tra...@continuum.io>
> wrote:
>
>>
>>
>> On Wed, Mar 16, 2016 at 12:55 PM, Nathaniel Smith <n...@pobox.com> wrote:
>>
>>> Hi Travis,
>>>
>>> On Mar 16, 2016 9:52 AM, "Travis Oliphant" <tra...@continuum.io> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > Can you help me understand why the stricter changes to generalized
>>> ufunc argument checking no now longer allows scalars to be interpreted as
>>> 1-d arrays in the core-dimensions?
>>> >
>>> > Is there a way to specify in the core-signature that scalars should be
>>> allowed and interpreted in those cases as an array with all the elements
>>> the same?   This seems like an important feature.
>>>
>>> Can you share some example of when this is useful?
>>>
>>
>> Being able to implicitly broadcast scalars to arrays is the core-function
>> of broadcasting.This is still very useful when you have a core-kernel
>> an want to pass in a scalar for many of the arguments.   It seems that at
>> least in that case, automatic broadcasting should be allowed --- as it
>> seems clear what is meant.
>>
>> While you can use the broadcast* features to get the same effect with the
>> current code-base, this is not intuitive to a user who is used to having
>> scalars interpreted as arrays in other NumPy operations.
>>
>
> The `@` operator doesn't allow that.
>
>
>>
>> It used to automatically happen and a few people depended on it in
>> several companies and so the 1.10 release broke their code.
>>
>> I can appreciate that in the general case, allowing arbitrary
>> broadcasting on the internal core dimensions can create confusion.  But,
>> scalar broadcasting still makes sense.
>>
>
> Mixing array multiplications with scalar broadcasting is looking for
> trouble. Array multiplication needs strict dimensions and having stacked
> arrays and vectors was one of the prime objectives of gufuncs. Perhaps what
> we need is a more precise notation for broadcasting, maybe `*` or some such
> addition to the signaturs to indicate that scalar broadcasting is
> acceptable.
>

I think that is a good idea.Let the user decide if scalar broadcasting
is acceptable for their function.

Here is a simple concrete example where scalar broadcasting makes sense:

A 1-d dot product (the core of np.inner)   (k), (k) -> ()

A user would assume they could call this function with a scalar in either
argument and have it broadcast to a 1-d array.Of course, if both
arguments are scalars, then it doesn't make sense.

Having a way for the user to allow scalar broadcasting seems sensible and a
nice compromise.

-Travis



>  
>
> Chuck
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Changes to generalized ufunc core dimension checking

2016-03-19 Thread Travis Oliphant

Hi everyone,

Can you help me understand why the stricter changes to generalized ufunc
argument checking no now longer allows scalars to be interpreted as 1-d
arrays in the core-dimensions?

Is there a way to specify in the core-signature that scalars should be
allowed and interpreted in those cases as an array with all the elements
the same?   This seems like an important feature.

Here's an example:

myfunc with core-signature (t),(k),(k) -> (t)

called with myfunc(arr1, arr2, scalar2).

This used to work in 1.9 and before and scalar2 was interpreted as a 1-d
array the same size as arr2.   It no longer works with 1.10.0 but I don't
see why that is an improvement.

Thoughts?   Is there a work-around that doesn't involve creating a 1-d
array the same size as arr2 and filling it with scalar2?

Thanks.

-Travis



-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changes to generalized ufunc core dimension checking

2016-03-19 Thread Travis Oliphant

On Wed, Mar 16, 2016 at 12:55 PM, Nathaniel Smith <n...@pobox.com> wrote:

> Hi Travis,
>
> On Mar 16, 2016 9:52 AM, "Travis Oliphant" <tra...@continuum.io> wrote:
> >
> > Hi everyone,
> >
> > Can you help me understand why the stricter changes to generalized ufunc
> argument checking no now longer allows scalars to be interpreted as 1-d
> arrays in the core-dimensions?
> >
> > Is there a way to specify in the core-signature that scalars should be
> allowed and interpreted in those cases as an array with all the elements
> the same?   This seems like an important feature.
>
> Can you share some example of when this is useful?
>

Being able to implicitly broadcast scalars to arrays is the core-function
of broadcasting.This is still very useful when you have a core-kernel
an want to pass in a scalar for many of the arguments.   It seems that at
least in that case, automatic broadcasting should be allowed --- as it
seems clear what is meant.

While you can use the broadcast* features to get the same effect with the
current code-base, this is not intuitive to a user who is used to having
scalars interpreted as arrays in other NumPy operations.

It used to automatically happen and a few people depended on it in several
companies and so the 1.10 release broke their code.

I can appreciate that in the general case, allowing arbitrary broadcasting
on the internal core dimensions can create confusion.  But, scalar
broadcasting still makes sense.

A better workaround would be to use one of the np.broadcast* functions to
> request exactly the broadcasting you want and make an arr2-sized view of
> the scalar. In this case where you presumably (?) want to allow the last
> two arguments to be broadcast against each other arbitrarily:
>
> arr2, arr3 = np.broadcast_arrays(arr2, scalar)
> myufunc(arr1, arr2, arr3)
>
> A little wordier than implicit broadcasting, but not as bad as manually
> creating an array, and like implicit broadcasting the memory overhead is
> O(1) instead of O(size).
>

Thanks for the pointer (after I wrote the email this solution also occured
to me).   I think adding back automatic broadcasting for the scalar
case makes a lot of sense as well, however.   What do people think of that?

Also adding this example to the documentation as a work-around for people
whose code breaks with the new changes.

Thanks,

-Travis

> -n
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changes to generalized ufunc core dimension checking

2016-03-18 Thread Travis Oliphant

On Thu, Mar 17, 2016 at 4:41 PM, Stephan Hoyer <sho...@gmail.com> wrote:

> On Thu, Mar 17, 2016 at 1:04 AM, Travis Oliphant <tra...@continuum.io>
> wrote:
>
>> I think that is a good idea.Let the user decide if scalar
>> broadcasting is acceptable for their function.
>>
>> Here is a simple concrete example where scalar broadcasting makes sense:
>>
>>
>> A 1-d dot product (the core of np.inner)   (k), (k) -> ()
>>
>> A user would assume they could call this function with a scalar in either
>> argument and have it broadcast to a 1-d array.Of course, if both
>> arguments are scalars, then it doesn't make sense.
>>
>> Having a way for the user to allow scalar broadcasting seems sensible and
>> a nice compromise.
>>
>> -Travis
>>
>
> To generalize a little bit, consider the entire family of weighted
> statistical function (mean, std, median, etc.). For example, the gufunc
> version of np.average is basically equivalent to np.inner with a bit of
> preprocessing.
>
> Arguably, it *could* make sense to broadcast weights when given a scalar:
> np.average(values, weights=1.0 / len(values)) is pretty unambiguous.
>
> That said, adding an explicit "scalar broadcasting OK flag" seems like a
> hack that will need even more special logic (e.g., so we can error if both
> arguments to np.inner are scalars).
>
> Multiple dispatch for gufunc core signatures seems like the cleaner
> solution. If you want np.inner to handle scalars, you need to supply core
> signatures (k),()->() and (),(k)->() along with (k),(k)->(). This is the
> similar to vision of three core signatures for np.matmul: (i),(i,j)->(j),
> (i,j),(j)->(i) and (i,j),(j,k)->(i,k).
>
> Maybe someone will even eventually get around to adding an axis/axes
> argument so we can specify these core dimensions explicitly. Writing
> np.inner(a, b, axes=((-1,), ())) could trigger the (k),()->() signature
> even if the second argument is not a scalar (it should be broadcast against
> "a" instead).
>

That's a great idea!

Adding multiple-dispatch capability for this case could also solve a lot of
issues that right now prevent generalized ufuncs from being the mechanism
of implementation of *all* NumPy functions.

-Travis





>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant, PhD*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.11.0 release notes.

2016-01-20 Thread Travis Oliphant

Impressive work!Thank you for all the hard work that went in to these
improvements and releases.

-Travis


On Wed, Jan 20, 2016 at 12:32 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

> Hi All,
>
> I've put up a PR with revised 1.11.0 release notes at
> https://github.com/numpy/numpy/pull/7073. I would appreciate it if anyone
> involved in the 1.11 release would take a look and note anything missing
> that they think should be included or things that are misrepresented.
>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should I use pip install numpy in linux?

2016-01-15 Thread Travis Oliphant

On Thu, Jan 14, 2016 at 12:58 PM, Matthew Brett <matthew.br...@gmail.com>
wrote:

> On Thu, Jan 14, 2016 at 9:14 AM, Chris Barker - NOAA Federal
> <chris.bar...@noaa.gov> wrote:
> >>> Also, you have the problem that there is one PyPi -- so where do you
> put
> >>> your nifty wheels that depend on other binary wheels? you may need to
> fork
> >>> every package you want to build :-(
> >>
> >> Is this a real problem or a theoretical one? Do you know of some
> >> situation where this wheel to wheel dependency will occur that won't
> >> just be solved in some other way?
> >
> > It's real -- at least during the whole bootstrapping period. Say I
> > build a nifty hdf5 binary wheel -- I could probably just grab the name
> > "libhdf5" on PyPI. So far so good. But the goal here would be to have
> > netcdf and pytables and GDAL and who knows what else then link against
> > that wheel. But those projects are all supported be different people,
> > that all have their own distribution strategy. So where do I put
> > binary wheels of each of those projects that depend on my libhdf5
> > wheel? _maybe_ I would put it out there, and it would all grow
> > organically, but neither the culture nor the tooling support that
> > approach now, so I'm not very confident you could gather adoption.
>
> I don't think there's a very large amount of cultural work - but some
> to be sure.
>
> We already have the following on OSX:
>
> pip install numpy scipy matplotlib scikit-learn scikit-image pandas h5py
>
> where all the wheels come from pypi.  So, I don't think this is really
> outside our range, even if the problem is a little more difficult for
> Linux.
>
> > Even beyond the adoption period, sometimes you need to do stuff in
> > more than one way -- look at the proliferation of channels on
> > Anaconda.org.
> >
> > This is more likely to work if there is a good infrastructure for
> > third parties to build and distribute the binaries -- e.g.
> > Anaconda.org.
>
> I thought that Anaconda.org allows pypi channels as well?
>

It does:   http://pypi.anaconda.org/

-Travis


>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should I use pip install numpy in linux?

2016-01-15 Thread Travis Oliphant

to do the same.   The only
thing I have heard are "chicken-and-egg" stories that come down to "we want
people to be able to use pip."   So, good, then let's make it so that pip
can install conda packages and that conda packages with certain
restrictions can be hosted on pypi or anywhere else that you have an
"index".   At least if there were valid reasons they could be addressed.
But, this head-in-the-sand attitude towards a viable technology that is
freely available is really puzzling to me.

There are millions of downloads of Anaconda and many millions of downloads
of conda packages each year.   That is just with one company doing it.
There could be many millions more with other companies and organizations
hosting conda packages and indexes. The conda user-base is already very
large.   A great benefit to the Python ecosystem would be to allow pip
users and conda users to share each other's work --- rather than to spend
time reproducing work that is already done and freely available.

-Travis





> -n
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fast Access to Container of Numpy Arrays on Disk?

2016-01-14 Thread Travis Oliphant

__test
>>
>> So, the final disk usage is quite similar to NPZ, but it can store and
>> retrieve lots faster.  Also, the data decompression speed is on par to
>> using non-compression.  This is because bcolz uses Blosc behind the scenes,
>> which is much faster than zlib (used by NPZ) --and sometimes faster than a
>> memcpy().  However, even we are doing I/O against the disk, this dataset is
>> so small that fits in the OS filesystem cache, so the benchmark is actually
>> checking I/O at memory speeds, not disk speeds.
>>
>> In order to do a more real-life comparison, let's use a dataset that is
>> much larger than the amount of memory in my laptop (8 GB):
>>
>> $ PYTHONPATH=. python key-store.py -f bcolz -m 100 -k 5000 -d
>> /media/faltet/docker/__test -l 0
>> ## Checking method: bcolz (via ctable(clevel=0, cname='blosclz')
>> 
>> Building database.  Wait please...
>> Time (creation) --> 133.650
>> Retrieving 100 keys in arbitrary order...
>> Time (   query) --> 2.881
>> Number of elements out of getitem: 91907396
>> faltet@faltet-Latitude-E6430:~/blosc/bcolz$ du -sh
>> /media/faltet/docker/__test
>>
>> 39G /media/faltet/docker/__test
>>
>> and now, with compression on:
>>
>> $ PYTHONPATH=. python key-store.py -f bcolz -m 100 -k 5000 -d
>> /media/faltet/docker/__test -l 9
>> ## Checking method: bcolz (via ctable(clevel=9, cname='blosclz')
>> 
>> Building database.  Wait please...
>> Time (creation) --> 145.633
>> Retrieving 100 keys in arbitrary order...
>> Time (   query) --> 1.339
>> Number of elements out of getitem: 91907396
>> faltet@faltet-Latitude-E6430:~/blosc/bcolz$ du -sh
>> /media/faltet/docker/__test
>>
>> 12G /media/faltet/docker/__test
>>
>> So, we are still seeing the 3x compression ratio.  But the interesting
>> thing here is that the compressed version works a 50% faster than the
>> uncompressed one (13 ms/query vs 29 ms/query).  In this case I was using a
>> SSD (hence the low query times), so the compression advantage is even more
>> noticeable than when using memory as above (as expected).
>>
>> But anyway, this is just a demonstration that you don't need heavy tools
>> to achieve what you want.  And as a corollary, (fast) compressors can save
>> you not only storage, but processing time too.
>>
>> Francesc
>>
>>
>> 2016-01-14 11:19 GMT+01:00 Nathaniel Smith <n...@pobox.com>:
>>
>>> I'd try storing the data in hdf5 (probably via h5py, which is a more
>>> basic interface without all the bells-and-whistles that pytables
>>> adds), though any method you use is going to be limited by the need to
>>> do a seek before each read. Storing the data on SSD will probably help
>>> a lot if you can afford it for your data size.
>>>
>>> On Thu, Jan 14, 2016 at 1:15 AM, Ryan R. Rosario <r...@bytemining.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > I have a very large dictionary that must be shared across processes
>>> and does not fit in RAM. I need access to this object to be fast. The key
>>> is an integer ID and the value is a list containing two elements, both of
>>> them numpy arrays (one has ints, the other has floats). The key is
>>> sequential, starts at 0, and there are no gaps, so the “outer” layer of
>>> this data structure could really just be a list with the key actually being
>>> the index. The lengths of each pair of arrays may differ across keys.
>>> >
>>> > For a visual:
>>> >
>>> > {
>>> > key=0:
>>> > [
>>> > numpy.array([1,8,15,…, 16000]),
>>> > numpy.array([0.1,0.1,0.1,…,0.1])
>>> > ],
>>> > key=1:
>>> > [
>>> > numpy.array([5,6]),
>>> > numpy.array([0.5,0.5])
>>> > ],
>>> > …
>>> > }
>>> >
>>> > I’ve tried:
>>> > -   manager proxy objects, but the object was so big that
>>> low-level code threw an exception due to format and monkey-patching wasn’t
>>> successful.
>>> > -   Redis, which was far too slow due to setting up connections
>>> and data conversion etc.
>>> > -   Numpy rec arrays + memory mapping, but there is a restriction
>>> that the numpy arrays in each “column” must be o

Re: [Numpy-discussion] Should I use pip install numpy in linux?

2016-01-11 Thread Travis Oliphant

Anaconda "build environment" was setup by Ilan and me.Aaron helped to
build packages while he was at Continuum but spent most of his time on the
open-source conda project.

It is important to understand the difference between Anaconda and conda in
this respect.   Anaconda is a particular dependency foundation that
Continuum supports and releases -- it will have a particular set of
expected libraries on each platform (we work to keep this fairly limited
and on Linux currently use CentOS 5 as the base).

conda is a general package manager that is open-source and that anyone can
use to produce a set of consistent binaries (there can be many conda-based
distributions).   It solves the problem of multiple binary dependency
chains generally using the concept of "features".  This concept of
"features" allows you to create environments with different base
dependencies.

What packages you install when you  "conda install" depends on which
channels you are pointing to and which features you have "turned on" in the
environment.   It's a general system that extends the notions that were
started by the PyPA.

-Travis

On Sun, Jan 10, 2016 at 10:14 PM, Robert McGibbon <rmcgi...@gmail.com>
wrote:

> > > Right. There's a small problem which is that the base linux system
> >> isn't just "CentOS 5", it's "CentOS 5 and here's the list of libraries
> > > that you're allowed to link to: ...", where that list is empirically
> > > chosen to include only stuff that really is installed on ~all linux
> >> machines and for which the ABI really has been stable in practice over
> > > multiple years and distros (so e.g. no OpenSSL).
> > >
> > > Does anyone know who maintains Anaconda's linux build environment?
>
> > I strongly suspect it was originally set up by Aaron Meurer. Who
> maintains it now that he is no longer at Continuum is a good question.
>
> From looking at all of the external libraries referenced by binaries
> included in Anaconda
> and the conda repos, I am not confident that they have a totally strict
> policy here, or at least
> not ones that is enforced by tooling. The sonames I listed here
> <https://mail.scipy.org/pipermail/numpy-discussion/2016-January/074602.html> 
> cover
> all of the external
> dependencies used by the latest Anaconda release, but earlier releases and
> other
> conda-installable packages from the default channel are not so strict.
>
> -Robert
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

-- 

*Travis Oliphant*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] isfortran compatibility in numpy 1.10.

2015-10-30 Thread Travis Oliphant

As I posted to the github issue, I support #2 as it is the original
meaning. The most common case of isfortran that I recall was to support
transpositions that needed to occur before calling Fortran-compiled linear
algebra routines.

However, with that said, you could also reasonably do #1 and likely have no
real problem --- because transposing a 1-d array doesn't have any effect.

In NumPy 1.0.1, isfortran was intended to be True only for arrays with
a.ndim > 1. Thus, it would have been possible for someone to rely on that
invariant for some other reason.

With relaxed stride checking, this invariant changed because isfortran was
implemented by returning True if the F_Contiguous flag was set but the
C_Contiguous flag was not (this was only ever previously possible for
a.ndim > 1).

If you choose to go with #1, please emphasize in the release notes that
isfortran now does not assume a.ndim > 1 but is simply short-hand for
a.flags.f_contiguous.

-Travis

On Fri, Oct 30, 2015 at 5:12 PM, Charles R Harris <charlesr.har...@gmail.com
> wrote:

> Hi All,
>
> The isfortran function calls a.fnc (Fortran-Not-C), which is implemented
> as  F_CONTIGUOUS &&  !C_CONTIGUOUS. Before relaxed stride checking
> contiguous multidimensional arrays could not be both and continguous 1-D
> arrays were always CONTIGUOUS, but this is not longer the case.
> Consequently current isfortran breaks backward compatiblity. There are two
> suggested solutions
>
>1. Return `a.flags.f_contiguous`. This differs for 1-D arrays, but is
>most consistent with the name isfortran.
>2. Return `a.flags.f_contiguous and a.ndim > 1`, which would be
>backward compatible.
>
> It is also possible to start with 2. but add a FutureWarning and later
> move to 1, which it my preferred solution. See gh-6590
> <https://github.com/numpy/numpy/issues/6590> for the issue.
>
> Thoughts?
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

-- 

*Travis Oliphant*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy Generalized Ufuncs: Pointer Arithmetic and Segmentation Faults (Debugging?)

2015-10-25 Thread Travis Oliphant

Two things that might help you create generalized ufuncs:

1) Look at Numba --- it makes it very easy to write generalized ufuncs in
simple Python code.  Numba will compile to machine code so it can be as
fast as writing in C.   Here is the documentation for that specific
feature:
http://numba.pydata.org/numba-doc/0.21.0/user/vectorize.html#the-guvectorize-decorator.
One wart of the interface is that scalars need to be treated as
1-element 1-d arrays (but still use '()' in the signature).

2) Look at the linear algebra module in NumPy which now wraps a bunch of
linear-algebra based generalized ufuncs (all written in C):
https://github.com/numpy/numpy/blob/master/numpy/linalg/umath_linalg.c.src

-Travis



On Sun, Oct 25, 2015 at 7:06 AM, <eleanore.yo...@artorg.unibe.ch> wrote:

> Dear Numpy maintainers and developers,
>
> Thanks for providing such a great numerical library!
>
> I’m currently trying to implement the Dynamic Time Warping metric as a set
> of generalised numpy ufuncs, but unfortunately, I have lasting issues with
> pointer arithmetic and segmentation faults. Is there any way that I can
> use GDB or some such to debug a python/numpy extension? Furthermore: is it
> necessary to use pointer arithmetic to access the function arguments (as
> seen on http://docs.scipy.org/doc/numpy/user/c-info.ufunc-tutorial.html)
> or is element access (operator[]) also permissible?
>
> To break it down quickly, I need to have a fast DTW distance function
> dist_dtw() with two vector inputs (broadcasting should be possible), two
> scalar parameters and one scalar output (signature: (i), (j), (), () -> ())
> usable in python for a 1-Nearest Neighbor classification algorithm. The
> extension also implements two functions compute_envelope() and
> piecewise_mean_reduction() which are used for lower-bounding based on Keogh
> and Ratanamahatana, 2005. The source code is available at
> http://pastebin.com/MunNaP7V and the prominent segmentation fault happens
> somewhere in the chain dist_dtw() —> meta_dtw_dist() —> slow_dtw_dist(),
> but I fail to pin it down.
>
> Aside from my primary questions, I wonder how to approach
> errors/exceptions and unit testing when developing numpy ufuncs. Are there
> any examples apart from the numpy manual that I could use as reference
> implementations of generalised numpy ufuncs?
>
> I would greatly appreciate some insight into properly developing
> generalised ufuncs.
>
> Best,
> Eleanore
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Let's move forward with the current governance document.

2015-10-02 Thread Travis Oliphant

Hi everyone,

After some further thought and spending quite a bit of time re-reading the
discussion on a few threads, I now believe that my request to be on the
steering council might be creating more trouble than it's worth.
Nothing matters to me more than seeing NumPy continue to grow and improve.

So, I'm switching my position to supporting the adoption of the governance
model outlined and just contributing as I can outside the steering council.
   The people on the steering council are committed to the success of NumPy
and will do a great job --- they already have in contributing to the
community over the past year(s).We can always revisit the question in a
year if difficulties arise with the model.

If my voice and other strong voices remain outside the council, perhaps we
can all encourage that the intended community governance of NumPy does in
fact happen, and most decisions continue to be made in the open.

I had the pleasure last night of meeting one of the new NumPy core
contributors, Allan Haldane.   This only underscored my confidence in
everyone who is contributing to NumPy today.   This confidence has already
been established by watching the great contributions of many talented
developers who have given their time and talents to the project over the
past several years.

I hope that we can move on from the governance discussion and continue to
promote the success of the project together.

Best,

-Travis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] composition of the steering council (was Re: Governance model request)

2015-09-29 Thread Travis Oliphant

Thanks for the candid discussion and for expressing concerns freely.

I think Nathaniel's "parenting" characterization of NumPy from me is pretty
accurate.I do feel a responsibility for the *stuff* that's out there,
and that is what drives me.   I do see the many contributions from others
and really learn from them as well.

I have seen conversations on this list and others have characterizations of
history that I don't agree with which affects decisions that are being made
--- and so I feel compelled to try and share my view.

I'm in a situation now where at least for 6 months or so I can help with
NumPy more than I have been able to for 7 years.

Focusing on the initial governance text, my issues are that

1) 1 year of inactivity to be removed from the council is too little for a
long-running project like NumPy --- somewhere between 2 and 4 years would
be more appropriate.   I suppose 1 year of inactivity is fine if that is
defined only as "failure to vote on matters before the council"

2) The seed council should not just be recent contributors but should
include as many people as are willing to help who have a long history with
the project.

3) I think people who contribute significantly generally should be able to
re-join the steering council more easily than "going through the 1-year
vetting process" again --- they would have to be approved by the current
steering council but it should not be automatically disallowed (thus
requiring the equivalent of an amendment to change it).

I applaud the fact that the steering council will not be and should not be
used except when absolutely necessary and for limited functions.

Thanks,

-Travis

On Tue, Sep 29, 2015 at 4:06 AM, Nathaniel Smith <n...@pobox.com> wrote:

> On Fri, Sep 25, 2015 at 7:15 AM, Thomas Caswell <tcasw...@gmail.com>
> wrote:
> > To respond to the devils advocate:
> >
> >  Creating this organizational framework is a one time boot-strapping
> event.
> > You could use wording like "The initial council will include those who
> have
> > made significant contributions to numpy in the past and want to be on
> it" or
> > "The initial council will be constructed by invitation by Nathaniel and
> > Chuck".  More objective criteria should be used going forward, but in
> terms
> > of getting things spun up quickly doing things by fiat is probably ok.
> I am
> > not even sure that the method by which the initial group is formed needs
> to
> > go into the governing document.
>
> The problem is that according to the current text, not only is Travis
> ineligible to join the council (it's a little weird to put people on
> the seed council who wouldn't be eligible to join it normally, but
> okay, sure) -- he's not even eligible to stay on the council once he
> joins. So we would need to change the text no matter what.
>
> Which we can do, if we decide that that's what we need to do to
> accomplish what we want. It's our text, after all. I think it's
> extremely important though that what we actually do, and what we write
> down saying we will do, somehow match. Otherwise this whole exercise
> has no point.
>
> -n
>
> --
> Nathaniel J. Smith -- http://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>

-- 

*Travis Oliphant*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] composition of the steering council (was Re: Governance model request)

2015-09-24 Thread Travis Oliphant

>
>
>
> [1] Sorry to "footnote" this, but I think I am probably rudely repeating
> myself and frankly do **not want this to be discussed**. It is just to
> try to be fully clear where I come from:
> Until SciPy 2015, I could list many people on this list who have shown
> more direct involvement in numpy then Travis since I joined and have no
> affiliation to numpy. If Travis had been new to the community at the
> time, I would be surprised if I would even recognize his name.
> I know this is only half the picture and Travis already mentioned
> another side, but this is what I mostly saw even if it may be a harsh
> and rude assessment.
>
>
I do understand this.   That's actually why I'm speaking up, because I
don't think my activity has been understood by many people who have joined
this list only recently.   I don't want to interfere with your activity or
impede your progress, or to be asked permission for anything.   In fact, I
want to understand how to best use my limited time to support things.

You in particular are interested in indexing and fixing it --- the current
code is there for a reason and some of the issues being discussed today
have been discussed before --- though we have the benefit of hindsight now.


I have mostly been behind the scenes helping people since about 2010 ---
but still thinking a lot about NumPy, the downstream community, integration
with other libraries, and where things could go. I don't have the time
to commit major code changes, but I do have the time to contribute
perspective and even a design idea or two from time to time.Obviously,
nobody has to listen.

I understand and appreciate that there are a lot of people that have
contributed code and discussion since 2009 and to them it probably seems
I'm just popping in and out --- and if you only look at the contributor log
you can wonder "who is this guy...". But, I did do *a lot* of work to
get NumPy off the ground.   Quite a bit of that work was very lonely with
people interested in the merger but pretty skeptical until the work was
nearly done (and then many people helped finish it and get it working and
tested). I wish I had been a better architect at the time (I can see
now many things that would have been done differently).But, I'm still
proud of the work I did in creating a foundation many could build on --- at
the time nobody else was stepping up to do the job.

Since that time, I have remained very interested in the success of NumPy
and supporting the many *users* of NumPy. What I most bring to the
current community is having observed many, many uses of NumPy in the wild
--- from people who would never post to this list and whose use-cases are
absent from discussion or misunderstood.  I also bring knowledge about
the wider Python ecosystem and the broader world outside of NumPy alone.
The group is free to take my ideas and/or contributions or leave them.
 And I am also free to just review pull requests and contribute if and when
I might.

Best,

-Travis








>
> >
> > Chuck
> >
> >
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-23 Thread Travis Oliphant

On Wed, Sep 23, 2015 at 3:02 AM, Fernando Perez <fperez@gmail.com>
wrote:

> Hi all,
>
> I would like to pitch in here, I am sorry that I didn't have the time
> before...
>
> First, I want to disclose that recently Continuum made a research gift to
> the Jupyter project; we were just now writing up a blog post to acknowledge
> this, but in light of this discussion, I feel that I should say this up
> front so folks can gauge any potential bias accordingly.
>
>
> On Tue, Sep 22, 2015 at 3:44 AM, Travis Oliphant <tra...@continuum.io>
> wrote:
>
>> I'm actually offended that so many at BIDS seem eager to crucify my
>> intentions when I've done nothing but give away my time, my energy, my
>> resources, and my sleep to NumPy for many, many years.I guess if your
>> intent is to drive me away, then you are succeeding.
>
>
> Travis, first, I'd like to kindly ask you not to conflate BIDS, an
> institution where a large number of people work, with the personal opinions
> of some, who happen to work there but in this case are speaking only for
> themselves.  You say "so many at BIDS", but as far as I know, your
> disagreements are with Stefan and Nathaniel (Matthew doesn't work at
> BIDS).  You are painting with a very wide brush the work of many people,
> and in the process, unfairly impacting others who have nothing to do with
> this.
>

I accept that criticism and apologize for doing that.   My *human* side was
coming out, and I was not being fair.  In my head, though I was also
trying to illustrate how some seemed to be doing the same thing for
Continuum or other companies.   This did not come out very artfully in the
early morning hours. I'm sorry.BIDS is doing a lot for the
community --- the recent DS4DS workshop, for example, was a spectacularly
useful summit --- I hope that many different write-ups and reports of the
event make their way out into the world.

>
>
> 1. I hope the discussion can move past the suspicion and innuendo about
> Continuum and Travis.  I haven't always agreed with how Travis communicates
> some of his ideas, and I've said it to him in such instances (e.g. this
> weekend, as I myself was surprised at how his last round of comments had
> landed on the list a few days back).  But I also have worked closely with
> him for years because I know that he has proven, not in words, but in
> actions, that he has the best interests of our community at heart, and that
> he is willing to try and do everything in his power to help whenever he
> can.
>

I really hope it's just a perception problem (perhaps on my end).   There
are challenges with working in the commercial world (there are a lot of
things to do that have nothing to do with the technology creation) and
communicating on open-source mailing lists.As many have noticed,
despite my intentions to contribute, I really can't do the same level of
contribution personally that I could when I was a student and a professor
and had more time.

However, I think that it is also under-appreciated (or mis-understood) how
much time I have spent with training and helping people who have
contributed instead.It's important to me to build a company that can
sponsor people to work on open-source (in a community setting).We are
still working on that, but it has been my intent.  So, far it's actually
easier to sponsor new projects than it is to sponsor people on old
projects.   I am quite sure that if Continuum had put 3 people full time on
NumPy in 2012, there would have been a lot of back-lash and
mis-understanding.   That's why we didn't do it.The collateral effect
of that was the creation of other tools that could be somewhat competitive
with NumPy long term -- or not.

I'd like to learn how to work with the community in an optimal way so that
everyone benefits --- and progress happens.  That's also why we created
Numfocus --- though it is ironic that NumPy has been one of the last
projects to actually sign up and be a formally sponsored project.

2. Conflicts of interest are a fact of life, in fact, I would argue that
> every healthy and sufficiently interconnected community eventually *should*
> have conflicts of interest. They are a sign that there is activity across
> multiple centers of interest, and individuals with connections in multiple
> areas of the community.  And we *want* folks who are engaged enough
> precisely to have such interests!
>
> For conflict of interest management, we don't need to reinvent the wheel,
> this is actually something where our beloved institutions, blessed be their
> bureaucratic souls, have tons of training materials that happen to be not
> completely useless.  Most universities and the national labs have
> information on COIs that provides guidelines, and Numpy could include in
>

Re: [Numpy-discussion] Steering Committee Size

2015-09-23 Thread Travis Oliphant

On Wed, Sep 23, 2015 at 3:25 AM, Sebastian Berg 
wrote:

> Hi,
>
> Trying to figure out at least a bit from  the discussions. While I am
> happy with the draft, I wonder if someone has some insights about some
> questions:
>
> 1. How large crowds have examples of working well with apache style voting?
>

I don't have experience with this, other than Python mailing list where it
is fine.


> 2. How large do we expect numpy steering council to be (I have always
> thought about 10).
>

I don't know.  I don't think this is set as far as I'm aware.

3. More on  opinions, how large does the community feel is too large (so
> that we should maybe elect people).
>
> And to maybe more a discussion point, does the community feel that those
> who would be/are affectivly now in the Steering Council do not sufficiently
> represent old time contributers who were not active in the past year(s).
>

As I mentioned before, I am happy to serve on the initial seed council to
help transition more fully to this style of governance.

-Travis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-23 Thread Travis Oliphant

>
>
> One last time, it was *not* a personal reference to you: the only reason I
> mentioned your names was because of the Berkeley clarification regarding
> BIDS that I asked of Travis, that's all.  If that comment hadn't been made,
>  I would not have made any mention whatsoever of anyone in particular.  I
> apologize for not foreseeing that this would have made you feel singled
> out, in retrospect, I should have.
>
> In my mind, it was the opposite, as I felt that you had every right to
> express whatever opinions you have speaking for yourselves, independent of
> your affiliations, and I was simply asking Travis to separate individuals
> from institutions.  But I should have realized that calling anyone out by
> name in a context like this is a bad idea regardless.
>
>
This was my fault for not being more careful in my words.   I felt multiple
things when I wrote my emails that led to incorrectly chosen words --- but
mostly I was feeling unappreciated, attacked, and accused.   I'm sure now
that was not intended --- but there have been mis-understandings.  I expect
they will happen again.   I know if we listen to each other and trust that
while we may see the world differently and have different framings of
solutions --- we can work to coordinate on an important technical activity
together.

In retrospect, my initial email requesting inclusion on the seed council
could have been worded better (as there were multiple things conflated
together).   I am responding to the actual text of the governance document
in the other thread so as to clarify what my proposal actually is in the
context of that document.

-Travis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] interpretation of the draft governance document (was Re: Governance model request)

2015-09-23 Thread Travis Oliphant

Hi Nathaniel,

Thanks for the clarifications.   Is the governance document committed to
the repository?   I keep looking for it and have a hard time finding it ---
I think I read it last in an email.

In this way, I could make Pull Requests to the governance document if there
are concrete suggestions for change, and then have them reviewed in the
standard way.

I'm hopeful that a few tweaks to the document would satisfy all my
concerns.

Thanks,

-Travis




On Wed, Sep 23, 2015 at 1:04 PM, Nathaniel Smith <n...@pobox.com> wrote:

> Hi Travis,
>
> On Tue, Sep 22, 2015 at 3:08 AM, Travis Oliphant <tra...@continuum.io>
> wrote:
> >
> >
> > On Tue, Sep 22, 2015 at 4:33 AM, Nathaniel Smith <n...@pobox.com> wrote:
> >>
> >> On Tue, Sep 22, 2015 at 1:24 AM, Travis Oliphant <tra...@continuum.io>
> >> wrote:
> >>>
> >>> I actually do agree with your view of the steering council as being
> >>> usually not really being needed.You are creating a straw-man by
> >>> indicating otherwise.I don't believe a small council should do
> anything
> >>> *except* resolve disputes that cannot be resolved without one.  Like
> you, I
> >>> would expect that would almost never happen --- but I would argue that
> >>> extrapolating from Debian's experience is not actually relevant here.
> >>
> >>
> >> To be clear, Debian was only one example -- what I'm extrapolating from
> is
> >> every community-driven F/OSS project that I'm aware of.
> >>
> >> It's entirely possible my data set is incomplete -- if you have some
> other
> >> examples that you think would be better to extrapolate from, then I'd be
> >> genuinely glad to hear them. You may have noticed that I'm a bit of an
> >> enthusiast on this topic :-).
> >>
> >
> >
> > Yes, you are much better at that than I am.   I'm not even sure where I
> > would look for this kind of data.
> >
> >>>
> >>>
> >>>
> >>> So, if the steering council is not really needed then why have it at
> all?
> >>> Let's just eliminate the concept entirely.
> >>>
> >>
> >> In my view, the reasons for having such a council are:
> >> 1) The framework is useful even if you never use it, because it means
> >> people can run "what if" scenarios in their mind and make decisions on
> that
> >> basis. In the US legal system, only a vanishingly small fraction of
> cases go
> >> to the Supreme Court -- but the rules governing the Supreme Court have a
> >> huge effect on all cases, because people can reason about what would
> happen
> >> *if* they tried to appeal to the Supreme Court.
> >
> >
> > O.K.  That is a good point.   I can see the value in that.
> >
> >
> >>
> >> 2) It provides a formal structure for interfacing with the outside
> world.
> >> E.g., one can't do anything with money or corporate contributions
> without
> >> having some kind of written-down and enforceable rules for making
> decisions
> >> (even if in practice you always stick to the "everyone is equal and we
> >> govern by consensus" part of the rules).
> >
> >
> > O.K.
> >
> >>
> >> 3) There are rare but important cases where discussions have to be had
> in
> >> private. The main one is "personnel decisions" like inviting people to
> join
> >> the council; another example Fernando has mentioned to me is that when
> they
> >> need to coordinate a press release between the project and a funding
> body,
> >> the steering council reviews the press release before it goes public.
> >
> >
> > O.K.
> >
> >
> >>
> >> That's pretty much it, IMO.
> >>
> >> The framework we all worked out at the dev meeting in Austin seems to
> >> handle these cases well AFAICT.
> >
> >
> > How did we "all" work it out when not everyone was there?   This is
> where I
> > get lost.   You talk about community decision making and yet any actual
> > decision is always a subset of the community.I suppose you just rely
> on
> > the "if nobody complains than it's o.k." rule?   That really only works
> if
> > the project is moving slowly.
>
> By "all" I just meant "all of us who were there" (which was a majority
> of the active maintainers + a number of other interested parties --
> the list of attendees is in the meeting notes if you're cur

Re: [Numpy-discussion] composition of the steering council (was Re: Governance model request)

2015-09-23 Thread Travis Oliphant

>
> Regarding the seed council, I just tried to pick an objective
> criterion and an arbitrary date that seemed generally in keeping with
> idea of "should be active in the last 1-to-2-years-ish". Fiddling with
> the exact date in particular makes very little difference -- between
> pushing it back to 2 years ago today or forward to 1 year ago today,
> the only thing that changes is whether Pauli makes the list or not.
> (And Pauli is obviously a great council candidate, though I don't know
> whether he even wants to be on it.)
>
> > Personally, I have no idea how big the council should be. Too big, and
> > there is no point, consensus is harder to reach the larger the group,
> > and the main (only?) role of the council is to resolve issues where
> > consensus has not been reached in the larger community. But what is
> > too big?
>
>
> > As for make-up of the council, I think we need to expand beyond people
> > who have recently contributed core code.
> >
> > Yes, the council does need to have expertise to make technical
> > decisions, but if you think about the likely contentious issues like
> > ABI breakage, a core-code focused view is incomplete. So there should
> > be representation by:
> >
> > Someone(s) with a long history of working with the code -- that
> > institutional memory of why decisions were made the way they were
> > could be key.
>
> Sure -- though I can't really imagine any way of framing a rule like
> this that *wouldn't* be satisfied by Chuck + Ralf + Pauli, so my guess
> is that such a rule would not actually have any effect on the council
> membership in practice.
>

As the original author of NumPy, I would like to be on the seed council as
long as it is larger than 7 people.That is my proposal.I don't need
to be a permanent member, but I do believe I have enough history that I can
understand issues even if I haven't been working on code directly.

I think I do bring history and information that provides all of the history
that could be helpful on occasion. In addition, if a matter is
important enough to even be brought to the attention of this council, I
would like to be involved in the discussion about it.

It's a simple change to the text --- basically an explanation that Travis
requested to be on the seed council.

-Travis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] interpretation of the draft governance document (was Re: Governance model request)

2015-09-23 Thread Travis Oliphant

Council -- see below.
>
> Steering Council
> 
>
> The Project will have a Steering Council that consists of Project
> Contributors who have produced contributions that are substantial in
> quality and quantity, and sustained over at least one year. The overall
> role of the Council is to ensure, with input from the Community, the
> long-term well-being of the project, both technically and as a community.
>
> During the everyday project activities, council members participate in all
> discussions, code review and other project activities as peers with all
> other Contributors and the Community. In these everyday activities, Council
> Members do not have any special power or privilege through their membership
> on the Council. However, it is expected that because of the quality and
> quantity of their contributions and their expert knowledge of the Project
> Software and Services that Council Members will provide useful guidance,
> both technical and in terms of project direction, to potentially less
> experienced contributors.
>
> The Steering Council and its Members play a special role in certain
> situations. In particular, the Council may, if necessary:
>
> -   Make decisions about the overall scope, vision and direction of the
> project.
> -   Make decisions about strategic collaborations with other organizations
> or individuals.
> -   Make decisions about specific technical issues, features, bugs and
> pull requests. They are the primary mechanism of guiding the code review
> process and merging pull requests.
> -   Make decisions about the Services that are run by The Project and
> manage those Services for the benefit of the Project and Community.
> -   Update policy documents such as this one.
> -   Make decisions when regular community discussion doesn’t produce
> consensus on an issue in a reasonable time frame.
>
> However, the Council's primary responsibility is to facilitate the
> ordinary community-based decision making procedure described above. If we
> ever have to step in and formally override the community for the health of
> the Project, then we will do so, but we will consider reaching this point
> to indicate a failure in our leadership.
>
> ### Council decision making
>
> If it becomes necessary for the Steering Council to produce a formal
> decision, then they will use a form of the [Apache Foundation voting
> process](https://www.apache.org/foundation/voting.html). This is a
> formalized version of consensus, in which +1 votes indicate agreement, -1
> votes are vetoes (and must be accompanied with a rationale, as above), and
> one can also vote fractionally (e.g. -0.5, +0.5) if one wishes to express
> an opinion without registering a full veto. These numeric votes are also
> often used informally as a way of getting a general sense of people's
> feelings on some issue, and should not normally be taken as formal votes. A
> formal vote only occurs if explicitly declared, and if this does occur then
> the vote should be held open for long enough to give all interested Council
> Members a chance to respond -- at least one week.
>
> In practice, we anticipate that for most Steering Council decisions (e.g.,
> voting in new members) a more informal process will suffice.
>
> ### Council membership
>
> To become eligible to join the Steering Council, an individual must be a
> Project Contributor who has produced contributions that are substantial in
> quality and quantity, and sustained over at least one year. Potential
> Council Members are nominated by existing Council members and voted upon by
> the existing Council after asking if the potential Member is interested and
> willing to serve in that capacity. The Council will be initially formed
> from the set of existing Core Developers who, as of late 2015, have been
> significantly active over the last year.
>
>
Concretely, I'm asking to be included in this initial council so a simple
"along with Travis Oliphant who is the original author of NumPy".   If
other long-time contributors to the code-base also want to be on this
initial seed council, I think it would make sense as well.


> When considering potential Members, the Council will look at candidates
> with a comprehensive view of their contributions. This will include but is
> not limited to code, code review, infrastructure work, mailing list and
> chat participation, community help/building, education and outreach, design
> work, etc. We are deliberately not setting arbitrary quantitative metrics
> (like “100 commits in this repo”) to avoid encouraging behavior that plays
> to the metrics rather than the project’s overall well-being. We want to
> encourage a diverse array of backgrounds, viewpoints and talents in our
> team,

Re: [Numpy-discussion] composition of the steering council (was Re: Governance model request)

2015-09-23 Thread Travis Oliphant

On Wed, Sep 23, 2015 at 6:19 PM, Charles R Harris <charlesr.har...@gmail.com
> wrote:

>
>
> On Wed, Sep 23, 2015 at 3:42 PM, Chris Barker <chris.bar...@noaa.gov>
> wrote:
>
>> On Wed, Sep 23, 2015 at 2:21 PM, Travis Oliphant <tra...@continuum.io>
>> wrote:
>>
>>
>>> As the original author of NumPy, I would like to be on the seed council
>>> as long as it is larger than 7 people.That is my proposal.
>>>
>>
>> Or the seed council could invite Travis to join as its first order of
>> business :-)
>>
>> Actually, maybe that's a way to handle it -- declare that the first order
>> of business for teh seed council is to expand the council.
>>
>
> Perhaps we should specify a yearly meeting to review the past year and
> nominate people for commit rights and council membership. Long term, we
> might also want to start removing commit rights, perhaps by adding a team
> category on github with restricted rights -- committer emeritus, so to
> speak.
>

That's a pretty good idea, actually.



>
>
>> I'd still like some guidelines (suggestions) for history and at least one
>> major dependent-on-numpy rep. Travis would certainly meet the history
>> requirement -- and maybe the other, too. :-)
>>
>>
>>> It's a simple change to the text --- basically an explanation that
>>> Travis requested to be on the seed council.
>>>
>>
>> I'd rather the final draft of the document didn't name names, but no
>> biggie.
>>
>
I'm fine with that too --- except you will need to name the initial seed
council.

-Travis




>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

Thank you for posting that draft as it is a useful comparison to borrow
from.   I think Nathaniel's original document is a great start.   Perhaps
some tweaks along the lines of what you and Matt have suggested could also
be useful.

I agree that my proposal is mostly about altering the governance model,
mixed with some concern about being "automatically disqualified" from a
council that can decide the future of NumPy if things don't move forward.

-Travis


On Tue, Sep 22, 2015 at 12:57 AM, Stefan van der Walt <stef...@berkeley.edu>
wrote:

> On 2015-09-20 11:20:28, Travis Oliphant <tra...@continuum.io> wrote:
> > I would recommend three possible adjustments to the steering council
> > concept.
> >
> > 1 - define a BDFL for the council.  I would nominate chuck Harris
> >
> > 2 - limit the council to 3 people.  I would nominate chuck, nathaniel,
> and
> > pauli.
> >
> > 3 - add me as a permanent member of the steering council.
>
> I would split the above into two parts: a suggestion on how to change
> the governance model (first half of 1 and 2) and then some thoughts on
> what to do once those changes have been made (latter half of 1 and 2, as
> well as 3).
>
> For now, since those changes are not in place yet, it's probably best
> to focus on the governance model.
>
> I would agree that one person (or a very small group) is best suited to
> "getting things unstuck".  And, personally, I believe it best for that
> person/persons to be elected by the community (whatever we define "the
> community" to be)---which is what I presume you suggested when you
> mentioned nominating candidates.
>
> Since Matthew mentioned the governance proposal we're working on, here
> is a very early draft:
>
>
> https://github.com/stefanv/skimage-org/blob/governance_proposal/governance.md
>
> As I said, this is still a work-in-progress--comments are welcome.
> E.g., the weighting element in the voting has to be fine tuned (but was
> put in place to prevent rapid take-overs).
>
> Essentially, we need:
>
> - a way for community members to express disagreement without being
>   ousted,
> - protection against individuals who want to exert disproportional
>   influence,
> - protection against those in leadership roles who cause the project
>   long-term harm,
> - and a way for the community to change the direction of the project if
>   they so wished.
>
> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.10.0rc1 coming tomorrow, 22 Sept.

2015-09-22 Thread Travis Oliphant

Of course it will be 1.10.0 final where all the problems will show up
suddenly :-)

Perhaps we can get to where we are testing Anaconda against beta releases
better.

-Travis


On Mon, Sep 21, 2015 at 5:19 PM, Charles R Harris <charlesr.har...@gmail.com
> wrote:

> Hi All,
>
> Just a heads up. The lack of reported problems in 1.10.0b1 has been
> stunning.
>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Commit rights for Allan Haldane

2015-09-22 Thread Travis Oliphant

Excellent news!   Welcome Allan.

-Travis


On Tue, Sep 22, 2015 at 1:54 PM, Charles R Harris <charlesr.har...@gmail.com
> wrote:

> Hi All,
>
> Allan Haldane has been given commit rights. Here's to the new member of
> the team.
>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

On Tue, Sep 22, 2015 at 1:20 PM, Stefan van der Walt 
wrote:

>
>
> I guess we've gone off the rails pretty far at this point, so let me at
> least take a step back, and make sure that you know that:
>
> - I have never doubted that your intensions for NumPy are anything but
>   good (I know they are!),
> - I *want* the community to be a welcoming place for companies to
>   contribute (otherwise, I guess I'd not be such a fervent supporter of
>   the scientific eco-system using the BSD license), and
> - I love your enthusiasm for the project.  After all, that is a big part
>   of what inspired me to become involved in the first place.
>
> My goal is not to spread uncertainty, fear nor doubt—if that was the
> perception left, I apologize.
>
> I'll re-iterate that I wanted to highlight a concern about the
> interactions of a (somewhat weakly cohesive) community and strong,
> driven personalities such as yourself backed by a formidable amount of
> development power.  No matter how good your intensions are, there are
> risks involved in this kind of interaction, and if we fail to even
> *admit* that, we are in trouble.
>
> Lest the above be read in a negative light again, let me state it
> up-front: *I don't think you will hijack the project, use it for your
> own gain, or attempt to do anything you don't believe to be in the best
> interest of NumPy.* What I'm saying is that we absolutely need to move
> forward in a way that brings everyone along, and makes everyone rest
> assured that their voice will be heard.
>
>
Thank you for the clarification.   I'm sorry that I started to question
your intentions.I agree that everyone should rest assured that their
voice will be heard.   I have been and continue to be a staunch advocate
for the voices that are not even on this mailing list.


> Also, please know that I have not discussed these matters with Nathaniel
> behind the scenes, other than an informal hour-long discussion about his
> original governance proposal.  There is no BIDS conspiracy or attempts
> at crucifixion.  After all, you were an invited guest speaker at an
> event I organized this weekend, since I value your opinion and insights.
>
>
Thank you.   I'm sorry for implying otherwise.   That was wrong of me.   I
know we are just trying to bring all the voices to the table.

Best,

-Travis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

I am not upset nor was I ever upset about discussing the possibility of
conflict of interest.   Of course it can be discussed --- but it should be
discussed directly about specific things --- and as others have said it is
generally easily handled when it actually could arise.   The key is to
understand affiliations.   We should not do things in the community that
actually encourage people to hide their affiliations for fear of backlash
or bias.

I was annoyed at the insinuation that conflict of interest is a
company-only problem that academics are somehow immune to.

I was upset about accusations and mis-interpretations of my activities and
those of my colleagues in behalf of the community.

On Tue, Sep 22, 2015 at 1:48 PM, Matthew Brett <matthew.br...@gmail.com>
wrote:

> Hi,
>
> On Tue, Sep 22, 2015 at 11:20 AM, Stefan van der Walt
> <stef...@berkeley.edu> wrote:
> > Hi Travis
> >
> > On 2015-09-22 03:44:12, Travis Oliphant <tra...@continuum.io> wrote:
> >> I'm actually offended that so many at BIDS seem eager to crucify my
> >> intentions when I've done nothing but give away my time, my energy, my
> >> resources, and my sleep to NumPy for many, many years.I guess if
> your
> >> intent is to drive me away, then you are succeeding.
> >
> > I guess we've gone off the rails pretty far at this point, so let me at
> > least take a step back, and make sure that you know that:
> >
> > - I have never doubted that your intensions for NumPy are anything but
> >   good (I know they are!),
> > - I *want* the community to be a welcoming place for companies to
> >   contribute (otherwise, I guess I'd not be such a fervent supporter of
> >   the scientific eco-system using the BSD license), and
> > - I love your enthusiasm for the project.  After all, that is a big part
> >   of what inspired me to become involved in the first place.
> >
> > My goal is not to spread uncertainty, fear nor doubt—if that was the
> > perception left, I apologize.
> >
> > I'll re-iterate that I wanted to highlight a concern about the
> > interactions of a (somewhat weakly cohesive) community and strong,
> > driven personalities such as yourself backed by a formidable amount of
> > development power.  No matter how good your intensions are, there are
> > risks involved in this kind of interaction, and if we fail to even
> > *admit* that, we are in trouble.
> >
> > Lest the above be read in a negative light again, let me state it
> > up-front: *I don't think you will hijack the project, use it for your
> > own gain, or attempt to do anything you don't believe to be in the best
> > interest of NumPy.* What I'm saying is that we absolutely need to move
> > forward in a way that brings everyone along, and makes everyone rest
> > assured that their voice will be heard.
> >
> > Also, please know that I have not discussed these matters with Nathaniel
> > behind the scenes, other than an informal hour-long discussion about his
> > original governance proposal.  There is no BIDS conspiracy or attempts
> > at crucifixion.  After all, you were an invited guest speaker at an
> > event I organized this weekend, since I value your opinion and insights.
> >
> > Either way, let me again apologize if my suggested lack of insight hurt
> > people's feelings.  I can only hope that, in educating me, we all learn
> > a few lessons.
>
> I'm also in favor of taking a step back.
>
> The point is, that a sensible organization and a sensible leader has
> to take the possibility of conflict of interest into account.  They
> also have to consider the perception of a conflict of interest.
>
> It is the opposite of sensible, to respond to this with 'how dare you"
> or by asserting that this could never happen or by saying that we
> shouldn't talk about that in case people get frightened.  I point you
> again to Linus' interview [1].  He is not upset that he has been
> insulted by the implication of conflict of interest, he soberly
> accepts that this will always be an issue, with companies in
> particular, and goes out of his way to address that in an explicit and
> reasonable way.
>
> Cheers,
>
> Matthew
>
> [1] http://www.bbc.com/news/technology-18419231
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

On Tue, Sep 22, 2015 at 2:16 AM, Stefan van der Walt <stef...@berkeley.edu>
wrote:

> Hi Travis
>
> On 2015-09-21 23:29:12, Travis Oliphant <tra...@continuum.io> wrote:
> >   1) nobody believes that the community should be forced to adopt numba
> as
> > part of ufunc core yet --- but this could happen someday just as Cython
> is
> > now being adopted but was proposed 8 years ago that it "could be adopted"
> > That's a red-hearing.
>
> Yes, I'd like to clarify: I was not against including any specific
> technology in NumPy.  I was highlighting that there may be different
> motivations for members of the general community and those working for,
> say, Continuum, to get certain features adopted.
>

This is what I'm calling you out on.  Why?   I think that is an unfair
statement and inaccurate.   The general community includes Continuum,
Enthought, Microsoft, Intel, various hedge funds, investment banks and
companies large and small.   Are you saying that people should not be
upfront about their affiliations with a company?  That if they are not
academics, then they should not participate in the discussion?   It is hard
enough to be at a company and get time to contribute effort back to an open
source project.We should not be questioning people's motives just
*because* they are at a company.   We should not assume people cannot think
in terns of the success of the project, just because they are at a company.

Their proposals and contributions can be evaluated on their merits and
value --- so this whole discussion seems to be just revealing an
anti-company paranoia rather than helping understand the actual concern.

> >   2) I have stated that breaking the ABI is of little consequence because
> > of conda as well as other tools.I still believe that.  This has
> nothing
> > to do with any benefit Continuum might or might not receive because of
> > conda.   Everyone else who wants to make a conda-based distribution also
> > benefits (Cloudera, Microsoft, Intel, ...) or use conda also benefits.
> > I don't think the community realizes the damange that is done with FUD
> like
> > this.  There are real implications.  It halts progress, creates
> confusion,
> > and I think ultimately damages the community.
>
> This is an old argument, and the reason why we have extensive measures
> in place to guard against ABI breakage.  But, reading what you wrote
> above, I would like to understand better what FUD you are referring to,
> because I, rightly or wrongly, believe there is a real concern here that
> is being glossed over.
>

I don't know which is the "old argument".   Anyway, old arguments can still
be right.  The fact is that not breaking the ABI has caused real damage to
the community.  NumPy was never designed to not have it's ABI broken for
over a decade. We have some attempts to guard against ABI breakage ---
but they are not perfect.

We have not moved the code-base forward for fear of breaking the ABI.
When it was hard to update your Python installation that was a concern.
There are very few cases where this is still the concern (conda is a big
part of it but not the only part as other distros and approaches for easily
updating the install exist) --- having this drive major architecture
decisions is a serious mistake in my mind, and causes a lot more work than
it should.

The FUD I'm talking about is the anti-company FUD that has influenced
discussions in the past.I really hope that we can move past this.

>
> > I don't see how.None of these have been proposed for integrating into
> > NumPy.I don't see how integrating numba into NumPy benefits Continuum
> > at all.  It's much easier for us to keep it separate.   At this point
> > Continuum doesn't have an opinion about integrating DyND into NumPy or
> > not.
>
> I think that touches, tangentially at least, on the problem.  If an
> employee of Continuum were steering NumPy, and the company developed an
> opinion on those integrations, would such a person not feel compelled to
> toe the company line?  (Whether the company is Continuum or another is
> besides the point—I am only trying to understand the dynamics of working
> for a company and leading an open source project that closely interacts
> with their producs.)
>

O.K.  if you are honestly asking this question out of inexperience, then I
can at least help you understand because perhaps that is the problem
(creating a straw-man that doesn't exist).I have never seen a motivated
open source developer at a company who "tows the company line" within a
community project that is accepted long term.All that would do is drive
the developer out of the company and be a sure-fire way to make sure their
contributions are not accepted.   I know that a

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

bit" to differentiate
people in the community.

-Travis













On Tue, Sep 22, 2015 at 2:11 AM, Nathaniel Smith <n...@pobox.com> wrote:

> On Mon, Sep 21, 2015 at 9:20 AM, Travis Oliphant <tra...@continuum.io>
> wrote:
> >
> > I wrote my recommendations quickly before heading on a plane.I hope
> the spirit of them was caught correctly.I also want to re-emphasize
> that I completely understand that the Steering Council is not to be making
> decisions that often and almost all activity will be similar to it is now
> --- discussion, debate, proposals, and pull-requests --- that is a good
> thing.
> >
> > However, there is a need for leadership to help unstick things and move
> the project forward from time to time because quite often doing *something*
> can be better than trying to please everyone with a voice.   My concerns
> about how to do this judgment have 2 major components:
> >
> > 1) The need for long-term consistency --- a one-year horizon on defining
> this group is too short in my mind for a decades-old project like NumPy.
> > 2) The group that helps unstick things needs to be small (1, 3, or 5 at
> the most)
>
> For reference, the rules for steering council membership were taken
> directly from those used by the Jupyter project, and their steering
> council currently has 10 people, making it larger than the "seed
> council" proposed in the numpy document:
> https://github.com/jupyter/governance/blob/master/people.md
>
> > We could call this group the "adjudication group" rather than the
> "Steering Council" as well.   I could see that having a formal method of
> changing that "adjudication group" would be a good idea as well (and
> perhaps that formal vote could be made by a vote of a group of active
> contributors.   In that case, I would define active as having a time-window
> of 5 years instead of just 1).
>
> I may be misreading things, but I'm getting the impression that the
> active "adjudication group" you envision is radically different from
> the "steering council" as envisioned by the current governance
> document. It also, I think, radically different from anything I've
> ever seen in a functioning community-run FOSS project and frankly it's
> something where if I saw a project using this model, it would make me
> extremely wary about contributing.
>
> The key point that I think differs is that you envision that this
> "adjudication group" will actually intervene into discussions and make
> formal decisions in situations other than true irreconcilable crises,
> which in my estimation happen approximately never. The only two kinds
> of F/OSS projects that I can think of that run like this are (a)
> projects that are not really community driven at all, but rather run
> as internal company projects that happen to have a public repository,
> (b) massive projects like Debian and Fedora that have to manage
> literally thousands of contributors, and thus have especially robust
> backstop procedures to handle the rare truly irreconcilable situation.
>
> E.g., the Debian CTTE acts as an "adjudication group" in the way it
> sounds like you envision it: on a regular basis, irreconcilable
> arguments in Debian get taken to them to decide, and they issue a
> ruling. By some back of the envelope calculations, it looks like they
> issue approximately ~0.002 rulings per debian-contributor-year [1][2].
> If we assume crudely that irreconcilable differences scale linearly
> with the size of a project, this suggests that a ~20 person project
> like NumPy should require a ruling ~once every 20 years.
>
> Or quoting myself from the last thread about this [3]:
> ] Or on the other end of things, you have e.g. Subversion, which had an
> ] elaborate defined governance system with different levels of
> ] "core-ness", a voting system, etc. -- and they were 6 years into the
> ] project before they had their first vote. (The vote was on the crucial
> ] technical decision of whether to write function calls like "f ()" or
> ] "f()".)
>
> These are two real projects and how they really work. And even in
> projects that do have a BDFL, the successful ones almost never use
> this power to actually "unstick things" (i.e., use their formal power
> to resolve a discussion). Consider PEP 484, Guido's somewhat
> controversial type hints proposal: rather than use his power to move
> the debate along, he explicitly delegated his power to one of the
> idea's strongest critics [4].
>
> Of course, things to get stuck. But the only time that getting them
> unstuck needs or even benefits from the existence of a formal
> "unstick

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

>
>
>
> > May? Can you elaborate? More speculation. My own position is that
> > these projects want to integrate with NumPy, not the
> > converse. Regardless of my opinion, can you actually make any specific
> > arguements, one way or the otehr? What if if some integrations
> > actually make more sense for the community? Is this simply a dogmatic
> > ideological position that anything whatsoever that benefits both NumPy
> > and Continuum simultaneously is bad, on principle? That's fine, as
> > such, but let's make that position explicit if that's all it is.
>
> No, I don't have such a dogmatic ideological position.  I think,
> however, that it is somewhat unimaginative to propose that there are no
> potential conflicts whatsoever.
>
> I am happy if we can find solutions that benefit both numpy and any
> company out there.  But in the end, I'm sure you'd agree that we want
> the decisions that lead to such solutions to be taken in the best
> interest of the project, and not be weighed by alterior motivations of
> any sorts.  In the end, even the *perception* that that is not the case
> can be very harmful.
>

I will only comment on the last point.   I completely agree that the
*perception* that this is not the case can be harmful.

But, what concerns me is where this perception comes from --- from actual
evidence of anything that is not in the best interests of the project ---
or just ideological differences of opinion about the way the world works
and the perceptions around open source and markets.   It is quite easy for
someone to spread FUD about companies that contribute to open source ---
and it has the effect of discouraging companies from continuing to
contribute to community projects.This removes a huge amount of
potential support from projects.

In NumPy's case in particular, this kind of attitude basically guarantees
that I won't be able to contribute effectively and potentially even people
I fund to contribute might not be accepted --- not because we can't
faithfully participate in the same spirit that we have always contributed
to SciPy and NumPy and other open source projects --- but because people
are basically going to question things just because.

What exactly do you need me to say to get you to believe that I have
nothing but the best interests of array computing in Python at heart?

The only thing that is different between me today and me 18 years ago is
that 1) I have more resources now, 2) I have more knowledge about computer
science and software architecture and 3) I have more experience with how
NumPy gets used.All I can do is continue to try and make things better
the best way I know how.

-Travis


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

On Tue, Sep 22, 2015 at 1:07 AM, Stefan van der Walt <stef...@berkeley.edu>
wrote:

> On 2015-09-21 22:15:55, Bryan Van de Ven <bry...@continuum.io> wrote:
> > Beyond that, what (even in a broad sense) is an example of a goal that
> > "Continuum might need" that would conceivably do detriment to the
> > NumPy community? That it be faster? Simpler to maintain? Easier to
> > extend? Integrate better with more OS projects? Attract new active
> > developers? Receive more financial support? Grow its user base even
> > more?
>
> I don't know how productive it is to dream up examples, but it's not
> very hard to do.  Currently, e.g., the community is not ready to adopt
> numba as part of the ufunc core.  But it's been stated by some that,
> with so many people running Conda, breaking the ABI is of little
> consequence.  And then it wouldn't be much of a leap to think that numba
> is an acceptable dependency.
>

A couple of things to help clarify:

  1) nobody believes that the community should be forced to adopt numba as
part of ufunc core yet --- but this could happen someday just as Cython is
now being adopted but was proposed 8 years ago that it "could be adopted"
That's a red-hearing.

  2) I have stated that breaking the ABI is of little consequence because
of conda as well as other tools.I still believe that.  This has nothing
to do with any benefit Continuum might or might not receive because of
conda.   Everyone else who wants to make a conda-based distribution also
benefits (Cloudera, Microsoft, Intel, ...) or use conda also benefits.
I don't think the community realizes the damange that is done with FUD like
this.  There are real implications.  It halts progress, creates confusion,
and I think ultimately damages the community.

Numba being an acceptable dependency means a lot more than conda --- it's
dependent on LLVM compiled support which would have to be carefully tested
--- first as only an optional dependency for many years.

>
> There's a broad range of Continuum projects that intersect with what
> NumPy does: numba, DyND, dask and Odo to name a few.  Integrating them
> into NumPy may make a lot more sense for someone from Continuum than for
> other members of the community.
>

I don't see how.None of these have been proposed for integrating into
NumPy.I don't see how integrating numba into NumPy benefits Continuum
at all.  It's much easier for us to keep it separate.   At this point
Continuum doesn't have an opinion about integrating DyND into NumPy or not.

These projects will all succeed or fail on their own based on users needs.
  Whether or not they every become a part of NumPy will depend on whether
they are useful as such not because a person at Continuum is part of a
steering committee (with other people on it).

I know that you were responding to specific question by Brian as to how
their could be a conflict of interest for Continuum and NumPy development.
I don't think this is a useful conversation --- we could dream up all
kinds of conflicts of interest for BIDS and NumPy too (e.g. perhaps BIDS
really wants Spark to take over and for NumPy to have special connections
to Spark).   Are we to not allow anyone at BIDS to participate in the
steering council because of their other interests?

But remember, the original point is whether or not someone from Continuum
(or I presume any company and not just singling out Continuum for special
treatment) should be on the steering council.Are you really arguing
that they shouldn't because there are other projects Continuum is working
on that have some overlap with NumPy.I really hope you don't actually
believe that.

-Travis

> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>

-- 

*Travis Oliphant*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.10.0rc1 coming tomorrow, 22 Sept.

2015-09-22 Thread Travis Oliphant

Absolutely it would be good if others can test.  All I was suggesting is
that we do run a pretty decent set of tests upon build and that would be
helpful.

If the numpy build recipes are not available, it is only because they have
not been updated to use conda-build yet.  If somebody wants to volunteer to
convert all of our internal recipes to conda-build recipes so they could be
open source --- we would welcome the help.

But, it's not just the numpy recipes, it's the downstream binaries and
their test-suite as well that is useful to run.   I am hoping we will have
something automatic here in the next few months on anaconda.org that will
make this easier -- but no promises at this point.

-Travis

On Tue, Sep 22, 2015 at 2:19 AM, Nathaniel Smith <n...@pobox.com> wrote:

> On Sep 21, 2015 11:51 PM, "Travis Oliphant" <tra...@continuum.io> wrote:
> >
> > Of course it will be 1.10.0 final where all the problems will show up
> suddenly :-)
> >
> > Perhaps we can get to where we are testing Anaconda against beta
> releases better.
>
> The most useful thing would actually not even involve you doing any more
> testing, but just if you could make builds available so that end-users
> could easily conda install the prereleases and do their own testing against
> their own choice. In principle I guess we could provide our own binstar
> channel for this, but it's difficult given that AFAIK rebuilding numpy in
> conda requires also rebuilding the whole stack, and the numpy build recipes
> are still proprietary.
>
> -n
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

-- 

*Travis Oliphant*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

s part of the ufunc core.  But it's been stated by some that,
> >
> > Who are you speaking for? The entire community? Under what mandate?
> >
> >> with so many people running Conda, breaking the ABI is of little
> >> consequence.  And then it wouldn't be much of a leap to think that numba
> >> is an acceptable dependency.
> >
> > The current somewhat concrete proposal I am aware of involves funding
> cleaning up dtypes. Is there another concrete, credible proposal to make
> Numba a dependency of NumPy that you can refer to? If not, why are we mired
> in hypotheticals?
> >
> >> There's a broad range of Continuum projects that intersect with what
> >> NumPy does: numba, DyND, dask and Odo to name a few.  Integrating them
> >> into NumPy may make a lot more sense for someone from Continuum than for
> >> other members of the community.
> >
> > May? Can you elaborate? More speculation. My own position is that these
> projects want to integrate with NumPy, not the converse. Regardless of my
> opinion, can you actually make any specific arguements, one way or the
> otehr? What if if some integrations actually make more sense for the
> community? Is this simply a dogmatic ideological position that anything
> whatsoever that benefits both NumPy and Continuum simultaneously is bad, on
> principle? That's fine, as such, but let's make that position explicit if
> that's all it is.
> >
> > Bryan
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Nathaniel J. Smith -- http://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Travis Oliphant

On Tue, Sep 22, 2015 at 4:33 AM, Nathaniel Smith <n...@pobox.com> wrote:

> On Tue, Sep 22, 2015 at 1:24 AM, Travis Oliphant <tra...@continuum.io>
> wrote:
>
>> I actually do agree with your view of the steering council as being
>> usually not really being needed.You are creating a straw-man by
>> indicating otherwise.I don't believe a small council should do anything
>> *except* resolve disputes that cannot be resolved without one.  Like you, I
>> would expect that would almost never happen --- but I would argue that
>> extrapolating from Debian's experience is not actually relevant here.
>>
>
> To be clear, Debian was only one example -- what I'm extrapolating from is
> every community-driven F/OSS project that I'm aware of.
>
> It's entirely possible my data set is incomplete -- if you have some other
> examples that you think would be better to extrapolate from, then I'd be
> genuinely glad to hear them. You may have noticed that I'm a bit of an
> enthusiast on this topic :-).
>
>

Yes, you are much better at that than I am.   I'm not even sure where I
would look for this kind of data.


>
>>
> So, if the steering council is not really needed then why have it at all?
>> Let's just eliminate the concept entirely.
>>
>>
> In my view, the reasons for having such a council are:
> 1) The framework is useful even if you never use it, because it means
> people can run "what if" scenarios in their mind and make decisions on that
> basis. In the US legal system, only a vanishingly small fraction of cases
> go to the Supreme Court -- but the rules governing the Supreme Court have a
> huge effect on all cases, because people can reason about what would happen
> *if* they tried to appeal to the Supreme Court.
>

O.K.  That is a good point.   I can see the value in that.



> 2) It provides a formal structure for interfacing with the outside world.
> E.g., one can't do anything with money or corporate contributions without
> having some kind of written-down and enforceable rules for making decisions
> (even if in practice you always stick to the "everyone is equal and we
> govern by consensus" part of the rules).
>

O.K.


> 3) There are rare but important cases where discussions have to be had in
> private. The main one is "personnel decisions" like inviting people to join
> the council; another example Fernando has mentioned to me is that when they
> need to coordinate a press release between the project and a funding body,
> the steering council reviews the press release before it goes public.
>

O.K.



> That's pretty much it, IMO.
>
> The framework we all worked out at the dev meeting in Austin seems to
> handle these cases well AFAICT.
>

How did we "all" work it out when not everyone was there?   This is where I
get lost.   You talk about community decision making and yet any actual
decision is always a subset of the community.I suppose you just rely on
the "if nobody complains than it's o.k." rule?   That really only works if
the project is moving slowly.


> But there are real questions that have to have an answer or an approach to
>> making a decision.  The answer to these questions cannot really be a vague
>> notion of "lack of vigorous opposition by people who read the mailing list"
>> which then gets parried about as "the community decided this."   The NumPy
>> user base is far, far larger than the number of people that read this list.
>>
>
> According to the dev meeting rules, no particularly "vigorous opposition"
> is required -- anyone who notices that something bad is happening can write
> a single email and stop an idea dead in its tracks, with only the steering
> council able to overrule. We expect this will rarely if ever happen,
> because the threat will be enough to keep everyone honest and listening,
> but about the only way we could possibly be *more* democratic is if we
> started phoning up random users at home to ask their opinion.
>

O.K.  so how long is the time allowed for this kind of opposition to be
noted?



>
> This is actually explicitly designed to prevent the situation where
> whoever talks the loudest and longest wins, and to put those with more and
> less time available on an equal footing.
>
>
>> For better or for worse, we will always be subject to the "tyranny of who
>> has time to contribute lately".Fundamentally, I would argue that this
>> kind of "tyranny" should at least be tempered by additional considerations
>> from long-time contributors who may also be acting more indirectly than is
>> measured by a simple git log.
>>
>
> I guess I am mi

Re: [Numpy-discussion] Governance model request

2015-09-21 Thread Travis Oliphant

ctive developers, who make sure the project does not go off
> the rails.   I think you'd be an excellent and obvious trustee, in
> that model.
>

I like the trustee model too and think such an addition to the NumPy
concept would help alleviate my concerns about actually being on a
"steering committee" but my preferred outcome is actually that the agreed
upon steering council be smaller and that people who have a right to vote
on things like the make-up of the steering committee be comprised of people
who have been significantly involved in the past 3 years (not just the past
one year).

-Travis




>
> Cheers,
>
> Matthew
>
>
> [1] http://www.bbc.com/news/technology-18419231
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Governance model request

2015-09-21 Thread Travis Oliphant

I wrote my recommendations quickly before heading on a plane.I hope the
spirit of them was caught correctly.I also want to re-emphasize that I
completely understand that the Steering Council is not to be making
decisions that often and almost all activity will be similar to it is now
--- discussion, debate, proposals, and pull-requests --- that is a good
thing.

However, there is a need for leadership to help unstick things and move the
project forward from time to time because quite often doing *something* can
be better than trying to please everyone with a voice.   My concerns about
how to do this judgment have 2 major components:

1) The need for long-term consistency --- a one-year horizon on defining
this group is too short in my mind for a decades-old project like NumPy.
2) The group that helps unstick things needs to be small (1, 3, or 5 at the
most)

We could call this group the "adjudication group" rather than the "Steering
Council" as well.   I could see that having a formal method of changing
that "adjudication group" would be a good idea as well (and perhaps that
formal vote could be made by a vote of a group of active contributors.   In
that case, I would define active as having a time-window of 5 years instead
of just 1).

Thanks,

-Travis




On Mon, Sep 21, 2015 at 2:39 AM, Sebastian Berg <sebast...@sipsolutions.net>
wrote:

> On Mo, 2015-09-21 at 11:32 +0200, Sebastian Berg wrote:
> > On So, 2015-09-20 at 11:20 -0700, Travis Oliphant wrote:
> > > After long conversations at BIDS this weekend and after reading the
> > > entire governance document,  I realized that the steering council is
> > > very large and I don't agree with the mechanism by which it is
> > > chosen.
> > >
> >
> > Hmmm, well I never had the impression that the steering council would be
> > huge. But maybe you are right, and if it is, I could imagine something
> > like option 2, but vote based (could possibly dual use those in charge
> > of NumFOCUS relations, we had even discussed this possibility) which
> > would have final say if necessary (could mean that the contributers
> > definition could be broadened a bit).
> > However, I am not sure this is what you suggested, because for me it
> > should be a regular vote (if just because I am scared of having to make
> > the right pick). And while I will not block this if others agree, I am
> > currently not comfortable with either picking a BDFL (sorry guys :P) or
> > very fond of an oligarchy for live.
> >
> > Anyway, I still don't claim to have a good grasp on these things, but
> > without a vote, it seems a bit what Matthew warned about.
> >
> > One thing I could imagine is something like an "Advisory Board", without
> > (much) formal power. If we had a voted Steering Council, it could be the
> > former members + old time contributers which we would choose now. These
> > could be invited to meetings at the very least.
> >
> > Just my current, probably not well thought out thoughts on the matter.
> > But neither of your three options feel very obvious to me unfortunately.
> >
> > - Sebastian
> >
> >
> > > A one year time frame is pretty short on the context of a two decades
> > > old project and I believe the current council has too few people who
> > > have been around the community long enough to help unstuck difficult
> > > situations if that were necessary.
> > >
> > > I would recommend three possible adjustments to the steering council
> > > concept.
> > >
> > > 1 - define a BDFL for the council.  I would nominate chuck Harris
> > >
> > > 2 - limit the council to 3 people.  I would nominate chuck, nathaniel,
> > > and pauli.
> > >
> > > 3 - add me as a permanent member of the steering council.
> > >
>
> Though, maybe you should be in the steering council in any case even by
> the current rules. Maybe you were not too active for a while, but I
> doubt you will quite stop doing stuff on numpy soon
>
>
> > > Writing NumPy was a significant amount of work.  I have been working
> > > indirectly or directly in support of NumPy continously since I wrote
> > > it.  While I don't actively participate all the time, I still have a
> > > lot of knowledge, context, and experience in how NumPy is used, why it
> > > is the way it is, and how things could be better.  I also work with
> > > people directly who have and will contribute regularly.
> > >
> > > I am formally requesting that the steering council concept be adjusted
> > > in one of these three ways.
> > >
> &

[Numpy-discussion] Governance model request

2015-09-20 Thread Travis Oliphant

After long conversations at BIDS this weekend and after reading the entire
governance document,  I realized that the steering council is very large
and I don't agree with the mechanism by which it is chosen.

A one year time frame is pretty short on the context of a two decades old
project and I believe the current council has too few people who have been
around the community long enough to help unstuck difficult situations if
that were necessary.

I would recommend three possible adjustments to the steering council
concept.

1 - define a BDFL for the council.  I would nominate chuck Harris

2 - limit the council to 3 people.  I would nominate chuck, nathaniel, and
pauli.

3 - add me as a permanent member of the steering council.

Writing NumPy was a significant amount of work.  I have been working
indirectly or directly in support of NumPy continously since I wrote it.
While I don't actively participate all the time, I still have a lot of
knowledge, context, and experience in how NumPy is used, why it is the way
it is, and how things could be better.  I also work with people directly
who have and will contribute regularly.

I am formally requesting that the steering council concept be adjusted in
one of these three ways.

Thanks,

Travis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] The process I intend to follow for any proposed changes to NumPy

2015-09-18 Thread Travis Oliphant

Hey Chris (limiting to NumPy only),

I've had some great conversations with Nathaniel in the past few days and
I'm glad he posted his thoughts so that there is no confusion about
governance or what I was implying.

With respect to governance, I'm very supportive of what everyone is doing
in organizing a governance document and approach and appreciate the effort
of Nathaniel and others to move this forward.   Nothing I said was meant to
imply differently.   I'm sorry if it made anyone nervous.

I'm a very enthusiastic person when I get an idea of what to do.   I like
to see things implemented.   In this case, it also turns out that in terms
of overall architecture, my ideas are actually very similar to Nathaniel's
ideas.   That's a good sign.We have different tactical approaches as to
how to move forward, but I think it's a good thing to note that we see a
very similar path forward.Nothing will be done in NumPy itself except
via pull-request and review.

My approach for the ideas I'm pursuing will be to organize people around
two new prototype packages I'm calling memtype and gufunc.   The purpose of
these is to allow playing with the design and ideas quickly before looking
at how to put them into NumPy itself --- there will also be some training
involved in getting people up to speed.There was a long discussion
today at this BIDS data-structures for data-science summit part of which
talked about how to improve NumPy's dtype system. I would love to these
independent objects evolve into independent packages that could even go
into Python standard library. Not everyone agrees that is the best
idea, but regardless of whether this happens or not, the intent is to do
work that could go into NumPy now.

I look forward to the activity.

-Travis

On Mon, Sep 14, 2015 at 10:46 AM, Chris Barker <chris.bar...@noaa.gov>
wrote:

> Travis,
>
> I'm sure you appreciate that this might all look a bit scary, given the
> recent discussion about numpy governance.
>
> But it's an open-source project, and I, at least, fully understand that
> going through a big process is NOT the way to get a new idea tried out and
> implemented. So I think think this is a great development -- I know I want
> to see something like this dtype work done.
>
> So, as someone who has been around this community for a long time, and
> dependent on Numeric, numarray, and numpy over the years, this looks like a
> great development.
>
> And, in fact, with the new governance effort -- I think less scary --
> people can go off and work on a branch or fork, do good stuff, and we, as a
> community, can be assured that API (or even ABI) changes won't be thrust
> upon us unawares :-)
>
> As for the technical details -- I get a bit lost, not fully understanding
> the current dtype system either, but do your ideas take us in the direction
> of having dtypes independent of the container and ufunc machinery -- and
> thus easier to create new dtypes (even in Python?) 'cause that would be
> great.
>
> I hope you find the partner you're looking for -- that's a challenge!
>
> -Chris
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
> _______
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

-- 

*Travis Oliphant*
*Co-founder and CEO*

@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] The process I intend to follow for any proposed changes to NumPy

2015-09-13 Thread Travis Oliphant

Hey all,

I just wanted to clarify, that I am very excited about a few ideas I have
--- but I don't have time myself to engage in the community process to get
these changes into NumPy. However, those are real processes --- I've
been coaching a few people in those processes for the past several years
already.

So, rather than do nothing, what I'm looking to do is to work with a few
people who I can share my ideas with, get excited about the ideas, and then
who will work with the community to get them implemented.   That's what I
was announcing and talking about yesterday --- looking for interested
people who want to work on NumPy *with* the NumPy community.

In my enthusiasm, I realize that some may have mis-understood my
intention.  There is no 'imminent' fork, nor am I planning on doing some
crazy amount of work that I then try to force on other developers of NumPy.


What I'm planning to do is find people to train on NumPy code base (people
to increase the diversity of the developers would be ideal -- but hard to
accomplish).  I plan to train them on NumPy based on my experience, and on
what I think should be done --- and then have *them* work through the
community process and engage with others to get consensus (hopefully not
losing too much in translation in the process --- but instead getting even
better).

During that process I will engage as a member of the community and help
write NEPs and other documents and help clarify where it makes sense as I
can.   I will be filtering for people that actually want to see NumPy get
better.Until I identify the people and work with them, it will be hard
to tell how this will best work.   So, stay tuned.

If all goes well, what you should see in a few weeks time are specific
proposals, a branch or two, and the beginnings of some pull requests.If
you don't see that, then I will not have found the right people to help me,
and we will all continue to go back to searching.

While I'm expecting the best, in the worst case, we get additional people
who know the NumPy code base and can help squash bugs as well as implement
changes that are desired.Three things are needed if you want to
participate in this:  1) A willingness to work with the open source
community, 2) a deep knowledge of C and in-particular CPython's brand of C,
and 3) a willingness to engage with me, do a mind-meld and dump around the
NumPy code base, and then improve on what is in my head with the rest of
the community.

Thanks,

-Travis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Just to this list --- more details of the approach

2015-09-12 Thread Travis Oliphant

Hey all,

To the NumPy list only,  I'll at least give the highlights of the surgical
approach I would like to get someone to work on -- I can help mentor and
guide.   These are just the highlights, but it should give someone familiar
with the code the general gist.  There are some details to work out, of
course, but it could be done.

It may be very similar to what Nathaniel is contemplating --- except I
think breaking the ABI is the only way to really do this --- could be wrong
but I'm not wiling to risk *not* just breaking the ABI.

1) Create a new meta-type in C (call it dtype)
2) Create Python Classes (in C) that are instances of this meta-type for
each "kind" of data-type
3) Make PyArray_Descr * be a reference to one of these new objects (which
can be built either in C or Python) and should be published outside NumPy
as well.
4) Remove most of the "per-type function calls" in PyArray_ArrFuncs ---
instead replacing those with the Generalized Ufunc equivalents and expand
the capability of Generalized Ufuncs
5) Keep the Array Scalar Types but change them so that they also use the
dtype meta-type as their foundation and mixin an array-methods type.
 Also, have these be in a separate project from NumPy itself.
6) The current void* would be replaced with real Python classes instead of
structured arrays being shoved through a single data-type.
7) The documented ways to spell a dtype would be reduced --- but backwards
compatibility would be preserved.
8) Make sure Numba can create these Descriptor objects with Ahead of Time
Compilation and start to move code of NumPy to Numba
9) Ensure the Generalized Ufunc framework can take the data-type as an
argument so that *all* data-types can participate in the general
multi-method approach.

There is more to it, but that is the basic idea.Please forgive me if I
can't respond to any feedback from the list in a timely way.  I will as I
can.

-Travis



-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Looking for a developer who will work with me for at least 6 months to fix NumPy's dtype system.

2015-09-12 Thread Travis Oliphant

Hi all,

Apologies for cross-posting, but I need to get the word out and twitter
doesn't provide enough explanation.

I've been working on a second edition of my "Guide to NumPy" book.   It's
been a time-pressured activity, but it's helped me put more meat around my
ideas for how to fix NumPy's dtype system -- which I've been contemplating
off an on for 8 years.

I'm pretty sure I know exactly how to do it --- in a way that fits more
cleanly into Python.  It will take 3-6 months and will have residual
efforts needed that will last another 6 months --- making more types
available with NumPy, improving calculations etc.

This work will be done completely in public view and allow for public
comment.   It will not solve *all* of NumPy's problems, but it will put
NumPy's dtype system on the footing it in retrospect should have been put
on in the first place (if I had known then what I know now).

It won't be a grandiose rewrite.   It will be a pretty surgical fix to a
few key places in the code. However, it will break the ABI and require
recompilation of NumPy extensions (and so would need to be called NumPy
2.0).   This is unavoidable, but I don't see any problem with breaking the
ABI today given how easy it is to get distributions of Python these days
from a variety of sources (including using conda --- but not only using
conda).

For those that remember what happened in Python dev land, the changes will
be similar to when Guido changed Python 1.5.2 to Python 2.0.

I can mentor and work closely with someone who will work on this and we
will invite full participation and feedback from whomever in the community
also wants to participate --- but I can't do it myself full time (and it
needs someone full time+).   Fortunately, I can pay someone to do it if
they are willing to commit at least 6 months (it is not required to work at
Continuum for this, but you can have a job at Continuum if you want one).

I'm only looking for people who have enough experience with C or preferably
the Python C-API. You also have to *want* to work on this.   You need to be
willing to work with me on the project directly and work to have a
mind-meld with my ideas which will undoubtedly give rise to additional
perspectives and ideas for later work for you.

When I wrote NumPy 1.0, I put in 80+ hour weeks for about 6 months or more
and then 60+ weeks for another year.  I was pretty obsessed with it.   This
won't need quite that effort, but it will need something like it. Being
able to move to Austin is a plus but not required.   I can sponsor a visa
for the right candidate as well (though it's not guaranteed you will get
one with the immigration policies what they are).

This is a labor of love for so many of us and my desire to help the dtype
situation in NumPy comes from the same space that my desire to work on
NumPy in the first place came.  I will be interviewing people to work
on this as not everyone who may want to will really be qualified to do it
--- especially with so many people writing Cython these days instead of
good-ole C-API code :-)

Feel free to spread the news to anyone you can.   I won't say more until
I've found someone to work with me on this --- because I won't have the
time to follow-up with any questions or comments.Even if I can't find
someone I will publish the ideas --- but that also takes time and effort
that is in short supply for me right now.

If there is someone willing to fund this work, please let me know as well
-- that could free up more of my time.

Best,

-Travis


-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Notes from the numpy dev meeting at scipy 2015

2015-08-26 Thread Travis Oliphant

 as we both have
the same ultimate interests in seeing array-computing in Python improve.
I just don't support *major* changes without breaking the ABI without a
whole lot of proof that it is possible (without hackiness).  You have
mentioned on your roadmap a lot of what I would consider *major* changes.
  Some of it you describe how to get there.   The most important change
(improving the dtype system) you don't.

Part of my point is that we now *know* how to improve the dtype system.
Let's do it.   Let's not try yet again to do it differently inside an old
system designed by a scientist who didn't understand type-theory or type
systems (that was me by the way).Look at data-shape in the blaze
project.Take that and build a Python type-system that also outputs
struct-string syntax for memory-views.  That's the data-description system
that NumPy should be using --- not trying to hack on a mixed array-scalar,
dtype-object system that may never support everything we now know is
needed.

Trying to incrementing from where we are now will only lead to a
sub-optimal outcome and unfortunate instability when we already know what
to do differently.I doubt I will convince you --- certainly not via
email.   I apologize in advance that I likely won't be able to respond in
depth to any more questions that are really just prove to me that I can't
kind of questions.  Of course I can't prove that.   All I'm saying is that
to me the evidence and my experience leads me to not be able to support
major changes like you have proposed without also intentionally breaking
the ABI (and thus calling it NumPy 2.0).

If I find time to write, I will try to use it to outline more specifically
what I think is a better approach to array- and table-computing in Python
that keeps the stability of NumPy and adds new features using different
approaches.

-Travis






 On Tue, Aug 25, 2015 at 12:00 PM, Travis Oliphant tra...@continuum.io
 wrote:

 Thanks for the write-up Nathaniel.   There is a lot of great detail and
 interesting ideas here.

 I've am very eager to understand how to help NumPy and the wider
 community move forward however I can (my passions on this have not changed
 since 1999, though what I myself spend time on has changed).

 There are a lot of ways to think about approaching this, though.   It's
 hard to get all the ideas on the table, and it was unfortunate we couldn't
 get everybody wyho are core NumPy devs together in person to have this
 discussion as there are still a lot of questions unanswered and a lot of
 thought that has gone into other approaches that was not brought up or
 represented in the meeting (how does Numba fit into this, what about
 data-shape, dynd, memory-views and Python type system, etc.).   If NumPy
 becomes just an interface-specification, then why don't we just do that
 *outside* NumPy itself in a way that doesn't jeopardize the stability of
 NumPy today.These are some of the real questions I have.   I will try
 to write up my thoughts in more depth soon, but  I won't be able to respond
 in-depth right now.   I just wanted to comment because Nathaniel said I
 disagree which is only partly true.

 The three most important things for me are 1) let's make sure we have
 representation from as wide of the community as possible (this is really
 hard), 2) let's look around at the broader community and the prior art that
 is happening in this space right now and 3) let's not pretend we are going
 to be able to make all this happen without breaking ABI compatibility.
 Let's just break ABI compatibility with NumPy 2.0 *and* have as much
 fidelity with the API and semantics of current NumPy as possible (though
 there will be some changes necessary long-term).

 I don't think we should intentionally break ABI if we can avoid it, but I
 also don't think we should spend in-ordinate amounts of time trying to
 pretend that we won't break ABI (for at least some people), and most
 importantly we should not pretend *not* to break the ABI when we actually
 do.We did this once before with the roll-out of date-time, and it was
 really un-necessary. When I released NumPy 1.0, there were several
 things that I knew should be fixed very soon (NumPy was never designed to
 not break ABI).Those problems are still there.Now, that we have
 quite a bit better understanding of what NumPy *should* be (there have been
 tremendous strides in understanding and community size over the past 10
 years), let's actually make the infrastructure we think will last for the
 next 20 years (instead of trying to shoe-horn new ideas into a 20-year old
 code-base that wasn't designed for it).

 NumPy is a hard code-base.  It has been since Numeric days in 1995. I
 could be wrong, but my guess is that we will be passed by as a community if
 we don't seize the opportunity to build something better than we can build
 if we are forced to use a 20 year old code-base.

 It is more important to not break people's

Re: [Numpy-discussion] Notes from the numpy dev meeting at scipy 2015

2015-08-25 Thread Travis Oliphant

On Tue, Aug 25, 2015 at 3:58 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Tue, Aug 25, 2015 at 1:00 PM, Travis Oliphant tra...@continuum.io
 wrote:

 Thanks for the write-up Nathaniel.   There is a lot of great detail and
 interesting ideas here.

 snip



 I think that summarizes my main concerns.  I will write-up more forward
 thinking ideas for what else is possible in the coming weeks.   In the mean
 time, thanks for keeping the discussion going.  It is extremely exciting to
 see the help people have continued to provide to maintain and improve
 NumPy.It will be exciting to see what the next few years bring as
 well.


 I think the only thing that looks even a little bit like a numpy 2.0 at
 this time is dynd. Rewriting numpy, let alone producing numpy 2.0 is a
 major project. Dynd is 2.5+ years old, 3500+ commits in, and still in
 progress.  If there is a decision to pursue Dynd I could support that, but
 I think we would want to think deeply about how to make the transition as
 painless as possible. It would be good at this point to get some feedback
 from people currently using dynd. IIRC, part of the reason for starting
 dynd was the perception that is was not possible to evolve numpy without
 running into compatibility road blocks. Travis, could you perhaps summarize
 the thinking that went into the decision to make dynd a separate project?


I think it would be best if Mark Wiebe speaks up here.   I can explain why
Continuum supported DyND with some fraction of Mark's time for a few years
and give my perspective, but ultimately DyND is Mark's story to tell (and a
few talented people have now joined him in the effort).  Mark Wiebe was a
productive NumPy developer.   He was one of a few people that jumped in on
the code-base and made substantial and significant changes and came to
understand just how hard it can be to develop in the NumPy code-base.
He also is a C++ developer who really likes the beauty and power of that
language (which definitely biases his NumPy work, but he did put a lot of
effort into making NumPy better).  Before Peter and I started Continuum,
Mark had begun the DyND project as an example of a general-purpose dynamic
array library that could be used by any dynamic language to make arrays.

In the early days of Continuum, we spent time from at least Mark W, Bryan
Van de Ven, Jay Borque, and Francesc Alted looking at how to extend NumPy
to add 1) categorical data-types, 2) variable-length strings, and 3) better
date-time types.Bryan, a good developer, who has gone on to be a
primary developer of Bokeh spent quite a bit of time and had a prototype of
categoricals *nearly* working.   He did not like working on the NumPy
code-base at all.  He struggled with it and found it very difficult to
extend.He worked closely with Mark Wiebe who helped him the best he
could.   What took him 4 weeks in NumPy took him 3 days in DyND to build.
I think that experience, convinced him and Mark W both that working with
NumPy code-base would take too long to make significant progress.

Also, during 2012 I was trying to help with release-management (though I
ended up just hiring Ondrej Certek to actually do the work and he did a
great job of getting a release of NumPy out the door --- thanks to much
help from many of you).At that point, I realized very clearly, that
what I could best do at this point was to try and get more resources for
open source and for the NumPy stack rather than work on the code directly.
   We also did work with several clients that helped me realize just how
many disruptive changes had happened from 1.4 to 1.7 for extensive users of
NumPy (much more than would be justified from a we don't break the ABI
mantra that was the stated goal).

We also realized that the kind of experimentation we wanted to do in the
first 2 years of Continuum would just not be possible on the NumPy
code-base and the need for getting community buy-in on every decision would
slow us down too much --- as we had to iterate rapidly on so many things
and find our center as a startup.   It also would not be fair to the NumPy
community. Our decision to do *all* of our exploration outside the
NumPy code base was basically 1) the kinds of changes we wanted ultimately
were potentially dramatic and disruptive, 2) it would be too difficult and
time-consuming to decide all things in public discussions with the NumPy
community --- especially when some things were experimental 3) tying
ourselves to releases of NumPy would be difficult at that time, and 4) the
design of the NumPy code-base makes it difficult to contribute to --- both
Mark W and Bryan V felt they could make progress *much* faster in a new
code-base.

Continuum did not have enough start-up funding to devote significant time
on DyND in the early days.So Mark rallied what resources he could and
we supported him the best we could and he made progress.  My only real
requirement with sponsoring his work when we did

Re: [Numpy-discussion] Notes from the numpy dev meeting at scipy 2015

2015-08-25 Thread Travis Oliphant

On Tue, Aug 25, 2015 at 3:58 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Tue, Aug 25, 2015 at 1:00 PM, Travis Oliphant tra...@continuum.io
 wrote:

 Thanks for the write-up Nathaniel.   There is a lot of great detail and
 interesting ideas here.

 snip



 There are at least 3 areas of compatibility (ABI, API, and semantic).
  ABI-compatibility is a non-feature in today's world.   There are so many
 distributions of the NumPy stack (and conda makes it trivial for anyone to
 build their own or for you to build one yourself).   Making less-optimal
 software-engineering choices because of fear of breaking the ABI is not
 something I'm supportive of at all.   We should not break ABI every
 release, but a release every 3 years that breaks ABI is not a problem.

 API compatibility should be much more sacrosanct, but it is also
 something that can also be managed.   Any NumPy 2.0 should definitely
 support the full NumPy API (though there could be deprecated swaths).I
 think the community has done well in using deprecation and limiting the
 public API to make this more manageable and I would love to see a NumPy 2.0
 that solidifies a future-oriented API along with a back-ward compatible API
 that is also available.

 Semantic compatibility is the hardest.   We have already broken this on
 multiple occasions throughout the 1.x NumPy releases.  Every time you
 change the code, this can change.This is what I fear causing deep
 instability over the course of many years. These are things like the
 casting rule details,  the effect of indexing changes, any change to the
 calculations approaches. It is and has been the most at risk during any
 code-changes.My view is that a NumPy 2.0 (with a new low-level
 architecture) minimizes these changes to a single release rather than
 unavoidably spreading them out over many, many releases.

 I think that summarizes my main concerns.  I will write-up more forward
 thinking ideas for what else is possible in the coming weeks.   In the mean
 time, thanks for keeping the discussion going.  It is extremely exciting to
 see the help people have continued to provide to maintain and improve
 NumPy.It will be exciting to see what the next few years bring as
 well.


 I think the only thing that looks even a little bit like a numpy 2.0 at
 this time is dynd. Rewriting numpy, let alone producing numpy 2.0 is a
 major project. Dynd is 2.5+ years old, 3500+ commits in, and still in
 progress.  If there is a decision to pursue Dynd I could support that, but
 I think we would want to think deeply about how to make the transition as
 painless as possible. It would be good at this point to get some feedback
 from people currently using dynd. IIRC, part of the reason for starting
 dynd was the perception that is was not possible to evolve numpy without
 running into compatibility road blocks. Travis, could you perhaps summarize
 the thinking that went into the decision to make dynd a separate project?


 Thanks Chuck.   I'll do this in a separate email, but I just wanted to
point out that when I say NumPy 2.0, I'm actually only specifically talking
about a release of NumPy that breaks ABI compatibility --- not some
potential re-write.   I'm not ruling that out, but I'm not necessarily
implying such a thing by saying NumPy 2.0.


 snip

 Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

*Travis Oliphant*
*Co-founder and CEO*


@teoliphant
512-222-5440
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Notes from the numpy dev meeting at scipy 2015

2015-08-25 Thread Travis Oliphant

 rewriting the world.

   Obviously there are still a lot of details to work out, though. But
   overall, there was widespread agreement that this is one of the #1
   pain points for our users (e.g. it's the single main request from
   pandas), and fixing it is very high priority.

   Some features that would become straightforward to implement
   (e.g. even in third-party libraries) if this were fixed:
   - missing value support
   - physical unit tracking (meters / seconds - array of velocity;
 meters + seconds - error)
   - better and more diverse datetime representations (e.g. datetimes
 with attached timezones, or using funky geophysical or
 astronomical calendars)
   - categorical data
   - variable length strings
   - strings-with-encodings (e.g. latin1)
   - forward mode automatic differentiation (write a function that
 computes f(x) where x is an array of float64; pass that function
 an array with a special dtype and get out both f(x) and f'(x))
   - probably others I'm forgetting right now

   I should also note that there was one substantial objection to this
   plan, from Travis Oliphant (in discussions later in the
   conference). I'm not confident I understand his objections well
   enough to reproduce them here, though -- perhaps he'll elaborate.


 Money
 =

   There was an extensive discussion on the topic of: if we had money,
   what would we do with it?

   This is partially motivated by the realization that there are a
   number of sources that we could probably get money from, if we had a
   good story for what we wanted to do, so it's not just an idle
   question.

   Points of general agreement:

   - Doing the in-person meeting was a good thing. We should plan do
 that again, at least once a year. So one thing to spend money on
 is travel subsidies to make sure that happens and is productive.

   - While it's tempting to imagine hiring junior people for the more
 frustrating/boring work like maintaining buildbots, release
 infrastructure, updating docs, etc., this seems difficult to do
 realistically with our current resources -- how do we hire for
 this, who would manage them, etc.?

   - On the other hand, the general feeling was that if we found the
 money to hire a few more senior people who could take care of
 themselves more, then that would be good and we could
 realistically absorb that extra work without totally unbalancing
 the project.

 - A major open question is how we would recruit someone for a
   position like this, since apparently all the obvious candidates
   who are already active on the NumPy team already have other
   things going on. [For calibration on how hard this can be: NYU
   has apparently had an open position for a year with the job
   description of come work at NYU full-time with a
   private-industry-competitive-salary on whatever your personal
   open-source scientific project is (!) and still is having an
   extremely difficult time filling it:
   [http://cds.nyu.edu/research-engineer/]]

 - General consensus though was that there isn't much to be done
   about this though, except try it and see.

 - (By the way, if you're someone who's reading this and
   potentially interested in like a postdoc or better working on
   numpy, then let's talk...)


 More specific changes to numpy that had general consensus, but don't
 really fit into a high-level roadmap

 =

   - Resolved: we should merge multiarray.so and umath.so into a single
 extension module, so that they can share utility code without the
 current awkward contortions.

   - Resolved: we should start hiding new fields in the ufunc and dtype
 structs as soon as possible going forward. (I.e. they would not be
 present in the version of the structs that are exposed through the
 C API, but internally we would use a more detailed struct.)
 - Mayybe we should even go ahead and hide the subset of the
   existing fields that are really internal details that no-one
   should be using. If we did this without changing anything else
   then it would preserve ABI (the fields would still be where
   existing compiled extensions expect them to be, if any such
   extensions exist) while breaking API (trying to compile such
   extensions would give a clear error), so would be a smoother
   ramp if we think we need to eventually break those fields for
   real. (As discussed above, there are a bunch of fields in the
   dtype base class that only make sense for specific dtype
   subclasses, e.g. only record dtypes need a list of field names,
   but right now all dtypes have one anyway. So it would be nice to
   remove these from the base class entirely, but that is
   potentially ABI-breaking.)

   - Resolved

Re: [Numpy-discussion] Copyright status of NumPy binaries on Windows/OS X

2014-10-08 Thread Travis Oliphant

Only on Windows does free Anaconda link against the MKL.   But, you are
correct, that the MKL-linked binaries can only be re-distributed if the
person or entity doing the re-distribution has a valid MKL license from
Intel.

Microsoft has actually released their Visual Studio 2008 compiler stack so
that OpenBLAS and ATLAS could be compiled on Windows for these platforms as
well.   I would be very interested to see conda packages for these
libraries which should be pretty straightforward to build.

-Travis


On Wed, Oct 8, 2014 at 1:12 PM, Carl Kleffner cmkleff...@gmail.com wrote:

 Hi Travis,

 the Anaconda binaries (free packages as well as the non-free addons) link
 against Intel MKL - not against ATLAS. Are this binaries really free
 redistributable as stated?

 The lack of numpy/scipy 64bit windows binaries with opensource blas/lapack
 with was one of the main reasons to start with the development of a
 dedicated mingw-w64 based compiler toolchain to support OpenBLAS / ATLAS
 based binaries on windows.

 Cheers,

 carlkl



 2014-10-08 1:32 GMT+02:00 Travis Oliphant tra...@continuum.io:

 Hey Andrew,

 You can use any of the binaries from Anaconda and redistribute them as
 long as you cite Anaconda --- i.e. tell your users that they are using
 Anaconda-derived binaries. The Anaconda binaries link against ATLAS.

 The binaries are all at http://repo.continuum.io/pkgs/

 In case you weren't aware:

 Another way you can build and distribute an application is to build a
 'conda' meta-package which lists all the dependencies.   If you add to this
 meta-package 1) an icon and 2) an entry-point, then your application will
 automatically show up in the Anaconda Launcher (see this blog-post:
 http://www.continuum.io/blog/new-launcher ) and anyone with the Anaconda
 Launcher app can install/update your package by clicking on the icon next
 to it.

 Users can also install your package with conda install or using the
 conda-gui.

 Best,

 -Travis


 On Mon, Oct 6, 2014 at 11:54 AM, Andrew Collette 
 andrew.colle...@gmail.com wrote:

 Hi all,

 I am working with the HDF Group on a new open-source viewer program
 for HDF5 files, powered by NumPy, h5py, and wxPython.  On Windows,
 since people don't typically have Python installed, we are looking to
 distribute the application using PyInstaller, which embeds
 dependencies like NumPy.  Likewise for OS X (using Py2App).

 We would like to make sure we don't accidentally include
 non-open-source components... I recall there was some discussion here
 about using the Intel math libraries for binary releases on various
 platforms.  Do the releases on SourceForge or PyPI use any proprietary
 code?  We'd like to avoid building NumPy ourselves if we can avoid it.

 Apologies if this is explained somewhere, but I couldn't find it.

 Thanks!
 Andrew Collette
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




 --

 Travis Oliphant
 CEO
 Continuum Analytics, Inc.
 http://www.continuum.io

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Travis Oliphant
CEO
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Copyright status of NumPy binaries on Windows/OS X

2014-10-08 Thread Travis Oliphant

Ah, yes,  I hadn't realized that OpenBLAS could not be compiled with Visual
Studio. Thanks for that explanation.

Also, I had heard that 32bit mingw on Windows could still produce 64-bit
binaries. It looks like there are OpenBLAS binaries available for
Windows 32 and Windows 64 (two flavors). It should be straightforward
to take those binaries and make conda (or wheel) packages out of them.

A good mingw64 stack for Windows would be great and benefits many
communities.


On Wed, Oct 8, 2014 at 4:46 PM, Sturla Molden sturla.mol...@gmail.com
wrote:

 Travis Oliphant tra...@continuum.io wrote:

  Microsoft has actually released their Visual Studio 2008 compiler stack
 so
  that OpenBLAS and ATLAS could be compiled on Windows for these platforms
 as
  well.   I would be very interested to see conda packages for these
  libraries which should be pretty straightforward to build.

 OpenBLAS does not compile with Microsoft compilers because of ATT assembly
 syntax. You need to use a GNU compiler and you also need to have a GNU
 environment. OpenBLAS is easy to build on Windows with MinGW (with
 gfortran) and MSYS. Carl's toolchain ensures that the binaries are
 compatible with the Python binaries from Python.org.

 Sturla

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Travis Oliphant
CEO
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Copyright status of NumPy binaries on Windows/OS X

2014-10-07 Thread Travis Oliphant

Hey Andrew,

You can use any of the binaries from Anaconda and redistribute them as long
as you cite Anaconda --- i.e. tell your users that they are using
Anaconda-derived binaries. The Anaconda binaries link against ATLAS.

The binaries are all at http://repo.continuum.io/pkgs/

In case you weren't aware:

Another way you can build and distribute an application is to build a
'conda' meta-package which lists all the dependencies.   If you add to this
meta-package 1) an icon and 2) an entry-point, then your application will
automatically show up in the Anaconda Launcher (see this blog-post:
http://www.continuum.io/blog/new-launcher ) and anyone with the Anaconda
Launcher app can install/update your package by clicking on the icon next
to it.

Users can also install your package with conda install or using the
conda-gui.

Best,

-Travis


On Mon, Oct 6, 2014 at 11:54 AM, Andrew Collette andrew.colle...@gmail.com
wrote:

 Hi all,

 I am working with the HDF Group on a new open-source viewer program
 for HDF5 files, powered by NumPy, h5py, and wxPython.  On Windows,
 since people don't typically have Python installed, we are looking to
 distribute the application using PyInstaller, which embeds
 dependencies like NumPy.  Likewise for OS X (using Py2App).

 We would like to make sure we don't accidentally include
 non-open-source components... I recall there was some discussion here
 about using the Intel math libraries for binary releases on various
 platforms.  Do the releases on SourceForge or PyPI use any proprietary
 code?  We'd like to avoid building NumPy ourselves if we can avoid it.

 Apologies if this is explained somewhere, but I couldn't find it.

 Thanks!
 Andrew Collette
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Travis Oliphant
CEO
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Custom dtypes without C -- or, a standard ndarray-like type

2014-09-24 Thread Travis Oliphant

This could actually be done by using the structured dtype pretty easily.
The hard work would be improving the ufunc and generalized ufunc mechanism
to handle structured data-types. Numba actually provides some of this
already, so if you have NumPy + Numba you can do this sort of thing now.

-Travis






On Wed, Sep 24, 2014 at 12:08 PM, Chris Barker chris.bar...@noaa.gov
wrote:

 On Tue, Sep 23, 2014 at 4:40 AM, Eric Moore e...@redtetrahedron.org
 wrote:

  Improving the dtype system requires working on c code.


 yes -- it sure does. But I think that is a bit of a Red Herring. I'm
 barely competent in C, and don't like it much, but the real barrier to
 entry for  me is not that it's in C, but that it's really complex and hard
 to hack on, as it wasn't designed to support custom dtypes, etc. from the
 start. There is a lot of ugly code in there that has been hacked in to
 support various functionality over time. If there was a clean
 dtype-extension system in C, then A) it wouldn't be bad C to write, and B)
 would be pretty easy to make a Cython-wrapped version.

 Travis gave a nice vision for the future, but in the meantime, I'm
 wondering:

 Could we hack in a generic custom dtype  dtype object into the current
 system that would delegate everything to the dtype object -- in a truly
 object-oriented way. I'm imagining that this custom dtype object would be a
 pyObject and thus very hackable, easy to make a new subclass, etc --
 essentially like making a new class in python that emulates one of the
 built-in type interfaces.

 This would be slow as a dog -- if inside that C loop, numpy would have to
 call out to python to do anyting, maybe as simple as arithmetic, but it
 would be clean, extensible system, and a good way for folks to plug in and
 try out new dtypes when performance didn't matter, or as prototypes for
 something that would get plugged in at the C level later once the API was
 worked out.

 Is this even possible without too much hacking to the current dtype
 system? Would it be as simple as adding a bit to the object dtype?

 -Chris

 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Travis Oliphant
CEO
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Custom dtypes without C -- or, a standard ndarray-like type

2014-09-23 Thread Travis Oliphant

On Sun, Sep 21, 2014 at 6:50 PM, Stephan Hoyer sho...@gmail.com wrote:

 pandas has some hacks to support custom types of data for which numpy
 can't handle well enough or at all. Examples include datetime and
 Categorical [1], and others like GeoArray [2] that haven't make it into
 pandas yet.

 Most of these look like numpy arrays but with custom dtypes and type
 specific methods/properties. But clearly nobody is particularly excited
 about writing the the C necessary to implement custom dtypes [3]. Nor is do
 we need the ndarray ABI.

 In many cases, writing C may not actually even be necessary for
 performance reasons, e.g., categorical can be fast enough just by wrapping
 an integer ndarray for the internal storage and using vectorized
 operations. And even if it is necessary, I think we'd all rather write
 Cython than C.

 It's great for pandas to write its own ndarray-like wrappers (*not*
 subclasses) that work with pandas, but it's a shame that there isn't a
 standard interface like the ndarray to make these arrays useable for the
 rest of the scientific Python ecosystem. For example, pandas has loads of
 fixes for np.datetime64, but nobody seems to be up for porting them to
 numpy (I doubt it would be easy).

 I know these sort of concerns are not new, but I wish I had a sense of
 what the solution looks like. Is anyone actively working on these issues?
 Does the fix belong in numpy, pandas, blaze or a new project? I'd love to
 get a sense of where things stand and how I could help -- without writing
 any C :).


Hey Stephan,

There are not easy answers to your questions.   The reason is that NumPy's
dtype system is not extensible enough with its fixed set of builtin
data-types and its bolted-on user-defined datatypes.   The implementation
was adapted from the *descriptor* notion that was in Numeric (written
almost 20 years ago). While a significant improvement over Numeric, the
dtype system in NumPy still has several limitations:

1) it was not designed to add new fundamental data-types without
breaking the ABI (most of the ABI breakage between 1.3 and 1.7 due to the
addition of np.datetime has been pushed to a small corner but it is still
there).

2) The user-defined data-type system which is present is not well
tested and likely incomplete:  it was the best I could come up with at the
time NumPy first came out with a bit of input from people like Fernando
Perez and Francesc Alted.

3) It is far easier than in Numeric to add new data-types (that was a
big part of the effort of NumPy), but it is still not as easy as one would
like to add new data-types (either fundamental ones requiring recompilation
of NumPy or 'user-defined' data-types requiring C-code.

I believe this system has served us well, but it needs to be replaced
eventually.  I think it can be replaced fairly seamlessly in a largely
backward compatible way (though requiring re-compilation of dependencies).
   Fixing the dtype system is a fundamental effort behind several projects
we are working on at Continuum:  datashape, dynd, and numba.These
projects are addressing fundamental limitations in a way that can lead to a
significantly improved framework for scientific and tabular computing in
Python.

In the mean-time, NumPy can continue to improve in small ways and in
orthogonal ways (like the new __numpy_ufunc__ mechanism which allows ufuncs
to work more seamlessly with different kinds of array-like objects).
 This kind of effort as well as the improved buffer protocol in Python,
mean that multiple array-like objects can co-exist and use each-other's
data.   Right now, I think that is the best current way to address the
data-type limitations of NumPy.

Another small project is possible today --- one could today use Numba or
Cython to generate user-defined data-types for existing NumPy.   That would
be an interesting project and would certainly help to understand the
limitations of the user-defined data-type framework without making people
write C-code.   You could use a meta-class and some code-generation
techniques so that by defining a particular class you end-up with a
user-defined data-type for NumPy.

Even while we have been addressing the fundamental limitations of NumPy
with our new tools at Continuum, replacing NumPy is a big undertaking
because of its large user-base.   While I personally think that NumPy could
be replaced for new users as early as next year with a combination of dynd
and numba, the big install base of NumPy means that many people (including
the company I work with, Continuum) will be supporting NumPy 1.X and Pandas
and the rest of the NumPy-Stack for many years to come.

So, even if you see me working and advocating new technology, that should
never be construed as somehow ignoring or abandoning the current technology
base.   I remain deeply interested in the success of the scientific
computing community --- even though I am not currently contributing a lot
of code directly myself.As

Re: [Numpy-discussion] SciPy 2014 BoF NumPy Participation

2014-06-04 Thread Travis Oliphant

 projects can use as a
data-type-description mini-language:
https://github.com/ContinuumIO/datashape

I think that a really good project for an enterprising young graduate
student, post-doc, or professor (who is willing to delay their PhD or risk
their tenure) would be to re-write the ufunc system using more modern
techniques and put generalized ufuncs front and center as Nathaniel
described.

It sounds like many agree that we can improve the ufunc object
implementation.A new ufunc system is an entirely achievable goal and
could even be shipped as an add-on project external from NumPy for
several years before being adopted fully.I know at least 4 people with
demo-ware versions of a new ufunc-object that could easily replace current
NumPy ufuncs eventually.If you are interested in that, I would love to
share what I know with you.

After spending quite a bit of time thinking about this over the past 2
years, interacting with many in the user community outside of this list,
and working with people as they explore a few options --- I do have a fair
set of opinions.   But, there are also a lot of possibilities and many
opportunities.  I'm looking forward to seeing what emerges in the coming
months and years and cooperating where possible with others having
overlapping interests.

Best,

-Travis




On Tue, Jun 3, 2014 at 6:08 PM, Kyle Mandli kyle.man...@gmail.com wrote:

 Hello everyone,

 As one of the co-chairs in charge of organizing the birds-of-a-feather
 sesssions at the SciPy conference this year, I wanted to solicit through
 the NumPy list to see if we could get enough interest to hold a NumPy
 centered BoF this year.  The BoF format would be up to those who would lead
 the discussion, a couple of ideas used in the past include picking out a
 few of the lead devs to be on a panel and have a QA type of session or an
 open QA with perhaps audience guided list of topics.  I can help
 facilitate organization of something but we would really like to get
 something organized this year (last year NumPy was the only major project
 that was not really represented in the BoF sessions).

 Thanks!

 Kyle Manldi (and via proxy Matt McCormick)



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Travis Oliphant
CEO
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-04 Thread Travis Oliphant

Believe me, I'm all for incremental changes if it is actually possible and
doesn't actually cost more.  It's also why I've been silent until now about
anything we are doing being a candidate for a NumPy 2.0.  I understand the
challenges of getting people to change.  But, features and solid
improvements *will* get people to change --- especially if their new
library can be used along with the old library and the transition can be
done gradually. Python 3's struggle is the lack of features.

At some point there *will* be a NumPy 2.0.   What features go into NumPy
2.0, how much backward compatibility is provided, and how much porting is
needed to move your code from NumPy 1.X to NumPy 2.X is the real user
question --- not whether it is characterized as incremental change or
re-write. What I call a re-write and what you call an
incremental-change are two points on a spectrum and likely overlap
signficantly if we really compared what we are thinking about.

One huge benefit that came out of the numeric / numarray / numpy transition
that we mustn't forget about was actually the extended buffer protocol and
memory view objects.  This really does allow multiple array objects to
co-exist and libraries to use the object that they prefer in a way that did
not exist when Numarray / numeric / numpy came out.So, we shouldn't be
afraid of that world.   The existence of easy package managers to update
environments to try out new features and have applications on a single
system that use multiple versions of the same library is also something
that didn't exist before and that will make any transition easier for
users.

One thing I regret about my working on NumPy originally is that I didn't
have the foresight, skill, and understanding to work more on a more
extended and better designed multiple-dispatch system so that multiple
array objects could participate together in an expression flow.   The
__numpy_ufunc__ mechanism gives enough capability in that direction that it
may be better now.

Ultimately, I don't disagree that NumPy can continue to exist in
incremental change mode ( though if you are swapping out whole swaths of
C-code for Cython code --- it sounds a lot like a re-write) as long as
there are people willing to put the effort into changing it.   I think this
is actually benefited by the existence of other array objects that are
pushing the feature envelope without the constraints --- in much the same
way that the Python standard library is benefitted by many versions of
different capabilities being tried out before moving into the standard
library.

I remain optimistic that things will continue to improve in multiple ways
--- if a little messier than any of us would conceive individually.   It
*is* great to see all the PR's coming from multiple people on NumPy and all
the new energy around improving things whether great or small.

Best,

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] PEP 465 has been accepted / volunteers needed

2014-04-08 Thread Travis Oliphant

Congratulations!  This is definitely a big step for array-computing with
Python.  Working with the Python devs to implement a PEP can be a
tremendous opportunity to increase your programming awareness and ability
--- as well as make some good friends.

This is a great way to get involved with both Python and the NumPy
community and have a big impact.  If you are in a position to devote
several hours a week to the task, then you won't find a better opportunity
to contribute.

Best,

-Travis



On Apr 7, 2014 6:24 PM, Nathaniel Smith n...@pobox.com wrote:

 Hey all,

 Guido just formally accepted PEP 465:
   https://mail.python.org/pipermail/python-dev/2014-April/133819.html
   http://legacy.python.org/dev/peps/pep-0465/#implementation-details

 Yay.

 The next step is to implement it, in CPython and in numpy. I have time
 to advise on this, but not to do it myself, so, any volunteers? Ever
 wanted to hack on the interpreter itself, with BDFL guarantee your
 patch will be accepted (if correct)?

 The todo list for CPython is here:
 http://legacy.python.org/dev/peps/pep-0465/#implementation-details
 There's one open question which is where the type slots should be
 added. I'd just add them to PyNumberMethods and then if someone
 objects during patch review it can be changed.

 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Travis Oliphant

Congratulations Nathaniel!

This is great news!

Well done on starting the process and taking things forward.

Travis
On Mar 14, 2014 7:51 PM, Nathaniel Smith n...@pobox.com wrote:

 Well, that was fast. Guido says he'll accept the addition of '@' as an
 infix operator for matrix multiplication, once some details are ironed
 out:
   https://mail.python.org/pipermail/python-ideas/2014-March/027109.html
   http://legacy.python.org/dev/peps/pep-0465/

 Specifically, we need to figure out whether we want to make an
 argument for a matrix power operator (@@), and what
 precedence/associativity we want '@' to have. I'll post two separate
 threads to get feedback on those in an organized way -- this is just a
 heads-up.

 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Indexing changes in 1.9

2014-02-03 Thread Travis Oliphant

Hey Sebastien,

I didn't mean to imply that you would need to necessarily work on it.   But
the work Jay has done could use review.

There are also conversations to have about what to do to resolve the
ambiguity that led to the current behavior.

Thank you or all the great work on the indexing code paths.

Is their a roadmap for 1.9?

Travis
 On Feb 3, 2014 1:26 PM, Sebastian Berg sebast...@sipsolutions.net
wrote:

 On Sun, 2014-02-02 at 13:11 -0600, Travis Oliphant wrote:
  This sounds like a great and welcome work and improvements.
 
  Does it make sense to also do something about the behavior of advanced
  indexing when slices are interleaved between lists and integers.
 
  I know that jay borque has some  preliminary work to fix this.  There
  are a some straightforward fixes -- like doing iterative application
  of indexing in those cases which would be more sensical in the cases
  where current code gets tripped up.
 

 I guess you are talking about the funky transposing logic and maybe the
 advanced indexing logic as such? I didn't really think about changing
 any of that, not sure if we easily can?
 Personally, I always wondered if it would make sense to add some new
 type of indexing mechanism to switch to R/matlab style non-advanced
 integer-array indexing. I don't think this will make it substantially
 easier to do (the basic logic remains the same -- we need an
 extra/different preparation and then transpose the result differently),
 though it might be a bit more obvious where/how to plug it in.

 But it seems very unlikely I will look into that in the near future (but
 if someone wants hints on how to go about it, just ask).

 - Sebastian

  Travis
 
  On Feb 2, 2014 11:07 AM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  Sebastian has done a lot of work to refactor/rationalize numpy
  indexing. The changes are extensive enough that it would be
  good to have more public review, so here is the release note.
 
  The NumPy indexing has seen a complete rewrite in this
  version. This makes
  most advanced integer indexing operations much faster
  and should have no
  other implications.
  However some subtle changes and deprecations were
  introduced in advanced
  indexing operations:
 
* Boolean indexing into scalar arrays will always
  return a new 1-d array.
  This means that ``array(1)[array(True)]`` gives
  ``array([1])`` and
  not the original array.
* Advanced indexing into one dimensional arrays used
  to have (undocumented)
  special handling regarding repeating the value
  array in assignments
  when the shape of the value array was too small or
  did not match.
  Code using this will raise an error. For
  compatibility you can use
  ``arr.flat[index] = values``, which uses the old
  code branch.
* The iteration order over advanced indexes used to
  be always C-order.
  In NumPy 1.9. the iteration order adapts to the
  inputs and is not
  guaranteed (with the exception of a *single*
  advanced index which is
  never reversed for compatibility reasons). This
  means that the result is
  undefined if multiple values are assigned to the
  same element.
  An example for this is ``arr[[0, 0], [1, 1]] = [1,
  2]``, which may
  set ``arr[0, 1]`` to either 1 or 2.
* Equivalent to the iteration order, the memory
  layout of the advanced
  indexing result is adapted for faster indexing and
  cannot be predicted.
* All indexing operations return a view or a copy.
  No indexing operation
  will return the original array object.
* In the future Boolean array-likes (such as lists
  of python bools)
  will always be treated as Boolean indexes and
  Boolean scalars (including
  python `True`) will be a legal *boolean* index. At
  this time, this is
  already the case for scalar arrays to allow the
  general
  ``positive = a[a  0]`` to work when ``a`` is zero
  dimensional.
* In NumPy 1.8 it was possible to use `array(True)`
  and `array(False)`
  equivalent to 1 and 0 if the result

Re: [Numpy-discussion] Indexing changes in 1.9

2014-02-02 Thread Travis Oliphant

This sounds like a great and welcome work and improvements.

Does it make sense to also do something about the behavior of advanced
indexing when slices are interleaved between lists and integers.

I know that jay borque has some  preliminary work to fix this.  There are a
some straightforward fixes -- like doing iterative application of indexing
in those cases which would be more sensical in the cases where current code
gets tripped up.

Travis
On Feb 2, 2014 11:07 AM, Charles R Harris charlesr.har...@gmail.com
wrote:

 Sebastian has done a lot of work to refactor/rationalize numpy indexing.
 The changes are extensive enough that it would be good to have more public
 review, so here is the release note.

 The NumPy indexing has seen a complete rewrite in this version. This makes
 most advanced integer indexing operations much faster and should have no
 other implications.
 However some subtle changes and deprecations were introduced in advanced
 indexing operations:

   * Boolean indexing into scalar arrays will always return a new 1-d
 array.
 This means that ``array(1)[array(True)]`` gives ``array([1])`` and
 not the original array.
   * Advanced indexing into one dimensional arrays used to have
 (undocumented)
 special handling regarding repeating the value array in assignments
 when the shape of the value array was too small or did not match.
 Code using this will raise an error. For compatibility you can use
 ``arr.flat[index] = values``, which uses the old code branch.
   * The iteration order over advanced indexes used to be always C-order.
 In NumPy 1.9. the iteration order adapts to the inputs and is not
 guaranteed (with the exception of a *single* advanced index which is
 never reversed for compatibility reasons). This means that the result
 is
 undefined if multiple values are assigned to the same element.
 An example for this is ``arr[[0, 0], [1, 1]] = [1, 2]``, which may
 set ``arr[0, 1]`` to either 1 or 2.
   * Equivalent to the iteration order, the memory layout of the advanced
 indexing result is adapted for faster indexing and cannot be
 predicted.
   * All indexing operations return a view or a copy. No indexing operation
 will return the original array object.
   * In the future Boolean array-likes (such as lists of python bools)
 will always be treated as Boolean indexes and Boolean scalars
 (including
 python `True`) will be a legal *boolean* index. At this time, this is
 already the case for scalar arrays to allow the general
 ``positive = a[a  0]`` to work when ``a`` is zero dimensional.
   * In NumPy 1.8 it was possible to use `array(True)` and `array(False)`
 equivalent to 1 and 0 if the result of the operation was a scalar.
 This will raise an error in NumPy 1.9 and, as noted above, treated as
 a
 boolean index in the future.
   * All non-integer array-likes are deprecated, object arrays of custom
 integer like objects may have to be cast explicitly.
   * The error reporting for advanced indexing is more informative, however
 the error type has changed in some cases. (Broadcasting errors of
 indexing arrays are reported as `IndexError`)
   * Indexing with more then one ellipsis (`...`) is deprecated.


 Thoughts?

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ufunc overrides

2013-07-10 Thread Travis Oliphant

Hey Blake,

To be clear, my blog-post is just a pre-NEP and should not be perceived as
something that will transpire in NumPy anytime soon.You should take it
as a hey everyone, I think I know how to solve this problem, but I have no
time to do it, but wanted to get the word out to those who might have the
time

I think the multi-method approach I outline is the *right* thing to do for
NumPy. Another attribute on ufuncs would be a bit of a hack (though
easier to implement). But, on the other-hand, the current ufunc
attributes are also a bit of a hack.

While my overall proposal is to make *all* functions in NumPy (and SciPy
and Scikits) multimethods, I think it's actually pretty straightforward and
a more contained problem to make all *ufuncs* multi-methods. I think that
could fit in a summer of code project.

I don't think it would be that difficult to make all ufuncs multi-methods
that dispatch based on the Python type (they are already multi-methods
based on the array dtype).You could basically take the code from
Guido's essay or from Peak Rules multi-method implementation or from the
links below and integrate it with a wrapped version of the current ufuncs
(or do a bit more glue and modify the ufunc_call function in 'C' directly
and get nice general multi-methods for ufuncs).

Of course, you would need to define a decorator that NumPy users could use
to register their multi-method implementation with the ufunc. But, this
again would not be too difficult. Look for examples and inspiration at
the following places:

http://alexgaynor.net/2010/jun/26/multimethods-python/
https://pypi.python.org/pypi/typed.py

I really think this would be a great addition to NumPy (it would simplify a
lot of cruft around masked arrays, character arrays, etc.) and be quite
useful. I wish you the best. I can't promise I will have time to
help, but I will try to chime in the best I can.

Best regards,

-Travis

On Wed, Jul 10, 2013 at 10:29 PM, Blake Griffith blake.a.griff...@gmail.com
wrote:

Hello NumPy,

Part of my GSoC is compatibility with SciPy's sparse matrices and NumPy's
ufuncs. Currently there is no feasible way to do this without changing
ufuncs a bit.

I've been considering a mechanism to override ufuncs based on checking the
ufuncs arguments for a __ufunc_override__ attribute. Then handing off the
operation to a function specified by that attribute. I prototyped this in
python and did a demo in a blog post here:
http://cwl.cx/posts/week-6-ufunc-overrides.html
This is similar to a previously discussed, but never implemented change:
http://mail.scipy.org/pipermail/numpy-discussion/2011-June/056945.html

However it seems like the ufunc machinery might be ripped out and replaced
with a true multi-method implementation soon. See Travis' blog post:

http://technicaldiscovery.blogspot.com/2013/07/thoughts-after-scipy-2013-and-specific.html
So I'd like to make my changes as forward compatible as possible. However
I'm not sure what I should even consider here, or how forward compatible my
current implementation is. Thoughts?

Until then, I'm writing up a nep, it is still pretty incomplete, it can be
found here:

https://github.com/cowlicks/numpy/blob/ufunc-override/doc/neps/ufunc-overrides.rst

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Travis Oliphant
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Travis Oliphant

Mark Wiebe and I are both still tracking NumPy development and can provide
context and even help when needed.Apologies if we've left a different
impression.   We have to be prudent about the time we spend as we have
other projects we are pursuing as well, but we help clients with NumPy
issues all the time and are eager to continue to improve the code base.

It seems to me that the biggest issue is just the automatic conversion that
is occurring on string or date-time input.   We should stop using the local
time-zone (explicit is better than implicit strikes again) and not use any
time-zone unless time-zone information is provided in the string.  I am
definitely +1 on that.

It may be necessary to carry around another flag in the data-type to
indicate whether or not the date-time is naive (not time-zone aware) or
time-zone aware so that string printing does not print a time-zone if it
didn't have one to begin with as well.

If others agree that this is the best way forward, then Mark or I can
definitely help contribute a patch.

Best,

-Travis



On Wed, Apr 3, 2013 at 9:38 AM, Dave Hirschfeld
dave.hirschf...@gmail.comwrote:

 Nathaniel Smith njs at pobox.com writes:

 
  On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld
  dave.hirschfeld at gmail.com wrote:
  
   This isn't acceptable for my use case (in a multinational company) and
 I
 found
   no reasonable way around it other than bypassing the numpy conversion
 entirely
   by setting the dtype to object, manually parsing the strings and
 creating an
   array from the list of datetime objects.
 
  Wow, that's truly broken. I'm sorry.
 
  I'm skeptical that just switching to UTC everywhere is actually the
  right solution. It smells like one of those solutions that's simple,
  neat, and wrong. (I don't know anything about calendar-time series
  handling, so I have no ability to actually judge this stuff, but
  wouldn't one problem be if you want to know about business days/hours?
  You lose the original day-of-year once you move everything to UTC.)
  Maybe datetime dtypes should be parametrized by both granularity and
  timezone? Or we could just declare that datetime64 is always
  timezone-naive and adjust the code to match?
 
  I'll CC the pandas list in case they have some insight. Unfortunately
  AFAIK no-one who's regularly working on numpy this point works with
  datetimes, so we have limited ability to judge solutions... please
  help!
 
  -n
 

 It think simply setting the timezone to UTC if it's not specified would
 solve
 99% of use cases because IIUC the internal representation is UTC so numpy
 would
 be doing no conversion of the dates that were passed in. It was the
 conversion
 which was the source of the error in my example.

 The only potential issue with this is that the dates might take along an
 incorrect UTC timezone, making it more difficult to work with naive
 datetimes.

 e.g.

 In [42]: d = np.datetime64('2014-01-01 00:00:00', dtype='M8[ns]')

 In [43]: d
 Out[43]: numpy.datetime64('2014-01-01T00:00:00+')

 In [44]: str(d)
 Out[44]: '2014-01-01T00:00:00+'

 In [45]: pydate(str(d))
 Out[45]: datetime.datetime(2014, 1, 1, 0, 0, tzinfo=tzutc())

 In [46]: pydate(str(d)) == datetime.datetime(2014, 1, 1)
 Traceback (most recent call last):

   File ipython-input-46-abfc0fee9b97, line 1, in module
 pydate(str(d)) == datetime.datetime(2014, 1, 1)

 TypeError: can't compare offset-naive and offset-aware datetimes


 In [47]: pydate(str(d)) == datetime.datetime(2014, 1, 1, tzinfo=tzutc())
 Out[47]: True

 In [48]: pydate(str(d)).replace(tzinfo=None) == datetime.datetime(2014, 1,
 1)
 Out[48]: True


 In this case it may be best to have numpy not try to set the timezone at
 all if
 none was specified. Given the internal representation is UTC I'm not sure
 this
 is feasible though so defaulting to UTC may be the best solution.

 Regards,
 Dave


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
---
Travis Oliphant
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] A new webpage promoting Compiler technology for CPython

2013-02-16 Thread Travis Oliphant

We should take this discussion off list.

Please email me directly if you have questions.  But,  we are open to
listing all of these tools.
On Feb 16, 2013 10:46 AM, Massimo DiPierro massimo.dipie...@gmail.com
wrote:

 Thank you.

 Should this be listed: https://github.com/mdipierro/ocl ?

 It is based on meta (which is listed) and pyopencl (which is listed, only
 used to run with opencl) and has some overlap with Cython and Pyjamas
 although it is not based on any of them.
 It is minimalist in scope: it only coverts to C/JS/OpenCL a common subset
 of those languages. But it does what it advertises. It is written in pure
 python and implemented and implemented in a single file.

 Massimo

 On Feb 16, 2013, at 10:13 AM, Ronan Lamy wrote:

 Le 16/02/2013 16:08, Massimo DiPierro a écrit :

 Sorry for injecting... Which page is this about?


 http://compilers.pydata.org/
 Cf. the post I answered to.

 On Feb 16, 2013, at 9:59 AM, Ronan Lamy wrote:


 Le 15/02/2013 07:11, Travis Oliphant a écrit :


 This page is specifically for Compiler projects that either integrate

 with or work directly with the CPython run-time which is why PyPy is not

 presently listed.  The PyPy project is a great project but we just felt

 that we wanted to explicitly create a collection of links to compilation

 projects that are accessible from CPython which are likely less well known.


 I won't argue here with the exclusion of PyPy, but RPython is definitely

 compiler technology that runs on CPython 2.6/2.7. For now, it is only

 accessible from a source checkout of PyPy but that will soon change and

 pip install rpython isn't far off.


 Since it's a whole tool chain, it has a wealth of functionalities,

 though they aren't always well-documented or easy to access from the

 outside: bytecode analysis, type inference, several GC implementations,

 a JIT generator, assemblers for several architectures, ...


 Cheers,

 Ronan


 ___

 NumPy-Discussion mailing list

 NumPy-Discussion@scipy.org

 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___

 NumPy-Discussion mailing list

 NumPy-Discussion@scipy.org

 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] A new webpage promoting Compiler technology for CPython

2013-02-16 Thread Travis Oliphant

I only meant off the NumPy list as it seems this is off-topic for this
forum.

I thought I made clear in the rest of the paragraph that we would *love*
this contribution.   I recommend a pull request.

If you want to discuss this in public.  Let's have the discussion over at
numfo...@googlegroups.com until a more specific list is created.


On Sat, Feb 16, 2013 at 6:14 PM, Fernando Perez fperez@gmail.comwrote:

 On Sat, Feb 16, 2013 at 3:56 PM, Travis Oliphant tra...@continuum.io
 wrote:
  We should take this discussion off list.

 Just as a bystander interested in this: why?  It seems that OCL is
 within the scope of what's being proposed and another entrant into the
 vibrant new world of compiler-extended machinery for fast numerical
 work in cpython, so I suspect I'm not the only numpy user curious to
 know the answer on-list.

 I know sometimes there are legitimate reasons to take a discussion
 off-list, but in this case it seemed to be a perfectly reasonable
 question that also made me curious (as I only learned of OCL thanks to
 this discussion).

 Cheers,

 f
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] A new webpage promoting Compiler technology for CPython

2013-02-14 Thread Travis Oliphant

Hey all, 

With Numba and Blaze we have been doing a lot of work on what essentially is 
compiler technology and realizing more and more that we are treading on ground 
that has been plowed before with many other projects.   So, we wanted to create 
a web-site and perhaps even a mailing list or forum where people could 
coordinate and communicate about compiler projects, compiler tools, and ways to 
share efforts and ideas.

The website is:  http://compilers.pydata.org/

This page is specifically for Compiler projects that either integrate with or 
work directly with the CPython run-time which is why PyPy is not presently 
listed.  The PyPy project is a great project but we just felt that we wanted to 
explicitly create a collection of links to compilation projects that are 
accessible from CPython which are likely less well known.

But that is just where we started from.   The website is intended to be a 
community website constructed from a github repository.   So, we welcome pull 
requests from anyone who would like to see the website updated to reflect their 
related project.Jon Riehl (Mython, PyFront, ROFL, and many other 
interesting projects) and Stephen Diehl (Blaze) and I will be moderating the 
pull requests to begin with.   But, we welcome others with similar interests to 
participate in that effort of moderation.

The github repository is here:  https://github.com/pydata/compilers-webpage

This is intended to be a community website for information spreading, and so we 
welcome any and all contributions.  

Thank you,

Travis Oliphant


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release

2012-12-28 Thread Travis Oliphant

Fantastic job everyone! Hats of to you Ondrej!

-Travis

On Dec 28, 2012, at 6:02 PM, Ondřej Čertík wrote:

 Hi,
 
 I'm pleased to announce the availability of the first release candidate of
 NumPy 1.7.0rc1.
 
 Sources and binary installers can be found at
 https://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc1/
 
 We have fixed all issues known to us since the 1.7.0b2 release.
 The only remaining issue is a documentation improvement:
 
 https://github.com/numpy/numpy/issues/561
 
 Please test this release and report any issues on the numpy-discussion
 mailing list. If there are no more problems, we'll release the final
 version soon. I'll wait at least a week and please write me an email
 if you need more time for testing.
 
 I would like to thank Sebastian Berg, Ralf Gommers, Han Genuit,
 Nathaniel J. Smith, Jay Bourque, Gael Varoquaux, Mark Wiebe,
 Matthew Brett, Skipper Seabold, Peter Cock, Charles Harris, Frederic,
 Gabriel, Luis Pedro Coelho, Pauli Virtanen, Travis E. Oliphant
 and cgohlke for sending patches and fixes for this release since
 1.7.0b2.
 
 Cheers,
 Ondrej
 
 P.S. Source code is uploaded to sourceforge, and I'll upload the
 rest of the Windows and Mac binaries in a few hours as they finish building.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch)

2012-12-21 Thread Travis Oliphant


On Dec 20, 2012, at 7:39 PM, Nathaniel Smith wrote:

 On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Travis - I think you are suggesting that there should be no  one
 person in charge of numpy, and I think this is very unlikely to work
 well.   Perhaps there are good examples of well-led projects where
 there is not a clear leader, but I can't think of any myself at the
 moment.  My worry would be that, without a clear leader, it will be
 unclear how decisions are made, and that will make it very hard to
 take strategic decisions.
 
 Curious; my feeling is the opposite, that among mature and successful
 FOSS projects, having a clear leader is the uncommon case. GCC
 doesn't, Glibc not only has no leader but they recently decided to get
 rid of their formal steering committee, I'm pretty sure git doesn't,
 Apache certainly doesn't, Samba doesn't really, etc. As usual Karl
 Fogel has sensible comments on this:
  http://producingoss.com/en/consensus-democracy.html
 
 In practice the main job of a successful FOSS leader is to refuse to
 make decisions, nudge people to work things out, and then if they
 refuse to work things out tell them to go away until they do:
  https://lwn.net/Articles/105375/
 and what actually gives people influence in a project is the respect
 of the other members. The former stuff is stuff anyone can do, and the
 latter isn't something you can confer or take away with a vote.
 

I will strongly voice my opinion that NumPy does not need an official single 
leader.What it needs are committed, experienced, service-oriented 
developers and users who are willing to express their concerns and requests 
because they are used to being treated well.It also needs new developers 
who are willing to dive into code, contribute to discussions,  tackle issues, 
make pull requests, and review pull requests.As people do this regularly, 
the leaders of the project will emerge as they have done in the past.

Even though I called out three people explicitly --- there are many more 
contributors to NumPy whose voices deserve attention.  But, you don't need me 
to point out the obvious to what the Github record shows about who is 
shepherding NumPy  these days.But, the Github record is not the only one 
that matters.I would love to see NumPy developers continue to pay attention 
to and deeply respect the users (especially of downstream projects that depend 
on NumPy). 

I plan to continue using NumPy myself and plan to continue to encourage others 
around me to contribute patches, fixes and features.   Obviously, there are 
people who have rights to merge pull-requests to the repository.But, this 
group seems always open to new, willing help.From a practical matter, this 
group is the head development group of the official NumPy fork.I believe 
this group will continue to be open enough to new, motivated contributors which 
will allow it to grow to the degree that such developers are available. 



 Nor do we necessarily have a great track record for executive
 decisions actually working things out.
 
 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch

2012-12-16 Thread Travis Oliphant

Hello all, 

There is a lot happening in my life right now and I am spread quite thin among 
the various projects that I take an interest in. In particular, I am 
thrilled to publicly announce on this list that Continuum Analytics has 
received DARPA funding (to the tune of at least $3 million) for Blaze, Numba, 
and Bokeh which we are writing to take NumPy, SciPy, and visualization into the 
domain of very large data sets.This is part of the XDATA program, and I 
will be taking an active role in it.You can read more about Blaze here:  
http://blaze.pydata.org.   You can read more about XDATA here:  
http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx  

I personally think Blaze is the future of array-oriented computing in Python.   
I will be putting efforts and resources next year behind making that case.   
How it interacts with future incarnations of NumPy, Pandas, or other projects 
is an interesting and open question.  I have no doubt the future will be a rich 
ecosystem of interoperating array-oriented data-structures. I invite anyone 
interested in Blaze to participate in the discussions and development at 
https://groups.google.com/a/continuum.io/forum/#!forum/blaze-dev or watch the 
project on our public GitHub repo:  https://github.com/ContinuumIO/blaze.  
Blaze is being incubated under the ContinuumIO GitHub project for now, but 
eventually I hope it will receive its own GitHub project page later next year.  
 Development of Blaze is early but we are moving rapidly with it (and have 
deliverable deadlines --- thus while we will welcome input and pull requests we 
won't have a ton of time to respond to simple queries until
  at least May or June).There is more that we are working on behind the 
scenes with respect to Blaze that will be coming out next year as well but 
isn't quite ready to show yet.

As I look at the coming months and years, my time for direct involvement in 
NumPy development is therefore only going to get smaller.  As a result it is 
not appropriate that I remain as head steward of the NumPy project (a term I 
prefer to BFD12 or anything else).   I'm sure that it is apparent that while 
I've tried to help personally where I can this year on the NumPy project, my 
role has been more one of coordination, seeking funding, and providing expert 
advice on certain sections of code.I fundamentally agree with Fernando 
Perez that the responsibility of care-taking open source projects is one of 
stewardship --- something akin to public service.I have tried to emulate 
that belief this year --- even while not always succeeding.  

It is time for me to make official what is already becoming apparent to 
observers of this community, namely, that I am stepping down as someone who 
might be considered head steward for the NumPy project and officially leaving 
the development of the project in the hands of others in the community.   I 
don't think the project actually needs a new head steward --- especially from 
a development perspective. Instead I see a lot of strong developers 
offering key opinions for the project as well as a great set of new developers 
offering pull requests.  

My strong suggestion is that development discussions of the project continue on 
this list with consensus among the active participants being the goal for 
development.  I don't think 100% consensus is a rigid requirement --- but 
certainly a super-majority should be the goal, and serious changes should not 
be made with out a clear consensus. I would pay special attention to 
under-represented people (users with intense usage of NumPy but small voices on 
this list).   There are many of them.If you push me for specifics then at 
this point in NumPy's history, I would say that if Chuck, Nathaniel, and Ralf 
agree on a course of action, it will likely be a good thing for the project.   
I suspect that even if only 2 of the 3 agree at one time it might still be a 
good thing (but I would expect more detail and discussion).There are others 
whose opinion should be sought as well:  Ondrej Certik, Perry Greenfield, 
Robert Kern, David Cournapeau, Francesc Alted, and Mark Wiebe to 
 name a few.For some questions, I might even seek input from people like 
Konrad Hinsen and Paul Dubois --- if they have time to give it.   I will still 
be willing to offer my view from time to time and if I am asked. 

Greg Wilson (of Software Carpentry fame) asked me recently what letter I would 
have written to myself 5 years ago.   What would I tell myself to do given the 
knowledge I have now? I've thought about that for a bit, and I have some 
answers.   I don't know if these will help anyone, but I offer them as 
hopefully instructive:   

1) Do not promise to not break the ABI of NumPy --- and in fact 
emphasize that it will be broken at least once in the 1.X series.NumPy was 
designed to add new data-types --- but not without breaking the ABI.NumPy 
has

[Numpy-discussion] www.numpy.org home page

2012-12-13 Thread Travis Oliphant

For people interested in the www.numpy.org home page:

Jon Turner has officially transferred the www.numpy.org domain to NumFOCUS. 
 Thank you, Jon for this donation and for being a care-taker of the 
domain-name.   We have setup the domain registration to point to 
numpy.github.com and I've changed the CNAME in that repostiory to www.numpy.org

I've sent an email to have the numpy.scipy.org page to redirect to 
www.numpy.org.  

The NumPy home page can still be edited in this repository:  
g...@github.com:numpy/numpy.org.git.   Pull requests are always welcome --- 
especially pull requests that improve the look and feel of the web-page. 

Two of the content changes that we need to make a decision about is 

1) whether or not to put links to books published (Packt publishing for 
example has offered a higher percentage of their revenues if we put a prominent 
link on www.numpy.org) 
2) whether or not to accept Sponsored by links on the home page for 
donations to the project (e.g. Continuum Analytics has sponsored Ondrej release 
management, other companies have sponsored pull requests, other companies may 
want to provide donations and we would want to recognize their contributions to 
the numpy project). 

These decisions should be made by the NumPy community which in my mind are 
interested people on this list.   Who is interested in this kind of discussion? 
 

We could have these discussions on this list or on the 
numfo...@googlegroups.com list and keep this list completely technical (which I 
prefer, but I will do whatever the consensus is).  

Best regards,

-Travis
   



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8

2012-12-13 Thread Travis Oliphant

A big +1 from me  --- but I don't have anyone I know using 2.4 anymore

-Travis

On Dec 13, 2012, at 10:34 AM, Charles R Harris wrote:

 Time to raise this topic again. Opinions welcome.
 
 Chuck
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement

2012-12-02 Thread Travis Oliphant

Raul, 

This is *fantastic work*. While many optimizations were done 6 years ago as 
people started to convert their code, that kind of report has trailed off in 
the last few years.   I have not seen this kind of speed-comparison for some 
time --- but I think it's definitely beneficial. 

NumPy still has quite a bit that can be optimized.   I think your example is 
really great.Perhaps it's worth making a C-API macro out of the short-cut 
to the attribute string so it can be used by others.It would be interesting 
to see where your other slow-downs are. I would be interested to see if the 
slow-math of float64 is hurting you.It would be possible, for example, to 
do a simple subclass of the ndarray that overloads a[integer] to be the same 
as array.item(integer).  The latter syntax returns python objects (i.e. 
floats) instead of array scalars. 

Also, it would not be too difficult to add fast-math paths for int64, float32, 
and float64 scalars (so they don't go through ufuncs but do scalar-math like 
the float and int objects in Python.  


A related thing we've been working on lately which might help you is Numba 
which might help speed up functions that have code like:  a[0]  4 :  
http://numba.pydata.org.

Numba will translate the expression a[0]  4 to a machine-code address-lookup 
and math operation which is *much* faster when a is a NumPy array.Presently 
this requires you to wrap your function call in a decorator: 

from numba import autojit

@autojit
def function_to_speed_up(...):
pass

In the near future (2-4 weeks), numba will grow the experimental ability to 
basically replace all your function calls with @autojit versions in a Python 
function.I would love to see something like this work: 

python -m numba filename.py

To get an effective autojit on all the filename.py functions (and optionally on 
all python modules it imports).The autojit works out of the box today --- 
you can get Numba from PyPI (or inside of the completely free Anaconda CE) to 
try it out. 

Best, 

-Travis




On Dec 2, 2012, at 7:28 PM, Raul Cota wrote:

 Hello,
 
 First a quick summary of my problem and at the end I include the basic 
 changes I am suggesting to the source (they may benefit others)
 
 I am ages behind in times and I am still using Numeric in Python 2.2.3. 
 The main reason why it has taken so long to upgrade is because NumPy 
 kills performance on several of my tests.
 
 I am sorry if this topic has been discussed before. I tried parsing the 
 mailing list and also google and all I found were comments related to 
 the fact that such is life when you use NumPy for small arrays.
 
 In my case I have several thousands of lines of code where data 
 structures rely heavily on Numeric arrays but it is unpredictable if the 
 problem at hand will result in large or small arrays. Furthermore, once 
 the vectorized operations complete, the values could be assigned into 
 scalars and just do simple math or loops. I am fairly sure the core of 
 my problems is that the 'float64' objects start propagating all over the 
 program data structures (not in arrays) and they are considerably slower 
 for just about everything when compared to the native python float.
 
 Conclusion, it is not practical for me to do a massive re-structuring of 
 code to improve speed on simple things like a[0]  4 (assuming a is 
 an array) which is about 10 times slower than b  4 (assuming b is a 
 float)
 
 
 I finally decided to track down the problem and I started by getting 
 Python 2.6 from source and profiling it in one of my cases. By far the 
 biggest bottleneck came out to be PyString_FromFormatV which is a 
 function to assemble a string for a Python error caused by a failure to 
 find an attribute when multiarray calls PyObject_GetAttrString. This 
 function seems to get called way too often from NumPy. The real 
 bottleneck of trying to find the attribute when it does not exist is not 
 that it fails to find it, but that it builds a string to set a Python 
 error. In other words, something as simple as a[0]  3.5 internally 
 result in a call to set a python error .
 
 I downloaded NumPy code (for Python 2.6) and tracked down all the calls 
 like this,
 
  ret = PyObject_GetAttrString(obj, __array_priority__);
 
 and changed to
 if (PyList_CheckExact(obj) ||  (Py_None == obj) ||
 PyTuple_CheckExact(obj) ||
 PyFloat_CheckExact(obj) ||
 PyInt_CheckExact(obj) ||
 PyString_CheckExact(obj) ||
 PyUnicode_CheckExact(obj)){
 //Avoid expensive calls when I am sure the attribute
 //does not exist
 ret = NULL;
 }
 else{
 ret = PyObject_GetAttrString(obj, __array_priority__);
 
 
 
 ( I think I found about 7 spots )
 
 
 I also noticed (not as bad in my case) that calls to PyObject_GetBuffer 
 also resulted in Python errors being set thus unnecessarily slower code.
 
 
 With this change, something like this,

Re: [Numpy-discussion] Z-ordering (Morton ordering) for numpy

2012-11-24 Thread Travis Oliphant

This is pretty cool.Something like this would be interesting to play with.  
There are some algorithms that are faster with z-order arrays.The code is 
simple enough and small enough that I could see putting it in NumPy.   What do 
others think?

-Travis



On Nov 24, 2012, at 1:03 PM, Gamblin, Todd wrote:

 Hi all,
 
 In the course of developing a network mapping tool I'm working on, I also 
 developed some python code to do arbitrary-dimensional z-order (morton order) 
 for ndarrays.  The code is here:
 
   https://github.com/tgamblin/rubik/blob/master/rubik/zorder.py
 
 There is a function to put the elements of an array in Z order, and another 
 one to enumerate an array's elements in Z order.  There is also a ZEncoder 
 class that can generate Z-codes for arbitrary dimensions and bit widths.
 
 I figure this is something that would be generally useful.  Any interest in 
 having this in numpy?  If so, what should the interface look like and can you 
 point me to a good spot in the code to add it?
 
 I was thinking it might make sense to have a Z-order iterator for ndarrays, 
 kind of like ndarray.flat.  i.e.:
 
   arr = np.empty([4,4], dtype=int)
   arr.flat = range(arr.size)
   for elt in arr.zorder:
   print elt,
   0 4 1 5 8 12 9 13 2 6 3 7 10 14 11 15
 
 Or an equivalent to ndindex:
 
   arr = np.empty(4,4, dtype=int)
   arr.flat = range(arr.size)
   for ix in np.zindex(arr.shape):
   print ix,
   (0, 0) (1, 0) (0, 1) (1, 1) (2, 0) (3, 0) (2, 1) (3, 1) (0, 2) (1, 2) 
 (0, 3) (1, 3) (2, 2) (3, 2) (2, 3) (3, 3)
 
 Thoughts?
 
 -Todd
 __
 Todd Gamblin, tgamb...@llnl.gov, http://people.llnl.gov/gamblin2
 CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] NumFOCUS has received 501(c)3 status

2012-11-05 Thread Travis Oliphant

Hello all, 

I'm really happy to report that NumFOCUS has received it's 501(c)3 status from 
the IRS.   You can now make tax-deductible donations to NumFOCUS for the 
support of NumPy.  We will put a NumPy-specific button on the home-page of 
NumPy soon so you can specifically direct your funds.But,  for now you can 
go to http://numfocus.org/donate and be confident that your funds will support: 

1) Continuous integration 
2) The John Hunter Technical fellowships (which are awards made to 
students and post-docs and their mentors who will contribute substantially to a 
supported project during a 3-18 month period).  
3) Equipment grants
4) Development sprints 
5) Student travel to conferences 
6) Project specific grants

For example, most of Ondrej's time to work on the release of NumPy 1.7.0 has 
been paid for by donations to NumFOCUS from Continuum Analytics. 

NumFOCUS is also seeking nominations for 5 new board members (to bring the 
total to 9).   If you would like to nominate someone please subscribe to 
numfo...@googlegroups.com (by sending an email to 
numfocus+subscr...@googlegroups.com) and then send your nomination.   
Alternatively, you can email me or one of the other directors directly. 

Best, 

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] 1.7.0 release

2012-11-05 Thread Travis Oliphant

Hey all, 

Ondrej has been tied up finishing his PhD for the past several weeks.  He is 
defending his work shortly and should be available to continue to help with the 
1.7.0 release around the first of December.He and I have been in contact 
during this process, and I've been helping where I can.   Fortunately, other 
NumPy developers have been active closing tickets and reviewing pull requests 
which has helped the process substantially. 

The release has taken us longer than we expected, but I'm really glad that 
we've received the bug-reports and issues that we have seen because it will 
help the 1.7.0 release be a more stable series.   Also, the merging of the Trac 
issues with Git has exposed over-looked problems as well and will hopefully 
encourage more Git-focused participation by users. 

We are targeting getting the final release of 1.7.0 out by mid December (based 
on Ondrej's availability).   But, I would like to find out which issues are 
seen as blockers by people on this list.   I think most of the issues that I 
had as blockers have been resolved.If there are no more remaining blockers, 
then we may be able to accelerate the final release of 1.7.0 to just after 
Thanksgiving. 

Best regards,

-Travis




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ticket 2228: Scientific package seeing ABI change in 1.6.x

2012-11-04 Thread Travis Oliphant


On Nov 4, 2012, at 1:31 PM, Ralf Gommers wrote:

 
 
 
 On Wed, Oct 31, 2012 at 1:05 PM, Charles R Harris charlesr.har...@gmail.com 
 wrote:
 
 
 On Tue, Oct 30, 2012 at 9:26 PM, Travis Oliphant tra...@continuum.io wrote:
 The NPY_CHAR is not a real type.   There are no type-coercion functions 
 attached to it nor ufuncs nor a full dtype object.  However, it is used 
 to mimic old Numeric character arrays (especially for copying a string).  
 
 It should have been deprecated before changing the ABI.  I don't think it was 
 realized that it was part of the ABI (mostly for older codes that depended on 
 Numeric).   I think it was just another oversight that inserting type-codes 
 changes this part of the ABI. 
 
 The positive side is that It's a small part of the ABI and not many codes 
 should depend on it.   At this point, I'm not sure what can be done, except 
 to document that NPY_CHAR has been deprecated in 1.7.0 and remove it in 1.8.0 
 to avoid future ABI difficulties.
 
 The short answer, is that codes that use NPY_CHAR must be recompiled to be 
 compatible with 1.6.0.
 
 
 IIRC, it was proposed to remove it at one point, but the STScI folks wanted 
 to keep it because their software depended on it.
 
 I can't find that discussion in the list archives. If you know who from STScI 
 to ask about this, can you do so? 
 
 Is replacing NPY_CHAR with NPY_STRING supposed to just work?

No, it's a little more complicated than that, but not too much.  Code that 
uses the NPY_CHAR type can be changed fairly easily to use the NPY_STRING type, 
but it does take some re-writing.   The NPY_CHAR field was added so that code 
written for Numeric (like ScientificPython's netcdf reader) would continue to 
just work with no changes and behave similarly to how it behaved with 
Numeric's character type.   

Unfortunately, adding it to the end of the TYPE list does prevent adding any 
more types without breaking at least this part of the ABI.  

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ticket 2228: Scientific package seeing ABI change in 1.6.x

2012-10-30 Thread Travis Oliphant

The NPY_CHAR is not a real type.   There are no type-coercion functions 
attached to it nor ufuncs nor a full dtype object.  However, it is used to 
mimic old Numeric character arrays (especially for copying a string).  

It should have been deprecated before changing the ABI.  I don't think it was 
realized that it was part of the ABI (mostly for older codes that depended on 
Numeric).   I think it was just another oversight that inserting type-codes 
changes this part of the ABI. 

The positive side is that It's a small part of the ABI and not many codes 
should depend on it.   At this point, I'm not sure what can be done, except to 
document that NPY_CHAR has been deprecated in 1.7.0 and remove it in 1.8.0 to 
avoid future ABI difficulties.

The short answer, is that codes that use NPY_CHAR must be recompiled to be 
compatible with 1.6.0.

-Travis






On Oct 30, 2012, at 8:46 PM, Charles R Harris wrote:

 
 
 On Tue, Oct 30, 2012 at 4:08 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:
 Hi,
 
 Ticket 2228 says ABI was broken in 1.6.x. Specifically, NPY_CHAR in the 
 NPY_TYPES enum seems to be have been moved. Can anyone comment on why the 3 
 datetime related values were inserted instead of appended?
 
 I don't know, although having NPY_CHAR after  NPY_NTYPES goes back to 1.0.3
 
 
 NPY_NTYPES,
 NPY_NOTYPE,
 
 NPY_CHAR,  /* special flag */
 
 And I expect it was desired to keep it there on the expectation that there 
 was a reason for it. The decision not to append was in 1.4.0
 
 NPY_DATETIME, NPY_TIMEDELTA,
 NPY_NTYPES,
 NPY_NOTYPE,
 NPY_CHAR, /* special flag */
 
 And probably due to Robert Kern or Travis, IIRC who worked on getting it in.
 
 I don't see a good way to get around the ABI break, I think the question 
 going forward needs to be whether we leave it after NPY_NTYPES or make it 
 part of the unchanging ABI, and I suspect we need to know what the 'special 
 flag' comment means before we can make that decision. My suspicion is that it 
 wasn't considered a real numeric type, but rather a flag marking a special 
 string type, in which case it probably doesn't really belong among the types, 
 which I think is also indicated by NPY_NOTYPE. Moving NPY_CHAR could have 
 implications we would want to check, but I'd generally favor moving it all 
 else being equal.
 
 Chuck 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue tracking

2012-10-23 Thread Travis Oliphant


On Oct 23, 2012, at 9:58 AM, David Cournapeau wrote:

 On Tue, Oct 23, 2012 at 5:05 AM, Thouis (Ray) Jones tho...@gmail.com wrote:
 On Fri, Oct 19, 2012 at 9:34 PM, Thouis (Ray) Jones tho...@gmail.com wrote:
 On Fri, Oct 19, 2012 at 4:46 PM, Thouis (Ray) Jones tho...@gmail.com 
 wrote:
 On Fri, Oct 19, 2012 at 11:20 AM, Thouis (Ray) Jones tho...@gmail.com 
 wrote:
 I started the import with the oldest 75 and newest 125 Trac issues,
 and will wait a few hours to do the rest to allow feedback, just in
 case something is broken that I haven't noticed.
 
 I did make one change to better emulate Trac behavior.  Some Trac
 usernames are also email addresses, which Trac anonymizes in its
 display.  I decided it was safer to do the same.
 
 The import is running again, though I've been having some failures in
 a few comments and general hangs (these might be network related).
 I'm keeping track of which issues might have had difficulties.
 
 @endolith noticed that I didn't correctly relink #XXX trac id numbers
 to github id numbers (both trac and github create links
 automatically), so that will have to be handled by a postprocessing
 script (which it probably would have, anyway, since the github # isn't
 known before import).
 
 Import has finished.
 
 The following trac #s had issues in creating the comments (I think due
 to network problems): 182, 297, 619, 621, 902, 904, 909 913, 914, 915,
 1044, 1526.  I'll review them and see if I can pull in anything
 missing
 
 I'll also work on a script for updating the trac crossrefs to github 
 crossrefs.
 
 In the no good deed goes unpunished category, I accidentally logged
 in as myself (rather than numpy-gitbot) and pushed about 500 issues,
 so now I receive updates whenever one of them gets changed.  At least
 most of them were closed, already...
 
 I just updated the cross-issue-references to use github rather than
 Trac id numbers.  Stupidly, I may have accidentally removed comments
 that were added in the last few days to  issues moved from trac to
 github.  Hopefully not, or at least not many.
 
 It's probably a good idea to turn off Trac, soon, to keep too many new
 bugs from needing to be ported, and old bugs being commented on.  The
 latter is more of a pain to deal with.
 
 I will look into making the NumPy trac read-only. It should not be too
 complicated to extend Pauli's code to redirect the tickets part to
 github issues.
 
 Have we decided what to do with the wiki content ?
 

I believe there is a wiki dump command in trac wiki.   We should put that 
content linked off the numpy pages at github. 

Thanks for helping with this. 

-Travis




 David
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue tracking

2012-10-19 Thread Travis Oliphant

Kudos! Ray.

Very impressive and useful work.

-Travis

On Oct 19, 2012, at 10:20 AM, Thouis (Ray) Jones wrote:

 I started the import with the oldest 75 and newest 125 Trac issues,
 and will wait a few hours to do the rest to allow feedback, just in
 case something is broken that I haven't noticed.
 
 I did make one change to better emulate Trac behavior.  Some Trac
 usernames are also email addresses, which Trac anonymizes in its
 display.  I decided it was safer to do the same.
 
 Ray Jones
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Announcing Anaconda version 1.1

2012-10-19 Thread Travis Oliphant

I just wanted to let everyone know about our new release of Anaconda which
now has Spyder and Matplotlib working for Mac OS X and Windows.

Right now, it's the best way to get the pre-requisites for Numba --- though
I recommend getting the latest Numba from github as Numba is still under
active development.
*

Anaconda 1.1 Announcement

Continuum Analytics, Inc. is pleased to announce the release of Anaconda
Pro 1.1, which extends Anaconda’s programming capabilities to the desktop.
 Anaconda Pro now includes an IDE (Spyder http://spyder-ide.blogspot.com/)
and plotting capabilities (Matplotlib http://matplotlib.org/), as well as
optimized versions of Numba Pro https://store.continuum.io/cshop/numbaproand
IOPro https://store.continuum.io/cshop/iopro.

With these enhancements, AnacondaPro is a complete solution for server-side
computation or client-side development.  It is equally well-suited for
supercomputers or for training in a classroom.

Available for Windows, Mac OS X, and Linux, Anaconda is a Python
distribution for scientific computing, engineering simulation, and business
intelligence  data management.  It includes the most popular numerical and
scientific libraries used by scientists, engineers, and data analysts, with
a single integrated and flexible installer.

Continuum Analytics offers Enterprise-level support for Anaconda, covering
both its open source libraries as well as the included commercial libraries
from Continuum.

For more information, to download a trial version of Anaconda Pro, or
download the completely free Anaconda CE, click
herehttps://store.continuum.io/cshop/anaconda
.
*
*
*
*
*
Best regards,

-Travis

*
*
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] A change with minor compatibility questions

2012-10-17 Thread Travis Oliphant

Hey all, 

https://github.com/numpy/numpy/pull/482

is  a pull request that changes the hash function for numpy void scalars.   
These are the objects returned from fully indexing a structured array:  
array[i] if array is a 1-d structured array. 

Currently their hash function just hashes the pointer to the underlying data.   
 This means that void scalars can be used as keys in a dictionary but the 
behavior is non-intuitive because another void scalar with the same data but 
pointing to a different region of memory will hash differently.  

The pull request makes it so that two void scalars with the same data will hash 
to the same value (using the same algorithm as a tuple hash).This pull 
request also only allows read-only scalars to be hashed. 

There is a small chance this will break someone's code if they relied on this 
behavior.  I don't believe anyone is currently relying on this behavior -- but 
I've been proven wrong before.   What do people on this list think?   

Should we raise a warning in the next release when a hash function on a void 
scalar is called or just make the change, put it in the release notes and make 
a few people change their code if needed.  The problem was identified by a 
couple of users of NumPy currently which is why I think that people who have 
tried using numpy void scalars as keys aren't doing it right now but are 
instead converting them to tuples first. 

-Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] A change with minor compatibility questions

2012-10-17 Thread Travis Oliphant


On Oct 17, 2012, at 12:48 PM, Dag Sverre Seljebotn wrote:

 On 10/17/2012 06:56 PM, Dag Sverre Seljebotn wrote:
 On 10/17/2012 05:22 PM, Travis Oliphant wrote:
 Hey all,
 
 https://github.com/numpy/numpy/pull/482
 
 is  a pull request that changes the hash function for numpy void
 scalars.   These are the objects returned from fully indexing a
 structured array:  array[i] if array is a 1-d structured array.
 
 Currently their hash function just hashes the pointer to the underlying
 data.This means that void scalars can be used as keys in a
 dictionary but the behavior is non-intuitive because another void scalar
 with the same data but pointing to a different region of memory will
 hash differently.
 
 The pull request makes it so that two void scalars with the same data
 will hash to the same value (using the same algorithm as a tuple hash).
 This pull request also only allows read-only scalars to be hashed.
 
 There is a small chance this will break someone's code if they relied on
 this behavior.  I don't believe anyone is currently relying on this
 behavior -- but I've been proven wrong before.   What do people on this
 list think?
 
 I support working on fixing this, but if I understand your fix correctly
 this change just breaks things in a different way.
 
 Specifically, in this example:
 
 arr = np.ones(4, dtype=[('a', np.int64)])
 x = arr[0]
 d = { x : 'value' }
 arr[0]['a'] = 4
 print d[x]
 
 Does the last line raise a KeyError? If I understand correctly it does.
 
 Argh. I overlooked both Travis' second commit, and the explicit mention 
 of read-only above.
 
 Isn't it possible to produce a read-only array from a writeable one 
 though, and so get a read-only scalar whose underlying value can still 
 change?

Yes, it is possible to do that (just like it is currently possible to change a 
tuple with a C-extension or even Cython or a string with NumPy). 

We won't be able to prevent people from writing code that will have odd 
behavior, but we can communicate correctly about what one should do. 

-Travis



 
 Anyway, sorry about being so quick to post.
 
 Dag Sverre
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Behavior of .base

2012-10-01 Thread Travis Oliphant


On Oct 1, 2012, at 9:11 AM, Jim Bosch wrote:

 On 09/30/2012 03:59 PM, Travis Oliphant wrote:
 Hey all,
 
 In a github-discussion with Gael and Nathaniel, we came up with a proposal 
 for .base that we should put before this list.Traditionally, .base has 
 always pointed to None for arrays that owned their own memory and to the 
 most immediate array object parent for arrays that did not own their own 
 memory.   There was a long-standing issue related to running out of stack 
 space that this behavior created.
 
 Recently this behavior was altered so that .base always points to the 
 original object holding the memory (something exposing the buffer 
 interface).   This created some problems for users who relied on the fact 
 that most of the time .base pointed to an instance of an array object.
 
 The proposal here is to change the behavior of .base for arrays that don't 
 own their own memory so that the .base attribute of an array points to the 
 most original object that is still an instance of the type of the array.
   This would go into the 1.7.0 release so as to correct the issues reported.
 
 What are reactions to this proposal?
 
 
 In the past, I've relied on putting arbitrary Python objects in .base in 
 my C++ to NumPy conversion code to make sure reference counting for 
 array memory works properly.  In particular, I've used Python CObjects 
 that hold boost::shared_ptrs, which don't even have a buffer interface. 
  So it sounds like I may be a few steps behind on the rules of what 
 actually should go in .base.
 

This should still work, nothing has been proposed to change this use-case.

 I'm very concerned that if we do demand that .base always point to a 
 NumPy array (rather than an arbitrary Python object or even just one 
 with a buffer interface), there's no longer any way for a NumPy array to 
 hold data allocated by something other than NumPy.

I don't recall a suggestion to demand that .base always point to a NumPy array. 
   The suggestion is that a view of a view of an array that has your 
boost::shared_ptr as a PyCObject pointed to by base will have it's base point 
to the first array instead of the PyCObject (as the recent change made). 

 
 If I want to put external memory in a NumPy array and indicate that it's 
 owned by some non-NumPy Python object, what is the recommended way to do 
 that?

The approach you took is still the way I would recommend doing that.  There may 
be other suggestions. 

-Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Behavior of .base

2012-09-30 Thread Travis Oliphant

Hey all, 

In a github-discussion with Gael and Nathaniel, we came up with a proposal for 
.base that we should put before this list.Traditionally, .base has always 
pointed to None for arrays that owned their own memory and to the most 
immediate array object parent for arrays that did not own their own memory.   
There was a long-standing issue related to running out of stack space that this 
behavior created. 

Recently this behavior was altered so that .base always points to the 
original object holding the memory (something exposing the buffer interface).  
 This created some problems for users who relied on the fact that most of the 
time .base pointed to an instance of an array object. 

The proposal here is to change the behavior of .base for arrays that don't own 
their own memory so that the .base attribute of an array points to the most 
original object that is still an instance of the type of the array.  This 
would go into the 1.7.0 release so as to correct the issues reported.

What are reactions to this proposal? 

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Behavior of .base

2012-09-30 Thread Travis Oliphant

We are not talking about changing it back.  The change in 1.6 caused problems 
that need to be addressed.

Can you clarify your concerns?  The proposal is not a major change to the 
behavior on master, but it does fix a real issue.

--
Travis Oliphant
(on a mobile)
512-826-7480


On Sep 30, 2012, at 3:30 PM, Han Genuit hangen...@gmail.com wrote:

 On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant tra...@continuum.io wrote:
 Hey all,
 
 In a github-discussion with Gael and Nathaniel, we came up with a proposal 
 for .base that we should put before this list.Traditionally, .base has 
 always pointed to None for arrays that owned their own memory and to the 
 most immediate array object parent for arrays that did not own their own 
 memory.   There was a long-standing issue related to running out of stack 
 space that this behavior created.
 
 Recently this behavior was altered so that .base always points to the 
 original object holding the memory (something exposing the buffer 
 interface).   This created some problems for users who relied on the fact 
 that most of the time .base pointed to an instance of an array object.
 
 The proposal here is to change the behavior of .base for arrays that don't 
 own their own memory so that the .base attribute of an array points to the 
 most original object that is still an instance of the type of the array.
   This would go into the 1.7.0 release so as to correct the issues reported.
 
 What are reactions to this proposal?
 
 -Travis
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 I think the current behaviour of the .base attribute is much more
 stable and predictable than past behaviour. For views for instance,
 this makes sure you don't hold references of 'intermediate' views, but
 always point to the original *base* object. Also, I think a lot of
 internal logic depends on this behaviour, so I am not in favour of
 changing this back (yet) again.
 
 Also, considering that this behaviour already exists in past versions
 of NumPy, namely 1.6, and is very fundamental to how arrays work, I
 find it strange that it is now up for change in 1.7 at the last
 minute.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Behavior of .base

2012-09-30 Thread Travis Oliphant

I think you are misunderstanding the proposal.   The proposal is to traverse 
the views as far as you can but stop just short of having base point to an 
object of a different type.

This fixes the infinite chain of views problem but also fixes the problem 
sklearn was having with base pointing to an unexpected mmap object.

--
Travis Oliphant
(on a mobile)
512-826-7480


On Sep 30, 2012, at 3:50 PM, Han Genuit hangen...@gmail.com wrote:

 On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant tra...@continuum.io wrote:
 We are not talking about changing it back.  The change in 1.6 caused 
 problems that need to be addressed.
 
 Can you clarify your concerns?  The proposal is not a major change to the 
 behavior on master, but it does fix a real issue.
 
 --
 Travis Oliphant
 (on a mobile)
 512-826-7480
 
 
 On Sep 30, 2012, at 3:30 PM, Han Genuit hangen...@gmail.com wrote:
 
 On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant tra...@continuum.io 
 wrote:
 Hey all,
 
 In a github-discussion with Gael and Nathaniel, we came up with a proposal 
 for .base that we should put before this list.Traditionally, .base has 
 always pointed to None for arrays that owned their own memory and to the 
 most immediate array object parent for arrays that did not own their own 
 memory.   There was a long-standing issue related to running out of stack 
 space that this behavior created.
 
 Recently this behavior was altered so that .base always points to the 
 original object holding the memory (something exposing the buffer 
 interface).   This created some problems for users who relied on the fact 
 that most of the time .base pointed to an instance of an array object.
 
 The proposal here is to change the behavior of .base for arrays that don't 
 own their own memory so that the .base attribute of an array points to 
 the most original object that is still an instance of the type of the 
 array.  This would go into the 1.7.0 release so as to correct the 
 issues reported.
 
 What are reactions to this proposal?
 
 -Travis
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 I think the current behaviour of the .base attribute is much more
 stable and predictable than past behaviour. For views for instance,
 this makes sure you don't hold references of 'intermediate' views, but
 always point to the original *base* object. Also, I think a lot of
 internal logic depends on this behaviour, so I am not in favour of
 changing this back (yet) again.
 
 Also, considering that this behaviour already exists in past versions
 of NumPy, namely 1.6, and is very fundamental to how arrays work, I
 find it strange that it is now up for change in 1.7 at the last
 minute.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 Well, the current behaviour makes sure you can have an endless chain
 of views derived from each other without keeping a copy of each view
 alive. If I understand correctly, you propose to change this behaviour
 to where it would keep a copy of each view alive.. My concern is that
 the problems that occurred from the 1.6 change are now seen as
 paramount above a correct implementation. There are problems with
 backward compatibility, but most of these are due to lack of
 documentation and testing. And now there will be a lot of people
 depending on the new behaviour, which is also something to take into
 account.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Behavior of .base

2012-09-30 Thread Travis Oliphant



--
Travis Oliphant
(on a mobile)
512-826-7480


On Sep 30, 2012, at 4:00 PM, Han Genuit hangen...@gmail.com wrote:

 On Sun, Sep 30, 2012 at 10:55 PM, Travis Oliphant tra...@continuum.io wrote:
 I think you are misunderstanding the proposal.   The proposal is to traverse 
 the views as far as you can but stop just short of having base point to an 
 object of a different type.
 
 This fixes the infinite chain of views problem but also fixes the problem 
 sklearn was having with base pointing to an unexpected mmap object.
 
 --
 Travis Oliphant
 (on a mobile)
 512-826-7480
 
 
 On Sep 30, 2012, at 3:50 PM, Han Genuit hangen...@gmail.com wrote:
 
 On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant tra...@continuum.io 
 wrote:
 We are not talking about changing it back.  The change in 1.6 caused 
 problems that need to be addressed.
 
 Can you clarify your concerns?  The proposal is not a major change to the 
 behavior on master, but it does fix a real issue.
 
 --
 Travis Oliphant
 (on a mobile)
 512-826-7480
 
 
 On Sep 30, 2012, at 3:30 PM, Han Genuit hangen...@gmail.com wrote:
 
 On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant tra...@continuum.io 
 wrote:
 Hey all,
 
 In a github-discussion with Gael and Nathaniel, we came up with a 
 proposal for .base that we should put before this list.
 Traditionally, .base has always pointed to None for arrays that owned 
 their own memory and to the most immediate array object parent for 
 arrays that did not own their own memory.   There was a long-standing 
 issue related to running out of stack space that this behavior created.
 
 Recently this behavior was altered so that .base always points to the 
 original object holding the memory (something exposing the buffer 
 interface).   This created some problems for users who relied on the 
 fact that most of the time .base pointed to an instance of an array 
 object.
 
 The proposal here is to change the behavior of .base for arrays that 
 don't own their own memory so that the .base attribute of an array 
 points to the most original object that is still an instance of the 
 type of the array.  This would go into the 1.7.0 release so as to 
 correct the issues reported.
 
 What are reactions to this proposal?
 
 -Travis
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 I think the current behaviour of the .base attribute is much more
 stable and predictable than past behaviour. For views for instance,
 this makes sure you don't hold references of 'intermediate' views, but
 always point to the original *base* object. Also, I think a lot of
 internal logic depends on this behaviour, so I am not in favour of
 changing this back (yet) again.
 
 Also, considering that this behaviour already exists in past versions
 of NumPy, namely 1.6, and is very fundamental to how arrays work, I
 find it strange that it is now up for change in 1.7 at the last
 minute.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 Well, the current behaviour makes sure you can have an endless chain
 of views derived from each other without keeping a copy of each view
 alive. If I understand correctly, you propose to change this behaviour
 to where it would keep a copy of each view alive.. My concern is that
 the problems that occurred from the 1.6 change are now seen as
 paramount above a correct implementation. There are problems with
 backward compatibility, but most of these are due to lack of
 documentation and testing. And now there will be a lot of people
 depending on the new behaviour, which is also something to take into
 account.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 Ah, sorry, I get it. You mean to make sure that base is an object of
 type ndarray. No problems there. :-)

Yes.  Exactly.   I realize I didn't explain it very well.  For a subtype it 
would ensure base is a subtype. 

Thanks for feedback.

Travis 


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Behavior of .base

2012-09-30 Thread Travis Oliphant

It sounds like there are no objections and this has a strong chance to fix the 
problems.We will put it on the TODO list for 1.7.0 release.

-Travis




On Sep 30, 2012, at 9:30 PM, Charles R Harris wrote:

 
 
 On Sun, Sep 30, 2012 at 1:59 PM, Travis Oliphant tra...@continuum.io wrote:
 Hey all,
 
 In a github-discussion with Gael and Nathaniel, we came up with a proposal 
 for .base that we should put before this list.Traditionally, .base has 
 always pointed to None for arrays that owned their own memory and to the 
 most immediate array object parent for arrays that did not own their own 
 memory.   There was a long-standing issue related to running out of stack 
 space that this behavior created.
 
 Recently this behavior was altered so that .base always points to the 
 original object holding the memory (something exposing the buffer 
 interface).   This created some problems for users who relied on the fact 
 that most of the time .base pointed to an instance of an array object.
 
 The proposal here is to change the behavior of .base for arrays that don't 
 own their own memory so that the .base attribute of an array points to the 
 most original object that is still an instance of the type of the array. 
  This would go into the 1.7.0 release so as to correct the issues reported.
 
 What are reactions to this proposal?
 
 
 It sounds like this would solve the problem in the short term, but it is a 
 bit of a hack in that the behaviour is more complicated than either the 
 original or the current version. So I could see this in 1.7, but it might be 
 preferable in the long term to work out what attributes are needed to solve 
 Gael's problem more directly.
 
 Chuck 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Making numpy sensible: backward compatibility please

2012-09-28 Thread Travis Oliphant

Thank you for expressing this voice, Gael.It is an important perspective.
The main reason that 1.7 has taken so long to get released is because I'm
concerned about these kinds of changes and really want to either remove them or
put in adequate warnings prior to moving forward.

It's a long and complex process. Thanks for providing feedback when you
encounter problems so that we can do our best to address them. I agree that
we should be much more cautious about semantic changes in the 1.X series of
NumPy.How we handle situations where 1.6 changed things from 1.5 and wasn't
reported until now is an open question and depends on the particular problem in
question.I agree that we should be much more cautious about changes
(particularly semantic changes that will break existing code).

-Travis

On Sep 28, 2012, at 4:23 PM, Gael Varoquaux wrote:

Hi numpy developers,

First of all, thanks a lot for the hard work you put in numpy. I know
very well that maintaining such a core library is a lot of effort and a
service to the community. But with great dedication, comes great
responsibility :).

I find that Numpy is a bit of a wild horse, a moving target. I have just
fixed a fairly nasty bug in scikit-learn [1] that was introduced by
change of semantics in ordering when doing copies with numpy. I have been
running working and developing the scikit-learn while tracking numpy's
development tree and, as far as I can tell, I never saw warnings raised
in our code that something was going to change, or had changed.

In other settings, changes in array inheritance and 'base' propagation
have made impossible some of our memmap-related usecase that used to work
under previous numpy [2]. Other's have been hitting difficulties related
to these changes in behavior [3]. Not to mention the new casting rules
(default: 'same_kind') that break a lot of code, or the ABI change that,
while not done an purpose, ended up causing us a lot of pain.

My point here is that having code that works and gives correct results
with new releases of numpy is more challenging that it should be. I
cannot claim that I disagree with the changes that I mention above. They
were all implemented for a good reason and can all be considered as
overall improvements to numpy. However the situation is that given a
complex codebase relying on numpy that works at a time t, the chances
that it works flawlessly at time t + 1y are thin. I am not too proud that
we managed to release scikit-learn 0.12 with a very ugly bug under numpy
1.7. That happened although we have 90% of test coverage, buildbots under
different numpy versions, and a lot of people, including me, using our
development tree on a day to day basis with bleeding edge numpy. Most
code in research settings or RD industry does not benefit from such
software engineering and I believe is much more likely to suffer from
changes in numpy.

I think that this is a cultural issue: priority is not given to stability
and backward compatibility. I think that this culture is very much
ingrained in the Python world, that likes iteratively cleaning its
software design. For instance, I have the feeling that in the
scikit-learn, we probably fall in the same trap. That said, such a
behavior cannot fare well for a base scientific environment. People tell
me that if they take old matlab code, the odds that it will still works
is much higher than with Python code. As a geek, I tend to reply that we
get a lot out of this mobility, because we accumulate less cruft.
However, in research settings, for reproducibility reasons, ones need to
be able to pick up an old codebase and trust its results without knowing
its intricacies.

From a practical standpoint, I believe that people implementing large
changes to the numpy codebase, or any other core scipy package, should
think really hard about their impact. I do realise that the changes are
discussed on the mailing lists, but there is a lot of activity to follow
and I don't believe that it is possible for many of us to monitor the
discussions. Also, putting more emphasis on backward compatibility is
possible. For instance, the 'order' parameter added to np.copy could have
defaulted to the old behavior, 'K', for a year, with a
DeprecationWarning, same thing for the casting rules.

Thank you for reading this long email. I don't mean it to be a complaint
about the past, but more a suggestion on something to keep in mind when
making changes to core projects.

Cheers,

Gaël

[1]
https://github.com/scikit-learn/scikit-learn/commit/7842748cf777412c506a8c0ed28090711d3a3783

[2]
http://mail.scipy.org/pipermail/numpy-discussion/2012-September/063985.html

[3] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063126.html

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org

Re: [Numpy-discussion] Making numpy sensible: backward compatibility please

2012-09-28 Thread Travis Oliphant


On Sep 28, 2012, at 4:53 PM, Henry Gomersall wrote:

 On Fri, 2012-09-28 at 16:43 -0500, Travis Oliphant wrote:
 I agree that we should be much more cautious about semantic changes in
 the 1.X series of NumPy.How we handle situations where 1.6 changed
 things from 1.5 and wasn't reported until now is an open question and
 depends on the particular problem in question.I agree that we
 should be much more cautious about changes (particularly semantic
 changes that will break existing code). 
 
 One thing I noticed in my (short and shallow) foray into numpy
 development was the rather limited scope of the tests in the area I
 touched (fft). I know not the extent to which this is true across the
 code base, but I know from experience the value of a truly exhaustive
 test set (every line tested for every condition). Perhaps someone with a
 deeper knowledge could comment on this?

Thank you for bringing this up.  It is definitely a huge flaw of NumPy that it 
does not have more extensive testing.  It is a result of the limited resources 
under which NumPy has been developed.We are trying to correct this problem 
over time --- but it takes time.In the mean time, there is a huge install 
base of code out there which acts as a de-facto test suite of NumPy.   We just 
need to make sure those tests actually get run on new versions of NumPy and we 
get reports back of failures --- especially when subtle changes have taken 
place in the way things work (iteration in ufuncs and coercion rules being the 
most obvious).   This results in longer release cycles if releases contain code 
that significantly change the way things work (removed APIs, altered coercion 
rules, etc.)

The alteration of the semantics of how the base attribute works is a good 
example.  Everyone felt it was a good idea to have the .base attribute point to 
the actual object holding the memory (and it fixed a well-known example of how 
you could crash Python by building up a stack of array-object references). 
However, our fix created a problem for code that uses memmap objects and relied 
on the fact that the .base attribute would hold a reference to the most recent 
*memmap* object.   This was an unforeseen problem with our change.   

On the other hand, change is a good thing and we don't want NumPy to stop 
getting improvements.   We just have to be careful that we don't allow our 
enthusiasm for new features and changes to over-ride our responsibility to 
end-users.   I appreciate the efforts of all the NumPy developers in working 
through the inevitable debates that differences in perspective on that 
fundamental trade-off will bring.  

Best, 

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Making numpy sensible: backward compatibility please

2012-09-28 Thread Travis Oliphant

Maybe it still can, but you have to tell us details :-)

In general numpy development just needs more people keeping track of these
things. If you want to keep an open source stack functional sometimes you
have to pay a tax of your time to making sure the levels below you will
continue to suit your needs.

Thanks for the thorough and thoughtful response. Well spoken...

-Travis

-n

Thank you for reading this long email. I don't mean it to be a complaint
about the past, but more a suggestion on something to keep in mind when
making changes to core projects.

Cheers,

Gaël

[1]
https://github.com/scikit-learn/scikit-learn/commit/7842748cf777412c506a8c0ed28090711d3a3783

[2]
http://mail.scipy.org/pipermail/numpy-discussion/2012-September/063985.html

[3] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063126.html

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] np.array execution path

2012-09-22 Thread Travis Oliphant

Check to see if this expression is true

no is o

In the first case no and o are the same object


Travis 

--
Travis Oliphant
(on a mobile)
512-826-7480


On Sep 22, 2012, at 1:01 PM, Sebastian Berg sebast...@sipsolutions.net wrote:

 Hi,
 
 I have a bit of trouble figuring this out. I would have expected
 np.asarray(array) to go through ctors, PyArray_NewFromArray, but it
 seems to me it does not, so which execution path is exactly taken here?
 The reason I am asking is that I want to figure out this behavior/bug,
 and I really am not sure which function is responsible:
 
 In [69]: o = np.ones(3)
 
 In [70]: no = np.asarray(o, order='C')
 
 In [71]: no[:] = 10
 
 In [72]: o # OK, o was changed in place:
 Out[72]: array([ 10.,  10.,  10.])
 
 In [73]: no.flags # But no claims to own its data!
 Out[73]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
 
 In [74]: no = np.asarray(o, order='F')
 
 In [75]: no[:] = 11
 
 In [76]: o # Here asarray actually returned a real copy!
 Out[76]: array([ 10.,  10.,  10.])
 
 
 Thanks,
 
 Sebastian
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] specifying numpy as dependency in your project, install_requires

2012-09-21 Thread Travis Oliphant


On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote:

 Hi,
 
 An issue I keep running into is that packages use:
 install_requires = [numpy]
 or
 install_requires = ['numpy = 1.6']
 
 in their setup.py. This simply doesn't work a lot of the time. I actually 
 filed a bug against patsy for that 
 (https://github.com/pydata/patsy/issues/5), but Nathaniel is right that it 
 would be better to bring it up on this list.
 
 The problem is that if you use pip, it doesn't detect numpy (may work better 
 if you had installed numpy with setuptools) and tries to automatically 
 install or upgrade numpy. That won't work if users don't have the right 
 compiler. Just as bad would be that it does work, and the user didn't want to 
 upgrade for whatever reason.
 
 This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw other 
 people have the exact same problem. My recommendation would be to not use 
 install_requires for numpy, but simply do something like this in setup.py:
 
 try:
 import numpy
 except ImportError:
 raise ImportError(my_package requires numpy)
 
 or 
 
 try:
 from numpy.version import short_version as npversion
 except ImportError:
 raise ImportError(my_package requires numpy)
 if npversion  '1.6':
raise ImportError(Numpy version is %s; required is version = 1.6 % 
 npversion)
 
 Any objections, better ideas? Is there a good place to put it in the numpy 
 docs somewhere?

I agree.   I would recommend against using install requires.   

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] tests for casting table? (was: Numpy 1.7b1 API change cause big trouble)

2012-09-20 Thread Travis Oliphant

Here are a couple of scripts that might help (I used them to compare casting 
tables between various versions of NumPy): 

Casting Table Creation Script

import numpy as np

operators = np.set_numeric_ops().values()
types = '?bhilqpBHILQPfdgFDGO'
to_check = ['add', 'divide', 'minimum', 'maximum', 'remainder', 'true_divide', 
'logical_or', 'bitwise_or', 'right_shift', 'less', 'equal']
operators = [op for op in operators if op.__name__ in to_check]


def type_wrap(op):
def func(obj1, obj2):
try:
result = op(obj1, obj2)
char = result.dtype.char
except:
char = 'X'
return char

return func

def coerce():
result = {}
for op in operators:
d = {}
name = op.__name__
print name
op = type_wrap(op)
for type1 in types:
s1 = np.dtype(type1).type(2)
a1 = np.dtype(type1).type([1,2,3])
for type2 in types:
s2 = np.dtype(type2).type(1)
a2 = np.dtype(type2).type([2,3,4])
codes = []
# scalar op scalar
codes.append(op(s1, s2))
# scalar op array
codes.append(op(s1, a2))
# array op scalar
codes.append(op(a1, s2))
# array op array
codes.append(op(a1, a2))
d[type1,type2] = codes
result[name] = d

#for check_key in to_check:
# for key in result.keys():
#if key == check_key:
#continue
#if result[key] == result[check_key]:
#del result[key]
#assert set(result.keys()) == set(to_check)
return result

import sys
if sys.maxint  2**33:
bits = 64
else:
bits = 32

def write():
import cPickle
file = open('coercion-%s-%sbit.pkl'%(np.__version__, bits),'w')
cPickle.dump(coerce(),file,protocol=2)
file.close()

if __name__ == '__main__':
write()





Comparison Script


import numpy as np


def compare(result1, result2):
for op in result1.keys():
print  , op,  
if op not in result2:
print op,  not in the first
table1 = result1[op]
table2 = result2[op]
if table1 == table2:
print Tables are the same
else:
if set(table1.keys()) != set(table2.keys()):
print Keys are not the same
continue
for key in table1.keys():
if table1[key] != table2[key]:
print Different at , key, : , table1[key], table2[key]

import cPickle
import sys

if __name__ == '__main__':
name1 = 'coercion-1.5.1-64bit.pkl'
name2 = 'coercion-1.6.1-64bit.pkl'

if len(sys.argv)  1:
name1 = 'coercion-%s-64bit.pkl' % sys.argv[1]
if len(sys.argv)  2:
name2 = 'coercion-%s-64bit.pkl' % sys.argv[2]
result1 = cPickle.load(open(name1))
result2 = cPickle.load(open(name2))
compare(result1, result2)



On Sep 20, 2012, at 3:09 PM, Nathaniel Smith wrote:

 On Mon, Sep 17, 2012 at 10:22 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,
 
 On Sun, Sep 9, 2012 at 6:12 PM, Frédéric Bastien no...@nouiz.org wrote:
 The third is releated to change to the casting rules in numpy. Before
 a scalar complex128 * vector float32 gived a vector of dtype
 complex128. Now it give a vector of complex64. The reason is that now
 the scalar of different category only change the category, not the
 precision. I would consider a must that we warn clearly about this
 interface change. Most people won't see it, but people that optimize
 there code heavily could depend on such thing.
 
 It seems to me that it would be a very good idea to put the casting
 table results into the tests to make sure we are keeping track of this
 kind of thing.
 
 I'm happy to try to do it if no-one else more qualified has time.
 
 I haven't seen any PRs show up from anyone else in the last few days,
 and this would indeed be an excellent test to have, so that would be
 awesome.
 
 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Travis Oliphant


On Sep 18, 2012, at 1:47 PM, Charles R Harris wrote:

 
 
 On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root ben.r...@ou.edu wrote:
 
 
 On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris charlesr.har...@gmail.com 
 wrote:
 
 
 On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant tra...@continuum.io wrote:
 
 On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
 
  Consider the following code:
 
  import numpy as np
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(255) / 15
 
  In v1.6.x, this yields:
  array([17, 34, 51, 68, 85], dtype=int16)
 
  But in master, this throws an exception about failing to cast via same_kind.
 
  Note that numpy was smart about this operation before, consider:
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(128) / 256
 
  yields:
  array([0, 1, 1, 2, 2], dtype=int16)
 
  Of course, this is different than if one does it in a non-in-place manner:
  np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
 
  which yields an array with floating point dtype in both versions.  I can 
  appreciate the arguments for preventing this kind of implicit casting 
  between non-same_kind dtypes, but I argue that because the operation is 
  in-place, then I (as the programmer) am explicitly stating that I desire to 
  utilize the current array to store the results of the operation, dtype and 
  all.  Obviously, we can't completely turn off this rule (for example, an 
  in-place addition between integer array and a datetime64 makes no sense), 
  but surely there is some sort of happy medium that would allow these sort 
  of operations to take place?
 
  Lastly, if it is determined that it is desirable to allow in-place 
  operations to continue working like they have before, I would like to see 
  such a fix in v1.7 because if it isn't in 1.7, then other libraries (such 
  as matplotlib, where this issue was first found) would have to change their 
  code anyway just to be compatible with numpy.
 
 I agree that in-place operations should allow different casting rules.  There 
 are different opinions on this, of course, but generally this is how NumPy 
 has worked in the past.
 
 We did decide to change the default casting rule to same_kind but making an 
 exception for in-place seems reasonable.
 
 I think that in these cases same_kind will flag what are most likely 
 programming errors and sloppy code. It is easy to be explicit and doing so 
 will make the code more readable because it will be immediately obvious what 
 the multiplicand is without the need to recall what the numpy casting rules 
 are in this exceptional case. IISTR several mentions of this before (Gael?), 
 and in some of those cases it turned out that bugs were being turned up. 
 Catching bugs with minimal effort is a good thing.
 
 Chuck 
 
 
 True, it is quite likely to be a programming error, but then again, there are 
 many cases where it isn't.  Is the problem strictly that we are trying to 
 downcast the float to an int, or is it that we are trying to downcast to a 
 lower precision?  Is there a way for one to explicitly relax the same_kind 
 restriction?
 
 I think the problem is down casting across kinds, with the result that floats 
 are truncated and the imaginary parts of imaginaries might be discarded. That 
 is, the value, not just the precision, of the rhs changes. So I'd favor an 
 explicit cast in code like this, i.e., cast the rhs to an integer.
 
 It is true that this forces downstream to code up to a higher standard, but I 
 don't see that as a bad thing, especially if it exposes bugs. And it isn't 
 difficult to fix.

Shouldn't we be issuing a warning, though?   Even if the desire is to change 
the casting rules?   The fact that multiple codes are breaking and need to be 
upgraded seems like a hard thing to require of someone going straight from 
1.6 to 1.7. That's what I'm opposed to.   

All of these efforts move NumPy to its use as a library instead of an 
interactive environment where it started which is a good direction to move, 
but managing this move in the context of a very large user-community is the 
challenge we have. 

-Travis




 
 Chuck 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Travis Oliphant


On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote:

 
 
 On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root ben.r...@ou.edu wrote:
 
 
 On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris charlesr.har...@gmail.com 
 wrote:
 
 
 On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root ben.r...@ou.edu wrote:
 
 
 On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris charlesr.har...@gmail.com 
 wrote:
 
 
 On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root ben.r...@ou.edu wrote:
 
 
 On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris charlesr.har...@gmail.com 
 wrote:
 
 
 On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant tra...@continuum.io wrote:
 
 On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
 
  Consider the following code:
 
  import numpy as np
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(255) / 15
 
  In v1.6.x, this yields:
  array([17, 34, 51, 68, 85], dtype=int16)
 
  But in master, this throws an exception about failing to cast via same_kind.
 
  Note that numpy was smart about this operation before, consider:
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(128) / 256
 
  yields:
  array([0, 1, 1, 2, 2], dtype=int16)
 
  Of course, this is different than if one does it in a non-in-place manner:
  np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
 
  which yields an array with floating point dtype in both versions.  I can 
  appreciate the arguments for preventing this kind of implicit casting 
  between non-same_kind dtypes, but I argue that because the operation is 
  in-place, then I (as the programmer) am explicitly stating that I desire to 
  utilize the current array to store the results of the operation, dtype and 
  all.  Obviously, we can't completely turn off this rule (for example, an 
  in-place addition between integer array and a datetime64 makes no sense), 
  but surely there is some sort of happy medium that would allow these sort 
  of operations to take place?
 
  Lastly, if it is determined that it is desirable to allow in-place 
  operations to continue working like they have before, I would like to see 
  such a fix in v1.7 because if it isn't in 1.7, then other libraries (such 
  as matplotlib, where this issue was first found) would have to change their 
  code anyway just to be compatible with numpy.
 
 I agree that in-place operations should allow different casting rules.  There 
 are different opinions on this, of course, but generally this is how NumPy 
 has worked in the past.
 
 We did decide to change the default casting rule to same_kind but making an 
 exception for in-place seems reasonable.
 
 I think that in these cases same_kind will flag what are most likely 
 programming errors and sloppy code. It is easy to be explicit and doing so 
 will make the code more readable because it will be immediately obvious what 
 the multiplicand is without the need to recall what the numpy casting rules 
 are in this exceptional case. IISTR several mentions of this before (Gael?), 
 and in some of those cases it turned out that bugs were being turned up. 
 Catching bugs with minimal effort is a good thing.
 
 Chuck 
 
 
 True, it is quite likely to be a programming error, but then again, there are 
 many cases where it isn't.  Is the problem strictly that we are trying to 
 downcast the float to an int, or is it that we are trying to downcast to a 
 lower precision?  Is there a way for one to explicitly relax the same_kind 
 restriction?
 
 I think the problem is down casting across kinds, with the result that floats 
 are truncated and the imaginary parts of imaginaries might be discarded. That 
 is, the value, not just the precision, of the rhs changes. So I'd favor an 
 explicit cast in code like this, i.e., cast the rhs to an integer.
 
 It is true that this forces downstream to code up to a higher standard, but I 
 don't see that as a bad thing, especially if it exposes bugs. And it isn't 
 difficult to fix.
 
 Chuck 
 
 
 Mind you, in my case, casting the rhs as an integer before doing the 
 multiplication would be a bug, since our value for the rhs is usually between 
 zero and one.  Multiplying first by the integer numerator before dividing by 
 the integer denominator would likely cause issues with overflowing the 16 bit 
 integer.
 
 
 For the case in point I'd do
 
 In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
 
 In [2]: a //= 2 
 
 In [3]: a
 Out[3]: array([0, 1, 1, 2, 2], dtype=int16) 
 
 Although I expect you would want something different in practice. But the 
 current code already looks fragile to me and I think it is a good thing you 
 are taking a closer look at it. If you really intend going through a float, 
 then it should be something like
 
 a = (a*(float(128)/256)).astype(int16)
 
 Chuck
 
 
 And thereby losing the memory benefit of an in-place multiplication?
 
 What makes you think you are getting that? I'd have to check the numpy  C 
 source, but I expect the multiplication is handled just as I wrote it out. I 
 don't recall any

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Travis Oliphant

 
   
 That is sort of the point of all this.  We are using 16 bit integers because 
 we wanted to be as efficient as possible and didn't need anything larger.  
 Note, that is what we changed the code to, I am just wondering if we are 
 being too cautious.  The casting kwarg looks to be what I might want, though 
 it isn't as clean as just writing an *= statement.
 
 
 I think even there you will have an intermediate float array followed by a 
 cast.
 
 This is true, but it is done in chunks of a fixed size (controllable by a 
 thread-local variable or keyword argument to the ufunc).
 
 How difficult would it be to change in-place operations back to the unsafe 
 default?
 
 Probably not too difficult, but I think it would be a mistake. What keyword 
 argument are you referring to? In the current case, I think what is wanted is 
 a scaling function that will actually do things in place. The matplotlib 
 folks would probably be happier with the result if they simply coded up a 
 couple of small Cython routines to do that.

http://docs.scipy.org/doc/numpy/reference/ufuncs.html#ufunc

In particular, the extobj keyword argument or the thread-local variable at 
umath.UFUNC_PYVALS_NAME

But, the problem is not just for matplotlib.   Matplotlib is showing a symptom 
of the problem of just changing the default casting mode in one release.I 
think this is too stark of a change for a single minor release without some 
kind of glide path or warning system.

I think we need to change in-place multiplication back to unsafe and then put 
in the release notes that we are planning on changing this for 1.8.   It would 
be ideal if we could raise a warning when unsafe castings occur. 

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-17 Thread Travis Oliphant


On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:

 Consider the following code:
 
 import numpy as np
 a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
 a *= float(255) / 15
 
 In v1.6.x, this yields:
 array([17, 34, 51, 68, 85], dtype=int16)
 
 But in master, this throws an exception about failing to cast via same_kind.
 
 Note that numpy was smart about this operation before, consider:
 a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
 a *= float(128) / 256

 yields:
 array([0, 1, 1, 2, 2], dtype=int16)
 
 Of course, this is different than if one does it in a non-in-place manner:
 np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
 
 which yields an array with floating point dtype in both versions.  I can 
 appreciate the arguments for preventing this kind of implicit casting between 
 non-same_kind dtypes, but I argue that because the operation is in-place, 
 then I (as the programmer) am explicitly stating that I desire to utilize the 
 current array to store the results of the operation, dtype and all.  
 Obviously, we can't completely turn off this rule (for example, an in-place 
 addition between integer array and a datetime64 makes no sense), but surely 
 there is some sort of happy medium that would allow these sort of operations 
 to take place?
 
 Lastly, if it is determined that it is desirable to allow in-place operations 
 to continue working like they have before, I would like to see such a fix in 
 v1.7 because if it isn't in 1.7, then other libraries (such as matplotlib, 
 where this issue was first found) would have to change their code anyway just 
 to be compatible with numpy.

I agree that in-place operations should allow different casting rules.  There 
are different opinions on this, of course, but generally this is how NumPy has 
worked in the past.  

We did decide to change the default casting rule to same_kind but making an 
exception for in-place seems reasonable. 

-Travis




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release

2012-09-15 Thread Travis Oliphant

I was working on the same fix and so I saw your code was similar and merged it. 
   It needs to be back-ported to 1.7.0

Thanks,

-Travis

On Sep 15, 2012, at 11:06 AM, Han Genuit wrote:

 Okay, sent in a pull request: https://github.com/numpy/numpy/pull/443.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release

2012-09-15 Thread Travis Oliphant

It's very nice to get your help.I hope I haven't inappropriately set 
expectations :-)

-Travis

On Sep 15, 2012, at 3:14 PM, Han Genuit wrote:

 Yeah, that merge was fast. :-)
 
 Regards,
 Han
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Obscure code in concatenate code path?

2012-09-13 Thread Travis Oliphant


On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote:

 On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,
 
 While writing some tests for np.concatenate, I ran foul of this code:
 
if (axis = NPY_MAXDIMS) {
ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER);
}
else {
ret = PyArray_ConcatenateArrays(narrays, arrays, axis);
}
 
 in multiarraymodule.c
 
 How deeply weird


This is expected behavior.   It's how the concatenate Python function manages 
to handle axis=None to flatten the arrays before concatenation.This has 
been in NumPy since 1.0 and should not be changed without deprecation warnings 
which I am -0 on. 

Now, it is true that the C-API could have been written differently (I think 
this is what Mark was trying to encourage) so that there are two C-API 
functions and they are dispatched separately from the array_concatenate method 
depending on whether or not a None is passed in.   But, the behavior is 
documented and has been for a long time. 

Reference PyArray_AxisConverter (which turns a None Python argument into an 
axis=MAX_DIMS).   This is consistent behavior throughout the C-API. 

-Travis





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Obscure code in concatenate code path?

2012-09-13 Thread Travis Oliphant

 
 
 This is expected behavior.   It's how the concatenate Python function 
 manages to handle axis=None to flatten the arrays before concatenation.
 This has been in NumPy since 1.0 and should not be changed without 
 deprecation warnings which I am -0 on.
 
 Now, it is true that the C-API could have been written differently (I think 
 this is what Mark was trying to encourage) so that there are two C-API 
 functions and they are dispatched separately from the array_concatenate 
 method depending on whether or not a None is passed in.   But, the behavior 
 is documented and has been for a long time.
 
 Reference PyArray_AxisConverter (which turns a None Python argument into 
 an axis=MAX_DIMS).   This is consistent behavior throughout the C-API.
 
 How about something like:
 
 #define NPY_NONE_AXIS NPY_MAXDIMS
 
 to make it clearer what is intended?

+1

-Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

1 2 3 4 5 6 7 >

1 - 100 of 678 matches

Mail list logo