[Numpy-discussion] Allowing slices as arguments for ndarray.take

2014-01-16 Thread Stephan Hoyer
There was a discussion last year about slicing along specified axes in numpy arrays: http://mail.scipy.org/pipermail/numpy-discussion/2012-April/061632.html I'm finding that slicing along specified axes is a common task for me when writing code to manipulate N-D arrays. The method ndarray.take ba

Re: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@?

2014-03-15 Thread Stephan Hoyer
Speaking only for myself (and as someone who has regularly used matrix powers), I would not expect matrix power as @@ to follow from matrix multiplication as @. I do agree that matrix power is the only reasonable use for @@ (given @), but it's still not something I would be confident enough to know

Re: [Numpy-discussion] Transparently reading complex arrays from netcdf4

2014-03-29 Thread Stephan Hoyer
Hi Glenn, My usual strategy for this sort of thing is to make a light-weight wrapper class which reads and converts values when you access them. For example: class WrapComplex(object): def __init__(self, nc_var): self.nc_var = nc_var def __getitem__(self, item): return se

Re: [Numpy-discussion] Transparently reading complex arrays from netcdf4

2014-03-29 Thread Stephan Hoyer
t then also being smart about taking advantage of the mmap > when possible. But perhaps your solution is the best compromise. > > Thanks again, > Glenn > On Mar 29, 2014 10:59 PM, "Stephan Hoyer" wrote: > >> Hi Glenn, >> >> My usual strategy for this sort o

Re: [Numpy-discussion] Transparently reading complex arrays from netcdf4

2014-03-30 Thread Stephan Hoyer
the array interface is built in for free. > > Thanks, > Glenn > On Mar 30, 2014 2:18 AM, "Stephan Hoyer" wrote: > >> Hi Glenn, >> >> Here is a full example of how we wrap a netCDF4.Variable object, >> implementing all of its ndarray-like methods: >

Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-11 Thread Stephan Hoyer
On Fri, Apr 11, 2014 at 3:56 PM, Charles R Harris wrote: > Are we in a position to start looking at implementation? If so, it would > be useful to have a collection of test cases, i.e., typical uses with > specified results. That should also cover conversion from/(to?) > datetime.datetime. > Ind

Re: [Numpy-discussion] min depth to nonzero in 3d array

2014-04-17 Thread Stephan Hoyer
Hi Alan, You can abuse np.argmax to calculate the first nonzero element in a vectorized manner: import numpy as np A = (2 * np.random.rand(100, 50, 50)).astype(int) Compare: np.argmax(A != 0, axis=0) np.array([[np.flatnonzero(A[:,i,j])[0] for j in range(50)] for i in range(50)]) You'll also wa

Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-18 Thread Stephan Hoyer
On Mon, Apr 14, 2014 at 11:59 AM, Chris Barker wrote: > > - datetime64 objects with high precision (e.g., ns) can't compare to >> datetime objects. >> > > That's a problem, but how do you think it should be handled? My thought is > that it should round to microseconds, and then compare -- kind of l

Re: [Numpy-discussion] Find Daily max - create lists using date and add hourly data to that list for the day

2014-05-21 Thread Stephan Hoyer
Hello anonymous, I recently wrote a package "xray" (http://xray.readthedocs.org/) specifically to make it easier to work with high-dimensional labeled data, as often found in NetCDF files. Xray has a groupby method for grouping over subsets of your data, which would seem well suited to what you're

Re: [Numpy-discussion] Fwd: [Python-ideas] PEP pre-draft: Support for indexing with keyword arguments

2014-07-02 Thread Stephan Hoyer
NumPy doesn't have named axes, but perhaps it should. See, for example, Fernando Perez's datarray prototype (https://github.com/fperez/datarray) or my project, xray (https://github.com/xray/xray). Syntactical support for indexing an axis by name would makes using named axes much more readable. For

Re: [Numpy-discussion] Fast way to convert (nested) list to numpy object array?

2014-07-03 Thread Stephan Hoyer
On Thu, Jul 3, 2014 at 5:36 AM, Marc Hulsman wrote: > This can however go wrong. Say that we have nested variable length > lists, what sometimes happens is that part of the data has > (by chance) only fixed length nested lists, while another part has > variable length nested lists. If we then unp

Re: [Numpy-discussion] String type again.

2014-07-17 Thread Stephan Hoyer
On Mon, Jul 14, 2014 at 10:00 AM, Olivier Grisel wrote: > 2014-07-13 19:05 GMT+02:00 Alexander Belopolsky : > > I've been toying with the idea of creating an array type for interned > > strings. In many applications dealing with large arrays of variable size > > strings, the strings come from a

Re: [Numpy-discussion] Proposed new feature for numpy.einsum: repeated output subscripts as diagonal

2014-08-14 Thread Stephan Hoyer
I think this would be very nice addition. On Thu, Aug 14, 2014 at 12:21 PM, Benjamin Root wrote: > You had me at Kronecker delta... :-) +1 > > > On Thu, Aug 14, 2014 at 3:07 PM, Pierre-Andre Noel < > noel.pierre.an...@gmail.com> wrote: > >> (I created issue 4965 earlier today on this topic, an

Re: [Numpy-discussion] Generalize hstack/vstack --> stack; Block matrices like in matlab

2014-09-08 Thread Stephan Hoyer
On Mon, Sep 8, 2014 at 10:00 AM, Benjamin Root wrote: > Btw, on a somewhat related note, whoever can implement ndarray to be able > to use views from other ndarrays stitched together would get a fruit basket > from me come the holidays and possibly naming rights for the next kid... > Ben, you sh

[Numpy-discussion] Custom dtypes without C -- or, a standard ndarray-like type

2014-09-21 Thread Stephan Hoyer
pandas has some hacks to support custom types of data for which numpy can't handle well enough or at all. Examples include datetime and Categorical [1], and others like GeoArray [2] that haven't make it into pandas yet. Most of these look like numpy arrays but with custom dtypes and type specific

[Numpy-discussion] ANN: xray 0.3 released

2014-09-21 Thread Stephan Hoyer
I'm pleased to announce the v0.3 release for xray, N-D labeled arrays and datasets in Python. xray is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures, Seri

Re: [Numpy-discussion] Custom dtypes without C -- or, a standard ndarray-like type

2014-09-22 Thread Stephan Hoyer
On Sun, Sep 21, 2014 at 8:31 PM, Nathaniel Smith wrote: > For cases where people genuinely want to implement a new array-like > types (e.g. DataFrame or scipy.sparse) then numpy provides a fair > amount of support for this already (e.g., the various hooks that allow > things like np.asarray(mydf)

Re: [Numpy-discussion] Proposal: add ndarray.keys() to return dtype.names

2014-09-30 Thread Stephan Hoyer
I like this idea. But I am -1 on returning None if the array is unstructured. I expect .keys(), if present, to always return an iterable. In fact, this would break some of my existing code, which checks for the existence of "keys" as a way to do duck typed checks for dictionary like objects (e.g.,

Re: [Numpy-discussion] Proposal: add ndarray.keys() to return dtype.names

2014-09-30 Thread Stephan Hoyer
On Tue, Sep 30, 2014 at 1:22 PM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > On more careful reading of your words, I think we agree; indeed, if keys() > is present is should return an iterable; but I don't think it should be > present for non-structured arrays. > Indeed, I think we

Re: [Numpy-discussion] 0/0 == 0?

2014-10-03 Thread Stephan Hoyer
On Thu, Oct 2, 2014 at 11:29 PM, Nathaniel Smith wrote: > The seterr warning system makes a lot of sense for IEEE754 floats, > which are specifically designed so that 0/0 has a unique well-defined > answer. For ints though this seems really broken to me. 0 / 0 = 0 is > just the wrong answer. It w

Re: [Numpy-discussion] use ufunc for arbitrary positional arguments?

2014-10-10 Thread Stephan Hoyer
On Fri, Oct 10, 2014 at 11:23 AM, Benjamin Root wrote: > I have a need to "and" together an arbitrary number of boolean arrays. > np.logical_and() expects only two positional arguments. There has got to be > some sort of easy way to just and these together using the ufunc mechanism, > right? > D

Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Stephan Hoyer
On Sun, Oct 12, 2014 at 10:56 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > > Just to add some noise to a productive conversation: if you add a 'copy' > flag to shuffle, then all the functionality is in one place, and > 'permutation' can either be deprecated, or trivially implemented

Re: [Numpy-discussion] numpy.mean slicing in a netCDF file

2014-10-14 Thread Stephan Hoyer
Hi Fadzil, My strong recommendation is that you don't just use numpy/netCDF4 to process your data, but rather use one of a multitude of packages that have been developed specifically to facilitate working with labeled data from netCDF files: - Iris: http://scitools.org.uk/iris/ - CDAT: http://uvcd

[Numpy-discussion] Add an axis argument to generalized ufuncs?

2014-10-17 Thread Stephan Hoyer
Yesterday I created a GitHub issue proposing adding an axis argument to numpy's gufuncs: https://github.com/numpy/numpy/issues/5197 I was told I should repost this on the mailing list, so here's the recap: I would like to write generalized ufuncs (probably using numba), to create fast functions s

Re: [Numpy-discussion] Add an axis argument to generalized ufuncs?

2014-10-19 Thread Stephan Hoyer
On Sat, Oct 18, 2014 at 6:46 PM, Nathaniel Smith wrote: > One thing we'll have to watch out for is that for reduction operations > (which are basically gufuncs with (n)->() signatures), we already > allow axis=(0,1) to mean "reshape axes 0 and 1 together into one big > axis, and then use that as

Re: [Numpy-discussion] Add an axis argument to generalized ufuncs?

2014-10-19 Thread Stephan Hoyer
On Sun, Oct 19, 2014 at 6:43 AM, Nathaniel Smith wrote: > I feel strongly that we should come up with a syntax that is > unambiguous even *without* looking at the gufunc signature. It's easy > for the computer to disambiguate stuff like this, but it'd be cruel to > ask people trying to skim throu

Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Stephan Hoyer
On Tue, Oct 28, 2014 at 10:25 AM, Nathaniel Smith wrote: > I too would be curious to know why .flat exists (beyond "it seemed like a > good idea at the time" ;-)). I've always treated it as some weird legacy > thing and ignored it, and this has worked out well for me. > > Is there any real proble

Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-29 Thread Stephan Hoyer
On Wed, Oct 29, 2014 at 2:16 AM, Sebastian Berg wrote: > On Di, 2014-10-28 at 14:03 -0400, Alan G Isaac wrote: > I don't really like flat (it is a pretty old part of numpy), but I > agree, while you can force nditer to be C-contiguous, nditer has its own > problems and is also pretty complex. I w

Re: [Numpy-discussion] Finding values in an array

2014-11-27 Thread Stephan Hoyer
On Thu, Nov 27, 2014 at 10:15 PM, Alexander Belopolsky wrote: > I probably miss something very basic, but how given two arrays a and b, > can I find positions in a where elements of b are located? If a were > sorted, I could use searchsorted, but I don't want to get valid positions > for element

[Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks?

2014-12-06 Thread Stephan Hoyer
I recently wrote function to manually broadcast an ndarray to a given shape according to numpy's broadcasting rules (using strides): https://github.com/xray/xray/commit/7aee4a3ed2dfd3b9aff7f3c5c6c68d51df2e3ff3 The same functionality can be done pretty straightforwardly with np.broadcast_arrays, bu

Re: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks?

2014-12-10 Thread Stephan Hoyer
On Sun, Dec 7, 2014 at 11:31 PM, Pierre Haessig wrote: > Instead of putting this function in stride_tricks (which is quite > hidden), could it be added instead as a boolean flag to the existing > `reshape` method ? Something like: > > x.reshape(y.shape, broadcast=True) > > What other people think

Re: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks?

2014-12-10 Thread Stephan Hoyer
On Wed, Dec 10, 2014 at 4:00 PM, Nathaniel Smith wrote: > 2) Add a broadcast_to(arr, shape) function, which broadcasts the array > to exactly the shape given, or else errors out if this is not > possible. > I like np.broadcast_to as a new function. We can document it alongside broadcast and broa

Re: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks?

2014-12-11 Thread Stephan Hoyer
On Thu, Dec 11, 2014 at 8:17 AM, Sebastian Berg wrote: > One option > would also be to have something like: > > np.common_shape(*arrays) > np.broadcast_to(array, shape) > # (though I would like many arrays too) > > and then broadcast_ar rays could be implemented in terms of these two. > It looks

Re: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks?

2014-12-12 Thread Stephan Hoyer
On Fri, Dec 12, 2014 at 5:48 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > np.broadcast is the Python object of the old iterator. It may be a better > idea to write all of these functions using the new one, np.nditer: > > def common_shape(*args): > return np.nditer(args).shape[:

Re: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks?

2014-12-12 Thread Stephan Hoyer
On Fri, Dec 12, 2014 at 6:25 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > it seems that all the functionality that has been discussed are one-liners > using nditer: do we need new functions, or better documentation? > I think there is utility to adding a new function or two (my in

Re: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks?

2015-01-03 Thread Stephan Hoyer
Here is an update on a new function for broadcasting arrays to a given shape (now named np.broadcast_to). I have a pull request up for review, which has received some feedback now: https://github.com/numpy/numpy/pull/5371 There is still at least one design decision to settle: should we expose "br

Re: [Numpy-discussion] Datetime again

2015-01-28 Thread Stephan Hoyer
On Wed, Jan 28, 2015 at 5:13 PM, Chris Barker wrote: > I tend to agree with Nathaniel that a ndarray subclass is less than ideal > -- they tend to get ugly fast. But maybe that is the only way to do > anything in Python, short of a major refactor to be able to write a dtype > in Python -- which w

[Numpy-discussion] New function: np.stack?

2015-02-05 Thread Stephan Hoyer
There are two usual ways to combine a sequence of arrays into a new array: 1. concatenated along an existing axis 2. stacked along a new axis For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack, np.dstack and np.column_stack. For arrays with arbitrary dimensions, there is the np.arr

Re: [Numpy-discussion] converting a list of tuples into an array of tuples?

2015-02-09 Thread Stephan Hoyer
It appears that the only reliable way to do this may be to use a loop to modify an object arrays in-place. Pandas has a version of this written in Cython: https://github.com/pydata/pandas/blob/c1a0dbc4c0dd79d77b2a34be5bc35493279013ab/pandas/lib.pyx#L342 To quote Wes McKinney "Seriously can't belie

Re: [Numpy-discussion] Matrix Class

2015-02-11 Thread Stephan Hoyer
On Wed, Feb 11, 2015 at 9:19 AM, Sebastian Berg wrote: > On Mi, 2015-02-11 at 11:38 -0500, cjw wrote: > No, I just mean the fact that a matrix is always 2D. This makes some > things like some indexing operations awkward and some functions that > expect a numpy array (but think they can handle sub

Re: [Numpy-discussion] Objects exposing the array interface

2015-02-25 Thread Stephan Hoyer
On Wed, Feb 25, 2015 at 1:24 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > 1. When converting these objects to arrays using PyArray_Converter, if > the arrays returned by any of the array interfaces is not C contiguous, > aligned, and writeable, a copy that is will be made. Proper a

Re: [Numpy-discussion] Objects exposing the array interface

2015-02-25 Thread Stephan Hoyer
On Wed, Feb 25, 2015 at 2:48 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > I am not really sure what the behavior of __array__ should be. The link > to the subclassing docs I gave before indicates that it should be possible > to write to it if it is writeable (and probably pandas sh

Re: [Numpy-discussion] [SciPy-User] Congratulations to Chris Barker...

2015-03-02 Thread Stephan Hoyer
Indeed, congratulations Chris! Are there plans to write a vectorized version for NumPy? :) On Mon, Mar 2, 2015 at 2:28 PM, Nathaniel Smith wrote: > ...on the acceptance of his PEP! PEP 485 adds a math.isclose function > to the standard library, encouraging people to do numerically more > reason

[Numpy-discussion] ANN: xray v0.4 released

2015-03-03 Thread Stephan Hoyer
I'm pleased to announce a major release of xray, v0.4. xray is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures. Our goal is to provide a pandas-like and p

Re: [Numpy-discussion] Custom __array_interface__ error

2015-03-13 Thread Stephan Hoyer
In my experience writing ndarray-like objects, you likely want to implement __array__ instead of __array_interface__. The former gives you full control to create the ndarray yourself. On Fri, Mar 13, 2015 at 7:22 AM, Daniel Smith wrote: > Greetings everyone, > I have a new project that deals wit

[Numpy-discussion] numpy.stack -- which function, if any, deserves the name?

2015-03-15 Thread Stephan Hoyer
In the past months there have been two proposals for new numpy functions using the name "stack": 1. np.stack for stacking like np.asarray(np.bmat(...)) http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ https://github.com/numpy/numpy/pull/5057 2. np.stack for stacking along an arbit

Re: [Numpy-discussion] numpy.stack -- which function, if any, deserves the name?

2015-03-16 Thread Stephan Hoyer
On Mon, Mar 16, 2015 at 1:50 AM, Stefan Otte wrote: > Summarizing, my proposal is mostly concerned how to create block > arrays from given arrays. I don't care about the name "stack". I just > used "stack" because it replaced hstack/vstack for me. Maybe "bstack" > for block stack, or "barray" for

Re: [Numpy-discussion] GSoC students: please read

2015-03-23 Thread Stephan Hoyer
On Mon, Mar 23, 2015 at 2:21 PM, Ralf Gommers wrote: > It's great to see that this year there are a lot of students interested in > doing a GSoC project with Numpy or Scipy. So far five proposals have been > submitted, and it looks like several more are being prepared now. > Hi Ralf, Is there a

Re: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc

2015-03-24 Thread Stephan Hoyer
The most recent discussion about datetime64 was back in March and April of last year: http://mail.scipy.org/pipermail/numpy-discussion/2014-March/thread.html#69554 http://mail.scipy.org/pipermail/numpy-discussion/2014-April/thread.html#69774 In addition to unfortunate timezone handling, datetime64

Re: [Numpy-discussion] Advanced indexing: "fancy" vs. orthogonal

2015-04-02 Thread Stephan Hoyer
On Wed, Apr 1, 2015 at 7:06 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > Is there any other package implementing non-orthogonal indexing aside from > numpy? > I think we can safely say that NumPy's implementation of broadcasting indexing is unique :). The issue is that many other

Re: [Numpy-discussion] Advanced indexing: "fancy" vs. orthogonal

2015-04-02 Thread Stephan Hoyer
On Thu, Apr 2, 2015 at 11:03 AM, Eric Firing wrote: > Fancy indexing is a horrible design mistake--a case of cleverness run > amok. As you can read in the Numpy documentation, it is hard to > explain, hard to understand, hard to remember. Well put! I also failed to correct predict your exampl

Re: [Numpy-discussion] Advanced indexing: "fancy" vs. orthogonal

2015-04-03 Thread Stephan Hoyer
On Fri, Apr 3, 2015 at 10:59 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > I have an all-Pyhton implementation of an OrthogonalIndexer class, loosely > based on Stephan's code plus some axis remapping, that provides all the > needed functionality for getting and setting with orthogo

Re: [Numpy-discussion] Advanced indexing: "fancy" vs. orthogonal

2015-04-03 Thread Stephan Hoyer
On Fri, Apr 3, 2015 at 4:54 PM, Nathaniel Smith wrote: > Unfortunately, AFAICT this means our only options here are to have > some kind of backcompat break in numpy, some kind of backcompat break > in pandas, or to do nothing and continue indefinitely with the status > quo where the same indexing

Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Stephan Hoyer
With regards to np.where -- shouldn't where be a ufunc, so subclasses or other array-likes can be control its behavior with __numpy_ufunc__? As for the other indexing functions, I don't have a strong opinion about how they should handle subclasses. But it is certainly tricky to attempt to handl

Re: [Numpy-discussion] Proposed deprecations for 1.10: dot corner cases

2015-05-11 Thread Stephan Hoyer
On Sat, May 9, 2015 at 1:26 PM, Nathaniel Smith wrote: > I'd like to suggest that we go ahead and add deprecation warnings to > the following operations. This doesn't commit us to changing anything > on any particular time scale, but it gives us more options later. > These both get a strong +1 f

Re: [Numpy-discussion] Proposed deprecations for 1.10: dot corner cases

2015-05-11 Thread Stephan Hoyer
On Mon, May 11, 2015 at 2:53 PM, Alan G Isaac wrote: > I agree that where `@` and `dot` differ in behavior, this should be > clearly documented. > I would hope that the behavior of `dot` would not change. Even if np.dot never changes (and indeed, perhaps it should not), issuing these warnings s

Re: [Numpy-discussion] matmul needs some clarification.

2015-06-03 Thread Stephan Hoyer
On Sat, May 30, 2015 at 3:23 PM, Charles R Harris wrote: > The problem arises when multiplying a stack of matrices times a vector. > PEP465 defines this as appending a '1' to the dimensions of the vector and > doing the defined stacked matrix multiply, then removing the last dimension > from the

[Numpy-discussion] ANN: xray v0.5

2015-06-11 Thread Stephan Hoyer
I'm pleased to announce version 0.5 of xray, N-D labeled arrays and datasets in Python. xray is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures. These data

Re: [Numpy-discussion] Flag for np.tile to use as_strided to reduce memory

2015-06-19 Thread Stephan Hoyer
On Fri, Jun 19, 2015 at 10:39 AM, Sebastian Berg wrote: > No, what tile does cannot be represented that way. If it was possible > you can achieve the same using `np.broadcast_to` basically, which was > just added though. There are some other things you can do, like rolling > window (adding dimens

Re: [Numpy-discussion] Numpy helper function for __getitem__?

2015-08-23 Thread Stephan Hoyer
I don't think NumPy has a function like this (at least, not exposed to Python), but I wrote one for xray, "expanded_indexer", that you are welcome to borrow: https://github.com/xray/xray/blob/v0.6.0/xray/core/indexing.py#L10 ​Stephan On Sunday, Aug 23, 2015 at 7:54 PM, Fabien , wrote:

Re: [Numpy-discussion] Numpy helper function for __getitem__?

2015-08-26 Thread Stephan Hoyer
Indeed, the helper function I wrote for xray was not designed to handle None/np.newaxis or non-1d Boolean indexers, because those are not valid indexers for xray objects. I think it could be straightforwardly extended to handle None simply by not counting them towards the total number of dimensions

Re: [Numpy-discussion] np.sign and object comparisons

2015-08-31 Thread Stephan Hoyer
On Mon, Aug 31, 2015 at 1:23 AM, Sebastian Berg wrote: > That would be my gut feeling as well. Returning `NaN` could also make > sense, but I guess we run into problems since we do not know the input > type. So `None` seems like the only option here I can think of right > now. > My inclination i

Re: [Numpy-discussion] Notes from the numpy dev meeting at scipy 2015

2015-09-03 Thread Stephan Hoyer
>From my perspective, a major advantage to dtypes is composability. For example, it's hard to write a library like dask.array (out of core arrays) that can suppose holding any conceivable ndarray subclass (like MaskedArray or quantity), but handling arbitrary dtypes is quite straightforward -- and

Re: [Numpy-discussion] Governance model request

2015-09-22 Thread Stephan Hoyer
On Tue, Sep 22, 2015 at 2:33 AM, Travis Oliphant wrote: > The FUD I'm talking about is the anti-company FUD that has influenced > discussions in the past.I really hope that we can move past this. > I have mostly stayed out of the governance discussion, in deference to how new I am in this co

Re: [Numpy-discussion] interpretation of the draft governance document (was Re: Governance model request)

2015-09-23 Thread Stephan Hoyer
Travis -- have you included all your email addresses in your GitHub profile? When I type git shortlog -ne, I see 2063 commits from your Continuum address that seem to be missing from the contributors page on github. Generally speaking, the git logs tend to be more reliable for these counts. On

Re: [Numpy-discussion] Sign of NaN

2015-09-29 Thread Stephan Hoyer
On Tue, Sep 29, 2015 at 8:13 AM, Charles R Harris wrote: > Due to a recent commit, Numpy master now raises an error when applying the > sign function to an object array containing NaN. Other options may be > preferable, returning NaN for instance, so I would like to open the topic > for discussio

Re: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver

2015-10-06 Thread Stephan Hoyer
On Tue, Oct 6, 2015 at 1:14 AM, Daπid wrote: > One idea: what about creating a "parallel numpy"? There are a few > algorithms that can benefit from parallelisation. This library would mimic > Numpy's signature, and the user would be responsible for choosing the > single threaded or the parallel o

[Numpy-discussion] Make all comparisons with NaT false?

2015-10-11 Thread Stephan Hoyer
Currently, NaT (not a time) does not have any special treatment when used in comparison with datetime64/timedelta64 objects. This means that it's equal to itself, and treated as the smallest possible value in comparisons, e.g., NaT == NaT and NaT < any_other_time. To me, this seems a little crazy

[Numpy-discussion] Making datetime64 timezone naive

2015-10-12 Thread Stephan Hoyer
As has come up repeatedly over the past few years, nobody seems to be very happy with the way that NumPy's datetime64 type parses and prints datetimes in local timezones. The tentative consensus from last year's discussion was that we should make datetime64 timezone naive, like the standard librar

Re: [Numpy-discussion] Making datetime64 timezone naive

2015-10-13 Thread Stephan Hoyer
On Mon, Oct 12, 2015 at 12:38 AM, Nathaniel Smith wrote: > > One possible strategy here would be to do some corpus analysis to find > out whether anyone is actually using it, like I did for the ufunc ABI > stuff: > https://github.com/njsmith/codetrawl > https://github.com/njsmith/ufunc-abi-an

[Numpy-discussion] Deprecating unitless timedelta64 and "safe" casting of integers to timedelta64

2015-10-13 Thread Stephan Hoyer
As part of the datetime64 cleanup I've been working on over the past few days, I noticed that NumPy's casting rules for np.datetime64('NaT') were not working properly: https://github.com/numpy/numpy/pull/6465 This led to my discovery that NumPy currently supports unit-less timedeltas (e.g., "np.ti

Re: [Numpy-discussion] when did column_stack become C-contiguous?

2015-10-18 Thread Stephan Hoyer
Looking at the git logs, column_stack appears to have been that way (creating a new array with concatenate) since at least NumPy 0.9.2, way back in January 2006: https://github.com/numpy/numpy/blob/v0.9.2/numpy/lib/shape_base.py#L271 Stephan ___ NumPy-Di

Re: [Numpy-discussion] Making datetime64 timezone naive

2015-10-19 Thread Stephan Hoyer
On Mon, Oct 19, 2015 at 12:34 PM, Chris Barker wrote: > Also -- I think we are at phase one of a (at least) two step process: > > 1) clean up datetime64 just enough that it is useful, and less error-prone > -- i.e. have it not pretend to support anything other than naive datetimes. > > 2) Do it r

Re: [Numpy-discussion] Nansum function behavior

2015-10-23 Thread Stephan Hoyer
Hi Charles, You should read the previous discussion about this issue on GitHub: https://github.com/numpy/numpy/issues/1721 For what it's worth, I do think the new definition of nansum is more consistent. If you want to preserve NaN if there are no non-NaN values, you can often calculate this des

[Numpy-discussion] Proposal for a new function: np.moveaxis

2015-11-04 Thread Stephan Hoyer
I've put up a pull request implementing a new function, np.moveaxis, as an alternative to np.transpose and np.rollaxis: https://github.com/numpy/numpy/pull/6630 This functionality has been discussed (even the exact function name) several times over the years, but it never made it into a pull reque

Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Stephan Hoyer
On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane wrote: > > I've also often wanted to generate large datasets of random uint8 and > uint16. As a workaround, this is something I have used: > > np.ndarray(100, 'u1', np.random.bytes(100)) > > It has also crossed my mind that np.random.randint and np.r

Re: [Numpy-discussion] Dynamic array list implementation

2015-12-23 Thread Stephan Hoyer
We have a type similar to this (a typed list) internally in pandas, although it is restricted to a single dimension and far from feature complete -- it only has .append and a .to_array() method for converting to a 1d numpy array. Our version is written in Cython, and we use it for performance re

Re: [Numpy-discussion] Fast Access to Container of Numpy Arrays on Disk?

2016-01-14 Thread Stephan Hoyer
On Thu, Jan 14, 2016 at 8:26 AM, Travis Oliphant wrote: > I don't know enough about xray to know whether it supports this kind of > general labeling to be able to build your entire data-structure as an x-ray > object. Dask could definitely be used to process your data in an easy to > describe m

Re: [Numpy-discussion] Fast Access to Container of Numpy Arrays on Disk?

2016-01-14 Thread Stephan Hoyer
On Thu, Jan 14, 2016 at 2:30 PM, Nathaniel Smith wrote: > The reason I didn't suggest dask is that I had the impression that > dask's model is better suited to bulk/streaming computations with > vectorized semantics ("do the same thing to lots of data" kinds of > problems, basically), whereas it

Re: [Numpy-discussion] Software Capabilities of NumPy in Our Tensor Survey Paper

2016-01-15 Thread Stephan Hoyer
Robert beat me to it on einsum, but also check tensordot for general tensor contraction. On Fri, Jan 15, 2016 at 9:30 AM, Nathaniel Smith wrote: > On Jan 15, 2016 8:36 AM, "Li Jiajia" wrote: >> >> Hi all, >> I’m a PhD student in Georgia Tech. Recently, we’re working on a survey > paper about t

[Numpy-discussion] ANN: xarray (formerly xray) v0.7.0 released

2016-01-21 Thread Stephan Hoyer
ort for reading GRIB, HDF4 and other file formats via PyNIO For more details, read the full release notes: http://xarray.pydata.org/en/stable/whats-new.html Contributors to this release: Antony Lee Fabien Maussion Joe Hamman Maximilian Roos Stephan Hoyer Takeshi Kanmae femtotrader I'd a

Re: [Numpy-discussion] GSoC?

2016-02-10 Thread Stephan Hoyer
On Wed, Feb 10, 2016 at 3:02 PM, Ralf Gommers wrote: > OK first version: > https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas > I kept some of the ideas from last year, but removed all potential mentors > as the same people may not be available this year - please re-add > yourselves wher

Re: [Numpy-discussion] Deprecating `numpy.iterable`

2016-02-11 Thread Stephan Hoyer
We certainly can (and probably should) deprecate this, but we can't remove it for a very long time. np.iterable is used in a lot of third party code. On Wed, Feb 10, 2016 at 7:09 PM, Joseph Fox-Rabinovitz < jfoxrabinov...@gmail.com> wrote: > I have created a PR to deprecate `np.iterable` > (http

Re: [Numpy-discussion] GSoC?

2016-02-16 Thread Stephan Hoyer
On Wed, Feb 10, 2016 at 4:22 PM, Chris Barker wrote: > We might consider adding "improve duck typing for numpy arrays" >> > > care to elaborate on that one? > > I know it come up on here that it would be good to have some code in numpy > itself that made it easier to make array-like objects (I.e.

Re: [Numpy-discussion] Generalized flip function

2016-02-28 Thread Stephan Hoyer
I also think this is a good idea -- the generalized flip is much more numpythonic than the specialized 2d versions. On Fri, Feb 26, 2016 at 11:36 AM Joseph Fox-Rabinovitz < jfoxrabinov...@gmail.com> wrote: > If nothing else, this is a nice complement to the generalized `stack` > function. > >

Re: [Numpy-discussion] fromnumeric.py internal calls

2016-02-28 Thread Stephan Hoyer
I think this is an improvement, but I do wonder if there are libraries out there that use *args instead of **kwargs to handle these extra arguments. Perhaps it's worth testing this change against third party array libraries that implement their own array like classes? Off the top of my head, maybe

Re: [Numpy-discussion] Changes to generalized ufunc core dimension checking

2016-03-18 Thread Stephan Hoyer
On Thu, Mar 17, 2016 at 1:04 AM, Travis Oliphant wrote: > I think that is a good idea.Let the user decide if scalar broadcasting > is acceptable for their function. > > Here is a simple concrete example where scalar broadcasting makes sense: > > A 1-d dot product (the core of np.inner) (k),

Re: [Numpy-discussion] Changes to generalized ufunc core dimension checking

2016-03-19 Thread Stephan Hoyer
On Thu, Mar 17, 2016 at 2:49 PM, Travis Oliphant wrote: > That's a great idea! > > Adding multiple-dispatch capability for this case could also solve a lot > of issues that right now prevent generalized ufuncs from being the > mechanism of implementation of *all* NumPy functions. > > -Travis > F

Re: [Numpy-discussion] Changes to generalized ufunc core dimension checking

2016-03-20 Thread Stephan Hoyer
On Thu, Mar 17, 2016 at 3:28 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > Would the logic for such a thing be consistent? E.g. how do you decide if > the user is requesting (k),(k)->(), or (k),()->() with broadcasting over a > non-core dimension of size k in the second argument? Wh

Re: [Numpy-discussion] Numpy arrays shareable among related processes (PR #7533)

2016-04-11 Thread Stephan Hoyer
On Mon, Apr 11, 2016 at 5:39 AM, Matěj Týč wrote: > * ... I do see some value in providing a canonical right way to > construct shared memory arrays in NumPy, but I'm not very happy with > this solution, ... terrible code organization (with the global > variables): > * I understand that, however

Re: [Numpy-discussion] Changing the behavior of (builtins.)round (via the __round__ dunder) to return an integer

2016-04-13 Thread Stephan Hoyer
On Wed, Apr 13, 2016 at 12:42 AM, Antony Lee wrote: > (Note that I am suggesting to switch to the new behavior regardless of the > version of Python.) > I would lean towards making this change only for Python 3. This is arguably more consistent with Python than changing the behavior on Python 2.

Re: [Numpy-discussion] Changing the behavior of (builtins.)round (via the __round__ dunder) to return an integer

2016-04-13 Thread Stephan Hoyer
On Wed, Apr 13, 2016 at 8:06 AM, wrote: > > The difference is that Python 3 has long ints, (and doesn't have to > overflow, AFAICS) > This is a good point. But if your float is so big that rounding it to an integer would overflow int64, rounding is already a no-op. I'm sure this has been done

[Numpy-discussion] Proposal: numpy.random.random_seed

2016-05-16 Thread Stephan Hoyer
I have recently encountered several use cases for randomly generate random number seeds: 1. When writing a library of stochastic functions that take a seed as an input argument, and some of these functions call multiple other such stochastic functions. Dask is one such example [1]. 2. When a libr

Re: [Numpy-discussion] Proposal: numpy.random.random_seed

2016-05-16 Thread Stephan Hoyer
) offset = np.arange(size) return (base + offset) % (2 ** 32) In principle, I believe this could generate the full 2 ** 32 unique seeds without any collisions. Cryptography experts, please speak up if I'm mistaken here. On Mon, May 16, 2016 at 8:54 PM, Stephan Hoyer wrote: >

Re: [Numpy-discussion] Proposal: numpy.random.random_seed

2016-05-17 Thread Stephan Hoyer
On Tue, May 17, 2016 at 12:18 AM, Robert Kern wrote: > On Tue, May 17, 2016 at 4:54 AM, Stephan Hoyer wrote: > > 1. When writing a library of stochastic functions that take a seed as an > input argument, and some of these functions call multiple other such > stochastic functio

Re: [Numpy-discussion] Integers to integer powers

2016-05-24 Thread Stephan Hoyer
On Tue, May 24, 2016 at 9:41 AM, Alan Isaac wrote: > What exactly is the argument against *always* returning float > (even for positive integer exponents)? > If we were starting over from scratch, I would agree with you, but the int ** 2 example feels quite compelling to me. I would guess there

Re: [Numpy-discussion] Integers to integer powers

2016-05-24 Thread Stephan Hoyer
On Tue, May 24, 2016 at 10:31 AM, Alan Isaac wrote: > Yes, but that one case is trivial: a*a an_explicit_name ** 2 is much better than an_explicit_name * an_explicit_name, though. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://ma

[Numpy-discussion] NumPy 1.11 docs

2016-05-28 Thread Stephan Hoyer
These still are missing from the SciPy.org page, several months after the release. What do we need to do to keep these updated? Is there someone at Enthought we should ping? Or do we really just need to transition to different infrastructure? ___ NumPy-D

Re: [Numpy-discussion] NumPy 1.11 docs

2016-05-30 Thread Stephan Hoyer
Awesome, thanks Ralf! On Sun, May 29, 2016 at 1:13 AM Ralf Gommers wrote: > On Sun, May 29, 2016 at 4:35 AM, Stephan Hoyer wrote: > >> These still are missing from the SciPy.org page, several months after the >> release. >> > > Thanks Stephan, that needs fixing. &

Re: [Numpy-discussion] ENH: compute many inner products quickly

2016-06-05 Thread Stephan Hoyer
If possible, I'd love to add new functions for "generalized ufunc" linear algebra, and then deprecate (or at least discourage) using the older versions with inferior broadcasting rules. Adding a new keyword arg means we'll be stuck with an awkward API for a long time to come. There are three types

  1   2   >