Re: [Numpy-discussion] offset in fill diagonal

2017-01-21 Thread Ian Henriksen
On Sat, Jan 21, 2017 at 9:23 AM Julian Taylor <jtaylor.deb...@googlemail.com>
wrote:

> On 21.01.2017 16:10, josef.p...@gmail.com wrote:
> > Is there a simple way to fill in diagonal elements in an array for other
> > than main diagonal?
> >
> > As far as I can see, the diagxxx functions that have offset can only
> > read and not inplace modify, and the functions for modifying don't have
> > offset and only allow changing the main diagonal.
> >
> > Usecase: creating banded matrices (2-D arrays) similar to toeplitz.
> >
>
> you can construct index arrays or boolean masks to index using the
> np.tri* functions.
> e.g.
>
> a = np.arange(5*5).reshape(5,5)
> band = np.tri(5, 5, 1, dtype=np.bool) & ~np.tri(5, 5, -2, dtype=np.bool)
> a[band] = -1
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


You can slice the array you're filling before passing it to fill_diagonal.
For example:

import numpy as np
a = np.zeros((4, 4))
b = np.ones(3)
np.fill_diagonal(a[1:], b)
np.fill_diagonal(a[:,1:], -b)

yields

array([[ 0., -1.,  0.,  0.],
   [ 1.,  0., -1.,  0.],
   [ 0.,  1.,  0., -1.],
   [ 0.,  0.,  1.,  0.]])

Hope this helps,

Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] State-of-the-art to use a C/C++ library from Python

2016-08-31 Thread Ian Henriksen
On Wed, Aug 31, 2016 at 3:57 PM Jason Newton <nev...@gmail.com> wrote:

> Hey Ian - I hope I gave Cython a fair comment, but I have to add the
> disclaimer that your capability to understand/implement those
> solutions/workarounds in that project is greatly enhanced from your knowing
> the innards of Cython from being core developer on the Cython project. This
> doesn't detract from DyDN's accomplishments (if nothing it means Cython
> users should look there for how to use C++ with Cython and the workarounds
> used + shortcomings) but I would not expect not everyone would want to jump
> through those hoops to get things working without a firm understanding of
> Cython's edges, and all this potential for special/hack adaption code is
> still something to keep in mind when comparing to something more straight
> forward and easier to understand coming from a more pure C/C++ side, where
> things are a bit more dangerous and fairly more verbose but make play with
> the language and environment first-class (like Boost.Python/pybind).  Since
> this thread is a survey over state and options it's my intent just to make
> sure readers have something bare in mind for current pros/cons of the
> approaches.
>
> -Jason
>

No offense taken at all. I'm actually not a Cython developer, just a
frequent
contributor. That said, knowing the compiler internals certainly helps when
finding
workarounds and building intermediate interfaces. My main point was just
that, in
my experience, Cython has worked well for many things beyond plain C
interfaces
and that workarounds (hackery entirely aside) for any missing features are
usually
manageable. Given that my perspective is a bit different in that regard, it
seemed
worth chiming in on the discussion. I suppose the moral of the story is
that there's
still not a clear cut "best" way of building wrappers and that your mileage
may vary
depending on what features you need.
Thanks,
Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] State-of-the-art to use a C/C++ library from Python

2016-08-31 Thread Ian Henriksen
We use Cython very heavily in DyND's Python bindings. It has worked well
for us
even when working with some very modern C++. That said, a lot depends on
exactly which C++ features you want to expose as a part of the interface.
Interfaces that require things like non-type template parameters or variadic
templates will often require a some extra C++ code to work them in to
something
that Cython can understand. In my experience, those particular limitations
haven't
been that hard to work with.
Best,
Ian Henriksen

On Wed, Aug 31, 2016 at 12:20 PM Jason Newton <nev...@gmail.com> wrote:

> I just wanted to follow up on the C++ side of OP email - Cython has quite
> a few difficulties working with C++ code at the moment.  It's really more
> of a C solution most of the time and you must split things up into a mostly
> C call interface (that is the C code Cython can call) and limit
> exposure/complications with templates and  complex C++11+ constructs.  This
> may change in the longer term but in the near, that is the state.
>
> I used to use Boost.Python but I'm getting my feet wet with Pybind (which
> is basically the same api but works more as you expect it to with it's
> signature/type plumbing  (including std::shared_ptr islanding), with some
> other C++11 based improvements, and is header only + submodule friendly!).
> I also remembered ndarray thanks to Neal's post but I haven't figured out
> how to leverage it better than pybind, at the moment.  I'd be interested to
> see ndarray gain support for pybind interoperability...
>
> -Jason
>
> On Wed, Aug 31, 2016 at 1:08 PM, David Morris <otha...@othalan.net> wrote:
>
>> On Wed, Aug 31, 2016 at 2:28 PM, Michael Bieri <mibi...@gmail.com> wrote:
>>
>>> Hi all
>>>
>>> There are several ways on how to use C/C++ code from Python with NumPy,
>>> as given in http://docs.scipy.org/doc/numpy/user/c-info.html .
>>> Furthermore, there's at least pybind11.
>>>
>>> I'm not quite sure which approach is state-of-the-art as of 2016. How
>>> would you do it if you had to make a C/C++ library available in Python
>>> right now?
>>>
>>> In my case, I have a C library with some scientific functions on
>>> matrices and vectors. You will typically call a few functions to configure
>>> the computation, then hand over some pointers to existing buffers
>>> containing vector data, then start the computation, and finally read back
>>> the data. The library also can use MPI to parallelize.
>>>
>>
>> I have been delighted with Cython for this purpose.  Great integration
>> with NumPy (you can access numpy arrays directly as C arrays), very python
>> like syntax and amazing performance.
>>
>> Good luck,
>>
>> David
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecating silent truncation of floats when assigned to int array

2016-06-13 Thread Ian Henriksen
Personally, I think this is a great idea. +1 to more informative errors.
Best,
Ian Henriksen

On Mon, Jun 13, 2016 at 2:11 PM Nathaniel Smith <n...@pobox.com> wrote:

> It was recently pointed out:
>
>   https://github.com/numpy/numpy/issues/7730
>
> that this code silently truncates floats:
>
> In [1]: a = np.arange(10)
>
> In [2]: a.dtype
> Out[2]: dtype('int64')
>
> In [3]: a[3] = 1.5
>
> In [4]: a[3]
> Out[4]: 1
>
> The proposal is that we should deprecate this, and eventually turn it
> into an error. Any objections?
>
> We recently went through a similar deprecation cycle for in-place
> operations, i.e., this used to silently truncate but now raises an
> error:
>
> In [1]: a = np.arange(10)
>
> In [2]: a += 1.5
> ---
> TypeError Traceback (most recent call last)
>  in ()
> > 1 a += 1.5
>
> TypeError: Cannot cast ufunc add output from dtype('float64') to
> dtype('int64') with casting rule 'same_kind'
>
> so the proposal here is to extend this to regular assignment.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Integers to integer powers, let's make a decision

2016-06-10 Thread Ian Henriksen
On Fri, Jun 10, 2016 at 12:01 PM Nathaniel Smith <n...@pobox.com> wrote:

> On Jun 10, 2016 10:50, "Alan Isaac" <alan.is...@gmail.com> wrote:
> >
> > On 6/10/2016 1:34 PM, Nathaniel Smith wrote:
> >>
> >> You keep pounding on this example. It's a fine example, but, c'mon. **2
> is probably at least 100x more common in real source code. Maybe 1000x more
> common. Why should we break the
> >> common case for your edge case?
> >
> >
> >
> > It is hardly an "edge case".
> > Again, **almost all** integer combinations overflow: that's the point.
>
> When you say "almost all", you're assuming inputs that are uniformly
> sampled integers. I'm much more interested in what proportion of calls to
> the ** operator involve inputs that can overflow, and in real life those
> inputs are very heavily biased towards small numbers.
>
> (I also think we should default to raising an error on overflow in
> general, with a seterr switch to turn it off when desired. But that's
> another discussion...)
>
> -n
>
Another thing that would need separate discussion...
Making 64 bit integers default in more cases would help here.
Currently arange gives 32 bit integers on 64 bit Windows, but
64 bit integers on 64 bit Linux/OSX. Using size_t (or even
int64_t) as the default size would help with overflows in
the more common use cases. It's a hefty backcompat
break, but 64 bit systems are much more common now,
and using 32 bit integers on 64 bit windows is a bit odd.
Anyway, hopefully that's not too off-topic.
Best,
Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Integers to integer powers, let's make a decision

2016-06-10 Thread Ian Henriksen
On Fri, Jun 10, 2016 at 12:42 AM Nathaniel Smith <n...@pobox.com> wrote:

> On Mon, Jun 6, 2016 at 1:17 PM, Charles R Harris
> <charlesr.har...@gmail.com> wrote:
> >
> >
> >
> > On Mon, Jun 6, 2016 at 2:11 PM, Marten van Kerkwijk <
> m.h.vankerkw...@gmail.com> wrote:
> >>
> >> Hi Chuck,
> >>
> >> I consider either proposal an improvement, but among the two I favour
> returning float for `**`, because, like for `/`, it ensures one gets
> closest to the (mathematically) true answer in most cases, and makes
> duck-typing that much easier -- I'd like to be able to do x** y without
> having to worry whether x and y are python scalars or numpy arrays of
> certain type.
> >>
> >> I do agree with Nathaniel that it would be good to check what actually
> breaks. Certainly, if anybody is up to making a PR that implements either
> suggestion, I'd gladly check whether it breaks anything in astropy.
> >>
> >> I  should add that I have no idea how to assuage the fear that new code
> would break with old versions of numpy, but on the other hand, I don't know
> its vailidity either, as it seems one either develops larger projects  for
> multiple versions and tests, or writes more scripty things for whatever the
> current versions are. Certainly, by this argument I better not start using
> the new `@` operator!
> >>
> >> I do think the argument that for division it was easier because there
> was `//` already available is a red herring: here one can use `np.power(a,
> b, dtype=...)` if one really needs to.
> >
> >
> > It looks to me like users want floats, while developers want the easy
> path of raising an error. Darn those users, they just make life sooo
> difficult...
>
> I dunno, with my user hat on I'd be incredibly surprised / confused /
> annoyed if an innocent-looking expression like
>
>   np.arange(10) ** 2
>
> started returning floats... having exact ints is a really nice feature
> of Python/numpy as compared to R/Javascript, and while it's true that
> int64 can overflow, there are also large powers that can be more
> precisely represented as int64 than float.
>
> -n
>
>
>
This is very much my line of thinking as well. Generally when I'm doing
operations with integers, I expect integer output, regardless of floor
division and overflow.
There's a lot to both sides of the argument though. Python's arbitrary
precision integers alleviate overflow concerns very nicely, but forcing
float output for people who actually want integers is not at all ideal
either.
Best,
Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-11 Thread Ian Henriksen
On Mon, Apr 11, 2016 at 5:24 PM Chris Barker <chris.bar...@noaa.gov> wrote:

> On Fri, Apr 8, 2016 at 4:37 PM, Ian Henriksen <
> insertinterestingnameh...@gmail.com> wrote:
>
>
>> If we introduced the T2 syntax, this would be valid:
>>
>> a @ b.T2
>>
>> It makes the intent much clearer.
>>
>
> would:
>
> a @ colvector(b)
>
> work too? or is T2  generalized to more than one column? (though I suppose
> colvector() could support more than one column also -- weird though that
> might be.)
>
> -CHB
>

Right, so I've opted to withdraw my support for having the T2 syntax prepend
dimensions when the array has fewer than two dimensions. Erroring out in
the 1D
case addresses my concerns well enough. The colvec/rowvec idea seems nice
too, but it matters a bit less to me, so I'll leave that discussion open
for others to
follow up on. Having T2 be a broadcasting transpose is a bit more general
than any
semantics for rowvec/colvec that I can think of. Here are specific arrays
that, in
the a @ b.T2 can only be handled using some sort of transpose:

a = np.random.rand(2, 3, 4)
b = np.random.rand(2, 1, 3, 4)

Using these inputs, the expression

a @ b.T2

would have the shape (2, 2, 3, 3).

All the T2 property would be doing is a transpose that has similar
broadcasting
semantics to matmul, solve, inv, and the other gufuncs. The primary
difference
with those other functions is that transposes would be done as views
whereas the
other operations, because of the computations they perform, all have to
return new
output arrays.

Hope this helps,

-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-08 Thread Ian Henriksen
On Fri, Apr 8, 2016 at 4:04 PM Alan Isaac <alan.is...@gmail.com> wrote:

> On 4/8/2016 5:13 PM, Nathaniel Smith wrote:
> > he doesn't want 2d matrices, he wants
> > tools that make it easy to work with stacks of 2d matrices stored in
> > 2-or-more-dimensional arrays.
>
>
> Like `map`?
>
> Alan Isaac
>
>
Sorry if there's any misunderstanding here.

Map doesn't really help much. That'd only be good for dealing with three
dimensional cases and you'd get a list of arrays, not a view with the
appropriate
axes swapped.

np.einsum('...ji', a)
np.swapaxes(a, -1, -2)
np.rollaxis(a, -1, -2)

all do the right thing, but they are all fairly verbose for such a simple
operation.

Here's a simple example of when such a thing would be useful.
With 2D arrays you can write
a.dot(b.T)

If you want to have that same operation follow the existing gufunc
broadcasting
semantics you end up having to write one of the following

np.einsum('...ij,...kj', a, b)
a @ np.swapaxes(a, -1, -2)
a @ np.rollaxis(a, -1, -2)

None of those are very concise, and, when I look at them, I don't usually
think
"that does a.dot(b.T)."

If we introduced the T2 syntax, this would be valid:

a @ b.T2

It makes the intent much clearer. This helps readability even more when
you're
trying to put together something that follows a larger equation while still
broadcasting correctly.

Does this help make the use cases a bit clearer?

Best,

-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-08 Thread Ian Henriksen
On Thu, Apr 7, 2016 at 4:04 PM Stéfan van der Walt <stef...@berkeley.edu>
wrote:

> On 7 April 2016 at 11:17, Chris Barker <chris.bar...@noaa.gov> wrote:
> > np.col_vector(arr)
> >
> > which would be a synonym for np.reshape(arr, (-1,1))
> >
> > would that make anyone happy?
>
> I'm curious to see use cases where this doesn't solve the problem.
>
> The most common operations that I run into:
>
> colvec = lambda x: np.c_[x]
>
> x = np.array([1, 2, 3])
> A = np.arange(9).reshape((3, 3))
>
>
> 1) x @ x   (equivalent to x @ colvec(x))
> 2) A @ x  (equivalent to A @ colvec(x), apart from the shape)
> 3) x @ A
> 4) x @ colvec(x)  -- gives an error, but perhaps this should work and
> be equivalent to np.dot(colvec(x), rowvec(x)) ?
>
> If (4) were changed, 1D arrays would mostly* be interpreted as row
> vectors, and there would be no need for a rowvec function.  And we
> already do that kind of magic for (2).
>
> Stéfan
>
> * not for special case (1)
>
>
Thinking this over a bit more, I think a broadcasting transpose that errors
out on
arrays that are less than 2D would cover the use cases of which I'm aware.
The
biggest things to me are having a broadcasting 2D transpose and having some
form of transpose that doesn't silently pass 1D arrays through unchanged.

Adding properties like colvec and rowvec has less bearing on the use cases
I'm
aware of, but they both provide nice syntax sugar for common reshape
operations.
It seems like it would cover all the needed cases for simplifying
expressions
involving matrix multiplication. It's not totally clear what the semantics
should be
for higher dimensional arrays or 2D arrays that already have a shape
incompatible
with the one desired.

Best,
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Thu, Apr 7, 2016 at 1:53 PM <josef.p...@gmail.com> wrote:

> On Thu, Apr 7, 2016 at 3:26 PM, Ian Henriksen <
> insertinterestingnameh...@gmail.com> wrote:
>
>> On Thu, Apr 7, 2016 at 12:31 PM <josef.p...@gmail.com> wrote:
>>
>>> write unit tests with non square 2d arrays and the exception / test
>>> error shows up fast.
>>>
>>> Josef
>>>
>>>
>> Absolutely, but good programming practices don't totally obviate helpful
>> error
>> messages.
>>
>
> The current behavior is perfectly well defined, and I don't want a lot of
> warnings showing up because .T works suddenly only for ndim != 1.
> I make lots of mistakes during programming. But shape mismatch are usually
> very fast to catch.
>
> If you want safe programming, then force everyone to use only 2-D like in
> matlab. It would have prevented me from making many mistakes.
>
> >>> np.array(1).T
> array(1)
>
> another noop. Why doesn't it convert it to 2d?
>
> Josef
>
>
I think we've misunderstood each other. Sorry if I was unclear. As I've
understood the discussion thus far, "raising an error" refers to raising an
error when
a 1D array is passed used with the syntax a.T2 (for swapping the last two
dimensions?). As far as whether or not a.T should raise an error for 1D
arrays, that
ship has definitely already sailed. I'm making the case that there's value
in having
an abbreviated syntax that helps prevent errors from accidentally using a
1D array,
not that we should change the existing semantics.

Cheers,

-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Thu, Apr 7, 2016 at 12:31 PM  wrote:

> write unit tests with non square 2d arrays and the exception / test error
> shows up fast.
>
> Josef
>
>
Absolutely, but good programming practices don't totally obviate helpful
error
messages.

Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Thu, Apr 7, 2016 at 12:18 PM Chris Barker <chris.bar...@noaa.gov> wrote:

> On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen <
> insertinterestingnameh...@gmail.com> wrote:
>
>> Here's another example that I've seen catch people now and again.
>>
>> A = np.random.rand(100, 100)
>> b =  np.random.rand(10)
>> A * b.T
>>
>
> typo? that was supposed to be
>
> b =  np.random.rand(100). yes?
>

Hahaha, thanks, yes, in describing a common typo I demonstrated another
one. At
least this one doesn't fail silently.


>
> This is exactly what someone else referred to as the expectations of
> someone that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D
> arrays.
>
> All of this is EXACTLY the motivation for the matric class -- which never
> took off, and was never complete (it needed a row and column vector
> implementation, if you ask me. But Ithikn the reason it didn't take off is
> that it really isn't that useful, but is different enough from regular
> arrays to be a greater source of confusion. And it was decided that all
> people REALLY wanted was an obviou sway to get matric multiply, which we
> now have with @.
>

Most of the cases I've seen this error have come from people unfamiliar with
matlab who, like I said, weren't tracking dimensions quite as carefully as
they
should have. That said, it's just anecdotal evidence. I wouldn't be at all
surprised if
this were an issue for matlab users as well.

As far as the matrix class goes, we really shouldn't be telling anyone to
use that
anymore.


>
> So this discussion brings up that we also need an easy an obvious way to
> make a column vector --
>
> maybe:
>
> np.col_vector(arr)
>
> which would be a synonym for np.reshape(arr, (-1,1))
>
> would that make anyone happy?
>
> NOTE: having transposing a 1D array raise an exception would help remove a
> lot  of the confusion, but it may be too late for that
>

>
> In this case the user pretty clearly meant to be broadcasting along the
>> rows of A
>> rather than along the columns, but the code fails silently.
>>
>
> hence the exception idea
>
>
Yep. An exception may be the best way forward here. My biggest objection is
that
the current semantics make it easy for people to silently get unintended
behavior.


> maybe a warning?
>
> -CHB
>
>
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Wed, Apr 6, 2016 at 3:21 PM Nathaniel Smith <n...@pobox.com> wrote:

> Can you elaborate on what you're doing that you find verbose and
> confusing, maybe paste an example? I've never had any trouble like
> this doing linear algebra with @ or dot (which have similar semantics
> for 1d arrays), which is probably just because I've had different use
> cases, but it's much easier to talk about these things with a concrete
> example in front of us to put everyone on the same page.
>
> -n
>

Here's another example that I've seen catch people now and again.

A = np.random.rand(100, 100)
b =  np.random.rand(10)
A * b.T

In this case the user pretty clearly meant to be broadcasting along the
rows of A
rather than along the columns, but the code fails silently. When an issue
like this
gets mixed into a larger series of broadcasting operations, the error
becomes
difficult to find. This error isn't necessarily unique to beginners either.
It's a
common typo that catches intermediate users who know about broadcasting
semantics but weren't keeping close enough track of the dimensionality of
the
different intermediate expressions in their code.

Best,

-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-06 Thread Ian Henriksen
On Tue, Apr 5, 2016 at 9:14 PM Nathaniel Smith <n...@pobox.com> wrote:

> On Tue, Apr 5, 2016 at 7:11 PM, Todd <toddr...@gmail.com> wrote:
> > When you try to transpose a 1D array, it does nothing.  This is the
> correct
> > behavior, since it transposing a 1D array is meaningless.  However, this
> can
> > often lead to unexpected errors since this is rarely what you want.  You
> can
> > convert the array to 2D, using `np.atleast_2d` or `arr[None]`, but this
> > makes simple linear algebra computations more difficult.
> >
> > I propose adding an argument to transpose, perhaps called `expand` or
> > `expanddim`, which if `True` (it is `False` by default) will force the
> array
> > to be at least 2D.  A shortcut property, `ndarray.T2`, would be the same
> as
> > `ndarray.transpose(True)`.
>
> An alternative that was mentioned in the bug tracker
> (https://github.com/numpy/numpy/issues/7495), possibly by me, would be
> to have arr.T2 act as a stacked-transpose operator, i.e. treat an arr
> with shape (..., n, m) as being a (...)-shaped stack of (n, m)
> matrices, and transpose each of those matrices, so the output shape is
> (..., m, n). And since this operation intrinsically acts on arrays
> with shape (..., n, m) then trying to apply it to a 0d or 1d array
> would be an error.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
>

I agree that we could really use a shorter syntax for a broadcasting
transpose.
Swapaxes is far too verbose for something that should be so common now that
we've introduced the new matmul operator.

That said, the fact that 1-D vectors are conceptually so similar to row
vectors
makes transposing a 1-D array a potential pitfall for a lot of people. When
broadcasting along the leading dimension, a (n) shaped array and a (1, n)
shaped
array are already treated as equivalent. Treating a 1-D array like a row
vector for
transposes seems like a reasonable way to make things more intuitive for
users.

Rather than raising an error for arrays with fewer than two dimensions, the
new
syntax could be made equivalent to np.swapaxes(np.atleast2d(arr), -1, -2).
From
the standpoint of broadcasting semantics, using atleast2d can be viewed as
allowing broadcasting along the inner dimensions. Though that's not a common
thing, at least there's a precedent.

The only downside I can see with allowing T2 to call atleast2d is that it
would make
things like A @ b and A @ b.T2 equivalent when B is one-dimensional. That's
already the case with our current syntax though. There's some inherent
design
tension between the fact that broadcasting usually prepends ones to fill in
missing
dimensions and the fact that our current linear algebra semantics often
treat rows
as columns, but making 1-D arrays into rows makes a lot of sense as far as
user
experience goes.

Great ideas everyone!

Best,
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ATLAS build errors

2016-03-28 Thread Ian Henriksen
On Sat, Mar 26, 2016 at 3:06 PM Matthew Brett <matthew.br...@gmail.com>
wrote:

> Hi,
>
> I'm workon on building manylinux wheels for numpy, and I ran into
> unexpected problems with a numpy built against the ATLAS 3.8 binaries
> supplied by CentOS 5.
>
> I'm working on the manylinux docker container [1]
>
> To get ATLAS, I'm doing `yum install atlas-devel` which gets the
> default CentOS 5 ATLAS packages.
>
> I then build numpy.  Local tests work fine, but when I test on travis,
> I get these errors [2]:
>
> ==
> ERROR: test_svd_build (test_regression.TestRegression)
> --
> Traceback (most recent call last):
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/linalg/tests/test_regression.py",
> line 56, in test_svd_build
> u, s, vh = linalg.svd(a)
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/linalg/linalg.py",
> line 1359, in svd
> u, s, vt = gufunc(a, signature=signature, extobj=extobj)
> ValueError: On entry to DGESDD parameter number 12 had an illegal value
>
> ==
> FAIL: test_lapack (test_build.TestF77Mismatch)
> --
> Traceback (most recent call last):
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/testing/decorators.py",
> line 146, in skipper_func
> return f(*args, **kwargs)
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/linalg/tests/test_build.py",
> line 56, in test_lapack
> information.""")
> AssertionError: Both g77 and gfortran runtimes linked in lapack_lite !
> This is likely to
> cause random crashes and wrong results. See numpy INSTALL.txt for more
> information.
>
>
> Sure enough, scipy built the same way segfaults or fails to import (see
> [2]).
>
> I get no errors for an openblas build.
>
> Does anyone recognize these?   How should I modify the build to avoid them?
>
> Cheers,
>
> Matthew
>
>
> [1] https://github.com/pypa/manylinux
> [2] https://travis-ci.org/matthew-brett/manylinux-testing/jobs/118712090
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


The error regarding parameter 12 of dgesdd sounds a lot like
https://github.com/scipy/scipy/issues/5039 where the issue was that the
LAPACK
version was too old. CentOS 5 is pretty old, so I wouldn't be surprised if
that were
the case here too.
In general, you can't expect Linux distros to have a uniform shared object
interface
for LAPACK, so you don't gain much by using the version that ships with
CentOS 5 beyond not having to compile it all yourself. It might be better
to use a
newer LAPACK built from source with the older toolchains already there.
Best,
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] AppVeyor

2015-12-22 Thread Ian Henriksen
On Tue, Dec 22, 2015 at 12:14 AM Ralf Gommers 
wrote:

> That would be quite useful I think. 32/64-bit issues are mostly orthogonal
> to py2/py3 ones, so may only a 32-bit Python 3.5 build to keep things fast?
>

Done in https://github.com/numpy/numpy/pull/6874.
Hope this helps,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] AppVeyor

2015-12-21 Thread Ian Henriksen
>
> Also, am I correct that these are win64 builds only? Anyone know if it
> would be easy to add win32?
>

It'd be really easy to add 32 bit builds. The main reason I didn't was
because appveyor only
gives one concurrent build job for free, and I didn't want to slow things
down too much. I can get
a PR up for 32 bit builds too if you'd like.

Some background: I based the initial setup off of the dynd-python appveyor
setup
(https://github.com/libdynd/dynd-python/blob/master/appveyor.yml) where we
do one 32 and one
64 bit build. For fancier selection of Python installation, there's a demo
at
https://github.com/ogrisel/python-appveyor-demo that looks promising as
well. I avoided that
initially because it's a lot more complex than just a single appveyor.yml
file.

Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] AppVeyor

2015-12-21 Thread Ian Henriksen
>
> The Python 3 build runs much faster than the Python 2. You can close and
>> reopen my testing PR to check what happens if you enable the numpy project.
>
>
I'm not sure why this is the case. MSVC 2015 is generally better about a
lot of things, but it's
surprising that the speed difference is so large.
Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Ian Henriksen
On Fri, Dec 18, 2015 at 3:27 PM Nathaniel Smith <n...@pobox.com> wrote:

> On Dec 18, 2015 2:22 PM, "Ian Henriksen" <
> insertinterestingnameh...@gmail.com> wrote:
> >
> > An appveyor setup is a great idea. An appveyor build matrix with the
> > various supported MSVC versions would do a lot more to prevent
> > compatibility issues than periodically building installers with old
> versions of
> > MinGW. The effort toward a MinGW-based build is valuable, but having a
> > CI system test for MSVC compatibility will be valuable regardless of
> where
> > things go with that.
>
> Yes, definitely. Would you by chance have any interest in getting this set
> up?
>
> -n
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


I'll take a look at setting that up. On the other hand, getting everything
working with the various MSVC versions isn't likely to be a smooth sailing
process, so I can't guarantee anything.

Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now)

2015-12-18 Thread Ian Henriksen
On Fri, Dec 18, 2015 at 2:51 PM Ralf Gommers  wrote:

> On Fri, Dec 18, 2015 at 5:55 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Fri, Dec 18, 2015 at 2:12 AM, Nathaniel Smith  wrote:
>>
>>> Hi all,
>>>
>>> I'm wondering what people think of the idea of us (= numpy) stopping
>>> providing our "official" win32 builds (the "superpack installers"
>>> distributed on sourceforge) starting with the next release.
>>>
>>
> +1 from me. Despite the number of downloads still being high, I don't
> think there's too much value in these binaries anymore. We have been
> recommending Anaconda/Canopy for a couple of years now, and that's almost
> always a much better option for users.
>
>
>>
>>> These builds are:
>>>
>>> - low quality: they're linked to an old & untuned build of ATLAS, so
>>> linear algebra will be dramatically slower than builds using MKL or
>>> OpenBLAS. They're win32 only and will never support win64. They're
>>> using an ancient version of gcc. They will never support python 3.5 or
>>> later.
>>>
>>> - a dead end: there's a lot of work going on to solve the windows
>>> build problem, and hopefully we'll have something better in the
>>> short-to-medium-term future; but, any solution will involve throwing
>>> out the current system entirely and switching to a new toolchain,
>>> wheel-based distribution, etc.
>>>
>>> - a drain on our resources: producing these builds is time-consuming
>>> and finicky; I'm told that these builds alone are responsible for a
>>> large proportion of the energy spent preparing each release, and take
>>> away from other things that our release managers could be doing (e.g.
>>> QA and backporting fixes).
>>>
>>
>> Once numpy-vendor is set up, preparing and running the builds take about
>> fifteen minutes on my machine.
>>
>
> Well, it builds but the current setup is just broken. Try building a
> binary and running the tests - you should find that there's a segfault in
> the np.fromfile tests (see https://github.com/scipy/scipy/issues/5540).
> And that kind of thing is incredibly painful to debug and fix.
>
>
>> That assumes familiarity with the process, a first time user will spend
>> significantly more time. Most of the work  in a release is keeping track of
>> reported bugs and fixing them. Tracking deprecations and such also takes
>> time.
>>
>>
>>> So the idea would be that for 1.11, we create a 1.11 directory on
>>> sourceforge and upload one final file: a README explaining the
>>> situation, a pointer to the source releases on pypi, and some links to
>>> places where users can find better-supported windows builds (Gohlke's
>>> page, Anaconda, etc.). I think this would serve our users better than
>>> the current system, while also freeing up a drain on our resources.
>>>
>>
>> What about beta releases? I have nothing against offloading part of the
>> release process, but if we do, we need to determine how to coordinate it
>> among the different parties, which might be something of a time sink in
>> itself.
>>
>
> We need to ensure that the MSVC builds work. But that's not new, that was
> always necessary for a release. Christophe has always tested beta/rc
> releases which is super helpful, but we need to get Appveyor CI to work
> soon.
>
> Ralf
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


An appveyor setup is a great idea. An appveyor build matrix with the
various supported MSVC versions would do a lot more to prevent
compatibility issues than periodically building installers with old
versions of
MinGW. The effort toward a MinGW-based build is valuable, but having a
CI system test for MSVC compatibility will be valuable regardless of where
things go with that.

Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy, BLAS, and CBLAS questions

2015-07-13 Thread Ian Henriksen
On Mon, Jul 13, 2015 at 11:08 AM Nathaniel Smith n...@pobox.com wrote:

 I think that if you can make this work then it would be great (who doesn't
 like less code that does more?), but that as a practical matter
 accomplishing this may be difficult. AFAIK there simply is no way to write
 generic code to probe the system for any BLAS, except by explicitly
 having a list of supported BLASes and checking for each in sequence. Given
 that new BLAS libraries appear at a rate of  1/year, and that we already
 support most of the viable ones, the current approach is practically
 reasonable even if non-ideal in principle.

Agreed. From my experience, the existing BLAS and LAPACK implementations
are only mostly interchangeable. There may be some small differences
between the libraries that will need some special casing. On the other
hand, if you can get it to work properly, go for it. Some generalization
and simplification is probably possible.

 Keep in mind that any solution needs to support weird systems too,
 including Windows. I'm not sure we can assume that all BLAS libraries are
 ABI compatible either. Debian/Ubuntu make sure that this is true for the
 ones they ship, but not all systems are so carefully managed. For LAPACK in
 particular I'm pretty sure there are multiple ABIs in active use that scipy
 supports via code in numpy.distutils. (Hopefully someone with more
 expertise will speak up.)

Yes, there are ABI patches in SciPy that put together a uniform ABI for all
currently supported BLAS and LAPACK libraries (
https://github.com/scipy/scipy/tree/master/scipy/_build_utils). With a
little luck, some of these other libraries won't need as much special
casing, but, I don't know offhand.
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy, BLAS, and CBLAS questions

2015-07-13 Thread Ian Henriksen
On Mon, Jul 13, 2015 at 7:55 AM Nathaniel Smith n...@pobox.com wrote:

 On Jul 13, 2015 1:44 AM, Eric Martin e...@ericmart.in wrote:
 
  My procedure questions:
  Is the installation procedure I outlined above reasonable, or does it
 contain steps that could/should be removed? Having to edit Numpy source
 seems sketchy to me. I largely came up with this procedure by looking up
 tutorials online and by trial and error. I don't want to write
 documentation that encourages people to do something in a non-optimal way,
 so if there is a better way to do this, please let me know.

 I'll leave others with more knowledge to answer your other questions, but
 if you're interested in making it easy for others to link numpy against
 these libraries, I'd suggest modifying the numpy source further, by
 submitting a patch that teaches numpy.distutils how to detect and link
 against these libraries automatically :-). There are several libraries we
 already know how to do this for that you can compare against for reference,
 and this will also help for other libraries that use blas and
 numpy.distutils, like scipy.

Supporting Eigen sounds like a great idea. BLIS would be another
one worth supporting at some point. As far as implementation goes, it may
be helpful to look at https://github.com/numpy/numpy/pull/3642 and
https://github.com/numpy/numpy/pull/4191 for the corresponding set of
changes for OpenBLAS. That could be a good starting point.
Thanks for bringing this up!
-Ian Henriksen

  Eigen has excellent performance. On my i5-5200U (Broadwell) CPU, I
 found Eigen BLAS compiled with AVX and FMA instructions to take 3.93s to
 multiply 2 4000x4000 double matrices with a single thread, while my install
 of Numpy from ubuntu took 9s (and used 4 threads on my 2 cores). My Ubuntu
 numpy appears to built against libblas, which I think is the reference
 implementation.

 If you're using the numpy packages distributed by Ubuntu, then it should
 be possible to switch to openblas just by apt installing openblas and then
 maybe fiddling with update-alternatives.

 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PR added: frozen dimensions in gufunc signatures

2015-06-22 Thread Ian Henriksen
On Fri, Aug 29, 2014 at 2:55 AM Jaime Fernández del Río 
jaime.f...@gmail.com wrote:

 On Thu, Aug 28, 2014 at 5:40 PM, Nathaniel Smith n...@pobox.com wrote:

 Some thoughts:


 But, for your computed dimension idea I'm wondering if what we should
 do instead is just let a gufunc provide a C callback that looks at the
 input array dimensions and explicitly says somehow which dimensions it
 wants to treat as the core dimensions and what its output shapes will
 be. There's no rule that we have to extend the signature mini-language
 to be Turing complete, we can just use C :-).

 It would be good to have a better motivation for computed gufunc
 dimensions, though. Your all pairwise cross products example would
 be *much* better handled by implementing the .outer method for binary
 gufuncs: pairwise_cross(a) == cross.outer(a, a). This would make
 gufuncs more consistent with ufuncs, plus let you do
 all-pairwise-cross-products between two different sets of cross
 products, plus give us all-pairwise-matrix-products for free, etc.


 The outer for binary gufuncs sounds like a good idea. A reduce for binary
 gufuncs that allow it (like square matrix multiplication) would also be
 nice. But going back to the original question, the pairwise whatevers were
 just an example: one could come up with several others, e.g.:

 (m),(n)-($p),($q) with $p = m - n and $q = n - 1, could be (I think)
 the signature of a polynomial division gufunc
 (m),(n)-($p), with $p = m - n + 1, could be the signature of a
 convolution or correlation gufunc
 (m)-($n), with $n = m / 2, could be some form of downsampling gufunc


 While you're messing around with the gufunc dimension matching logic,
 any chance we can tempt you to implement the optional dimensions
 needed to handle '@', solve, etc. elegantly? The rule would be that
 you can write something like
(n?,k),(k,m?)-(n?,m?)
 and the ? dimensions are allowed to take on an additional value
 nothing at all. If there's no dimension available in the input, then
 we act like it was reshaped to add a dimension with shape 1, and then
 in the output we squeeze this dimension out again. I guess the rules
 would be that (1) in the input, you can have ? dimensions at the
 beginning or the end of your shape, but not both at the same time, (2)
 any dimension that has a ? in one place must have it in all places,
 (3) when checking argument conformity, nothing at all only matches
 against nothing at all, not against 1; this is because if we allowed
 (n?,m),(n?,m)-(n?,m) to be applied to two arrays with shapes (5,) and
 (1, 5), then it would be ambiguous whether the output should have
 shape (5,) or (1, 5).


 I definitely do not mind taking a look into it. I need to think a little
 more about the rules to convince myself that there is a consistent set of
 them that we can use. I also thought there may be a performance concern,
 that you may want to have different implementations when dimensions are
 missing, not automatically add a 1 and then remove it. It doesn't seem to
 be the case with neither `np.dot` nor `np.solve`, so maybe I am being
 overly cautious.

 Thanks for your comments and ideas. I have a feeling there are some nice
 features hidden in here, but I can't seem to figure out what should they be
 on my own.

 Jaime

 --
 (\__/)
 ( O.o)
 (  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
 de dominación mundial.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


I'm not sure where this is at, given the current amount of work that is
coming from the 1.10 release, but this sounds like a really great idea.
Computed/fixed dimensions would allow gufuncs for things like:
- polynomial multiplication, division, differentiation, and integration
- convolutions
- views of different types (see the corresponding discussion at
http://permalink.gmane.org/gmane.comp.python.numeric.general/59847).
Some of these examples would work better with gufuncs that can construct
views and have an axes keyword, but this is exactly the kind of
functionality that would be really great to have.
Thanks for the great work!
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to Force Storage Order

2015-03-31 Thread Ian Henriksen
On Tue, Mar 31, 2015, 12:50 AM Klemm, Michael michael.kl...@intel.com
wrote:

 Dear all,

 I have found a bug in one of my codes and the way it passes a Numpy matrix
 to MKL's dgemm routine.  Up to now I was assuming that the matrixes are
 using C order.  I guess I have to correct this assumption :-).

 I have found that the numpy.linalg.svd algorithm creates the resulting U,
 sigma, and V matrixes with Fortran storage.  Is there any way to force
 these kind of algorithms to not change the storage order?  That would make
 passing the matrixes to the native dgemm operation much easier.

 Cheers,
 -michael

 Dr.-Ing. Michael Klemm
 Senior Application Engineer
 Software and Services Group
 Developer Relations Division
 Phone   +49 89 9914 2340
 Cell+49 174 2417583


 Intel GmbH
 Dornacher Strasse 1
 85622 Feldkirchen/Muenchen, Deutschland
 Sitz der Gesellschaft: Feldkirchen bei Muenchen
 Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk
 Registergericht: Muenchen HRB 47456
 Ust.-IdNr./VAT Registration No.: DE129385895
 Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


Why not just call the algorithm on the transpose of the original array?
That will transpose and reverse the order of the SVD, but taking the
transposes once the algorithm is finished will ensure they are C ordered.
You could also use np.ascontiguousarray on the output arrays, though that
results in unnecessary copies that change the memory layout.
Best of luck!
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Views of a different dtype

2015-02-03 Thread Ian Henriksen
On Tue Feb 03 2015 at 1:47:34 PM Jaime Fernández del Río 
jaime.f...@gmail.com wrote:

 On Tue, Feb 3, 2015 at 8:59 AM, Sebastian Berg sebast...@sipsolutions.net
  wrote:

 On Di, 2015-02-03 at 07:18 -0800, Jaime Fernández del Río wrote:
 
 snip
 
 
 
  Do you have a concrete example of what a non (1, 1) array that fails
  with relaxed strides would look like?
 
 
  If we used, as right now, the array flags as a first choice point, and
  only if none is set try to determine it from the strides/dimensions
  information, I fail to imagine any situation where the end result
  would be worse than now. I don't think that a little bit of
  predictable surprising in an advanced functionality is too bad. We
  could start raising on the face of ambiguity, we refuse to guess
  errors, even for the current behavior you show above, but that is more
  likely to trip people by not giving them any simple workaround, that
  it seems to me would be add a .T if all dimensions are 1 in some
  particular situations. Or are you thinking of something more serious
  than a shape mismatch when you write about breaking current code?
 

 Yes, I am talking only about wrongly shaped results for some fortran
 order arrays. A (20, 1) fortran order complex array being viewed as
 float, will with relaxed strides become a (20, 2) array instead of a
 (40, 1) one.


 That is a limitation of the current implementation too, and happens
 already whenever relaxed strides are in place. Which is the default for
 1.10, right?

 Perhaps giving 'view' an 'order' or 'axis' kwarg could make sense after
 all? It should probably be more of a hint of what to do (fortran vs c) when
 in doubt. C would prioritize last axis, F the first, and we could even
 add a raise option to have it fail if the axis cannot be inferred from
 the strides and shape. Current behavior would is equivalent to what C
 would do.

 Jaime



IMHO, the best option would be something like this:

- When changing to a type with smaller itemsize, add a new axis after the
others so the resulting array is C contiguous (unless a different axis is
specified by a keyword argument). The advantage here is that if you index
the new view using the old indices for an entry, you get an array showing
its representation in the new type.
- No shape change for views with the same itemsize
- When changing to a type with a larger itemsize, collapse along the last
axis unless a different axis is specified, throwing an error if the axis
specified does not match the axis specified.

The last point essentially is just adding an axis argument. I like that
idea because it gives users the ability to do all that the array's memory
layout allows in a clear and concise way. Throwing an error if the default
axis doesn't work would be a good way to prevent strange bugs from
happening when the default behavior is expected.

The first point would be a break in backwards compatibility, so I'm not
sure if it's feasible at this point. The advantage would be that all all
arrays returned when using this functionality would be contiguous along the
last axis. The shape of the new array would be independent of the memory
layout of the original one. This would also be a much cleaner way to ensure
that views of a different type are always reversible while still allowing
for relaxed strides.

Either way, thanks for looking into this. It's a great feature to have
available.

-Ian Henriksen





 
  If there are any real loopholes in expanding this functionality, then
  lets not do it, but we know we have at least one user unsatisfied with
  the current performance, so I really think it is worth trying. Plus,
  I'll admit to that, messing around with some of these stuff deep
  inside the guts of the beast is lots of fun! ;)
 

 I do not think there are loopholes with expanding this functionality. I
 think there have regressions when we put relaxed strides to on, because
 suddenly the fortran order array might be expanded along a 1-sized axis,
 because it is also C order. So I wonder if we can fix these regressions
 and at the same time maybe provide a more intuitive approach then using
 the memory order blindly


 - Sebastian

 
  Jaime
 
 
  - Sebastian
 
 
  
   Jaime
  
  
  
   - Sebastian
  
   
I guess the main consideration for this is
  that we
   may be
stuck with
stuff b/c of backwards compatibility. Can
  you maybe
   say a
little bit
about what is allowed now, and what
  constraints that
   puts on
things?
E.g. are we already grovelling around in
  strides and
   picking
random

Re: [Numpy-discussion] Best way to expose std::vector to be used with numpy

2014-10-14 Thread Ian Henriksen
On Mon, Oct 13, 2014 at 8:39 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Mon, Oct 13, 2014 at 12:54 PM, Sebastian Berg 
 sebast...@sipsolutions.net wrote:

 On Mo, 2014-10-13 at 13:35 +0200, Daniele Nicolodi wrote:
  Hello,
 
  I have a C++ application that collects float, int or complex data in a
  possibly quite large std::vector. The application has some SWIG
  generated python wrappers that expose this vector to python. However,
  the standard way in which SWIG exposes the data is to create a touple
  and pass this to python, where it is very often converted to a numpy
  array for processing. Of course this is not efficient.
 
  I would like therefore to generate a smarter python interface. In python
  3 I would without doubt go for implementing the buffer protocol, which
  enables seamless interfacing with numpy. However, I need to support also
  python 2 and there the buffer protocol is not as nice.

 Isn't the new buffer protocol in python 2.6 or 2.7 already? There is at
 least a memoryview object in python 2, which maybe could be used to the
 same effect?


 No memoryview in python2.6, but the older buffer protocol it there. Is
 Boost Python an option?

 snip

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


Here's an idea on how to avoid the buffer protocol entirely.
It won't be perfectly optimal, but it is a simple solution that
may be good enough.

It could be that the shape and type inference and the Python
loops that are run when converting the large tuple to a NumPy
array are the main sticking points. In that case, the primary
bottleneck could be eliminated by looping and copying at the C level.
You could use Cython and just copy the needed information
manually into the array. Here's a rough outline of what the
Cython file might look like:

# distutils: language = c++
# distutils: libraries = (whatever shared library you need)
# distutils: library_dirs = (folders where the libraries are)
# distutils: include_dirs = (wherever the headers are)
# cython: boundscheck = False
# cython: wraparound = False

import numpy as np
from libcpp.vector cimport vector

# Here do the call to your C++ library
# This can be done by declaring the
# functions or whatever else you need in
# a cdef extern from header file block
# and then calling the functions to get
# the vector you need.

cdef int[::1] arr_from_vector(vector[int] v):
cdef:
int i
int[::1] a = np.empty(v.size(), int)
for i in xrange(v.size()):
a[i] = v[i]
return np.asarray(a)

def get_data():
# Run your C++ computation here.
return arr_from_vector(v)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] (no subject)

2014-09-18 Thread Ian Henriksen
On Thu, Sep 18, 2014 at 1:30 PM, Jonathan Helmus jjhel...@gmail.com wrote:

 On 09/18/2014 12:44 PM, Petr Viktorin wrote:
  On Thu, Sep 18, 2014 at 7:14 PM, Jonathan Helmus jjhel...@gmail.com
 wrote:
  On 09/18/2014 12:01 PM, Chris Barker wrote:
 
  Well,
 
  First of all, numpy and the python math module have a number of
 differences
  when it comes to handling these kind of special cases -- and I think
 that:
 
  1) numpy needs to do what makes the most sense for numpy and NOT mirror
 the
  math lib.
  Sure.
 
  2) the use-cases of the math lib and numpy are different, so they maybe
  _should_ have different handling of this kind of thing.
  If you have a reason for the difference, I'd like to hear it.
 
  3) I'm not sure that the core devs think these kinds of issues are
 wrong
  'enough to break backward compatibility in subtle ways.
  I'd be perfectly fine with it being documented and tested (in CPython)
  as either a design mistake or design choice.
 
  But it's a fun topic in any case, and maybe numpy's behavior could be
  improved.
  My vote is that NumPy is correct here. I see no reason why
  float('inf') / 1
  and
  float('inf') // 1
  should return different results.
 
  Well, one argument is that floor division is supposed to return an
 integer
  value, and that inf is NOT an integer value. The integral part of
 infinity
  doesn't exist and thus is Not a Number.
 
 
  But nan is not an integer value either:
 
  float('inf') // 1
  nan
  int(float('inf') // 1)
  Traceback (most recent call last):
 File stdin, line 1, in module
  ValueError: cannot convert float NaN to integer
 
  Perhaps float('inf') // 1 should raise a ValueError directly since
 there is
  no proper way perform the floor division on infinity.
  inf not even a *real* number; a lot of operations don't make
  mathematical sense on it. But most are defined anyway, and quite
  sanely.

 But in IEEE-754 inf is a valid floating point number (where-as NaN is
 not) and has well defined arithmetic, specifically inf / 1 == inf and
 RoundToIntergral(inf) == inf.  In the numpy example, the
 numpy.array(float('inf')) statement creates an array containing a
 float32 or float64 representation of inf.  After this I would expect a
 floor division to return inf since that is what IEEE-754 arithmetic
 specifies.

 For me the question is if the floor division should also perform a cast
 to an integer type. Since inf cannot be represented in most common
 integer formats this should raise an exception.  Since // does not
 normally perform a cast, for example type(float(5) // 2) == float, the
 point is mute.

 The real question is if Python floats follows IEEE-754 arithmetic or
 not.  If they do not what standard are they going to follow?

  - Jonathan Helmus
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


Agreed. It's definitely best to follow the IEEE conventions. That will be
the most commonly expected behavior, especially in ambiguous cases like
this.
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion