Re: [Numpy-discussion] Floor divison on int returns float

2016-04-12 Thread T J
Thanks Eric.

Also relevant: https://github.com/numba/numba/issues/909

Looks like Numba has found a way to avoid this edge case.



On Monday, April 4, 2016, Eric Firing <efir...@hawaii.edu> wrote:

> On 2016/04/04 9:23 AM, T J wrote:
>
>> I'm on NumPy 1.10.4 (mkl).
>>
>>  >>> np.uint(3) // 2   # 1.0
>>  >>> 3 // 2   # 1
>>
>> Is this behavior expected? It's certainly not desired from my
>> perspective. If this is not a bug, could someone explain the rationale
>> to me.
>>
>> Thanks.
>>
>
> I agree that it's almost always undesirable; one would reasonably expect
> some sort of int.  Here's what I think is going on:
>
> The odd behavior occurs only with np.uint, which is np.uint64, and when
> the denominator is a signed int.  The problem is that if the denominator is
> negative, the result will be negative, so it can't have the same type as
> the first numerator.  Furthermore, if the denominator is -1, the result
> will be minus the numerator, and that can't be represented by np.uint or
> np.int.  Therefore the result is returned as np.float64.  The promotion
> rules are based on what *could* happen in an operation, not on what *is*
> happening in a given instance.
>
> Eric
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Floor divison on int returns float

2016-04-04 Thread T J
I'm on NumPy 1.10.4 (mkl).

>>> np.uint(3) // 2   # 1.0
>>> 3 // 2   # 1

Is this behavior expected? It's certainly not desired from my perspective.
If this is not a bug, could someone explain the rationale to me.

Thanks.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Round away from zero (towards +/- infinity)

2014-10-03 Thread T J
It does, but it is not portable. That's why I was hoping NumPy might think
about supporting more rounding algorithms.

On Thu, Oct 2, 2014 at 10:00 PM, John Zwinck jzwi...@gmail.com wrote:

 On 3 Oct 2014 07:09, T J tjhn...@gmail.com wrote:
 
  Any bites on this?
 
  On Wed, Sep 24, 2014 at 12:23 PM, T J tjhn...@gmail.com wrote:
  Python's round function goes away from zero, so I am looking for the
 NumPy equivalent (and using vectorize() seems undesirable). In this sense,
 it seems that having a ufunc for this type of rounding could be helpful.
 
  Aside: Is there interest in a more general around() that allows users
 to specify alternative tie-breaking rules, with the default staying 'round
 half to nearest even'? [1]
  ---
  [1]
 http://stackoverflow.com/questions/16000574/tie-breaking-of-round-with-numpy

 I like the solution given in that Stack Overflow post, namely using ctypes
 to call fesetround(). Does that work for you?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] 0/0 == 0?

2014-10-02 Thread T J
Hi, I'm using NumPy 1.8.2:

In [1]: np.array(0) / np.array(0)
Out[1]: 0

In [2]: np.array(0) / np.array(0.0)
Out[2]: nan

In [3]: np.array(0.0) / np.array(0)
  Out[3]: nan

In [4]: np.array(0.0) / np.array(0.0)
Out[4]: nan


In [5]: 0/0
---
ZeroDivisionError Traceback (most recent call last)
ipython-input-6-6549dea6d1ae in module()
 1 0/0

ZeroDivisionError: integer division or modulo by zero



Out[1] seems odd. I get the right value in 1.8.1.  Was this fixed for 1.9.0?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Round away from zero (towards +/- infinity)

2014-10-02 Thread T J
Any bites on this?

On Wed, Sep 24, 2014 at 12:23 PM, T J tjhn...@gmail.com wrote:

 Is there a ufunc for rounding away from zero? Or do I need to do

 x2 = sign(x) * ceil(abs(x))

 whenever I want to round away from zero? Maybe the following is better?

 x_ceil = ceil(x)
 x_floor = floor(x)
 x2 = where(x = 0, x_ceil, x_floor)

 Python's round function goes away from zero, so I am looking for the NumPy
 equivalent (and using vectorize() seems undesirable). In this sense, it
 seems that having a ufunc for this type of rounding could be helpful.

 Aside: Is there interest in a more general around() that allows users to
 specify alternative tie-breaking rules, with the default staying 'round
 half to nearest even'? [1]

 Also, what is the difference between NumPy's fix() and trunc() functions?
 It seems like they achieve the same goal. trunc() was added in 1.3.0. So is
 fix() just legacy?

 ---
 [1]
 http://stackoverflow.com/questions/16000574/tie-breaking-of-round-with-numpy


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Round away from zero (towards +/- infinity)

2014-09-24 Thread T J
Is there a ufunc for rounding away from zero? Or do I need to do

x2 = sign(x) * ceil(abs(x))

whenever I want to round away from zero? Maybe the following is better?

x_ceil = ceil(x)
x_floor = floor(x)
x2 = where(x = 0, x_ceil, x_floor)

Python's round function goes away from zero, so I am looking for the NumPy
equivalent (and using vectorize() seems undesirable). In this sense, it
seems that having a ufunc for this type of rounding could be helpful.

Aside: Is there interest in a more general around() that allows users to
specify alternative tie-breaking rules, with the default staying 'round
half to nearest even'? [1]

Also, what is the difference between NumPy's fix() and trunc() functions?
It seems like they achieve the same goal. trunc() was added in 1.3.0. So is
fix() just legacy?

---
[1]
http://stackoverflow.com/questions/16000574/tie-breaking-of-round-with-numpy
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Missing Data

2014-03-26 Thread T J
What is the status of:

   https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst

and of missing data in Numpy, more generally?

Is np.ma.array still the state-of-the-art way to handle missing data? Or
has something better and more comprehensive been put together?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Vectorize and ufunc attribute

2013-03-12 Thread T J
Prior to 1.7, I had working compatibility code such as the following:


if has_good_functions:
# http://projects.scipy.org/numpy/ticket/1096
from numpy import logaddexp, logaddexp2
else:
logaddexp = vectorize(_logaddexp, otypes=[numpy.float64])
logaddexp2 = vectorize(_logaddexp2, otypes=[numpy.float64])

# Run these at least once so that .ufunc.reduce exists
logaddexp([1.,2.,3.],[1.,2.,3.])
logaddexp2([1.,2.,3.],[1.,2.,3.])

# And then make reduce available at the top level
logaddexp.reduce = logaddexp.ufunc.reduce
logaddexp2.reduce = logaddexp2.ufunc.reduce


The point was that I wanted to treat the output of vectorize as a hacky
drop-in replacement for a ufunc.  In 1.7, I discovered that vectorize had
changed (https://github.com/numpy/numpy/pull/290), and now there is no
longer a ufunc attribute at all.

Should this be added back in?  Besides hackish drop-in replacements, I see
value in to being able to call reduce, accumulate, etc (when possible) on
the output of vectorize().
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Vectorize and ufunc attribute

2013-03-12 Thread T J
On Tue, Mar 12, 2013 at 9:59 AM, Bradley M. Froehle
brad.froe...@gmail.comwrote:

 T J:

 You may want to look into `numpy.frompyfunc` (
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.frompyfunc.html
 ).


Yeah that's better, but it doesn't respect the output type of the function.
 Be nice if this supported the otypes keyword.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: [numpy] ENH: Initial implementation of a 'neighbor' calculation (#303)

2012-10-13 Thread T J
On Fri, Oct 12, 2012 at 1:04 PM, Sturla Molden stu...@molden.no wrote:

 I'm still rather sure GIS functionality belongs in scipy.spatial instead
 of numpy.


From the link:


FocalMax

Finds the highest value for each cell location on an input grid within
a specified neighborhood and sends it to the corresponding cell
location on the output grid.


Isn't this much more generic than GIS?  Maybe the criteria should be:
 Would you expect to find extensive discussion on this functionality
within a standard text on spatial data structures?  Sliding windows and
nice boundary handling seems less spatial to me, and much further from
B-trees, quadtress, kd-trees, persistent data structures, etc.  Can you
elaborate your position a bit more?  Sorry if I'm being dense.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Meta: help, devel and stackoverflow

2012-06-30 Thread T J
On Sat, Jun 30, 2012 at 1:26 PM, josef.p...@gmail.com wrote:

 just some statistics

 http://stackoverflow.com/questions/tagged/numpy
 769 followers, 2,850  questions tagged

 a guess: average response time for regular usage question far less than an
 hour

 http://stackoverflow.com/questions/tagged/scipy
 446 followers, 991questions tagged



Yes they are frequently very quick and pinpoint.

To provide yet another data point, I will mention that I used to be an avid
follower of comp.text.tex.  I would post questions there and also read it
for knowledge.  Now, I use http://tex.stackexchange.com/ almost
exclusively.  I know many others have done the same.  I've also noticed a
number of LaTeX gurus using the stackexchange site more and more.  Try
googling a LaTeX (esp TikZ) question.  Would you rather read through an
archived newsgroup (mailing list in NumPy's case) or have a webpage with
useful features, embedded images, etc?

jdh noticed this as well: the majority of the messages to numpy-discussion
in the last 2 months have not been usage questions but decisions,
releases, debates, etc.  Personally, I would push for the stackexchange
solution over a 'user' mailing list.  That said, comp.text.tex and
tex.stackexchange.com coexist just fine---it just means there is redudnancy
and not the good kind IMO.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Meta: help, devel and stackoverflow

2012-06-30 Thread T J
On Sat, Jun 30, 2012 at 1:50 PM, srean srean.l...@gmail.com wrote:


 Anecdotal data-point:
 I have been  happy with SO in general. It works for certain types of
 queries very well. OTOH if the answer to the question is known only to
 a few and he/she does not happen to be online at  time the question
 was posted, and he/she does not pull such possible questions by
 key-words, that question is all but history.

 The difference is that on a mailing list questions are pushed on to
 people who might be able to answer it, whereas in SO model people have
 to actively seek questions they want to answer. Unanticipated, niche
 questions tend to disappear.


Isn't that what the various sections are for?

http://stackoverflow.com/questions?sort=newest

http://stackoverflow.com/questions?sort=unanswered

And then, if you want modification-by-modification updates:

http://stackoverflow.com/questions?sort=active

Entries are sorted by date and you can view as many pages worth as are
available.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Meta: help, devel and stackoverflow

2012-06-28 Thread T J
On Thu, Jun 28, 2012 at 3:23 PM, Fernando Perez fperez@gmail.comwrote:

 On Thu, Jun 28, 2012 at 3:06 PM, srean srean.l...@gmail.com wrote:
   What I like about having two lists is that on one hand it does not
  prevent me or you from participating in both, on the other hand it
  allows those who dont want to delve too deeply in one aspect or the
  other, the option of a cleaner inbox, or the option of having separate
  inboxes. I for instance would like to be in both the lists, perhaps
  mostly as a lurker, but still would want to have two different folders
  just for better organization.

 I just want to mention that even as a project leader, I benefit from
 this: when I'm swamped, I simply ignore the user list.  Not a nice
 thing to do, perhaps, but given the choice between moving the project
 forward and helping a new user, with often very limited time, I think
 it's the best solution possible.  Of course I do help in the user list
 when I can, but I mostly encourage more experienced users to help new
 ones, so that our small dev team can spend its limited time moving the
 project forward.


I'm okay with having two lists as it does filtering for me, but this seems
like a sub-optimal solution.

Observation: Some people would like to apply labels to incoming messages.
Reality: Email was not really designed for that.

We can hack it by using two different email addresses, but why not just
keep this list as is and make a concentrated effort to promote the use of
2.0 technologies, like stackoverflow/askbot/etc?  There, people can put as
many tags as desired on questions: matrix, C-API, iteration, etc.
Potentially, these tags would streamline everyone's workflow.  The
stackoverflow setup also makes it easier for users to search for solutions
to common questions, and know that the top answer is still an accurate
answer.  [No one likes finding old invalid solutions.]  The reputation
system and up/down votes also help new users figure out which responses to
trust.

As others have explained, it does seem that there are distinct types of
discussions that take place on this list.

1)  There are community discussiuons/debates.

Examples are the NA discussion, the bug tracker, release schedule, ABI/API
changes, matrix rank tolerance too low, lazy evaluation, etc.   These are
clearly mailing-list topics.   If you look at all the messages for the last
two(!) months, it seems like this type of message has been the dominate
type.

2) There are also standard questions.

Recent examples are memory allocation at assignment,  dot() function
question, not expected output of fill_diagonal, silly isscalar
question.  These messages seem much more suited to the stackoverflow
environment.  In fact, I'd be happy if we redirected such questions to
stackoverflow.  This has the added benefit that responses to such questions
will stay on topic.  Note that if a stackoverflow question seeds a
discussion, then someone can start a new thread on the mailing list which
cite the stackoverflow question.

tl;dr

Keep this list the same, and push user questions to stackoverflow instead
of pushing them to a user list.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or aview? (1.7 compatibility issue)

2012-05-23 Thread T J
On Wed, May 23, 2012 at 4:16 PM, Kathleen M Tacina 
kathleen.m.tac...@nasa.gov wrote:

 **
 On Wed, 2012-05-23 at 17:31 -0500, Nathaniel Smith wrote:

 On Wed, May 23, 2012 at 10:53 PM, Travis Oliphant tra...@continuum.io 
 wrote: To be clear, I'm not opposed to the change, and it looks like we 
 should go forward. In my mind it's not about developers vs. users as 
 satisfying users is the whole point.   The purpose of NumPy is not to make 
 its developers happy :-).   But, users also want there to *be* developers on 
 NumPy so developer happiness is not irrelevant. In this case, though, there 
 are consequences for users because of the double copy if a user wants to make 
 their code future proof.   We are always trading off predicted 
 user-experiences.I hope that we all don't have the same perspective on 
 every issue or more than likely their aren't enough voices being heard from 
 real users.
 I'm not really worried about users who have a problem with thedouble-copy. 
 It's a totally legitimate concern, but anyone who hasthat concern has already 
 understood the issues well enough to be ableto take care of themselves, and 
 decided that it's worth the effort tospecial-case this. They can check 
 whether the returned array has .baseset to tell whether it's an array or a 
 view, use a temporary hack tocheck for the secret warning flag in 
 arr.flags.num, check the numpyversion, all sorts of things to get them 
 through the one version wherethis matters. The suggestion in the docs to make 
 a copy is not exactlybinding :-).
 -- Nathaniel


 As a real user, if I care about whether an array arr2 is a copy or a
 view, I usually either check arr2.flags.owndata or append copy() to the
 statement that created arr2, e.g., arr2 = arr.diagonal().copy().

 Numpy rules on views vs. copies sometimes require a bit of thought, and so
 I'll frequently just check the flags or make a copy instead of thinking.
 (More foolproof :).)




 It seems that there are a number of ways to check if an array is a view.
 Do we have a preferred way in the API that is guaranteed to stay
available?  Or are all of the various methods here to stay?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread T J
On Fri, May 11, 2012 at 1:12 PM, Mark Wiebe mwwi...@gmail.com wrote:

 On Fri, May 11, 2012 at 2:18 PM, Pauli Virtanen p...@iki.fi wrote:

 11.05.2012 17:54, Frédéric Bastien kirjoitti:
  In Theano we use a view, but that is not relevant as it is the
  compiler that tell what is inplace. So this is invisible to the user.
 
  What about a parameter to diagonal() that default to return a view not
  writable as you said. The user can then choose what it want and this
  don't break the inferface.
 [clip]

 Agreed, it seems this is the appropriate way to go on here
 `diagonal(copy=True)`. A more obscure alternative would be to add a
 separate method that returns a view.


 This looks like the best way to deal with it, yes.

 Cheers,
 Mark



 I don't think changing the default behavior in a later release is a good
 idea. It's a sort of an API wart, but IMHO better that than subtle code
 breakage.



copy=True seems fine, but is this the final plan?   What about long term,
should diag() eventually be brought in line with transpose() and reshape()
so that it is a view by default?  Changing default behavior is certainly
not something that should be done all the time, but it *can* be done if
deprecated appropriately.  A more consistent API is better than one with
warts (if this particular issue is actually seen as a wart).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread T J
On Wed, Feb 15, 2012 at 12:45 PM, Alan G Isaac alan.is...@gmail.com wrote:


 for the core developers.  The right way to produce a
 governance structure is to make concrete proposals and
 show how these proposals are in the interest of the
 *developers* (as well as of the users).


At this point, it seems to me that Matthew is simply trying to make the
case that ==some== governance structure should be (or should be more
clearly) set up.  Even this fairly modest goal seems to be receiving
blowback from some.

Perhaps a specific proposal would be more convincing to those in
opposition, but I'd like to think that the merits for
a governance structure could be appreciated without having specific the
details of what such a structure would look like.  Matthew's links
certainly make a good case, IMO.  He has also described a number of
scenarios where a governance structure would be helpful (even if we think
the likelihood of such scenarios is small).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 11:59 AM, Pauli Virtanen p...@iki.fi wrote:

 I have a feeling that if you don't start by mathematically defining the
 scalar operations first, and only after that generalize them to arrays,
 some conceptual problems may follow.


Yes.  I was going to mention this point as well.


 For shorthand, we can refer to the above choices with the nomenclature

shorthand ::= propagation destructivity payload_type
propagation ::= P | N
destructivity ::= d | n | s
payload_type ::= S | E | C

 That makes 2 * 3 * 3 = 18 different ways to construct consistent
 behavior. Some of them might make sense, the problem is to find out which
:)


This is great for the discussion, IMO.  The self-destructive assignment
hasn't come up at all, so I'm guessing we can probably ignore it.

---

Can you be a bit more explicit on the payload types?  Let me try, respond
with corrections if necessary.

S is singleton and in the case of missing data, we take it to mean that
we only care that data is missing and not *how* missing the data is.

 x = MISSING
 -x  # unary
MISSING
 x + 3  # binary
MISSING

E means that we acknowledge that we want to track the how, but that we
aren't interested in it. So raise an error.
In the case of ignored data, we might have:

 x = 2
 ignore(x)
 x
IGNORED(2)
 -x
Error
 x + 3
Error

C means that we acknowledge that we want to track the how, and that we
are interested in it.  So do the computations.

 x = 2
 ignore(x)
 -x
IGNORED(-2)
 x + 3
IGNORED(5)

Did I get that mostly right?

 NAN and NA apparently fall into the PdS class.


Here is where I think we need ot be a bit more careful.  It is true that we
want NAN and MISSING to propagate, but then we additionally want to ignore
it sometimes.  This is precisely why we have functions like nansum.
Although people are well-aware of this desire, I think this thread has
largely conflated the issues when discussing propagation.

To push this forward a bit, can I propose that IGNORE behave as:   PnC

 x = np.array([1, 2, 3])
 y = np.array([10, 20, 30])
 ignore(x[2])
 x
[1, IGNORED(2), 3]
 x + 2
[3, IGNORED(4), 5]
 x + y
[11, IGNORED(22), 33]
 z = x.sum()
 z
IGNORED(6)
 unignore(z)
 z
6
 x.sum(skipIGNORED=True)
4

When done in this fashion, I think it is perfectly fine for masks to be
unmasked.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 1:03 PM, Gary Strangman
str...@nmr.mgh.harvard.eduwrote:


 To push this forward a bit, can I propose that IGNORE behave as:   PnC

  x = np.array([1, 2, 3])
  y = np.array([10, 20, 30])
  ignore(x[2])
  x
 [1, IGNORED(2), 3]
  x + 2
 [3, IGNORED(4), 5]
  x + y
 [11, IGNORED(22), 33]
  z = x.sum()
  z
 IGNORED(6)
  unignore(z)
  z
 6
  x.sum(skipIGNORED=True)
 4


 In my mind, IGNORED items should be skipped by default (i.e., skipIGNORED
 seems redundant ... isn't that what ignoring is all about?). Thus I might
 instead suggest the opposite (default) behavior at the end:


  x = np.array([1, 2, 3])
 y = np.array([10, 20, 30])
 ignore(x[2])
 x

 [1, IGNORED(2), 3]

 x + 2

 [3, IGNORED(4), 5]

 x + y

 [11, IGNORED(22), 33]

 z = x.sum()
 z

 4

 unignore(x).sum()

 6

 x.sum(keepIGNORED=True)

 6

 (Obviously all the syntax is totally up for debate.)



I agree that it would be ideal if the default were to skip IGNORED values,
but that behavior seems inconsistent with its propagation properties (such
as when adding arrays with IGNORED values).  To illustrate, when we did
x+2, we were stating that:

IGNORED(2) + 2 == IGNORED(4)

which means that we propagated the IGNORED value.  If we were to skip them
by default, then we'd have:

IGNORED(2) + 2 == 2

To be consistent, then it seems we also should have had:

 x + 2
[3, 2, 5]

which I think we can agree is not so desirable.   What this seems to come
down to is that we tend to want different behavior when we are doing
reductions, and that for IGNORED data, we want it to propagate in every
situation except for a reduction (where we want to skip over it).

I don't know if there is a well-defined way to distinguish reductions from
the other operations.  Would it hold for generalized ufuncs?  Would it hold
for other functions which might return arrays instead of scalars?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 2:41 PM, Pauli Virtanen p...@iki.fi wrote:

 04.11.2011 20:49, T J kirjoitti:
 [clip]
  To push this forward a bit, can I propose that IGNORE behave as:   PnC

 The *n* classes can be a bit confusing in Python:

 ### PnC

   x = np.array([1, 2, 3])
   y = np.array([4, 5, 6])
   ignore(y[1])
   z = x + y
   z
 np.array([5, IGNORE(7), 9])
   x += y # NB: x[1] := x[1] + y[1]
   x
 np.array([5, 2, 3])

 ***


Interesting.


 I think I defined the destructive and non-destructive in a different
 way than earlier in the thread. Maybe this behavior from np.ma is closer
 to what was meant earlier:

   x = np.ma.array([1, 2, 3], mask=[0, 0, 1])
   y = np.ma.array([4, 5, 6], mask=[0, 1, 1])
   x += y
   x
 masked_array(data = [5 -- --],
  mask = [False  True  True],
fill_value = 99)
   x.data
 array([5, 2, 3])


 Let's call this (since I botched and already reserved the letter n :)

 (m) mark-ignored

 a := SPECIAL_1
 # - a == SPECIAL_a ; the payload of the RHS is neglected,
 # the assigned value has the original LHS
 # as the payload



Does this behave as expected for x + y (as opposed to the inplace
operation)?

 z = x + y
 z
np.array([5, IGNORED(2), IGNORED(3)])
 x += y
np.array([5, IGNORED(2), IGNORED(3)])

However, doesn't this have the issue that Nathaniel brought up earlier:
commutativity

unignore(x + y) != unignore(y + x)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 2:29 PM, Nathaniel Smith n...@pobox.com wrote:

 On Fri, Nov 4, 2011 at 1:22 PM, T J tjhn...@gmail.com wrote:
  I agree that it would be ideal if the default were to skip IGNORED
 values,
  but that behavior seems inconsistent with its propagation properties
 (such
  as when adding arrays with IGNORED values).  To illustrate, when we did
  x+2, we were stating that:
 
  IGNORED(2) + 2 == IGNORED(4)
 
  which means that we propagated the IGNORED value.  If we were to skip
 them
  by default, then we'd have:
 
  IGNORED(2) + 2 == 2
 
  To be consistent, then it seems we also should have had:
 
  x + 2
  [3, 2, 5]
 
  which I think we can agree is not so desirable.   What this seems to come
  down to is that we tend to want different behavior when we are doing
  reductions, and that for IGNORED data, we want it to propagate in every
  situation except for a reduction (where we want to skip over it).
 
  I don't know if there is a well-defined way to distinguish reductions
 from
  the other operations.  Would it hold for generalized ufuncs?  Would it
 hold
  for other functions which might return arrays instead of scalars?

 Continuing my theme of looking for consensus first... there are
 obviously a ton of ugly corners in here. But my impression is that at
 least for some simple cases, it's clear what users want:

  a = [1, IGNORED(2), 3]
 # array-with-ignored-values + unignored scalar only affects unignored
 values
  a + 2
 [3, IGNORED(2), 5]
 # reduction operations skip ignored values
  np.sum(a)
 4

 For example, Gary mentioned the common idiom of wanting to take an
 array and subtract off its mean, and he wants to do that while leaving
 the masked-out/ignored values unchanged. As long as the above cases
 work the way I wrote, we will have

  np.mean(a)
 2
  a -= np.mean(a)
  a
 [-1, IGNORED(2), 1]

 Which I'm pretty sure is the result that he wants. (Gary, is that
 right?) Also numpy.ma follows these rules, so that's some additional
 evidence that they're reasonable. (And I think part of the confusion
 between Lluís and me was that these are the rules that I meant when I
 said non-propagating, but he understood that to mean something
 else.)

 So before we start exploring the whole vast space of possible ways to
 handle masked-out data, does anyone see any reason to consider rules
 that don't have, as a subset, the ones above? Do other rules have any
 use cases or user demand? (I *love* playing with clever mathematics
 and making things consistent, but there's not much point unless the
 end result is something that people will use :-).)


I guess I'm just confused on how one, in principle, would distinguish the
various forms of propagation that you are suggesting (ie for reductions).
I also don't think it is good that we lack commutativity.  If we disallow
unignoring, then yes, I agree that what you wrote above is what people
want.  But if we are allowed to unignore, then I do not.

Also, how does something like this get handled?

 a = [1, 2, IGNORED(3), NaN]

If I were to say, What is the mean of 'a'?, then I think most of the time
people would want 1.5.  I guess if we kept nanmean around, then we could do:

 a -= np.nanmean(a)
[-.5, .5, IGNORED(3), NaN]

Sorry if this is considered digging deeper than consensus.  I'm just
curious if arrays having NaNs in them, in addition to IGNORED, causes
problems.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 3:38 PM, Nathaniel Smith n...@pobox.com wrote:

 On Fri, Nov 4, 2011 at 3:08 PM, T J tjhn...@gmail.com wrote:
  On Fri, Nov 4, 2011 at 2:29 PM, Nathaniel Smith n...@pobox.com wrote:
  Continuing my theme of looking for consensus first... there are
  obviously a ton of ugly corners in here. But my impression is that at
  least for some simple cases, it's clear what users want:
 
   a = [1, IGNORED(2), 3]
  # array-with-ignored-values + unignored scalar only affects unignored
  values
   a + 2
  [3, IGNORED(2), 5]
  # reduction operations skip ignored values
   np.sum(a)
  4
 
  For example, Gary mentioned the common idiom of wanting to take an
  array and subtract off its mean, and he wants to do that while leaving
  the masked-out/ignored values unchanged. As long as the above cases
  work the way I wrote, we will have
 
   np.mean(a)
  2
   a -= np.mean(a)
   a
  [-1, IGNORED(2), 1]
 
  Which I'm pretty sure is the result that he wants. (Gary, is that
  right?) Also numpy.ma follows these rules, so that's some additional
  evidence that they're reasonable. (And I think part of the confusion
  between Lluís and me was that these are the rules that I meant when I
  said non-propagating, but he understood that to mean something
  else.)
 
  So before we start exploring the whole vast space of possible ways to
  handle masked-out data, does anyone see any reason to consider rules
  that don't have, as a subset, the ones above? Do other rules have any
  use cases or user demand? (I *love* playing with clever mathematics
  and making things consistent, but there's not much point unless the
  end result is something that people will use :-).)
 
  I guess I'm just confused on how one, in principle, would distinguish the
  various forms of propagation that you are suggesting (ie for reductions).

 Well, numpy.ma does work this way, so certainly it's possible to do.
 At the code level, np.add() and np.add.reduce() are different entry
 points and can behave differently.


I see your point, but that seems like just an API difference with a bad
name.  reduce() is just calling add() a bunch of times, so it seems like it
should behave as add() does.  That we can create different behaviors with
various assignment rules (like Pauli's 'm' for mark-ignored), only makes it
more confusing to me.

a = 1
a += 2
a += IGNORE

b = 1 + 2 + IGNORE

I think having a == b is essential.  If they can be different, that will
only lead to confusion.  On this point alone, does anyone think it is
acceptable to have a != b?



 OTOH, it might be that it's impossible to do *while still maintaining
 other things we care about*... but in that case we should just shake
 our fists at the mathematics and then give up, instead of coming up
 with an elegant system that isn't actually useful. So that's why I
 think we should figure out what's useful first.


Agreed.  I'm on the same page.



  I also don't think it is good that we lack commutativity.  If we disallow
  unignoring, then yes, I agree that what you wrote above is what people
  want.  But if we are allowed to unignore, then I do not.

 I *think* that for the no-unignoring (also known as MISSING) case,
 we have a pretty clear consensus that we want something like:

  a + 2
 [3, MISSING, 5]
  np.sum(a)
 MISSING
  np.sum(a, skip_MISSING=True)
 4

 (Please say if you disagree, but I really hope you don't!) This case
 is also easier, because we don't even have to allow a skip_MISSING
 flag in cases where it doesn't make sense (e.g., unary or binary
 operations) -- it's a convenience feature, so no-one will care if it
 only works when it's useful ;-).


Yes, in agreement.  I was talking specifically about the IGNORE case.   And
my point is that if we allow people to remove the IGNORE flag and see the
original data (and if the payloads are computed), then we should care about
commutativity:

 x = [1, IGNORE(2), 3]
 x2 = x.copy()
 y = [10, 11, IGNORE(12)]
 z = x + y
 a = z.sum()
 x += y
 b = x.sum()
 y += x2
 c = y.sum()

So, we should have:  a == b == c.
Additionally, if we allow users to unignore data, then we should have:

 x = [1, IGNORE(2), 3]
 x2 = x.copy()
 y = [10, 11, IGNORE(12)]
 x += y
 aa = unignore(x).sum()
 y += x2
 bb = unignore(y).sum()
 aa == bb
True

Is there agreement on this?


 Also, how does something like this get handled?
 
  a = [1, 2, IGNORED(3), NaN]
 
  If I were to say, What is the mean of 'a'?, then I think most of the
 time
  people would want 1.5.


 I would want NaN! But that's because the only way I get NaN's is when
 I do dumb things like compute log(0), and again, I want my code to
 tell me that I was dumb instead of just quietly making up a
 meaningless answer.


That's definitely field specific then.  In probability:  0 = 0 log(0) is a
common idiom.  In NumPy, 0 log(0) gives NaN, so you'd want to ignore then
when summing.
___
NumPy-Discussion mailing list
NumPy-Discussion

Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 4:29 PM, Pauli Virtanen p...@iki.fi wrote:

 04.11.2011 23:29, Pauli Virtanen kirjoitti:
 [clip]
  As the definition concerns only what happens on assignment, it does not
  have problems with commutativity.

 This is of course then not really true in a wider sense, as an example
 from T J shows:

 a = 1
 a += IGNORE(3)
 # - a := a + IGNORE(3)
 # - a := IGNORE(4)
 # - a == IGNORE(1)

 which is different from

 a = 1 + IGNORE(3)
 # - a == IGNORE(4)

 Damn, it seemed so good. Probably anything expect destructive assignment
 leads to problems like this with propagating special values.



Ok...with what I understand now, it seems like for almost all operations:

   MISSING :  PdS
   IGNORED : PdC  (this gives commutivity when unignoring data points)

When you want some sort of reduction, we want to change the behavior for
IGNORED so that it skips the IGNORED values by default. Personally, I still
believe that this non-consistent behavior warrants a new method name.  What
I mean is:

 x = np.array([1, IGNORED(2), 3])
 y = x.sum()
 z = x[0] + x[1] + x[2]

To say that y != z will only be a source of confusion. To remedy, we force
people to be explicit, even if they'll need to be explicit 99% of the time:

 q = x.sum(skipIGNORED=True)

Then we can have y == z  and y != q.  To make the 99% use case easier, we
provide a new method which passings the keyword for us.



With PdS and PdC is seems rather clear to me why MISSING should be
implemented as a bit pattern and IGNORED implemented using masks.  Setting
implementation details aside and going back to Nathaniel's original
biggest *un*resolved question, I am now convinced that these (IGNORED and
MISSING) should be distinct API concepts and still yet distinct from NaN
with floating point dtypes.  The NA implementation in NumPy does not seem
to match either of these (IGNORED and MISSING) exactly.  One cannot, as far
as I know, unignore an element marked as NA.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 6:31 PM, Pauli Virtanen p...@iki.fi wrote:

 05.11.2011 00:14, T J kirjoitti:
 [clip]
   a = 1
   a += 2
   a += IGNORE
   b = 1 + 2 + IGNORE
 
  I think having a == b is essential.  If they can be different, that will
  only lead to confusion.  On this point alone, does anyone think it is
  acceptable to have a != b?

 It seems to me that requiring this sort of a thing gives some
 limitations on how array operations should behave.

 An acid test for proposed rules: given two arrays `a` and `b`,

 a = [1, 2, IGNORED(3), IGNORED(4)]
b = [10, IGNORED(20), 30, IGNORED(40)]

 (a) Are the following pieces of code equivalent:

print unmask(a + b)
 a += 42
a += b
 print unmask(a)

 and

print unmask(b + a)
 a += b
a += 42
 print unmask(a)

 (b) Are the following two statements equivalent (wrt. ignored values):

a += b
 a[:] = a + b

 For np.ma (a) is false whereas (b) is true. For arrays containing nans,
 on the other hand (a) and (b) are both true (but of course, in this case
 values cannot be unmasked).

 Is there a way to define operations so that (a) is true, while retaining
 the desired other properties of arrays with ignored values?


I thought that PdC satisfied (a) and (b).
Let me show you what I thought they were. Perhaps I am not being
consistent. If so, point out my mistake.

(A)  You are making two comparisons.

(A1)  Does  unmask(a+b) == unmask(b + a) ?

Yes.  They both equal:

   unmask([11, IGNORED(22), IGNORED(33), IGNORED(44)])
 =
   [11, 22, 33, 44]


(A2) Is 'a' the same after a += 42; a += b and  a += b; a += 42?

Yes.

 a = [1, 2, IGNORED(3), IGNORED(4)]
 b = [10, IGNORED(20), 30, IGNORED(40)]
 a += 42; a
[43, 44, IGNORED(45), IGNORED(46)]
 a += b; a
[53, IGNORED(64), IGNORED(75), IGNORED(86)]

vs

 a = [1, 2, IGNORED(3), IGNORED(4)]
 b = [10, IGNORED(20), 30, IGNORED(40)]
 a += b; a
[11, IGNORED(22), IGNORED(33), IGNORED(44)]
 a += 42; a
[53, IGNORED(64), IGNORED(75), IGNORED(86)]


For part (B), I thought we were in agreement that inplace assignment should
be defined so that:

 a += b

is equivalent to:

 tmp = a + b
 a = tmp

If so, this definitely holds.


Have I missed something?  Probably.  Please spell it out for me.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in the NA discussion, what can we agree on?

2011-11-04 Thread T J
On Fri, Nov 4, 2011 at 8:03 PM, Nathaniel Smith n...@pobox.com wrote:

 On Fri, Nov 4, 2011 at 7:43 PM, T J tjhn...@gmail.com wrote:
  On Fri, Nov 4, 2011 at 6:31 PM, Pauli Virtanen p...@iki.fi wrote:
  An acid test for proposed rules: given two arrays `a` and `b`,
 
  a = [1, 2, IGNORED(3), IGNORED(4)]
 b = [10, IGNORED(20), 30, IGNORED(40)]
 [...]
  (A1)  Does  unmask(a+b) == unmask(b + a) ?
 
  Yes.  They both equal:
 
 unmask([11, IGNORED(22), IGNORED(33), IGNORED(44)])
   =
 [11, 22, 33, 44]

 Again, I really don't think you're going to be able to sell an API where
  [2] + [IGNORED(20)] == [IGNORED(22)]
 I mean, it's not me you have to convince, it's Gary, Pierre, maybe
 Benjamin, Lluís, etc. So I could be wrong. But you might want to
 figure that out first before making plans based on this...


But this is how np.ma currently does it, except that it doesn't compute the
payload---it just calls it IGNORED.
And it seems that this generalizes the way people want it to:

 z = [2, 4] + [IGNORED(20), 3]
 z
[IGNORED(24), 7]
 z.sum(skip_ignored=True)   # True could be the default
7
 z.sum(skip_ignored=False)
IGNORED(31)

I guess I am confused because it seems that you implicitly used this same
rule here:

Say we have
   a = np.array([1, IGNORED(2), 3])
   b = np.array([10, 20, 30])
(Here's I'm using IGNORED(2) to mean a value that is currently
ignored, but if you unmasked it it would have the value 2.)

Then we have:

# non-propagating **or** propagating, doesn't matter:

 a + 2

[3, IGNORED(2), 5]


That is, element-wise, you had to have done:

IGNORED(2) + 2 -- IGNORED(2).

I said it should be equal to IGNORED(4), but the result is still some form
of ignore.  Sorry if I am missing the bigger picture at this pointits
late and a Fri.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Example Usage of Neighborhood Iterator in Cython

2011-10-17 Thread T J
I recently put together a Cython example which uses the neighborhood
iterator.  It was trickier than I thought it would be, so I thought to
share it with the community.  The function takes a 1-dimensional array
and returns a 2-dimensional array of neighborhoods in the original
area. This is somewhat similar to the functionality provided by
segment_axis (http://projects.scipy.org/numpy/ticket/901), but I
believe this slightly different in that neighborhood can extend to the
left of the current index (assuming circular boundaries).  Keep in
mind that this is just an example, and normal usage probably is not
concerned with creating a new array.

External link:  http://codepad.org/czRIzXQl

--

import numpy as np
cimport numpy as np

cdef extern from numpy/arrayobject.h:

ctypedef extern class numpy.flatiter [object PyArrayIterObject]:
cdef int nd_m1
cdef np.npy_intp index, size
cdef np.ndarray ao
cdef char *dataptr

# This isn't exposed to the Python API.
# So we can't use the same approach we used to define flatiter
ctypedef struct PyArrayNeighborhoodIterObject:
int nd_m1
np.npy_intp index, size
np.PyArrayObject *ao # note the change from np.ndarray
char *dataptr

object PyArray_NeighborhoodIterNew(flatiter it, np.npy_intp* bounds,
   int mode, np.ndarray fill_value)
int PyArrayNeighborhoodIter_Next(PyArrayNeighborhoodIterObject *it)
int PyArrayNeighborhoodIter_Reset(PyArrayNeighborhoodIterObject *it)

object PyArray_IterNew(object arr)
void PyArray_ITER_NEXT(flatiter it)
np.npy_intp PyArray_SIZE(np.ndarray arr)

cdef enum:
NPY_NEIGHBORHOOD_ITER_ZERO_PADDING,
NPY_NEIGHBORHOOD_ITER_ONE_PADDING,
NPY_NEIGHBORHOOD_ITER_CONSTANT_PADDING,
NPY_NEIGHBORHOOD_ITER_CIRCULAR_PADDING,
NPY_NEIGHBORHOOD_ITER_MIRROR_PADDING

np.import_array()

def windowed(np.ndarray[np.int_t, ndim=1] arr, bounds):

cdef flatiter iterx = flatiterPyArray_IterNew(objectarr)
cdef np.npy_intp size = PyArray_SIZE(arr)
cdef np.npy_intp* boundsPtr = [bounds[0], bounds[1]]
cdef int hoodSize = bounds[1] - bounds[0] + 1

# Create the Python object and keep a reference to it
cdef object niterx_ = PyArray_NeighborhoodIterNew(iterx,
boundsPtr, NPY_NEIGHBORHOOD_ITER_CIRCULAR_PADDING, None)
cdef PyArrayNeighborhoodIterObject *niterx = \
PyArrayNeighborhoodIterObject *niterx_

cdef int i,j
cdef np.ndarray[np.int_t, ndim=2] hoods

hoods = np.empty((arr.shape[0], hoodSize), dtype=np.int)
for i in range(iterx.size):
for j in range(niterx.size):
hoods[i,j] = (niterx.dataptr)[0]
PyArrayNeighborhoodIter_Next(niterx)
PyArray_ITER_NEXT(iterx)
PyArrayNeighborhoodIter_Reset(niterx)
return hoods

def test():
x = np.arange(10)
print x
print
print windowed(x, [-1, 3])
print
print windowed(x, [-2, 2])


--

If you run test(), this is what you should see:

[0 1 2 3 4 5 6 7 8 9]

[[9 0 1 2 3]
 [0 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 6]
 [3 4 5 6 7]
 [4 5 6 7 8]
 [5 6 7 8 9]
 [6 7 8 9 0]
 [7 8 9 0 1]
 [8 9 0 1 2]]

[[8 9 0 1 2]
 [9 0 1 2 3]
 [0 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 6]
 [3 4 5 6 7]
 [4 5 6 7 8]
 [5 6 7 8 9]
 [6 7 8 9 0]
 [7 8 9 0 1]]

windowed(x, [0, 2]) is almost like segment_axis(x, 3, 2, end='wrap').
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Non-rectangular neighborhoods

2011-10-08 Thread T J
While reading the documentation for the neighborhood iterator, it
seems that it can only handle rectangular neighborhoods.  Have I
understood this correctly?  If it is possible to do non-rectangular
regions, could someone post an example/sketch of how to do this?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] random number generator, entropy and pickling

2011-04-25 Thread T J
On Mon, Apr 25, 2011 at 9:57 AM, Gael Varoquaux
gael.varoqu...@normalesup.org wrote:

 We thought that we could simply have a PRNG per object, as in:

    def __init__(self, prng=None):
        if prng is None:
            prng = np.random.RandomState()
        self.prng = prng

 I don't like this option, because it means that with a given pieve of
 code, setting the seed of numpy's PRNG isn't enough to make it
 reproducible.

 I couldn't retrieve a handle on a picklable instance for the global PRNG.

 The only option I can see would be to use the global numpy PRNG to seed
 an instance specific RandomState, as in:

    def __init__(self, prng=None):
        if prng is None:
            prng = np.random.RandomState(np.random.random())
        self.prng = prng

 That way seeding the global PRNG really does control the full random
 number generation. I am wondering if it would have an adverse consequence
 on the entropy of the stream of random numbers. Does anybody have
 suggestions? Advices?


If code A relies on code B (eg, some numpy function) and code B
changes, then the stream of random numbers will no longer be the same.
 The point here is that the user wrote code A but depended on code B,
and even though code A was unchanged, their random numbers were not
the same.

The situation is improved if scikits.learn used its own global
RandomState instance.  Then code A will at least give the same stream
of random numbers for a fixed version of scikits.learn.  It should be
made very clear though that the data stream cannot be expected to be
the same across versions.

As to each object having its own RandomState instance, I definitely
see that it makes restoring the overall state of a piece of code
harder, but perhaps utility functions could make this easier.  I can
imagine that users might want to arbitrarily set the seed for a
particular object in the midsts of a larger piece of code.  Perhaps
the user is testing known failure cases of a algorithm A interacting
with algorithm B.  If the user wants to loop over known seeds which
cause algorithm A to fail but needs algorithm B to keep its seed
fixed, then it seems like having a global seed makes this more
difficult.  In that sense, it might be desirable to have independent
prngs. On the other hand, maybe that is an uncommon use case that
could be handled through manually setting the seed.

Post back on what you guys decide.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] einsum

2011-01-26 Thread T J
On Wed, Jan 26, 2011 at 5:02 PM, Joshua Holbrook
josh.holbr...@gmail.com wrote:
 Ah, sorry for misunderstanding.  That would actually be very difficult,
 as the iterator required a fair bit of fixes and adjustments to the core.
 The new_iterator branch should be 1.5 ABI compatible, if that helps.

 I see. Perhaps the fixes and adjustments can/should be included with
 numpy standard, even if the Einstein notation package is made a
 separate module.

snip
 Indeed, I would like to be able to install and use einsum() without
 having to install another version of numpy. Even if it depends on
 features of a new numpy, it'd be nice to have it be a separate module.

I don't really understand the desire to have this single function
exist in a separate package.  If it requires the new version of NumPy,
then you'll have to install/upgrade either way...and if it comes as
part of that new NumPy, then you are already set.  Doesn't a separate
package complicate things unnecessarily?  It make sense to me if
einsum consisted of many functions (such as Bottleneck).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Build failure at rev8246

2010-05-21 Thread T J
Hi,

I tried upgrading today and had trouble building numpy (after rm -rf
build).  My full build log is here:

http://www.filedump.net/dumped/build1274454454.txt

If someone can point me in the right direction, I'd appreciate it very
much.  To excerpts from the log file:


Running from numpy source directory.numpy/core/setup_common.py:86:
MismatchCAPIWarning: API mismatch detected, the C API version numbers
have to be updated. Current C api version is 4, with checksum
59750b518272c8987f02d66445afd3f1, but recorded checksum for C API
version 4 in codegen_dir/cversions.txt is
3d8940bf7b0d2a4e25be4338c14c3c85. If functions were added in the C
API, you have to update C_API_VERSION  in numpy/core/setup_common.pyc.
  MismatchCAPIWarning)


In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:36:
numpy/core/src/multiarray/buffer.c: At top level:
numpy/core/src/multiarray/buffer.c:715: error: conflicting types for
‘_descriptor_from_pep3118_format’
numpy/core/src/multiarray/common.c:221: note: previous implicit
declaration of ‘_descriptor_from_pep3118_format’ was here
numpy/core/src/multiarray/buffer.c: In function
‘_descriptor_from_pep3118_format’:
numpy/core/src/multiarray/buffer.c:751: warning: assignment from
incompatible pointer type
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:45:
numpy/core/src/multiarray/multiarraymodule.c: In function ‘initmultiarray’:
numpy/core/src/multiarray/multiarraymodule.c:3062: warning: implicit
declaration of function ‘_numpymemoryview_init’
error: Command gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv
-O2 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include
-Ibuild/src.linux-i686-2.6/numpy/core/include/numpy
-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
-Inumpy/core/src/npymath -Inumpy/core/src/multiarray
-Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.6
-Ibuild/src.linux-i686-2.6/numpy/core/src/multiarray
-Ibuild/src.linux-i686-2.6/numpy/core/src/umath -c
numpy/core/src/multiarray/multiarraymodule_onefile.c -o
build/temp.linux-i686-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o
failed with exit status 1


Hope this helps.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Build failure at rev8246

2010-05-21 Thread T J
On Fri, May 21, 2010 at 8:51 AM, Pauli Virtanen p...@iki.fi wrote:
 Fri, 21 May 2010 08:09:55 -0700, T J wrote:
 I tried upgrading today and had trouble building numpy (after rm -rf
 build).  My full build log is here:

     http://www.filedump.net/dumped/build1274454454.txt


 Your SVN checkout might be corrupted, containing a mix of old and new
 files. Try building from a clean checkout.


That was it!

When I did an svn update, I noticed conflicts all over the place and
accepted with tf but I guess that was not enough.  Given that I
haven't really touched this directory, what was the likely cause of
this corruption?  The last time I svn update'd was a few months ago.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] pareto docstring

2010-05-11 Thread T J
On Mon, May 10, 2010 at 8:37 PM,  josef.p...@gmail.com wrote:

 I went googling and found a new interpretation

 numpy.random.pareto is actually the Lomax distribution also known as Pareto 2,
 Pareto (II) or Pareto Second Kind distribution


Great!


 So, from this it looks like numpy.random does not have a Pareto
 distribution, only Lomax, and the confusion came maybe because
 somewhere in the history the (II) (second kind) got dropped in the
 explanations.

 and actually it is in scipy.stats.distributions, but without rvs

 # LOMAX (Pareto of the second kind.)
 #  Special case of Pareto of the first kind (location=-1.0)


I understand the point with this last comment, but I think it can be
confusing in that the Pareto (of the first kind) has no location
parameter and people might think you are referring to the Generalized
Pareto distribution.  I think its much clearer to say:

 # Special case of the Pareto of the first kind, but shifted to the
left by 1.x -- x + 1



  2) Modify numpy/random/mtrand/distributions.c in the following way:

 double rk_pareto(rk_state *state, double a)
 {
   //return exp(rk_standard_exponential(state)/a) - 1;
   return 1.0 / rk_double(state)**(1.0 / a);
 }

 I'm not an expert on random number generator, but using the uniform 
 distribution
 as in
 http://en.wikipedia.org/wiki/Pareto_distribution#Generating_a_random_sample_from_Pareto_distribution
 and your Devroy reference seems better, than based on the relationship to
 the exponential distribution
 http://en.wikipedia.org/wiki/Pareto_distribution#Relation_to_the_exponential_distribution



Correct.  The exp relationship was for the existing implementation
(which corresponds to the Lomax).  I commented that line out and just
used 1/U^(1/a).


 I think without changing the source we can rewrite the docstring that
 this is Lomax (or
 Pareto of the Second Kind), so that at least the documentation is less
 misleading.

 But I find calling it Pareto very confusing, and I'm not the only one anymore,
 (and I don't know if anyone has used it assuming it is classical Pareto),
 so my preferred solution would be

 * rename numpy.random.pareto to numpy.random.lomax
 * and create a real (classical, first kind) pareto distribution (even
 though it's just
  adding or subtracting 1, ones we know it)


I personally have used numpy.random.pareto thinking it was the Pareto
distribution of the first kind---which led to this post in the first
place.  So, I'm in strong agreement.  While doing this, perhaps we
should increase functionality and allow users the ability to specify
the scale of the distribution (instead of just the shape)?

I can make a ticket for this and give a stab at creating the necessary patch.



 What's the backwards compatibility policy with very confusing names in numpy?


It seems reasonable that we might have to follow the deprecation
route, but I'd be happier with a faster fix.

1.5
  - Provide numpy.random.lomax.  Make numpy.random.pareto raise a
DeprecationWarning and then call lomax.
2.0 (if there is no 1.6)
  - Make numpy.random.pareto behave as Pareto distribution of 1st kind.

Immediately though, we can modify the docstring that is currently in
there to make the situation clear, instructing users how they can
generate samples from the standard Pareto distribution.  This is the
first patch I'll submit.  Perhaps it is better to only change the
docstring and then save all changes in functionality for 2.0.
Deferring to others on this one...
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] pareto docstring

2010-05-10 Thread T J
On Sun, May 9, 2010 at 4:49 AM,  josef.p...@gmail.com wrote:

 I think this is the same point, I was trying to make last year.

 Instead of renormalizing, my conclusion was the following,
 (copied from the mailinglist August last year)

 
 my conclusion:
 -
 What numpy.random.pareto actually produces, are random numbers from a
 pareto distribution with lower bound m=1, but location parameter
 loc=-1, that shifts the distribution to the left.

 To actually get useful  random numbers (that are correct in the usual
 usage http://en.wikipedia.org/wiki/Pareto_distribution), we need to
 add 1 to them.
 stats.distributions doesn't use mtrand.pareto

 rvs_pareto = 1 + numpy.random.pareto(a, size)

 

 I still have to work though the math of your argument, but maybe we
 can come to an agreement how the docstrings (or the function) should
 be changed, and what numpy.random.pareto really means.

 Josef
 (grateful, that there are another set of eyes on this)




Yes, I think my renormalizing statement is incorrect as it is really
just sampling from a different pdf altogether.  See the following image:

http://www.dumpt.com/img/viewer.php?file=q9tfk7ehxsw865vn067c.png

It plots histograms of the various implementations against the pdfs.
Summarizing:

The NumPy implementation is based on (Devroye p. 262).  The pdf listed
there is:

a / (1+x)^(a+1)

This differs from the standard Pareto pdf:

a / x^(a+1)

It also differs from the pdf of the generalized Pareto distribution,
with scale=1 and location=0:

(1 + a x)^(-1/a - 1)

And it also differs from the pdf of the generalized Pareto
distribution with scale=1 and location=-1  or location=1.

random.paretovariate and scipy.stats.pareto sample from the standard
Pareto, and this is the desired behavior, IMO.  Its true that 1 +
np.random.pareto provides the fix, but I think we're better off
changing the underlying implementation.  Devroye has a more recent
paper:

  http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.85.8760

which states the Pareto distribution in the standard way.  So I think
it is safe to make this change.  Backwards compatibility might be the
only argument for not making this change.  So here is my proposal:

  1) Remove every mention of the generalized Pareto distribution from
the docstring.  As far as I can see, the generalized Pareto
distribution does not reduce to the standard Pareto at all.  We can
still mention scipy.stats.distributions.genpareto and
scipy.stats.distributions.pareto.  The former is something different
and the latter will (now) be equivalent to the NumPy function.

  2) Modify numpy/random/mtrand/distributions.c in the following way:

double rk_pareto(rk_state *state, double a)
{
   //return exp(rk_standard_exponential(state)/a) - 1;
   return 1.0 / rk_double(state)**(1.0 / a);
}

Does this sound good?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] pareto docstring

2010-05-08 Thread T J
The docstring for np.pareto says:

This is a simplified version of the Generalized Pareto distribution
(available in SciPy), with the scale set to one and the location set to
zero. Most authors default the location to one.

and also:

The probability density for the Pareto distribution is

.. math:: p(x) = \frac{am^a}{x^{a+1}}

where :math:`a` is the shape and :math:`m` the location

These two statements seem to be in contradiction.  I think what was
meant is that m is the scale, rather than the location.  For if m were
equal to zero, as the first portion of the docstring states, then the
entire pdf would be zero for all shapes a0.




Also,  I'm not quite understanding how the stated pdf is actually the
same as the pdf for the generalized pareto with the scale=1 and
location=0.  By the wikipedia definition of the generalized Pareto
distribution, if we take \sigma=1 (scale equal to one)  and \mu=0
(location equal to zero), then we get:

(1 + a x)^(-1/a - 1)

which is normalized over $x \in (0, \infty)$.  If we compare this to
the distribution stated in the docstring (with m=1)

a x^{-a-1}

we see that it is normalized over $x \in (1, \infty)$.  And indeed,
the distribution requires x  scale = 1.

If we integrate the generalized Pareto (with scale=1, location=0) over
$x \in (1, \infty)$ then we have to re-normalize.  So should the
docstring say:

   This is a simplified version of the Generalized Pareto distribution
   (available in Scipy), with the scale set to one, the location set to zero,
   and the distribution re-normalized over the range (1, \infty). Most
   authors default the location to one.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Remove duplicate columns

2010-05-06 Thread T J
Hi,

Is there a way to sort the columns in an array?  I need to sort it so
that I can easily go through and keep only the unique columns.
ndarray.sort(axis=1) doesn't do what I want as it destroys the
relative ordering between the various columns. For example, I would
like:

[[2,1,3],
 [3,5,1],
 [0,3,1]]

to go to:

[[1,2,3],
 [5,3,1],
 [3,0,1]]

(swap the first and second columns).  So I want to treat the columns
as objects and sort them.  I can do this if I convert to a python
list, but I was hoping to avoid doing that because I ultimately need
to do element-wise bitwise operations.

Thanks!
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove duplicate columns

2010-05-06 Thread T J
On Thu, May 6, 2010 at 10:36 AM,  josef.p...@gmail.com wrote:

 there is a thread last august on unique rows which might be useful,
 and a thread in Dec 2008 for sorting rows

 something like

 np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])

 maybe it's np.unique with numpy 1.4.


The thread is useful:

  http://www.mail-archive.com/numpy-discussion@scipy.org/msg19830.html

I'll have to see if it is quicker for me to just do:

 y = x.transpose().tolist()
 y.sort()
 x = np.array(y).transpose()
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What should be the value of nansum of nan's?

2010-04-28 Thread T J
On Mon, Apr 26, 2010 at 10:03 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Mon, Apr 26, 2010 at 10:55 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:

 Hi All,

 We need to make a decision for ticket #1123 regarding what nansum should
 return when all values are nan. At some earlier point it was zero, but
 currently it is nan, in fact it is nan whatever the operation is. That is
 consistent, simple and serves to mark the array or axis as containing all
 nans. I would like to close the ticket and am a bit inclined to go with the
 current behaviour although there is an argument to be made for returning 0
 for the nansum case. Thoughts?


 To add a bit of context, one could argue that the results should be
 consistent with the equivalent operations on empty arrays and always be
 non-nan.

 In [1]: nansum([])
 Out[1]: nan

 In [2]: sum([])
 Out[2]: 0.0


This seems like an obvious one to me.  What is the spirit of nansum?


Return the sum of array elements over a given axis treating
Not a Numbers (NaNs) as zero.


Okay.  So NaNs in an array are treated as zeros and the sum is
performed as one normally would perform it starting with an initial
sum of zero.  So if all values are NaN, then we add nothing to our
original sum and still return 0.

I'm not sure I understand the argument that it should return NaN.  It
is counter to the *purpose* of nansum.   Also, if one wants to
determine if all values in an array are NaN, isn't there another way?
Let's keep (or make) those distinct operations, as they are definitely
distinct concepts.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How do I ensure numpy headers are present in setup.py?

2010-04-05 Thread T J
On Mon, Apr 5, 2010 at 11:28 AM, Robert Kern robert.k...@gmail.com wrote:
 On Mon, Apr 5, 2010 at 13:26, Erik Tollerud erik.tolle...@gmail.com wrote:
 Hmm, unfortunate.  So the best approach then is probably just to tell
 people to install numpy first, then my package?

 Yup.


And really, this isn't that unreasonable.  Not only does this make
users more aware of their environment (ie, the distinction between
your package and the major numerical package in Python), but its so
much cleaner.  With the combined approach, any NumPy installation
problems would be (frequently) associated with your package.  On the
other hand, if there are any NumPy installation problems in the
separated approach, at least your package is blameless.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in logaddexp2.reduce

2010-03-31 Thread T J
Hi,

I'm getting some strange behavior with logaddexp2.reduce:

from itertools import permutations
import numpy as np
x = np.array([-53.584962500721154, -1.5849625007211563, -0.5849625007211563])
for p in permutations([0,1,2]):
print p, np.logaddexp2.reduce(x[list(p)])

Essentially, the result depends on the order of the array...and we get
nans in the bad orders.  Likely, this also affects logaddexp.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in logaddexp2.reduce

2010-03-31 Thread T J
On Wed, Mar 31, 2010 at 10:30 AM, T J tjhn...@gmail.com wrote:
 Hi,

 I'm getting some strange behavior with logaddexp2.reduce:

 from itertools import permutations
 import numpy as np
 x = np.array([-53.584962500721154, -1.5849625007211563, -0.5849625007211563])
 for p in permutations([0,1,2]):
    print p, np.logaddexp2.reduce(x[list(p)])

 Essentially, the result depends on the order of the array...and we get
 nans in the bad orders.  Likely, this also affects logaddexp.


Sorry, forgot version information:

$ python -c import numpy;print numpy.__version__
1.5.0.dev8106
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in logaddexp2.reduce

2010-03-31 Thread T J
On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris
charlesr.har...@gmail.com wrote:

 Looks like roundoff error.


So this is expected behavior?

In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154)
Out[1]: -1.5849625007211561

In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154)
Out[2]: nan

In [3]: np.log2(np.exp2(-0.5849625007211563) + np.exp2(-53.584962500721154))
Out[3]: -0.58496250072115608

Shouldn't the result at least behave nicely and just return the larger value?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in logaddexp2.reduce

2010-03-31 Thread T J
On Wed, Mar 31, 2010 at 3:36 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 So this is expected behavior?

 In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154)
 Out[1]: -1.5849625007211561

 In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154)
 Out[2]: nan

 I don't see that

 In [1]: np.logaddexp2(-0.5849625007211563, -53.584962500721154)
 Out[1]: -0.58496250072115619

 In [2]: np.logaddexp2(-1.5849625007211563, -53.584962500721154)
 Out[2]: -1.5849625007211561

 What system are you running on.


$ python --version
Python 2.6.4

$ uname -a
Linux localhost 2.6.31-20-generic-pae #58-Ubuntu SMP Fri Mar 12
06:25:51 UTC 2010 i686 GNU/Linux
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in logaddexp2.reduce

2010-03-31 Thread T J
On Wed, Mar 31, 2010 at 3:38 PM, David Warde-Farley d...@cs.toronto.edu wrote:
 Unfortunately there's no good way of getting around order-of-
 operations-related rounding error using the reduce() machinery, that I
 know of.


That seems reasonable, but receiving a nan, in this case, does not.
Are my expectations are unreasonable?

Would sorting the values before reducing be an ugly(-enough) solution?
 It seems to fix it in this particular case.  Or is the method you
posted in the link below a better solution?

 I'd like to see a 'logsumexp' and 'logsumexp2' in NumPy instead, using
 the generalized ufunc architecture, to do it over an arbitrary
 dimension of an array, rather than needing to invoke 'reduce' on
 logaddexp. I tried my hand at writing one but I couldn't manage to get
 it working for some reason, and I haven't had time to revisit it: 
 http://mail.scipy.org/pipermail/numpy-discussion/2010-January/048067.html


I saw this original post and was hoping someone would post a response.
 Maybe now...
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in logaddexp2.reduce

2010-03-31 Thread T J
On Wed, Mar 31, 2010 at 7:06 PM, Charles R Harris
charlesr.har...@gmail.com wrote:

 That is a 32 bit kernel, right?


Correct.

Regarding the config.h, which config.h?  I have a numpyconfig.h.
Which compilation options should I obtain and how?  When I run
setup.py, I see:

C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2
-Wall -Wstrict-prototypes -fPIC

compile options: '-Inumpy/core/include
-Ibuild/src.linux-i686-2.6/numpy/core/include/numpy
-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
-Inumpy/core/src/npymath -Inumpy/core/src/multiarray
-Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.6
-Ibuild/src.linux-i686-2.6/numpy/core/src/multiarray
-Ibuild/src.linux-i686-2.6/numpy/core/src/umath -c'

Here is my show_config().

atlas_threads_info:
  NOT AVAILABLE
blas_opt_info:
libraries = ['f77blas', 'cblas', 'atlas']
library_dirs = ['/usr/lib/sse2']
define_macros = [('ATLAS_INFO', '\\3.6.0\\')]
language = c
include_dirs = ['/usr/include']
atlas_blas_threads_info:
  NOT AVAILABLE
lapack_opt_info:
libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
library_dirs = ['/usr/lib/sse2/atlas', '/usr/lib/sse2']
define_macros = [('ATLAS_INFO', '\\3.6.0\\')]
language = f77
include_dirs = ['/usr/include']
atlas_info:
libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
library_dirs = ['/usr/lib/sse2/atlas', '/usr/lib/sse2']
language = f77
include_dirs = ['/usr/include']
lapack_mkl_info:
  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE
atlas_blas_info:
libraries = ['f77blas', 'cblas', 'atlas']
library_dirs = ['/usr/lib/sse2']
language = c
include_dirs = ['/usr/include']
mkl_info:
  NOT AVAILABLE

$ gcc --version
gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Uninformative Error

2010-01-10 Thread T J
When passing in a list of longs and asking that the dtype be a float
(yes, losing precision), the error message is uninformative whenever
the long is larger than the largest float.

 x = 
 181626642333486640664316511479918087634811756599984861278481913634852446858952226941059178462566942027148832976486383692715763966132465634039844094073670028044755150133224694791817752891901042496950233943249209777416692569138779593594686170807571874640682826295728116325492852625325418526603207268018328608840
 array(x, dtype=float)
OverflowError: long int too large to convert to float

 array([x], dtype=float)
ValueError: setting an array element with a sequence.

The first error is informative, but the second is not and will occur
anytime one tries to convert a python list containing longs which are
too long.  Is there a way this error message could be made more
helpful?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Repeated dot products

2009-12-12 Thread T J
Hi,

Suppose I have an array of shape:  (n, k, k).  In this case, I have n
k-by-k matrices.  My goal is to compute the product of a (potentially
large) user-specified selection (with replacement) of these matrices.
For example,

   x = [0,1,2,1,3,3,2,1,3,2,1,5,3,2,3,5,2,5,3,2,1,3,5,6]

says that I want to take the 0th matrix and dot it with the 1st matrix
and dot that product with the 2nd matrix and dot that product with the
1st matrix again and so on...

Essentially, I am looking for efficient ways of doing this.  It seems
like what I *need* is for dot to be a ufunc with a reduce() operator.
Then I would construct an array of the matrices, as specified by the
input x.  For now, I am using a python loop and this is unbearable.

 prod = np.eye(k)
 for i in x:
...  prod = dot(prod, matrices[i])
...

Is there a better way?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Row-wise dot product?

2009-09-07 Thread T J
Is there a better way to achieve the following, perhaps without the
python for loop?

 x.shape
(1,3)
 y.shape
(1,3)
 z = empty(len(x))
 for i in range(1):
...z[i] = dot(x[i], y[i])
...
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Row-wise dot product?

2009-09-07 Thread T J
On Mon, Sep 7, 2009 at 3:27 PM, T Jtjhn...@gmail.com wrote:
 On Mon, Sep 7, 2009 at 7:09 AM, Hans-Andreas Engeleng...@deshaw.com wrote:
 If you wish to avoid the extra memory allocation implied by `x*y'
 and get a ~4x speed-up, you can use a generalized ufunc
 (numpy = 1.3, stolen from the testcases):

   z = numpy.core.umath_tests.inner1d(x, y)


 This is exactly what I was hoping for.  Now, I can also keep an array
 of vectors and apply a rotation matrix to each vector.


I spoke too soon.  inner1d will not allow me to rotate each row in the
array.  Is there another function that will help with this?  If I'm
understanding the signature for generalized ufuncs, it looks like I
need:  (i),(i,j)-(i)

Or perhaps I am just being dense.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Row-wise dot product?

2009-09-07 Thread T J
On Mon, Sep 7, 2009 at 3:43 PM, T Jtjhn...@gmail.com wrote:
 Or perhaps I am just being dense.


Yes.  I just tried to reinvent standard matrix multiplication.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Row-wise dot product?

2009-09-07 Thread T J
On Mon, Sep 7, 2009 at 7:09 AM, Hans-Andreas Engeleng...@deshaw.com wrote:
 If you wish to avoid the extra memory allocation implied by `x*y'
 and get a ~4x speed-up, you can use a generalized ufunc
 (numpy = 1.3, stolen from the testcases):

   z = numpy.core.umath_tests.inner1d(x, y)


This is exactly what I was hoping for.  Now, I can also keep an array
of vectors and apply a rotation matrix to each vector.

Hopefully, these use cases show serve as good proof on why the
generalized ufunc machinery is useful.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Specifying Index Programmatically

2009-08-08 Thread T J
I have an array, and I need to index it like so:

z[...,x,:]

How can I write code which will index z, as above, when x is not known
ahead of time. For that matter, the particular dimension I am querying
is not known either.  In case this is still confusing, I am looking
for the NumPy way to do:

   z[(...,5,:)]

  z[(:, 3, :, 5, ...)]

  z[(1, ..., 5)]

Basically, I want to be able to pre-construct what should appear
inside the [].  The numbers are no problem, but I'm having trouble
with the ellipsis and colon.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] reduce function of vectorize doesn't respect dtype?

2009-08-08 Thread T J
On Fri, Aug 7, 2009 at 11:54 AM, T Jtjhn...@gmail.com wrote:
 The reduce function of ufunc of a vectorized function doesn't seem to
 respect the dtype.

 def a(x,y): return x+y
 b = vectorize(a)
 c = array([1,2])
 b(c, c)  # use once to populate b.ufunc
 d = b.ufunc.reduce(c)
 c.dtype, type(d)
 dtype('int32'), type 'int'

 c = array([[1,2,3],[4,5,6]])
 b.ufunc.reduce(c)
 array([5, 7, 9], dtype=object)


So is this a bug? Or am I doing something wrong?  In the second example

 d = b.ufunc.reduce(c)
 type(d[0])
type 'int'
 d.dtype
dtype('object')
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Specifying Index Programmatically

2009-08-08 Thread T J
On Sat, Aug 8, 2009 at 8:54 PM, Neil Martinsen-Burrelln...@wartburg.edu wrote:

 The ellipsis is a built-in python constant called Ellipsis.  The colon
 is a slice object, again a python built-in, called with None as an
 argument.  So, z[...,2,:] == z[Ellipsis,2,slice(None)].


Very helpful!  Thank you.  I didn't run into this information in any
of the indexing tutorials I ran through.  If I get some time, I'll try
to add it.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Indexing with a list...

2009-08-08 Thread T J
 z = array([1,2,3,4])
 z[[1]]
array([1])
 z[(1,)]
1

I'm just curious: What is the motivation for this differing behavior?

Is it a necessary consequence of, for example, the following:

 z[z3]
array([1,2])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Vectorize ufunc

2009-08-07 Thread T J
I was wondering why vectorize doesn't make the ufunc available at the
topmost level

 def a(x,y):  return x + y
 b = vectorize(a)
 b.reduce

Instead, the ufunc is stored at b.ufunc.

Also, b.ufunc.reduce() doesn't seem to exist until I *use* the
vectorized function at least once.  Can this be changed so that it
exists right away (and preferably at b.reduce instead of
b.ufunc.reduce)?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] reduce function of vectorize doesn't respect dtype?

2009-08-07 Thread T J
The reduce function of ufunc of a vectorized function doesn't seem to
respect the dtype.

 def a(x,y): return x+y
 b = vectorize(a)
 c = array([1,2])
 b(c, c)  # use once to populate b.ufunc
 d = b.ufunc.reduce(c)
 c.dtype, type(d)
dtype('int32'), type 'int'

 c = array([[1,2,3],[4,5,6]])
 b.ufunc.reduce(c)
array([5, 7, 9], dtype=object)

My goal is to use the output of vectorize() as if it is actually a ufunc.
So I'd really like to just type:   b.reduce, b.accumulate, etc.
And I don't want to have to write (after using it once to populate b.ufunc):
 b.ufunc.reduce(c).astype(numpy.int32)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] dot documentation

2009-08-07 Thread T J
Hi,  the documentation for dot says that a value error is raised if:

If the last dimension of a is not the same size as the
second-to-last dimension of b.

(http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.htm)

This doesn't appear to be the case:

 a = array([[1,2],[3,4]])
 b = array([1,2])
 dot(a,b)
array([5,11])

I can see *how* 5,11 is obtained, but it seems this should have raised
a ValueError since the 2 != 1.  So the actual code must do something
more involved.  When I think about broadcasting, it seems that maybe b
should have been broadcasted to:

--  array([[1,2],[1,2]])

and then the multiplication done as normal (but this would give a 2x2 result).

Can someone explain this to me?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dot documentation

2009-08-07 Thread T J
Oh.  b.shape = (2,).  So I suppose the second to last dimension is, in
fact, the last dimension...and 2 == 2.

nvm

On Fri, Aug 7, 2009 at 2:19 PM, T Jtjhn...@gmail.com wrote:
 Hi,  the documentation for dot says that a value error is raised if:

    If the last dimension of a is not the same size as the
 second-to-last dimension of b.

 (http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.htm)

 This doesn't appear to be the case:

 a = array([[1,2],[3,4]])
 b = array([1,2])
 dot(a,b)
 array([5,11])

 I can see *how* 5,11 is obtained, but it seems this should have raised
 a ValueError since the 2 != 1.  So the actual code must do something
 more involved.  When I think about broadcasting, it seems that maybe b
 should have been broadcasted to:

 --  array([[1,2],[1,2]])

 and then the multiplication done as normal (but this would give a 2x2 result).

 Can someone explain this to me?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Repeated Dot Operations

2009-07-20 Thread T J
Hi,

Is there a good way to perform dot on an arbitrary list of arrays
which avoids using a loop?  Here is what I'd like to avoid:

# m1, m2, m3  are arrays
 out = np.(m1.shape[0])
 prod = [m1, m2, m3, m1, m2, m3, m3, m2]
 for m in prod:
... out = np.dot(out, m)
...

I was hoping for something like np.dot.reduce(prod).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread T J
On Tue, Jan 20, 2009 at 6:57 PM, Neal Becker ndbeck...@gmail.com wrote:
 It seems the big chunks of time are used in data conversion between numpy
 and my own vectors classes.  Mine are wrappers around boost::ublas.  The
 conversion must be falling back on a very inefficient method since there is no
 special code to handle numpy vectors.

 Not sure what is the best solution.  It would be _great_ if I could make
 boost::python objects that export a buffer interface, but I have absolutely
 no idea how to do this (and so far noone else has volunteered any info on
 this).


I'm not sure if I've understood everything here, but I think that
pyublas provides exactly what you need.

http://tiker.net/doc/pyublas/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New ufuncs

2008-11-10 Thread T J
On Mon, Nov 10, 2008 at 4:05 PM, Charles R Harris
[EMAIL PROTECTED] wrote:

 I added log2 and exp2. I still need to do the complex versions. I think
 logaddexp2 should go in also to compliment these.

Same here, especially since logaddexp is present.  Or was the idea
that both logexpadd and logexpadd2 should be moved to scipy.special?

 Note that MPL also defines
 log2 and their version has slightly different properties, i.e., it returns
 integer values for integer powers of two.


I'm just curious now.  Can someone comment on the difference in the
implementation just committed versus that in cephes?

http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/special/cephes/exp2.c

The difference won't matter to me as far as usage goes, but I was curious.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New ufuncs

2008-11-09 Thread T J
On Thu, Nov 6, 2008 at 3:01 PM, T J [EMAIL PROTECTED] wrote:
 On Thu, Nov 6, 2008 at 2:36 PM, Charles R Harris
 [EMAIL PROTECTED] wrote:
 I could add exp2, log2, and logaddexp2 pretty easily. Almost too easily, I
 don't want to clutter up numpy with a lot of functions. However, if there is
 a community for these functions I will put them in.


 I worry about clutter as well.  Note that scipy provides log2 and exp2
 already (scipy.special).  So I think only logaddexp2 would be needed
 and (eventually) logdotexp2.  Maybe scipy.special is a better place
 than in numpy?  Then perhaps the clutter could be avoidedthough
 I'm probably not the best one to ask for advice on this.  I will
 definitely use the functions and I suspect many others will as
 well---where ever they are placed.

Since no one commented further on this, can we go ahead and add
logaddexp2?  Once in svn, we can always deal with 'location' later---I
just don't want it to get forgotten.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] atlas not found, why?

2008-11-07 Thread T J
On Fri, Nov 7, 2008 at 2:16 AM, David Cournapeau
[EMAIL PROTECTED] wrote:

 And you have no site.cfg at all ?


Wow. I was too focused on the current directory and didn't realize I
had an old site.cfg in ~/.

Two points:

1) Others (myself included) might catch such silliness sooner if the
location of the cfg file were printed to screen during setup.py.  As
of now, I get:

Running from numpy source directory.
non-existing path in 'numpy/distutils': 'site.cfg'
F2PY Version 2_5986
...

2) The current system_info.py says:


The file 'site.cfg' is looked for in

1) Directory of main setup.py file being run.
2) Home directory of user running the setup.py file as ~/.numpy-site.cfg
3) System wide directory (location of this file...)


Doesn't this mean that it should *not* have picked up my ~/site.cfg?


Anyway, I can now report that ATLAS is detected without defining any
environment variables.  Thanks for all the help!
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] atlas not found, why?

2008-11-07 Thread T J
On Fri, Nov 7, 2008 at 1:58 AM, T J [EMAIL PROTECTED] wrote:

 That the fortran wrappers were compiled using g77 is also apparent via
 what is printed out during setup when ATLAS is detected:

 gcc -pthread _configtest.o -L/usr/lib/atlas -llapack -lblas -o _configtest
 ATLAS version 3.6.0 built by root on Fri Jan  9 15:57:20 UTC 2004:
   UNAME: Linux intech67 2.4.20 #1 SMP Fri Jan 10 18:29:51 EST
 2003 i686 GNU/Linux
   INSTFLG  :
   MMDEF: /fix/g/camm/atlas3-3.6.0/CONFIG/ARCHS/P4SSE2/gcc/gemm
   ARCHDEF  : /fix/g/camm/atlas3-3.6.0/CONFIG/ARCHS/P4SSE2/gcc/misc
   F2CDEFS  : -DAdd__ -DStringSunStyle
   CACHEEDGE: 1048576
   F77  : /usr/bin/g77, version GNU Fortran (GCC) 3.3.3 20031229
 (prerelease) (Debian)
   F77FLAGS : -fomit-frame-pointer -O
   CC   : /usr/bin/gcc, version gcc (GCC) 3.3.3 20031229
 (prerelease) (Debian)
   CC FLAGS : -fomit-frame-pointer -O3 -funroll-all-loops
   MCC  : /usr/bin/gcc, version gcc (GCC) 3.3.3 20031229
 (prerelease) (Debian)
   MCCFLAGS : -fomit-frame-pointer -O
 success!


That was intended to be a question.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] atlas not found, why?

2008-11-07 Thread T J
On Fri, Nov 7, 2008 at 1:48 AM, David Cournapeau
[EMAIL PROTECTED] wrote:

 It works for me on Intrepid (64 bits). Did you install
 libatlas3gf-base-dev ? (the names changed in intrepid).


I fear I am overlooking something obvious.

$ sudo aptitude search libatlas
p   libatlas-3dnow-dev  -
Automatically Tuned Linear Algebra Software,3dnow static
v   libatlas-3gf.so -
i   libatlas-base-dev   -
Automatically Tuned Linear Algebra Software,generic static
p   libatlas-cpp-0.6-1  - The
protocol library of the World Forge project - runtime libs
p   libatlas-cpp-0.6-1-dbg  - The
protocol library of the World Forge project - debugging libs
p   libatlas-cpp-0.6-dev- The
protocol library of the World Forge project - header files
p   libatlas-cpp-doc- The
protocol library of the World Forge project - documentation
p   libatlas-doc-
Automatically Tuned Linear Algebra Software,documentation
i A libatlas-headers-
Automatically Tuned Linear Algebra Software,C header files
p   libatlas-sse-dev-
Automatically Tuned Linear Algebra Software,SSE1 static
i   libatlas-sse2-dev   -
Automatically Tuned Linear Algebra Software,SSE2 static
p   libatlas-test   -
Automatically Tuned Linear Algebra Software,test programs
v   libatlas.so.3   -
v   libatlas.so.3gf -
p   libatlas3gf-3dnow   -
Automatically Tuned Linear Algebra Software,3dnow shared
i A libatlas3gf-base-
Automatically Tuned Linear Algebra Software,generic shared
p   libatlas3gf-sse -
Automatically Tuned Linear Algebra Software,SSE1 shared
i   libatlas3gf-sse2-
Automatically Tuned Linear Algebra Software,SSE2 shared

It looks like I have the important ones:
   libatlas-base-dev
   libatlas-headers
   libatlas-sse2-dev
   libatlas3gf-base
   libatlas3gf-sse2

But I don't see libatlas3gf-base-dev anywhere.  Is this on a special
repository?  My sources.list is:

# Intrepid Final Release Repository
deb http://archive.ubuntu.com/ubuntu intrepid main restricted universe
multiverse
deb-src http://archive.ubuntu.com/ubuntu intrepid main restricted
universe multiverse #Added by software-properties

# Intrepid Security Updates
deb http://archive.ubuntu.com/ubuntu intrepid-security main restricted
universe multiverse
deb-src http://archive.ubuntu.com/ubuntu intrepid-security main
restricted universe multiverse #Added by software-properties

# Intrepid Bugfix Updates
deb http://archive.ubuntu.com/ubuntu intrepid-updates main restricted
universe multiverse
deb-src http://archive.ubuntu.com/ubuntu intrepid-updates main
restricted universe multiverse #Added by software-properties


I suppose it is possible that something is messed up on my end since I
upgraded from 8.04.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] atlas not found, why?

2008-11-07 Thread T J
On Fri, Nov 7, 2008 at 1:26 AM, David Cournapeau
[EMAIL PROTECTED] wrote:
 David Cournapeau wrote:

 Ok, I took a brief look at this: I forgot that Ubuntu and Debian added
 an aditional library suffix to libraries depending on gfortran ABI. I
 added support for this in numpy.distutils - which was looking for
 libraries explicitly; could you retry *without* a site.cfg ? It
 should work, now,


 And it won't, because I am afraid the Ubuntu atlas package is broken in
 8.10... They use the gfortran  ABI, but they built the fortran wrappers
 with g77, according to ATL_buildinfo.

 https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/295051

 I would strongly advise using atlas on Ubuntu 8.10 until this bug is
 solved: it means any numpy/scipy code using linear algebra is
 potentially broken (segfault, wrong results).


Intended:
   '''I would strongly advise *against* using atlas on Ubuntu 8.10.'''   :-)

That the fortran wrappers were compiled using g77 is also apparent via
what is printed out during setup when ATLAS is detected:

gcc -pthread _configtest.o -L/usr/lib/atlas -llapack -lblas -o _configtest
ATLAS version 3.6.0 built by root on Fri Jan  9 15:57:20 UTC 2004:
   UNAME: Linux intech67 2.4.20 #1 SMP Fri Jan 10 18:29:51 EST
2003 i686 GNU/Linux
   INSTFLG  :
   MMDEF: /fix/g/camm/atlas3-3.6.0/CONFIG/ARCHS/P4SSE2/gcc/gemm
   ARCHDEF  : /fix/g/camm/atlas3-3.6.0/CONFIG/ARCHS/P4SSE2/gcc/misc
   F2CDEFS  : -DAdd__ -DStringSunStyle
   CACHEEDGE: 1048576
   F77  : /usr/bin/g77, version GNU Fortran (GCC) 3.3.3 20031229
(prerelease) (Debian)
   F77FLAGS : -fomit-frame-pointer -O
   CC   : /usr/bin/gcc, version gcc (GCC) 3.3.3 20031229
(prerelease) (Debian)
   CC FLAGS : -fomit-frame-pointer -O3 -funroll-all-loops
   MCC  : /usr/bin/gcc, version gcc (GCC) 3.3.3 20031229
(prerelease) (Debian)
   MCCFLAGS : -fomit-frame-pointer -O
success!

And the problem with this, as you've mentioned before, is that g77 and
gfortran are not abi compatible.  But isn't this issue separate from
the autodetection of atlas without a site.cfg?  With r5986, atlas is
still only detected if I declare ATLAS:

$ ATLAS=/usr/lib python setup.py build

versus

$ unset ATLAS; python setup.py build
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New ufuncs

2008-11-06 Thread T J
On Thu, Nov 6, 2008 at 1:48 PM, Charles R Harris
[EMAIL PROTECTED] wrote:
 What is your particular interest in these other bases and why would
 they be better than working in base e and converting at the end?

The interest is in information theory, where quantities are
(standardly) represented in bits.  So log2 quantities are often stored
by the user and then passed into functions or classes.  The main
reason I'd like to shy away from conversions is that I also make use
of generators/iterators and having next() convert to bits before each
yield is not ideal (as these things are often slow enough and will be
called many times).

The only one I could see really having a fast implementation is log2.

No disagreement here :)
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New ufuncs

2008-11-06 Thread T J
On Thu, Nov 6, 2008 at 2:36 PM, Charles R Harris
[EMAIL PROTECTED] wrote:
 I could add exp2, log2, and logaddexp2 pretty easily. Almost too easily, I
 don't want to clutter up numpy with a lot of functions. However, if there is
 a community for these functions I will put them in.


I worry about clutter as well.  Note that scipy provides log2 and exp2
already (scipy.special).  So I think only logaddexp2 would be needed
and (eventually) logdotexp2.  Maybe scipy.special is a better place
than in numpy?  Then perhaps the clutter could be avoidedthough
I'm probably not the best one to ask for advice on this.  I will
definitely use the functions and I suspect many others will as
well---where ever they are placed.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New ufuncs

2008-11-05 Thread T J
On Wed, Nov 5, 2008 at 12:00 PM, Charles R Harris
[EMAIL PROTECTED] wrote:
 Hmm I wonder if the base function should be renamed logaddexp, then
 logsumexp would apply to the reduce method. Thoughts?


As David mentioned, logsumexp is probably the traditional name, but as
the earlier link shows, it also goes by logadd.  Given the distinction
between add (a ufunc) and sum (something done over an axis) within
numpy, it seems that logadd or logaddexp is probably a more fitting
name.  So long as it is documented, I doubt it matters much though...
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New ufuncs

2008-11-04 Thread T J
On Tue, Nov 4, 2008 at 9:37 PM, Anne Archibald
[EMAIL PROTECTED] wrote:
 2008/11/5 Charles R Harris [EMAIL PROTECTED]:
 Hi All,

 I'm thinking of adding some new ufuncs. Some possibilities are

 expadd(a,b) = exp(a) + exp(b) -- For numbers stored as logs:

 Surely this should be log(exp(a)+exp(b))? That would be extremely useful, yes.


+1

But shouldn't it be called 'logadd', for adding values which are stored as logs?

http://www.lri.fr/~pierres/donn%E9es/save/these/torch/docs/manual/logAdd.html

I would also really enjoy a logdot  function, to be used when working
with arrays whose elements are log values.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] atlas not found, why?

2008-11-03 Thread T J
Numpy doesn't seem to be finding my atlas install.  Have I done
something wrong or misunderstood?

$ cd /usr/lib
$ ls libatlas*
libatlas.a  libatlas.so  libatlas.so.3gf  libatlas.so.3gf.0
$ ls libf77*
libf77blas.a  libf77blas.so  libf77blas.so.3gf  libf77blas.so.3gf.0
$ ls libcblas*
libcblas.a  libcblas.so  libcblas.so.3gf  libcblas.so.3gf.0
$ ls liblapack*
liblapack-3.a   liblapack.aliblapack_atlas.so   
liblapack_atlas.so.3gf.0  liblapackgf-3.so  liblapack.so.3gf
liblapack-3.so  liblapack_atlas.a  liblapack_atlas.so.3gf
liblapackgf-3.a  liblapack.so  liblapack.so.3gf.0


Since these are all in the standard locations, I am building without a
site.cfg.  Here is the beginning info:

Running from numpy source directory.
non-existing path in 'numpy/distutils': 'site.cfg'
F2PY Version 2_5972
blas_opt_info:
blas_mkl_info:
  libraries mkl,vml,guide not found in /usr/local/lib
  libraries mkl,vml,guide not found in /usr/lib
  NOT AVAILABLE

atlas_blas_threads_info:
Setting PTATLAS=ATLAS
  NOT AVAILABLE

atlas_blas_info:
  NOT AVAILABLE

/tmp/numpy/numpy/distutils/system_info.py:1340: UserWarning:
Atlas (http://math-atlas.sourceforge.net/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [atlas]) or by setting
the ATLAS environment variable.
  warnings.warn(AtlasNotFoundError.__doc__)
blas_info:
  libraries blas not found in /usr/local/lib
  FOUND:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77

  FOUND:
libraries = ['blas']
library_dirs = ['/usr/lib']
define_macros = [('NO_ATLAS_INFO', 1)]
language = f77

lapack_opt_info:
lapack_mkl_info:
mkl_info:
  libraries mkl,vml,guide not found in /usr/local/lib
  libraries mkl,vml,guide not found in /usr/lib
  NOT AVAILABLE

  NOT AVAILABLE

atlas_threads_info:
Setting PTATLAS=ATLAS
numpy.distutils.system_info.atlas_threads_info
  NOT AVAILABLE

atlas_info:
numpy.distutils.system_info.atlas_info
  NOT AVAILABLE

/tmp/numpy/numpy/distutils/system_info.py:1247: UserWarning:
Atlas (http://math-atlas.sourceforge.net/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [atlas]) or by setting
the ATLAS environment variable.
  warnings.warn(AtlasNotFoundError.__doc__)
lapack_info:
  libraries lapack not found in /usr/local/lib
  FOUND:
libraries = ['lapack']
library_dirs = ['/usr/lib']
language = f77

  FOUND:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib']
define_macros = [('NO_ATLAS_INFO', 1)]
language = f77
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] atlas not found, why?

2008-11-03 Thread T J
On Mon, Nov 3, 2008 at 10:46 AM, T J [EMAIL PROTECTED] wrote:

 Since these are all in the standard locations, I am building without a
 site.cfg.  Here is the beginning info:


Apparently, this is not enough.  Only if I also set the ATLAS
environment variable am I able to get this working as expected.  So
while I now have this working, I'd still like to understand why this
is necessary.This is for Ubuntu 8.10.  My previous experience was
that no site.cfg was necessary.  This is still the case, but you also
need ATLAS defined.

I was also expecting that the following site.cfg would avoid requiring
ATLAS to be defined, but it did not.

[DEFAULT]
library_dirs = /usr/lib:/usr/lib/sse2
include_dirs = /usr/local/include

[blas_opt]
libraries = f77blas, cblas, atlas

[lapack_opt]
libraries = lapack, f77blas, cblas, atlas

So can someone explain why I *must* define ATLAS.  I tried a number of
variations on site.cfg and could not get numpy to find atlas with any
of them.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Installation Trouble

2008-11-02 Thread T J
Sorrywrong list.

On Sun, Nov 2, 2008 at 11:34 AM, T J [EMAIL PROTECTED] wrote:
 Hi,

 I'm having trouble installing PyUblas 0.93.1 (same problems from the
 current git repository).  I'm in ubuntu 8.04 with standard boost
 packages (1.34.1, I believe).  Do you have any suggestions?

 Thanks!

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combinations of objects (?)

2008-10-20 Thread T J
On Mon, Oct 20, 2008 at 2:20 AM, A. G. wrote:
 one well attached to 2 or more units). Is there any simple way in
 numpy (scipy?) in which I can get the number of possible combinations
 of wells attached to the different 3 units, without repetitions? For
 example, I could have all 60 wells attached to unit number 1, then 59
 on unit 1 and 1 on unit 3 and so on...


From:  http://tinyurl.com/6oeyx8

def boxings(n, k):
seq, i = [n]*k + [0], k
while i:
yield tuple(seq[i] - seq[i+1] for i in xrange(k))
i = seq.index(0) - 1
seq[i:k] = [seq[i] - 1] * (k-i)


Example:

 from scipy import factorial as f
 x = list(boxings(60,3))
 len(x)
1891
 f(60+3-1) / f(60) / f(3-1)
1891.2


That thread contains another solution using itertools.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] SWIG, typemaps, 2D argout arrays

2008-10-14 Thread T J
On Tue, Oct 14, 2008 at 1:02 AM, Sebastian Haase [EMAIL PROTECTED] wrote:
 b) I don't want to use Python / numpy API code in the C functions I'm
 wrapping - so I limit myself to input arrays! Since array memory
 does not distinguish between input or output (assuming there is no
 copying needed because of alignment or contiguity issues) the only
 implication of this is that I have to allocate the array outside the C
 functions.

To clarify, are you saying that you don't make any use of numpy.i?
The functions I'm writing are not making any explicit use of the
Python/numpy API, but I'm sure it shows up in the *_wrap.c file that
SWIG creates.

 input-conversion transparently for my --- do you need to handle
 non-contiguous arrays ?

Probably not.


 HTH, I'm eager to learn here ;-)


Thanks
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] SWIG, typemaps, 2D argout arrays

2008-10-13 Thread T J
Hi,

I'm new to using SWIG and my reading of numpy_swig.pdf tells me that
the following typemap does not exist:

  (int* ARGOUT_ARRAY2, int DIM1, int DIM2)

What is the recommended way to output a 2D array?  It seems like I should use:

  (int* ARGOUT_ARRAY1, int DIM1)

and then provide a python function which reshapes the 1D array?  Is it
correct that there will be insignificant performance disadvantages to
this?  Also, is there any way to do this in an automated fashion?  My
first thought is that I'd need to create this function outside of the
python module that SWIG creates.

Thanks!
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Test failures on 2.6

2008-10-05 Thread T J
Hi,

I'm getting a couple of test failures with Python 2.6, Numpy 1.2.0, Nose 0.10.4:



nose version 0.10.4
..FK..
 
.../share/home/me/usr/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:68:
SyntaxWarning: assertion is always true, perhaps remove parentheses?
  assert(c.readlines(),
./share/home/me/usr/lib/python2.6/site-packages/numpy/ma/tests/test_core.py:1315:
SyntaxWarning: assertion is always true, perhaps remove parentheses?
  assert(store._mask, True)
/home/me/usr/lib/python2.6/site-packages/numpy/ma/tests/test_core.py:1322:
SyntaxWarning: assertion is always true, perhaps remove parentheses?
  assert(store._mask, True)
/home/me/usr/lib/python2.6/site-packages/numpy/ma/tests/test_core.py:1989:
SyntaxWarning: assertion is always true, perhaps remove parentheses?
  assert(test.mask, [0,1,0,0,0,0,0,0,0,0])
...E
==
ERROR: Tests the min/max functions with explicit outputs
--
Traceback (most recent call last):
  File /home/me/usr/lib/python2.6/site-packages/numpy/ma/tests/test_core.py,
line 653, in test_minmax_funcs_with_output
result = npfunc(xm,axis=0,out=nout)
  File /home/me/usr/lib/python2.6/site-packages/numpy/core/fromnumeric.py,
line 1525, in amin
return amin(axis, out)
  File /home/me/usr/lib/python2.6/site-packages/numpy/ma/core.py,
line 2978, in min
np.putmask(out, newmask, np.nan)
ValueError: cannot convert float NaN to integer

==
FAIL: test_umath.TestComplexFunctions.test_against_cmath
--
Traceback (most recent call last):
  File 
/home/me/usr/lib/python2.6/site-packages/nose-0.10.4-py2.6.egg/nose/case.py,
line 182, in runTest
self.test(*self.arg)
  File 
/home/me/usr/lib/python2.6/site-packages/numpy/core/tests/test_umath.py,
line 268, in test_against_cmath
assert abs(a - b)  atol, %s %s: %s; cmath: %s%(fname,p,a,b)
AssertionError: arcsin 2: (1.57079632679-1.31695789692j); cmath:
(1.57079632679+1.31695789692j)

--
Ran 1726 tests in 8.856s

FAILED (KNOWNFAIL=1, errors=1, failures=1)
nose.result.TextTestResult run=1726 errors=1 failures=1
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Log Arrays

2008-05-08 Thread T J
Hi,

For precision reasons, I almost always need to work with arrays whose
elements are log values.  My thought was that it would be really neat
to have a 'logarray' class implemented in C or as a subclass of the
standard array class.  Here is a sample of how I'd like to work with
these objects:

 x = array([-2,-2,-3], base=2)
 y = array([-1,-2,-inf], base=2)
 z = x + y
 z
array([-0.415037499279, -1.0, -3])
 z = x * y
 z
array([-3, -4, -inf])
 z[:2].sum()
-2.41503749928

This would do a lot for the code I writeand some of the numerical
stability issues would handled more appropriately.  For example, the
logsum function is frequently handled as:

 log(x + y)  ==  log(x) + log(1 + exp(log(y) - log(x)) )

when log(x)  log(y).   So the goal of the above function is to return
log(x + y)  using only the logarithms of x and y.


Does this sound like a good idea?
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Log Arrays

2008-05-08 Thread T J
On Thu, May 8, 2008 at 12:26 AM, T J [EMAIL PROTECTED] wrote:

   x = array([-2,-2,-3], base=2)
   y = array([-1,-2,-inf], base=2)
   z = x + y
   z
  array([-0.415037499279, -1.0, -3])
   z = x * y
   z
  array([-3, -4, -inf])
   z[:2].sum()
  -2.41503749928


Whoops

s/array/logarray/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Log Arrays

2008-05-08 Thread T J
On 5/8/08, Anne Archibald [EMAIL PROTECTED] wrote:
 Is logarray really the way to handle it, though? it seems like you
 could probably get away with providing a logsum ufunc that did the
 right thing. I mean, what operations does one want to do on logarrays?

 add - logsum
 subtract - ?
 multiply - add
 mean - logsum/N
 median - median
 exponentiate to recover normal-space values - exp
 str - ?


That's about it, as far as my usage goes.  Additionally, I would also
benefit from:

   logdot
   logouter

In addition to the elementwise operations, it would be nice to have

   logsum along an axis
   logprod along an axis
   cumlogsum
   cumlogprod

Whether these are through additional ufuncs or through a subclass is
not so much of an issue for me---either would be a huge improvement to
the current situation.  One benefit of a subclass, IMO, is that it
maintains the feel of non-log arrays.  That is, when I want to
multiply to logarrays, I just do x*y, rather than x+ybut I can
understand arguments that this might not be desirable.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion