from:"John Hunter"

Re: [Numpy-discussion] Meta: help, devel and stackoverflow

2012-06-30 Thread John Hunter

On Fri, Jun 29, 2012 at 2:20 PM, Jim Vickroy jim.vick...@noaa.gov wrote:

 As a lurker and user, I too wish for a distinct numpy-users list.  -- jv


This thread is a perfect example of why another list is needed.  It's
currently 42 semi-philosophical posts about what kind community numpy
should be and what kinds of lists or stacks should serve it.  There needs
to be a place where people can ask simple 'how do I do x in numpy
questions without having to wade through hundreds of posts about release
cycles, community input, process, and decisions about ABI and API
compatibility in point versus major releases.  Most people just don't care
-- they just want to be reasonably sure that the developers do care and are
doing it right.  And if they want to participate or observe these
discussions, they know where to go.  It's like sausage making -- the more
people get an inside look at how the sausage is made, the more they are
afraid to eat it.

In mpl we have a devel list and a users list.  Preparing for a release, we
might have a hundred emails about PR status and breakers and release cycles
and god knows what.  The users list gets rc1 is ready for testing, rc2
is ready for testing and v1.1.1 is released.  That's about all most
people want to know about our release process.

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Dropping support for Python 2.4 in NumPy 1.8

2012-06-28 Thread John Hunter

On Thu, Jun 28, 2012 at 7:25 AM, Travis Oliphant tra...@continuum.io wrote:
 Hey all,

 I'd like to propose dropping support for Python 2.4 in NumPy 1.8 (not the 1.7 
 release).      What does everyone think of that?


As a tangential point, MPL is dropping support for python2.4 in it's
next major release.  As such we have put a lot of  effort in making
our upcoming point release extremely stable since it is likely to be
the last 2.4 release.  Our next major release, either designated 1.2
or 2.0 TBT) will have python3 support, and it seemed to much to try
and support python versions from 2.4 on up.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumPy 1.7 release delays

2012-06-27 Thread John Hunter

 Some examples would be nice. A lot of people did move already. And I haven't
 seen reports of those that tried and got stuck. Also, Debian and Python(x,
 y) have 1.6.2, EPD has 1.6.1.

In my company, the numpy for our production python install is well
behind 1.6.  In the world of trading, the upgrade cycle can be slow,
because when people have production trading systems that are working
and running stably, they have little or no incentive to upgrade.  I
know Travis has been doing a lot of consulting inside major banks and
investment houses, and these are probably the kinds of people he sees
regularly.  You also have a fair amount of personnel turnover over the
years, so that the developer who wrote the trading system may have
moved on, and an upgrade which breaks the code is difficult to repair
because the original developers are gone.  So people are loathe to
upgrade.  It is certainly true that deprecations that have lived for a
single point release cycle have not been vetted by a large part of the
user community.

In my group, we try to stay as close to the bleeding edge as possible
so as to not fall behind and make an upgrade painful, but we are not
the rule.

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Created NumPy 1.7.x branch

2012-06-26 Thread John Hunter

On Tue, Jun 26, 2012 at 3:27 PM, Thouis (Ray) Jones tho...@gmail.com wrote:
 +1 !

 Speaking as someone trying to get started in contributing to numpy, I
 find this discussion extremely off-putting.  It's childish,
 meaningless, and spiteful, and I think it's doing more harm than any
 possible good that could come out of continuing it.

Hey Thouis,

Just chiming in to encourage you not to get discouraged.  There is a
large, mostly silent majority who feel just the same way you do, it's
just that they are silent precisely because they want to write good
code and contribute and not participate in long, unproductive email
threads that border on flame wars.  You've made helpful comments here
already advising people to take this offlist.  After that there is
nothing much to do but roll up your sleeves, make some pull requests,
and engage in a worthwhile discussion about work.  There are lots of
people here who will engage you on that.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] all elements equal

2012-03-05 Thread John Hunter

On Mon, Mar 5, 2012 at 1:29 PM, Keith Goodman kwgood...@gmail.com wrote:


 I[8] np.allclose(a, a[0])
 O[8] False
 I[9] a = np.ones(10)
 I[10] np.allclose(a, a[0])
 O[10] True


One disadvantage of using a[0] as a proxy is that the result depends on the
ordering of a

  (a.max() - a.min())  epsilon

is an alternative that avoids this.  Another good use case for a minmax
func.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-29 Thread John Hunter

On Wed, Feb 29, 2012 at 1:20 PM, Neal Becker ndbeck...@gmail.com wrote:


 Much of Linus's complaints have to do with the use of c++ in the _kernel_.
 These objections are quite different for an _application_.  For example,
 there
 are issues with the need for support libraries for exception handling.
  Not an
 issue for an application.

 Actually, the thread was on the git mailing list, and many of
his complaints were addressing the appropriateness of C++ for git
development.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-28 Thread John Hunter

On Sat, Feb 18, 2012 at 5:09 PM, David Cournapeau courn...@gmail.comwrote:


 There are better languages than C++ that has most of the technical
 benefits stated in this discussion (rust and D being the most
 obvious ones), but whose usage is unrealistic today for various
 reasons: knowledge, availability on esoteric platforms, etc… A new
 language is completely ridiculous.



I just saw this for the first time today: Linus Torvalds on C++ (
http://harmful.cat-v.org/software/c++/linus).  The post is from 2007 so
many of you may have seen it, but I thought it was entertainng enough and
on-topic enough with this thread that I'd share it in case you haven't.


The point he makes:

  In other words, the only way to do good, efficient, and system-level and
  portable C++ ends up to limit yourself to all the things that
are basically
  available in C

was interesting to me because the best C++ library I have ever worked with
(agg) imports *nothing* except standard C libs (no standard template
library).  In fact, the only includes external to external to itself
are math.h, stdlib.h, stdio.h, and string.h.

To shoehorn Jamie Zawinski's famous regex quote (
http://regex.info/blog/2006-09-15/247).  Some people, when confronted with
a problem, think “I know, I'll use boost.”   Now they have two problems.

Here is the Linus post:

From: Linus Torvalds torvalds at linux-foundation.org
Subject: Re: [RFC] Convert builin-mailinfo.c to use The Better String
Library.
Newsgroups: gmane.comp.version-control.git
Date: 2007-09-06 17:50:28 GMT (2 years, 14 weeks, 16 hours and 36 minutes
ago)

On Wed, 5 Sep 2007, Dmitry Kakurin wrote:

 When I first looked at Git source code two things struck me as odd:
 1. Pure C as opposed to C++. No idea why. Please don't talk about
portability,
 it's BS.

*YOU* are full of bullshit.

C++ is a horrible language. It's made more horrible by the fact that a lot
of substandard programmers use it, to the point where it's much much
easier to generate total and utter crap with it. Quite frankly, even if
the choice of C were to do *nothing* but keep the C++ programmers out,
that in itself would be a huge reason to use C.

In other words: the choice of C is the only sane choice. I know Miles
Bader jokingly said to piss you off, but it's actually true. I've come
to the conclusion that any programmer that would prefer the project to be
in C++ over C is likely a programmer that I really *would* prefer to piss
off, so that he doesn't come and screw up any project I'm involved with.

C++ leads to really really bad design choices. You invariably start using
the nice library features of the language like STL and Boost and other
total and utter crap, that may help you program, but causes:

 - infinite amounts of pain when they don't work (and anybody who tells me
   that STL and especially Boost are stable and portable is just so full
   of BS that it's not even funny)

 - inefficient abstracted programming models where two years down the road
   you notice that some abstraction wasn't very efficient, but now all
   your code depends on all the nice object models around it, and you
   cannot fix it without rewriting your app.

In other words, the only way to do good, efficient, and system-level and
portable C++ ends up to limit yourself to all the things that are
basically available in C. And limiting your project to C means that people
don't screw that up, and also means that you get a lot of programmers that
do actually understand low-level issues and don't screw things up with any
idiotic object model crap.

So I'm sorry, but for something like git, where efficiency was a primary
objective, the advantages of C++ is just a huge mistake. The fact that
we also piss off people who cannot see that is just a big additional
advantage.

If you want a VCS that is written in C++, go play with Monotone. Really.
They use a real database. They use nice object-oriented libraries.
They use nice C++ abstractions. And quite frankly, as a result of all
these design decisions that sound so appealing to some CS people, the end
result is a horrible and unmaintainable mess.

But I'm sure you'd like it more than git.

Linus
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy governance update

2012-02-16 Thread John Hunter

On Thu, Feb 16, 2012 at 7:26 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 2/16/2012 7:22 PM, Matthew Brett wrote:
  This has not been an encouraging episode in striving for consensus.

 I disagree.
 Failure to reach consensus does not imply lack of striving.


Hey Alan, thanks for your thoughtful and nuanced views.  I agree  with
everything you've said, but have a few additional points.

At the risk of wading into a thread that has grown far too long, and
echoing Eric's comments that the idea of governance is murky at best
when there is no provision for enforceability, I have a few comments.
Full disclosure: Travis has asked me and I have agreed to to serve on
a board for numfocus, the not-for-profit arm of his efforts to
promote numpy and related tools.  Although I have no special numpy
developer chops, as the original author of matplotlib, which is one of
the leading numpy clients, he asked me to join his organization as a
community representative.  I support his efforts, and so agreed to
join the numfocus board.

My first and most important point is that the subtext of many postings here
about the fear of undue and inappropriate influence of Continuum under
Travis' leadership is far overblown.  Travis created numpy -- it is
his baby.  Undeniably, he created it by standing on the shoulders of
giants: Jim Hugunin, Paul Dubois, Perry Greenfield and his team, and
many others.  But the idea that we need to guard against the
possibility that his corporate interests will compromise his interests
in what is best for numpy is academic at best.

As someone who has created a significant project in the realm of
scientific computing in Python, I can tell you that it is something
I take quite a bit of pride in and it is very important to me that the
project thrives as it was intended to: as a free, open-source,
best-practice way of doing science.  I know Travis well enough to know
he feels the same way -- numpy doing well is *at least* important to
him his company doing well.  All of his recent actions to start a
company and foundation which focuses resources on numpy and related
tools reinforce that view.  If he had a different temperament, he
wouldn't have devoted five to ten years of is life to Numeric, scipy
and numpy.  He is a BDFL for a reason: he has earned our trust.

And he has proven his ability to lead when *almost everyone* was
against him.  At the height of the Numeric/numarray split, and I was
deeply involved in this as the mpl author because we had a numerix
compatibility layer to allow users to use one or the other, Travis
proposed writing numpy to solve both camp's problems.  I really can't
remember a single individual who supported him.  What I remember is
the cacophony of voices who though this was a bad idea, because of the
third fork problem.  But Travis forged ahead, on his own, wrote
numpy, and re-united the Numeric and numarray camps.  And
all-the-while he maintained his friendship with the numarray
developers (Perry Greenfield who led the numarray development effort
has also been invited by Travis to the numfocus board, as has Fernando
Perez and Jarrod Millman).  Although MPL at the time agreed to support
a third version in its numerix compatibility layer for numpy, I can
thankfully say we have since dropped support for the compatibility
layer entirely as we all use numpy now.  This to me is the distilled
essence of leadership, against the voices of the masses, and it bears
remembering.

I have two more points I want to make: one is on democracy, and one is
on corporate control.  On corporate control: there have been a number
of posts in this thread about the worries and dangers that Continuum
poses as the corporate sponser of numpy development, about how this
may cause numpy to shift from a model of a few loosely connected,
decentralized cadre of volunteers to a centrally controlled steering
committee of programmers who are controlled by corporate headquarters
and who make all their decisions around the water cooler unobserved by
the community of users.

I want to make a connection to something that happened in the history
of matplotlib development, something that is not strictly analogous
but I think close enough to be informative.  Sometime around 2005,
Perry Greenfield, who heads the development team of the Space
Telescope Science Institute (STScI) that is charged with processing
the Hubble image pipeline, emailed me that he was considering using
matplotlib as their primary image visualization tool.  I can't tell
you how excited I was at the time.  The idea of having institutional
sponsorship from someone as prestigious and resourceful as STScI was
hugely motivating.  I worked feverishly for months to add stuff they
needed: better rendering, better image support, mathtext and lots
more.  But more importantly, Perry was offering to bring institutional
support to my project: well qualified full-time employees who
dedicated a significant part of their time to matplotlib
development. He

Re: [Numpy-discussion] wanted: decent matplotlib alternative

2011-10-13 Thread John Hunter





On Oct 13, 2011, at 4:21 PM, Zachary Pincus zachary.pin...@yale.edu wrote:

 I keep meaning to use matplotlib as well, but every time I try I also get 
 really turned off by the matlabish interface in the examples. I get that it's 
 a selling point for matlab refugees, but I find it counterintuitive in the 
 same way Christoph seems to.
 
 I'm glad to hear the OO interface isn't as clunky as it looks on some of the 
 doc pages, though. This is good news. Can anyone point out any good 
 tutorials/docs on using matplotlib idiomatically via its OO interface?
 
 

I would start with these examples

http://matplotlib.sourceforge.net/examples/api/index.html

These examples use pyplot only for figure generation, mostly because this is 
the easiest way to get a Figure instance correctly wired across user interface 
toolkits, but use the API for everything else. 

And this tutorial, which explains the central object hierarchy:

http://matplotlib.sourceforge.net/users/artists.html

For a deeper dive, these tutorials may be of interest too:

http://matplotlib.sourceforge.net/users/transforms_tutorial.html

http://matplotlib.sourceforge.net/users/path_tutorial.html


http://matplotlib.sourceforge.net/users/event_handling.html


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] segfault on complex array on solaris x86

2011-08-17 Thread John Hunter

On Wed, Apr 13, 2011 at 8:50 AM, John Hunter jdh2...@gmail.com wrote:
 On Sat, Jan 15, 2011 at 7:28 AM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 I've opened http://projects.scipy.org/numpy/ticket/1713 so this doesn't get
 lost.

 Just wanted to bump this -- bug still exists in numpy HEAD 2.0.0.dev-fe3852f

Just wanted to mention that this segfault still exists in
2.0.0.dev-4386275 and I updated the ticket at

http://projects.scipy.org/numpy/ticket/1713

with a much simpler test script. Basically::

  import numpy as np
  xn = np.exp(2j)

is causing a segfault on my solaris platform
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] segfault on complex array on solaris x86

2011-04-13 Thread John Hunter

On Sat, Jan 15, 2011 at 7:28 AM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:
 I've opened http://projects.scipy.org/numpy/ticket/1713 so this doesn't get
 lost.

Just wanted to bump this -- bug still exists in numpy HEAD 2.0.0.dev-fe3852f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] segfault on complex array on solaris x86

2011-01-05 Thread John Hunter

jo...@udesktop253:~ gcc --version
gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

jo...@udesktop253:~ uname -a
SunOS udesktop253 5.10 Generic_142910-17 i86pc i386 i86pc

jo...@udesktop253:~ cat test.py
import numpy as np
print np.__version__
fs = 1000
t = np.linspace(0, 0.3, 301)
A = np.array([2, 8]).reshape(-1, 1)
f = np.array([150, 140]).reshape(-1, 1)
xn = (A * np.exp(2j * np.pi * f * t)).sum(axis=0)

jo...@udesktop253:~ python test.py
2.0.0.dev-9451260
Segmentation Fault (core dumped)
jo...@udesktop253:~

jo...@udesktop253:~ sudo pstack /var/core/core.python.957
core '/var/core/core.python.957' of 9397:   python test.py
 febf1928 cexp (0, 0, 0, 0, 8060ab0, 84321ac) + 1b0
 fe9657e0 npy_cexp (80458e0, 0, 0, 0, 0, 84e2530) + 30
 fe95064f nc_exp   (8045920, 84e72a0, 8045978, 8045920, 10, 10) + 3f
 fe937d5b PyUFunc_D_D (84e2530, 84e20f4, 84e25b0, fe950610, 1, 0) + 5b
 fe95e818 PyUFunc_GenericFunction (81e96e0, 807deac, 0, 80460b8, 2, 2) + 448
 fe95fb10 ufunc_generic_call (81e96e0, 807deac, 0, fe98a820) + 70
 feeb2d78 PyObject_Call (81e96e0, 807deac, 0, 80a24ec, 8061c08, 0) + 28
 fef11900 PyEval_EvalFrame (80a2394, 81645a0, 8079824, 8079824) + 146c
 fef17708 PyEval_EvalCodeEx (81645a0, 8079824, 8079824, 0, 0, 0) + 620
 fef178af PyEval_EvalCode (81645a0, 8079824, 8079824, 8061488, fef3d9ee, 0)
+ 2f
 fef3d095 PyRun_FileExFlags (feb91c98, 804687b, 101, 8079824, 8079824, 1) +
75
 fef3d9ee PyRun_SimpleFileExFlags (feb91c98, 804687b, 1, 80465a8, fef454a1,
804687b) + 172
 fef3e4fd PyRun_AnyFileExFlags (feb91c98, 804687b, 1, 80465a8) + 61
 fef454a1 Py_Main  (1, 80466b8, feb1cf35, fea935a1, 29, feb96750) + 9d9
 08050862 main (2, 80466b8, 80466c4) + 22
 08050758 _start   (2, 8046874, 804687b, 0, 8046883, 80468ad) + 60
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] recarray to csv

2010-09-03 Thread John Hunter

2010/9/3 Guillaume Chérel guillaume.c.che...@gmail.com:
  Great, Thank you. I also found out about csv2rec. I've been missing
 these two a lot.

Some other handy rec functions in mlab

http://matplotlib.sourceforge.net/examples/misc/rec_groupby_demo.html
http://matplotlib.sourceforge.net/examples/misc/rec_join_demo.html

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] recarray to csv

2010-09-03 Thread John Hunter

On Fri, Sep 3, 2010 at 8:50 AM, Benjamin Root ben.r...@ou.edu wrote:

 Why is this function in matplotlib?  Wouldn't it be more useful in numpy?

I tend to add stuff I write to matplotlib.  mlab was initially a
repository of matlab-like functions that were not available in numpy
(load, save, linspace, psd, cohere, polyfit, polyval, prctile, ...).
I've always encouraged numpy developers to harvest what they want and
move them into numpy, and many of these functions have been moved.
Once they make it into stable numpy, we deprecate them and eventually
remove them from mlab.  Many of the rec functions have been ported to
numpy in  numpy.lib.recfunctions.  There are some differences,
particularly in the csv2rec (mpl handles date parsing) and I rely
heavily on all these functions, so I have not ported all my code to
use numpy yet.  We should start the process of deprecating the ones
that have been ported and have API and functional compatibility.

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] save data to csv with column names

2010-08-16 Thread John Hunter

2010/8/16 Guillaume Chérel guillaume.c.che...@gmail.com:
 Hello,

 I'd like to know if there is an easy way to save a list of 1D arrays to a
 csv file, with the first line of the file being the column names.

 I found the following, but I can't get to save the column names:

 data = rec.array([X1,X2,X3,X4], names=[n1,n2,n3,n4])
 savetxt(filename, data, delimiter=,, fmt=[%i,%d, %f,%f])

import matplotlib.mlab as mlab
mlab.rec2csv(data, 'myfile.csv')
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [SciPy-Dev] Good-bye, sort of

2010-08-13 Thread John Hunter

On Fri, Aug 13, 2010 at 11:47 AM, David Goldsmith
d.l.goldsm...@gmail.com wrote:
 2010/7/30 Stéfan van der Walt ste...@sun.ac.za

 Hi David

 Best of luck with your new position! I hope they don't make you program
 too much MATLAB!

 After several years now of writing Python and now having written my first
 on-the-job 15 operational MATLAB LOC, all of which are string, cell array,
 and file processing, I'm ready to say: MATLAB: what a PITA! :-(

Ahh, cell arrays, they bring back memories.  Makes you pine for a
dictionary, no?

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [SciPy-Dev] Good-bye, sort of (John Hunter)

2010-08-13 Thread John Hunter

On Fri, Aug 13, 2010 at 3:41 PM, Benjamin Root ben.r...@ou.edu wrote:
 @Josh: Awesome name.  Very fitting...

 Another thing that I really love about matplotlib that drove me nuts in
 Matlab was being unable to use multiple colormaps in the same figure.

Funny -- this was one of the *first* things I thought about when
writing mpl.  That limitation drove me nuts too.

And while we're dissing matlab, the one-function-per-file development
process that matlab seems to inspire (I did it too) is hard to fathom
in retrospect.

http://github.com/jesusabdullah/methlabs/blob/master/modules/extra/elemsof.m

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] isinf raises in inf

2010-07-15 Thread John Hunter

I am seeing a problem on Solaris since I upgraded to svn HEAD.
np.isinf does not handle np.inf.  See ipython session below.  I am not
seeing this problem w/ HEAD on an ubuntu linux box I tested on

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '2.0.0.dev8480'

In [3]: x = np.inf
np.infnp.info   np.infty

In [3]: x = np.inf

In [4]: np.isinf(x)
Warning: invalid value encountered in isinf
Out[4]: True

In [5]: np.seter
np.seterr  np.seterrcall  np.seterrobj

In [5]: np.seterr(all='raise')
Out[5]: {'over': 'print', 'divide': 'print', 'invalid': 'print',
'under': 'ignore'}

In [6]: np.isinf(x)
---
FloatingPointErrorTraceback (most recent call last)

/home/titan/johnh/ipython console

FloatingPointError: invalid value encountered in isinf

In [7]: !uname -a
SunOS udesktop191 5.10 Generic_139556-08 i86pc i386 i86pc

In [43]: !gcc --version
gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Posted on tracker:

http://projects.scipy.org/numpy/ticket/1547
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] isinf raises in inf

2010-07-15 Thread John Hunter

On Thu, Jul 15, 2010 at 6:14 PM, Eric Firing efir...@hawaii.edu wrote:
 Is it certain that the Solaris compiler lacks isinf?  Is it possible
 that it has it, but it is not being detected?

Just to clarify, I'm not using the sun compiler, but gcc-3.4.3 on solaris x86
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] isinf raises in inf

2010-07-15 Thread John Hunter

On Thu, Jul 15, 2010 at 7:11 PM, John Hunter jdh2...@gmail.com wrote:
 On Thu, Jul 15, 2010 at 6:14 PM, Eric Firing efir...@hawaii.edu wrote:
 Is it certain that the Solaris compiler lacks isinf?  Is it possible
 that it has it, but it is not being detected?

 Just to clarify, I'm not using the sun compiler, but gcc-3.4.3 on solaris x86

Correction: the version of gcc I compiled numpy with is different than
the one in my default path.  The version I compiled numpy with is

   /opt/app/g++lib6/gcc-4.2/bin/gcc --version
  gcc (GCC) 4.2.2

running on solaris 5.10

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] isinf raises in inf

2010-07-15 Thread John Hunter

On Thu, Jul 15, 2010 at 7:27 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Thu, Jul 15, 2010 at 6:11 PM, John Hunter jdh2...@gmail.com wrote:

 On Thu, Jul 15, 2010 at 6:14 PM, Eric Firing efir...@hawaii.edu wrote:
  Is it certain that the Solaris compiler lacks isinf?  Is it possible
  that it has it, but it is not being detected?

 Just to clarify, I'm not using the sun compiler, but gcc-3.4.3 on solaris
 x86

 Might be related to this thread.  What version of numpy are you using?

svn HEAD (2.0.0.dev8480)

After reading the thread you suggested, I tried forcing the

  CFLAGS=-DNPY_HAVE_DECL_ISFINITE

flag to be set, but this is apparently a bad idea for my platform...

  File 
/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/core/__init__.py,
line 5, in ?
import multiarray
ImportError: ld.so.1: python: fatal: relocation error: file
/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/core/multiarray.so:
symbol isfinite: referenced symbol not found

so while I think my bug is related to that thread, I don't see
anything in that thread to help me fix my problem.  Or am I missing
something?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] cannot set dtype on record array with 04 datetime records

2010-07-14 Thread John Hunter

I use record arrays extensively with python datetimes, which works if
you pass in a list of lists of data with the names.  numpy can
accurately infer the dtypes and create a usable record array.  Eg,

import datetime
import numpy as np
rows = [ [datetime.date(2001,1,1), 12, 23.],
 [datetime.date(2002,1,1), 10, 13.],
 [datetime.date(2003,1,1), -2, 1.],
 ]

r1 = np.rec.fromrecords(rows, names='a,b,c')

print r1.dtype

prints out: [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]

but if I want to speed things up by providing the dtype, numpy raises
ValueError:

dtype = [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]

r2 = np.rec.fromrecords(rows, dtype=dtype)

/home/titan/johnh/test.py
 12 dtype = [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]
 13
--- 14 r2 = np.rec.fromrecords(rows, dtype=dtype)
 15
 16

/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/core/records.pyc
in fromrecords(recList, dtype, shape, formats, names, titles, aligned,
byteorder)
610
611 try:
-- 612 retval = sb.array(recList, dtype=descr)
613 except TypeError:  # list of lists instead of list of tuples
614 if (shape is None or shape == 0):

ValueError: Setting void-array with object members using buffer.
WARNING: Failure executing file: test.py


Running from svn HEAD:

In [2]: numpy.__version__
Out[2]: '2.0.0.dev8480'


Here is a complete script::
import datetime
import numpy as np
rows = [ [datetime.date(2001,1,1), 12, 23.],
 [datetime.date(2002,1,1), 10, 13.],I use record arrays
extensively with python datetimes, which works if you pass in a list
of lists of data with the names.  numpy can accurately infer the
dtypes and create a usable record array.  Eg,

import datetime
import numpy as np
rows = [ [datetime.date(2001,1,1), 12, 23.],
 [datetime.date(2002,1,1), 10, 13.],
 [datetime.date(2003,1,1), -2, 1.],
 ]

r1 = np.rec.fromrecords(rows, names='a,b,c')

print r1.dtype

prints out: [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]

but if I want to speed things up by providing the dtype, numpy raises
ValueError:

dtype = [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]

r2 = np.rec.fromrecords(rows, dtype=dtype)

/home/titan/johnh/test.py
 12 dtype = [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]
 13
--- 14 r2 = np.rec.fromrecords(rows, dtype=dtype)
 15
 16

/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/core/records.pyc
in fromrecords(recList, dtype, shape, formats, names, titles, aligned,
byteorder)
610
611 try:
-- 612 retval = sb.array(recList, dtype=descr)
613 except TypeError:  # list of lists instead of list of tuples
614 if (shape is None or shape == 0):

ValueError: Setting void-array with object members using buffer.
WARNING: Failure executing file: test.py


Running from svn HEAD:

In [2]: numpy.__version__
Out[2]: '2.0.0.dev8480'


Here is a complete script::
import datetime
import numpy as np
rows = [ [datetime.date(2001,1,1), 12, 23.],
 [datetime.date(2002,1,1), 10, 13.],
 [datetime.date(2003,1,1), -2, 1.],
 ]

r1 = np.rec.fromrecords(rows, names='a,b,c')

print r1.dtype

dtype = [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]

r2 = np.rec.fromrecords(rows, dtype=dtype)


 [datetime.date(2003,1,1), -2, 1.],
 ]

r1 = np.rec.fromrecords(rows, names='a,b,c')

print r1.dtype

dtype = [('a', '|O4'), ('b', 'i4'), ('c', 'f8')]

r2 = np.rec.fromrecords(rows, dtype=dtype)



I filed a ticket on the tracker:

  http://projects.scipy.org/numpy/ticket/1544

Thanks!
JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy, matplotlib and masked arrays

2010-07-09 Thread John Hunter

On Fri, Jul 9, 2010 at 8:03 PM, Peter Isaac peter.is...@monash.edu wrote:
 Note that EPD-6.2-2 works fine with this script on WinXP.

 Any suggestions welcome

then just use epd-6.2.2 on winxp.

your-mpl-developer-channeling-steve-jobs,
JDH

wink
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] PSF GSoC 2010 (Py3K focus)

2010-03-09 Thread John Hunter

On Mon, Mar 8, 2010 at 11:39 PM, Charles R Harris
charlesr.har...@gmail.com wrote:

 - port matplotlib to Py3K

We'd be happy to mentor a project here.  To my knowledge, nothing has
been done, other than upgrade to CXX6 (our C++ extension lib).  Most,
but not all, of our extension code is exposed through CXX, which as of
v6 is python3 compliant so that should help.  But I suspect there is
enough work to justify a GSOC project on the mpl side.

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Emulate left outer join?

2010-02-10 Thread John Hunter

On Tue, Feb 9, 2010 at 7:53 PM, Pierre GM pgmdevl...@gmail.com wrote:
 On Feb 9, 2010, at 8:16 PM, John Hunter wrote:

 and have totxt, tocsv. etc... from rec2txt, rec2csv, etc...   I
 think the functionality of mlab.rec_summarize and rec_groupby is very
 useful, but the interface is a bit clunky and could be made easier for
 the common use cases.

 Are you going to work on it or should I step in (in a few weeks...).

I don't think I'll have time to do it -- I'm already behind on an mpl
release --  but I'll propose it to Sameer who has done a lot of work
on the  matplotlib.mlab.rec_* methods and see if he has some time for
it.

Thanks,
JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Emulate left outer join?

2010-02-10 Thread John Hunter

On Wed, Feb 10, 2010 at 8:54 AM, John Hunter jdh2...@gmail.com wrote:
 On Tue, Feb 9, 2010 at 7:53 PM, Pierre GM pgmdevl...@gmail.com wrote:
 On Feb 9, 2010, at 8:16 PM, John Hunter wrote:

 and have totxt, tocsv. etc... from rec2txt, rec2csv, etc...   I
 think the functionality of mlab.rec_summarize and rec_groupby is very
 useful, but the interface is a bit clunky and could be made easier for
 the common use cases.

 Are you going to work on it or should I step in (in a few weeks...).

 I don't think I'll have time to do it -- I'm already behind on an mpl
 release --  but I'll propose it to Sameer who has done a lot of work
 on the  matplotlib.mlab.rec_* methods and see if he has some time for
 it.

Sameer is interested in helping with this, but will also not be able
to get to it for a couple of weeks.

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Emulate left outer join?

2010-02-09 Thread John Hunter

On Tue, Feb 9, 2010 at 4:43 PM, Fernando Perez fperez@gmail.com wrote:
 On Tue, Feb 9, 2010 at 5:02 PM, Robert Kern robert.k...@gmail.com wrote:

 numpy.lib.recfunctions.join_by(key, r1, r2, jointype='leftouter')


 And if that isn't sufficient, John has in matplotlib.mlab a few other
 similar utilities that allow for more complex cases:

The numpy.lib.recfunctions were ported from matplotlib.mlab so most of
the functionality is overlapping, but we have added some stuff since
the port, eg matplotlib.mlab.recs_join for a multiway join, and some
stuff was never ported (rec_summarize, rec_groupby) so it may be worth
looking in mlab too.  Some of the stuff for mpl is only in svn but
most of it is released.

Examples are at

  http://matplotlib.sourceforge.net/examples/misc/rec_join_demo.html
  http://matplotlib.sourceforge.net/examples/misc/rec_groupby_demo.html

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Emulate left outer join?

2010-02-09 Thread John Hunter

On Tue, Feb 9, 2010 at 7:02 PM, Pierre GM pgmdevl...@gmail.com wrote:
 On Feb 9, 2010, at 7:54 PM, Pauli Virtanen wrote:

 But, should we make these functions available under some less
 internal-ish namespace? There's numpy.rec at the least -- it could be
 made a real module to pull in things from core and lib.

 I still think these functions are more generic than the rec_ prefix let 
 think, and I'd still prefer a decision being made about what should go in the 
 module before thinking too hard about how to advertise it.

I would love to see many of these as methods of record/structured
arrays, so we could say

  r = r1.join('date', r2)

or

  rs = r.groupby( ('year', 'month'), stats)

and have totxt, tocsv. etc... from rec2txt, rec2csv, etc...   I
think the functionality of mlab.rec_summarize and rec_groupby is very
useful, but the interface is a bit clunky and could be made easier for
the common use cases.

These methods could call the proper functions from np.lib.recfunctions
or wherever, and they would get a lot more visibility to people using
introspection.

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] ANN: job opening at Tradelink

2009-12-14 Thread John Hunter

We are looking to hire a quantitative researcher to help research and
develop trading ideas, and to develop and support infrastructure to
put these trading strategies into production.  We are looking for
someone who is bright and curious with a quantitative background and a
strong interest in writing good code and building systems that work.
Experience with probability, statistics and time series is required,
and experience working with real world data is a definite plus.  We do
not require a financial background, but are looking for someone with
an enthusiasm to dive into this industry and learn a lot.  We do most
of our data modeling and production software in python and R.  We have
a lot of ideas to test and hopefully put into production, and you'll
be working with a fast paced and friendly small team of traders,
programmers and quantitative researchers.


Applying:

  Please submit a resume and cover letter to qsj...@trdlnk.com.  In
  your cover letter, please address how your background, experience
  and skills will fit into the position described above.  We are
  looking for a full-time, on-site candidate only.

About Us:


  TradeLink Holdings LLC is a diversified alternative investment,
  trading and software firm. Headquartered in Chicago, TradeLink
  Holdings LLC includes a number of closely related entities. Since
  its organization in 1979, TradeLink has been actively engaged in the
  securities, futures, options, and commodities trading
  industries. Engaged in the option arbitrage business since 1983,
  TradeLink has a floor trading and/or electronic trading interface in
  commodity options, financial futures and options, and currency
  futures and options at all major U.S. exchanges. TradeLink is
  involved in various market-making programs in many different
  exchanges around the world, including over-the-counter derivatives
  markets. http://www.tradelinkllc.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Bug in rec.fromarrays ; plus one other possible bug

2009-11-25 Thread John Hunter

On Wed, Nov 25, 2009 at 8:48 AM, Dan Yamins dyam...@gmail.com wrote:

 Am I just not supposed to be working with length-0 string columns, period?

But why would you want to?  array dtypes are immutable, so you are
saying: I want this field to be only empty strings now and forever.
So you can't initialize them to be empty and then fill them later.  If
by construction it is always empty, why have it at all?


Then again, numpy allows me to create empty arrays x = np.array([])



JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matplotlib is breaking numpy

2009-11-19 Thread John Hunter






On Nov 19, 2009, at 12:35 PM, Mathew Yeates mat.yea...@gmail.com  
wrote:



Yeah, I tried that.

Here's what I'm doing. I have an application which displays  
different dataset which a user selects from a drop down list. I want  
to overwrite the existing plot with a new one. I've tried deleting  
just about everything get matplotlib to let go of my data!



What is everything?  Are you using pyplot or are you embedding mpl in  
a GUI?  If the latter, are you deleting the FigureCanvas?  You will  
also need to call gc.collect after deleting the mpl objects because we  
use a lot of circular references. Pyplot close does this  
automatically, but this does not apply to embedding.


How are you running you app?  From the shell or IPython?





Mathew

On Thu, Nov 19, 2009 at 10:30 AM, John Hunter jdh2...@gmail.com  
wrote:





On Nov 19, 2009, at 11:57 AM, Robert Kern robert.k...@gmail.com  
wrote:


 On Thu, Nov 19, 2009 at 11:52, Mathew Yeates mat.yea...@gmail.com
 wrote:
 There is definitely something wrong with matplotlib/numpy. Consider
 the
 following
 from numpy import *
 mydata=memmap('map.dat',dtype=float64,mode='w+',shape=56566500)
 del mydata

 I can now remove the file map.dat with (from the command line) $rm
 map.dat

 However
 If I plot  mydata before the line
 del mydata


 I can't get rid of the file until I exit python!!
 Does matplotlib keep a reference to the data?

 Almost certainly.

 How can I remove this
 reference?

 Probably by deleting the plot objects that were created and close  
all
 matplotlib windows referencing the data. If you are using IPython,  
you
 should know that many of the returned objects are kept in Out, so  
you

 will need to clear that. There might be some more places internal to
 matplotlib, I don't know.


Closing the figure window containg the data *should* be enough. In
pylab/pyplot, this also triggers a call to gc.collect.




 With some care, you can use gc.get_referrers() to find the objects
 that are holding direct references to your memmap.

 --
 Robert Kern

 I have come to believe that the whole world is an enigma, a  
harmless
 enigma that is made terrible by our own mad attempt to interpret  
it as

 though it had an underlying truth.
  -- Umberto Eco
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matplotlib is breaking numpy

2009-11-19 Thread John Hunter






On Nov 19, 2009, at 12:53 PM, Mathew Yeates mat.yea...@gmail.com  
wrote:


I am running my gtk app from python. I am deleting the canvas and  
running gc.collect(). I still seem to have a reference to my  
memmapped data.


Any other hints?


Gtk app from the standard python shell?

Are you using the mpl toolbar?  It keeps a ref to the canvas. If you  
can create a small freestanding example, that would help






-Mathew

On Thu, Nov 19, 2009 at 10:42 AM, John Hunter jdh2...@gmail.com  
wrote:





On Nov 19, 2009, at 12:35 PM, Mathew Yeates mat.yea...@gmail.com  
wrote:



Yeah, I tried that.

Here's what I'm doing. I have an application which displays  
different dataset which a user selects from a drop down list. I  
want to overwrite the existing plot with a new one. I've tried  
deleting just about everything get matplotlib to let go of my data!



What is everything?  Are you using pyplot or are you embedding mpl  
in a GUI?  If the latter, are you deleting the FigureCanvas?  You  
will also need to call gc.collect after deleting the mpl objects  
because we use a lot of circular references. Pyplot close does this  
automatically, but this does not apply to embedding.


How are you running you app?  From the shell or IPython?





Mathew

On Thu, Nov 19, 2009 at 10:30 AM, John Hunter jdh2...@gmail.com  
wrote:





On Nov 19, 2009, at 11:57 AM, Robert Kern robert.k...@gmail.com  
wrote:


 On Thu, Nov 19, 2009 at 11:52, Mathew Yeates mat.yea...@gmail.com
 wrote:
 There is definitely something wrong with matplotlib/numpy.  
Consider

 the
 following
 from numpy import *
 mydata=memmap('map.dat',dtype=float64,mode='w+',shape=56566500)
 del mydata

 I can now remove the file map.dat with (from the command line) $rm
 map.dat

 However
 If I plot  mydata before the line
 del mydata


 I can't get rid of the file until I exit python!!
 Does matplotlib keep a reference to the data?

 Almost certainly.

 How can I remove this
 reference?

 Probably by deleting the plot objects that were created and close  
all
 matplotlib windows referencing the data. If you are using  
IPython, you
 should know that many of the returned objects are kept in Out, so  
you
 will need to clear that. There might be some more places internal  
to

 matplotlib, I don't know.


Closing the figure window containg the data *should* be enough. In
pylab/pyplot, this also triggers a call to gc.collect.




 With some care, you can use gc.get_referrers() to find the objects
 that are holding direct references to your memmap.

 --
 Robert Kern

 I have come to believe that the whole world is an enigma, a  
harmless
 enigma that is made terrible by our own mad attempt to interpret  
it as

 though it had an underlying truth.
  -- Umberto Eco
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] masked index surprise

2009-08-14 Thread John Hunter

I just tracked down a subtle bug in my code, which is equivalent to


In [64]: x, y = np.random.rand(2, n)

In [65]: z = np.zeros_like(x)

In [66]: mask = x0.5

In [67]: z[mask] = x/y



I meant to write

  z[mask] = x[mask]/y[mask]

so I can fix my code, but why is line 67 allowed

  In [68]: z[mask].shape
  Out[68]: (54,)

  In [69]: (x/y).shape
  Out[69]: (100,)

it seems like broadcasting would fail


In [70]: np.__version__
Out[70]: '1.4.0.dev7153'

In [71]:
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] adaptive sampling of an interval or plane

2009-08-13 Thread John Hunter

On Wed, Aug 12, 2009 at 6:28 AM, John Hunterjdh2...@gmail.com wrote:
 We would like to add function plotting to mpl, but to do this right we
 need to be able to adaptively sample a function evaluated over an
 interval so that some tolerance condition is satisfied, perhaps with
 both a relative and absolute error tolerance condition.  I am a bit
 out of my area of competency here, eg I do not know exactly how the
 tolerance condition should be specified, but I suspect some of you
 here may be experts on this.  Does anyone have some code compatible
 with the BSD license, preferably based on numpy but we would consider
 an extension code or scipy solution, for doing this?

 The functionality we have in mind is provided in matlab with fplot

  http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/fplot.html

 We would like 1D and 2D versions of this ideally.  If anyone has some
 suggestions, let me know.

Denis Bzowy has replied to me off list with some code adaptive spline
approximation code he is working on.  He has documentation and code
for the 1D case, and is preparing for the 2D case, and is seeking
feedback.  He's having trouble posting to the list, and asked me to
forward this, so please make sure his email, included in this post, is
in any replies

http://drop.io/denis_adaspline1
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] adaptive sampling of an interval or plane

2009-08-12 Thread John Hunter

We would like to add function plotting to mpl, but to do this right we
need to be able to adaptively sample a function evaluated over an
interval so that some tolerance condition is satisfied, perhaps with
both a relative and absolute error tolerance condition.  I am a bit
out of my area of competency here, eg I do not know exactly how the
tolerance condition should be specified, but I suspect some of you
here may be experts on this.  Does anyone have some code compatible
with the BSD license, preferably based on numpy but we would consider
an extension code or scipy solution, for doing this?

The functionality we have in mind is provided in matlab with fplot

  
http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/fplot.html

We would like 1D and 2D versions of this ideally.  If anyone has some
suggestions, let me know.

Thanks,
JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] yubnub and numpy examples

2009-08-05 Thread John Hunter

yubnub is pretty cool -- it's a command line interface for the web.
You can enable it in firefox by typing about:config in the URL bar,
scrolling down to keyword.URL, right click on the line and choose
modify, and set the value to be

http://www.yubnub.org/parser/parse?default=g2command=

Then, you can type yubnub commands in the URL bar, eg, to see all
commands related to python, type ls python in the URL bar.

It's easy to create new commands; I just created a new command to load
the docs for a numpy function; just type in the URL bar:

  npfunc convolve

which takes you directly to
http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html

I was hoping to create a similar command for the numpy examples, but
the URL links in http://www.scipy.org/Numpy_Example_List_With_Doc are
some md5 gobbledy-gook.  Is it possible to have nice URLs on this
page, so they can be more readily yubnub-ized?

JDH
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] binary builds against older numpys

2009-05-20 Thread John Hunter

We are trying to build and test mpl installers for python2.4, 2.5 and
2.6.  What we are finding is that if we build mpl against a more
recent numpy than the installed numpy on a test machine, the import of
mpl extension modules which depend on numpy trigger a segfault.

Eg, on python2.5 and python2.6, we build the mpl installers against
the numpy-1.3.0-win32.superpack installation, and if I test the
installer on a python2.5 machine with numpy-1.2.1-win32.superpack
installed, I get the segfault.  If I install
numpy-1.3.0-win32.superpack on the test machine, then the mpl binaries
work fine.

Is there an known binary incompatibly between 1.2.1 and 1.3.0?  One
solution we may consider is building our 2.5 binaries against 1.2.1
and seeing if they work with both 1.2.1 and 1.3.0 installations, but
wanted to check in here to see if there were known issues or solutions
we should be considering.

Our test installers are at http://drop.io/rlck8ph if you are interested.

Thanks,
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] binary builds against older numpys

2009-05-20 Thread John Hunter

2009/5/20 Stéfan van der Walt ste...@sun.ac.za:

 David Cournapeau also put a check in place so that the NumPy build
 will break if we forget to update the API version again.

 So, while we can't change the releases of NumPy out there already, we
 can at least ensure that this won't happen again.

OK, great -- thanks for th info.  From reading David's comments in the
earlier thread:


  David - Backward compatibility means that you can build something against
  David  numpy version M, later update numpy to version N M, and it
still works.
  David  numpy 1.3.0 is backward compatible with 1.2.1

it looks like our best bet will be to build our python2.4 and
python2.5 binaries against 1.2.1 and our python2.6 binaries against
1.3.0 (since there are no older python2.6 numpy builds on the sf site
anyhow).  I'll post on the mpl list and site that anyone using the new
mpl installers needs to be on numpy 1.2.1 or later.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] numpy save files from C

2009-02-26 Thread John Hunter

A colleague of mine has a bunch of numpy arrays saved with np.save and
he now wants to access them directly in C, with or w/o the numpy C API
doesn't matter.  Does anyone have any sample code lying around which
he can borrow from?  The array is a structured array with an otherwise
plain vanilla dtype (ints and floats).

I've referred him to the npy-format NEP document, as well as the
format.py implementation, so he can roll his own if need be, but if
someone has a head start code example that would be great.

Thanks,
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [Newbie] Fast plotting

2009-01-07 Thread John Hunter

On Wed, Jan 7, 2009 at 6:37 AM, Franck Pommereau
pommer...@univ-paris12.fr wrote:

 def f4 (x, y) :
Jean-Baptiste Rudant boogalo...@yahoo.fr

test 1 CPU times: 111.21s
test 2 CPU times: 13.48s

As Jean-Baptiste noticed, this solution is not very efficient (but
works almost of-the-shelf).

recXY = numpy.rec.fromarrays((x, x), names='x, y')
return matplotlib.mlab.rec_groupby(recXY, ('x',),
   (('y', numpy.mean, 'y_avg'),))

This probably will have no impact on your tests, but this looks like a
bug.  You probably mean:

  recXY = numpy.rec.fromarrays((x, y), names='x, y')

Could you post the code you use to generate you inputs (ie what is x?)

I will look into trying some of the suggestions here to improve the
performance on rec_groupby.  One thing that slows it down is that it
supports an arbitrary number of keys -- eg groupby ('year', 'month')
-- whereas the examples above are using a single value lookup.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new incremental statistics project

2008-12-19 Thread John Hunter

On Thu, Dec 18, 2008 at 8:27 PM, Bradford Cross
bradford.n.cr...@gmail.com wrote:
 This is a new project I just released.

 I know it is C#, but some of the design and idioms would be nice in
 numpy/scipy for working with discrete event simulators, time series, and
 event stream processing.

 http://code.google.com/p/incremental-statistics/

I think an incremental stats module would be a boon to numpy or scipy.
 Eric Firing has a nice module wrtten in C with a pyrex wrapper
(ringbuf) that does trailing incremental mean, median, std, min, max,
and percentile.  It maintains a sorted queue to do the last three
efficiently, and handles NaN inputs.  I would like to see this
extended to include exponential or other weightings to do things like
incremental trailing exponential moving averages and variances.  I
don't know what the licensing terms are of this module, but it might
be a good starting point for an incremental numpy stats module, at
least if you were thinking about supporting a finite lookback window.
We have a copy of this in the py4science examples dir if you want to
take a look:

svn co 
https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/trunk/py4science/examples/pyrex/trailstats
cd trailstats/
   make
   python movavg_ringbuf.py

Other things that would be very useful are incremental covariance and
regression.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new incremental statistics project

2008-12-19 Thread John Hunter

On Fri, Dec 19, 2008 at 12:59 PM, Eric Firing efir...@hawaii.edu wrote:

 Licensing is no problem; I have never bothered with it, but I can tack on a
 BSD-type license if that would help.

Great -- if you are the copyright holder, would you commit a BSD
license file to the py4science trailstats dir?  I just committed the
small bug fix we discussed yesterday there.

Thanks!
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] np.loadtxt : yet a new implementation...

2008-12-01 Thread John Hunter

On Mon, Dec 1, 2008 at 12:21 PM, Pierre GM [EMAIL PROTECTED] wrote:
 Well, looks like the attachment is too big, so here's the implementation.
 The tests will come in another message.\


It looks like I am doing something wrong -- trying to parse a CSV file
with dates formatted like '2008-10-14', with::

import datetime, sys
import dateutil.parser
StringConverter.upgrade_mapper(dateutil.parser.parse,
default=datetime.date(1900,1,1))
r = loadtxt(sys.argv[1], delimiter=',', names=True)
print r.dtype

I get the following::

Traceback (most recent call last):
  File genload_proposal.py, line 734, in ?
r = loadtxt(sys.argv[1], delimiter=',', names=True)
  File genload_proposal.py, line 711, in loadtxt
(output, _) = genloadtxt(fname, **kwargs)
  File genload_proposal.py, line 646, in genloadtxt
rows[i] = tuple([conv(val) for (conv, val) in zip(converters, vals)])
  File genload_proposal.py, line 385, in __call__
raise ValueError(Cannot convert string '%s' % value)
ValueError: Cannot convert string '2008-10-14'

In debug mode, I see the following where the error occurs

ipdb vals
('2008-10-14', '116.26', '116.40', '103.14', '104.08', '70749800', '104.08')
ipdb converters
[__main__.StringConverter instance at 0xa35fa6c,
__main__.StringConverter instance at 0xa35ff2c,
__main__.StringConverter instance at 0xa35ff8c,
__main__.StringConverter instance at 0xa35ffec,
__main__.StringConverter instance at 0xa15406c,
__main__.StringConverter instance at 0xa1540cc,
__main__.StringConverter instance at 0xa15412c]

It looks like my registry of a custom converter isn't working.  Here
is what the _mapper looks like::

In [23]: StringConverter._mapper
Out[23]:
[(type 'numpy.bool_', function str2bool at 0xa2b8bc4, None),
 (type 'numpy.integer', type 'int', -1),
 (type 'numpy.floating', type 'float', -NaN),
 (type 'complex', type 'complex', (-NaN+0j)),
 (type 'numpy.object_',
  function parse at 0x8cf1534,
  datetime.date(1900, 1, 1)),
 (type 'numpy.string_', type 'str', '???')]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fwd: np.loadtxt : yet a new implementation...

2008-12-01 Thread John Hunter

On Mon, Dec 1, 2008 at 1:14 PM, Pierre GM [EMAIL PROTECTED] wrote:

 The problem you have is that the default dtype is 'float' (for
 backwards compatibility w/ the original np.loadtxt). What you want
 is to automatically change the dtype according to the content of
 your file: you should use dtype=None

 r = loadtxt(sys.argv[1], delimiter=',', names=True, dtype=None)

 As you'll want a recarray, we could make a np.records.loadtxt
 function where dtype=None would be the default...

 As you'll want a recarray, we could make a np.records.loadtxt function where
 dtype=None would be the default...

OK, that worked great.  I do think some a default impl in np.rec which
returned a recarray would be nice.  It might also be nice to have a
method like np.rec.fromcsv which defaults to a delimiter=',',
names=True and dtype=None.  Since csv is one of the most common data
interchange format in  the world, it would be nice to have some
obvious function that works with it with little or no customization
required.

Fernando and I have taught a scientific computing  course on a number
of occasions, and on the last round we taught to undergrads.  Most of
these students have little or no programming, for many the concept of
an array is something they struggle with, dtypes are a difficult
concept, but we found that they responded very well to our csv2rec
example, because with no syntactic cruft they were able to load a file
and do some stats on the columns, and I would like to see that ease of
use preserved.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] More loadtxt() changes

2008-11-26 Thread John Hunter

On Tue, Nov 25, 2008 at 11:23 PM, Ryan May [EMAIL PROTECTED] wrote:

 Updated patch attached.  This includes:
  * Updated docstring
  * New tests
  * Fixes for previous issues
  * Fixes to make new tests actually work

 I appreciate any and all feedback.

I'm having trouble applying your patch, so I haven't tested yet, but
do you (and do you want to) handle a case like this::

from  StringIO import StringIO
import matplotlib.mlab as mlab
f1 = StringIO(\
name   age  weight
John   23   145.
Harry  43   180.)

for line in f1:
print line.split(' ')


Ie, space delimited but using an irregular number of spaces?   One
place this comes up a lot is when  the output files are actually
fixed-width using spaces to line up the columns.  One could count the
columns to figure out the fixed widths and work with that, but it is
much easier to simply assume space delimiting and handle the irregular
number of spaces assuming one or more spaces is the delimiter.  In
csv2rec, we write a custom file object to handle this case.

Apologies if you are already handling this and I missed it...

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread John Hunter

On Tue, Nov 25, 2008 at 12:16 PM, Pierre GM [EMAIL PROTECTED] wrote:

 A la mlab.csv2rec ? It could work with a bit more tweaking, basically
 following John Hunter's et al. path. What happens when the column names are
 unknown (read from the header) or wrong ?

 Actually, I'd like John to comment on that, hence the CC. More generally,
 wouldn't be useful to push the recarray manipulating functions from
 matplotlib.mlab to numpy ?

Yes, I've said on a number of occasions I'd like to see these
functions in numpy, since a number of them make more sense as numpy
methods than as stand alone functions.

 What happens when the column names are unknown (read from the header) or 
 wrong ?

I'm not quite sure what you are looking for here.  Either the user
will have to know the correct column name or the column number or you
should raise an error.  I think supporting column names everywhere
they make sense is critical since this is how most people think about
these CSV-like files with column headers.

One other thing that is essential for me is that date support is
included.  Virtually every CSV file I work with has date data in it,
in a variety of formats, and I depend on csv2rec (via
dateutil.parser.parse which mpl ships) to be able to handle it w/o any
extra cognitive overhead, albeit at the expense of some performance
overhead, but my files aren't too big.  I'm not sure how numpy would
handle the date parsing aspect, but this came up in the date datatype
PEP discussion I think.  For me, having to manually specify a date
converter with the proper format string every time I load a CSV file
is probably not viable.

Another feature that is critical to me is to be able to get a
np.recarray back instead of a record array.  I use these all day long,
and the convenience of r.date over r['date'] is too much for me to
give up.

Feel free to ignore these suggestions if they are too burdensome or
not appropriate for numpy -- I'm just letting you know some of the
things I need to see before I personally would stop using mlab.csv2rec
 and use numpy.loadtxt instead.

One last thing, I consider the masked array support in csv2rec
somewhat broken because when using a masked array you cannot get at
the data (eg datetime methods or string methods) directly using the
same interface that regular recarrays use.  Pierre, last I brought
this up you asked for some example code and indicated a willingness to
work on it but I fell behind and never posted it.  The code
illustrating the problem is below.  I'm really not sure what the right
solution is, but the current implementation -- sometimes returning a
plain-vanilla rec array, sometimes returning a masked record array --
with different interfaces is not good.

Perhaps the best solution is to force the user to ask for masked
support, and then always return a masked array whether any of the data
is masked or not.  csv2rec conditionally returns a masked array only
if some of the data are masked, which makes it difficult to use.

JDH

Here is the problem I referred to above -- in f1 none of the rows are
masked and so I can access the object attributes from the rows
directly.  In the 2nd example, row 3 has some missing data so I get an
mrecords recarray back, which does not allow me to directly access the
valid data methods.

from  StringIO import StringIO
import matplotlib.mlab as mlab
f1 = StringIO(\
date,name,age,weight
2008-10-12,'Bill',22,125.
2008-10-13,'Tom',23,135.
2008-10-14,'Sally',23,145.
)

r1 = mlab.csv2rec(f1)
row0 = r1[0]
print row0.date.year, row0.name.upper()

f2 = StringIO(\
date,name,age,weight
2008-10-12,'Bill',22,125.
2008-10-13,'Tom',23,135.
2008-10-14,'',,145.
)

r2 = mlab.csv2rec(f2)
row0 = r2[0]
print row0.date.year, row0.name.upper()
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread John Hunter

On Tue, Nov 25, 2008 at 2:01 PM, Pierre GM [EMAIL PROTECTED] wrote:

 On Nov 25, 2008, at 2:26 PM, John Hunter wrote:

 Yes, I've said on a number of occasions I'd like to see these
 functions in numpy, since a number of them make more sense as numpy
 methods than as stand alone functions.

 Great. Could we think about getting that on for 1.3x, would you have
 time ? Or should we wait till early jan. ?

I wasn't volunteering to do it, just that I support the migration if
someone else wants to do it.

I'm fully committed with mpl already...

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] contiguous regions

2008-11-20 Thread John Hunter

I frequently want to break a 1D array into regions above and below
some threshold, identifying all such subslices where the contiguous
elements are above the threshold.  I have two related implementations
below to illustrate what I am after.  The first crossings is rather
naive in that it doesn't handle the case where an element is equal to
the threshold (assuming zero for the threshold in the examples below).
 The second is correct (I think) but is pure python.  Has anyone got a
nifty numpy solution for this?

import numpy as np
import matplotlib.pyplot as plt
t = np.arange(0.0123, 2, 0.05)
s = np.sin(2*np.pi*t)

def crossings(x):

return a list of (above, ind0, ind1).  ind0 and ind1 are regions
such that the slice x[ind0:ind1]0 when above is True and
x[ind0:ind1]0 when above is False

N = len(x)
crossings = x[:-1]*x[1:]0
ind = np.nonzero(crossings)[0]+1
lastind = 0
data = []
for i in range(len(ind)):
above = x[lastind]0
thisind = ind[i]
data.append((above, lastind, thisind))
lastind = thisind

# put the one past the end index if not already in
if len(data) and data[-1]!=N-1:
data.append((not data[-1][0], thisind, N))
return data

def contiguous_regions(mask):

return a list of (ind0, ind1) such that mask[ind0:ind1].all() is
True and we cover all such regions


in_region = None
boundaries = []
for i, val in enumerate(mask):
if in_region is None and val:
in_region = i
elif in_region is not None and not val:
boundaries.append((in_region, i))
in_region = None

if in_region is not None:
boundaries.append((in_region, i+1))
return boundaries



fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_title('using crossings')

ax.plot(t, s, 'o')
ax.axhline(0)


for above, ind0, ind1 in crossings(s):
if above: color='green'
else: color = 'red'
tslice = t[ind0:ind1]
ax.axvspan(tslice[0], tslice[-1], facecolor=color, alpha=0.5)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_title('using contiguous regions')
ax.plot(t, s, 'o')
ax.axhline(0)

for ind0, ind1 in contiguous_regions(s0):
tslice = t[ind0:ind1]
ax.axvspan(tslice[0], tslice[-1], facecolor='green', alpha=0.5)

for ind0, ind1 in contiguous_regions(s0):
tslice = t[ind0:ind1]
ax.axvspan(tslice[0], tslice[-1], facecolor='red', alpha=0.5)


plt.show()
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] how to tell if a point is inside a polygon

2008-10-17 Thread John Hunter

On Thu, Oct 16, 2008 at 2:28 PM, Rob Hetland [EMAIL PROTECTED] wrote:

 I did not know that very useful thing.  But now I do.  This is solid
 proof that lurking on the mailing lists makes you smarter.

and that our documentation effort still has a long way to go !

FAQ added at
http://matplotlib.sourceforge.net/faq/howto_faq.html?#how-do-i-test-whether-a-point-is-inside-a-polygon

though I am having trouble getting the module functions pnpoly and
points_inside_poly to show up in the sphinx automodule documentation
for nxutils.  These functions are defined in extension code and I have
a post in to the sphinx mailing list

  http://groups.google.com/group/sphinx-dev/t/7ad1631d3117e4eb

but if anyone on this list has seen problems with automodule and
extension code functions, and knows how to fix them, let me know.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] how to tell if a point is inside a polygon

2008-10-16 Thread John Hunter

On Thu, Oct 16, 2008 at 1:25 PM, Rob Hetland [EMAIL PROTECTED] wrote:
 This question gets asked about once a month on the mailing list.
 Perhaps pnpoly could find a permanent home in scipy?  (or somewhere?)
 Obviously, many would find it useful.

It is already in matplotlib

In [1]: import matplotlib.nxutils as nx

In [2]: nx.pnpoly
Out[2]: built-in function pnpoly

In [3]: nx.points_inside_poly
Out[3]: built-in function points_inside_poly

but one of should add it to the FAQ!
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Need working code example of 2-D arrays

2008-10-13 Thread John Hunter

On Mon, Oct 13, 2008 at 2:29 PM, Alan G Isaac [EMAIL PROTECTED] wrote:

 The problem is, you did not just ask
 for technical information.  You also
 accused people of being condescending
 and demeaning.  But nobody was
 condescending or demeaning.  As several
 people **politely** explained to you,
 you are wrong about that.

Here is a simple example of loading some 2D data into an array and
manipulating the contents

import numpy as np
# load a 2D array of integers
X = np.loadtxt('somefile.txt').astype(int)
print X.shape  # X is a 2D array

# display the contents of X as a string
print '\n'.join([''.join([chr(c) for c in row]) for row in X])

The input file somefile.txt is attached

JDH
32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 
95 95 95 32
32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 47 124 32 32 47 124 32 
32 124 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 124 32
32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 124 124 95 95 124 124 
32 32 124 32 32 32 32 32 32 32 80 108 101 97 115 101 32 100 111 110 39 116 32 
32 32 32 32 32 32 124 32
32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 47 32 32 32 79 32 79 92 
95 95 32 32 32 32 32 32 32 32 32 32 32 102 101 101 100 32 32 32 32 32 32 32 32 
32 32 32 124 32
32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 47 32 32 32 32 32 32 32 32 
32 32 92 32 32 32 32 32 32 32 116 104 101 32 116 114 111 108 108 115 32 32 32 
32 32 32 32 32 124 32
32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 47 32 32 32 32 32 32 92 32 32 
32 32 32 92 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 124 32
32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 47 32 32 32 95 32 32 32 32 92 32 
32 32 32 32 92 32 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 
45 45 32 32
32 32 32 32 32 32 32 32 32 32 32 32 32 32 47 32 32 32 32 124 92 95 95 95 95 92 
32 32 32 32 32 92 32 32 32 32 32 124 124 32 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32
32 32 32 32 32 32 32 32 32 32 32 32 32 47 32 32 32 32 32 124 32 124 32 124 32 
124 92 95 95 95 95 47 32 32 32 32 32 124 124 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32 32 32
32 32 32 32 32 32 32 32 32 32 32 32 47 32 32 32 32 32 32 32 92 124 95 124 95 
124 47 32 32 32 124 32 32 32 32 95 95 124 124 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32 32 32
32 32 32 32 32 32 32 32 32 32 32 47 32 32 47 32 32 92 32 32 32 32 32 32 32 32 
32 32 32 32 124 95 95 95 95 124 32 124 124 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32 32
32 32 32 32 32 32 32 32 32 32 47 32 32 32 124 32 32 32 124 32 47 124 32 32 32 
32 32 32 32 32 124 32 32 32 32 32 32 45 45 124 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32 32
32 32 32 32 32 32 32 32 32 32 124 32 32 32 124 32 32 32 124 47 47 32 32 32 32 
32 32 32 32 32 124 95 95 95 95 32 32 45 45 124 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32 32
32 32 32 42 32 95 32 32 32 32 124 32 32 124 95 124 95 124 95 124 32 32 32 32 32 
32 32 32 32 32 124 32 32 32 32 32 92 45 47 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32 32
42 45 45 32 95 45 45 92 32 95 32 92 32 32 32 32 32 47 47 32 32 32 32 32 32 32 
32 32 32 32 124 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32
32 32 47 32 32 95 32 32 32 32 32 92 32 95 32 47 47 32 32 32 124 32 32 32 32 32 
32 32 32 47 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32
42 32 32 47 32 32 32 92 95 32 47 45 32 124 32 45 32 32 32 32 32 124 32 32 32 32 
32 32 32 124 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 
32 32 32 32
32 32 42 32 32 32 32 32 32 95 95 95 32 99 95 99 95 99 95 67 47 32 92 67 95 99 
95 99 95 99 95 95 95 95 95 95 95 95 95 95 95 95 32 32 32 32 32 32 32 32 32 32 
32 32 32 32
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] efficient way to do this?

2008-09-22 Thread John Hunter

I have a an array of indices into a larger array where some condition
is satisfied.  I want to create a larger set of indices which *mark*
all the indicies following the condition over some Nmark length
window.  In code:

import numpy as np

N = 1000
Nmark = 20
ind = np.nonzero(np.random.rand(N)0.01)[0]


marked = np.zeros(N, bool)
for i in ind:
marked[i:i+Nmark] = True

I am going to have to do this over many arrays, and so I want to do it
efficiently.  Is there a way to do the above more efficiently, eg w/o
the loop.

In the real use case, there will be significant auto-correlation among
the places where the condition is satisfied.  Eg, if it is satisfied
at some index, it is likely that it will be satisfied for many of its
neighbors.  Eg, the real case looks more like

y = np.sin(2*np.pi*np.linspace(0, 2, N))

ind = np.nonzero(y0.95)[0]
marked2 = np.zeros(N, bool)
for i in ind:
marked2[i:i+Nmark] = True

Thanks in advance for any hints,
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] efficient way to do this?

2008-09-22 Thread John Hunter

On Mon, Sep 22, 2008 at 10:13 AM, Robert Kern [EMAIL PROTECTED] wrote:

 marked[ind + np.arange(Nmark)] = True

That triggers a broadcasting error:

Traceback (most recent call last):
  File /home/titan/johnh/test.py, line 13, in ?
marked3[ind + np.arange(Nmark)] = True
ValueError: shape mismatch: objects cannot be broadcast to a single shape

I am hoping there is some clever way to do this with broadcasting, I
am just not that clever...

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] efficient way to do this?

2008-09-22 Thread John Hunter

On Mon, Sep 22, 2008 at 10:23 AM, Robert Kern [EMAIL PROTECTED] wrote:
 On Mon, Sep 22, 2008 at 10:22, Robert Kern [EMAIL PROTECTED] wrote:

 ind2mark = np.asarray((ind[:,np.newaxis] + np.arange(Nmark).flat).clip(0, 
 N-1)
 marked[ind2mark] = True

 Missing parenthesis:

 ind2mark = np.asarray((ind[:,np.newaxis] + np.arange(Nmark)).flat).clip(0, 
 N-1)

Excellent, thanks!  Note to self, must become a newaxis guru...

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] strange seterr persistence between sessions

2008-07-28 Thread John Hunter

In trying to track down a bug in matplotlib, I have come across tsome
very strange numpy behavior.  Basically, whether or not I call
np.seterr('raise') or not in a matplotlib demo affects the behavior of
seterr in another (pure numpy) script, run in a separate process.
Something about the numpy state is persisting between python sessions.
 This appears to be platform specific, because I have only been able
to verify it on 1 platform (quad code xeon 64 bit running fedora) but
not on another (solaris x86).

Here are the gory details.  Below is a cut-and-paste from a single
xterm session, with some comments sprinkled in.

Some version info::

  ~ uname -a
  Linux bic128.bic.berkeley.edu 2.6.25.10-47.fc8 #1 SMP Mon Jul 7
18:31:41 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
  ~ python -V
  Python 2.5.1
  ~ python -c 'import numpy; print numpy.__version__'
  1.2.0.dev5564
  ~ python -c 'import matplotlib; print matplotlib.__version__'
  0.98.3rc1

With mpl svn, head over to the examples directory and grab the data
file needed to show this bug::

  ~ cd mpl/examples/pylab_examples/
  pylab_examples wget http://matplotlib.sourceforge.net/tmp/alpha.npy
  --11:22:08--  http://matplotlib.sourceforge.net/tmp/alpha.npy
 = `alpha.npy'
  Resolving matplotlib.sourceforge.net... 66.35.250.209
  Connecting to matplotlib.sourceforge.net|66.35.250.209|:80... connected.
  HTTP request sent, awaiting response... 200 OK
  Length: 688 [text/plain]

  100%[===]
688   --.--K/s

  11:22:08 (111.19 MB/s) - `alpha.npy' saved [688/688]


Run the geo_demo.py example.  This has np.seterr set to raise.  It
will issue a
floating point error::

  pylab_examples head -5 geo_demo.py
  import numpy as np
  np.seterr(raise)

  from pylab import *

  pylab_examples python geo_demo.py
  Traceback (most recent call last):
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/backends/backend_gtk.py,
line 333, in expose_event
  self._render_figure(self._pixmap, w, h)
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/backends/backend_gtkagg.py,
line 75, in _render_figure
  FigureCanvasAgg.draw(self)
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/backends/backend_agg.py,
line 261, in drawself.figure.draw(self.renderer)
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/figure.py,
line 759, in draw
  for a in self.axes: a.draw(renderer)
File /home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/axes.py,
line 1523, in draw
  a.draw(renderer)
File /home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/axis.py,
line 718, in draw
  tick.draw(renderer)
File /home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/axis.py,
line 186, in draw
  self.gridline.draw(renderer)
File /home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/lines.py,
line 423, in draw
  tpath, affine = self._transformed_path.get_transformed_path_and_affine()
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py,
line 2089, in get_transformed_path_and_affine
  self._transform.transform_path_non_affine(self._path)
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py,
line 1828, in transform_path_non_affine
  self._a.transform_path(path))
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py,
line 1828, in transform_path_non_affine
  self._a.transform_path(path))
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py,
line 1816, in transform_path
  self._a.transform_path(path))
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/projections/geo.py,
line 264, in transform_path
  return Path(self.transform(ipath.vertices), ipath.codes)
File 
/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/projections/geo.py,
line 249, in transformsinc_alpha = ma.sin(alpha) / alpha
File /home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py,
line 1887, in __div__
  return divide(self, other)
File /home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py,
line 638, in __call__
  t = narray(self.domain(d1, d2), copy=False)
File /home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py,
line 413, in __call__
  return umath.absolute(a) * self.tolerance = umath.absolute(b)
  FloatingPointError: underflow encountered in multiply

OK, now run the pure numpy test script in a separate python process.
It also has np.seterr set to raise, and
it raises the same error.  Nothing too strange (yet)::

  pylab_examples cat test.py
  import numpy as np
  np.seterr(raise)
  import numpy.ma as ma

  alpha = np.load('alpha.npy')
  alpham = ma.MaskedArray(alpha)
  sinc_alpha_ma = ma.sin(alpham) / alpham

  pylab_examples python test.py
  Traceback (most recent call last):
File

Re: [Numpy-discussion] strange seterr persistence between sessions

2008-07-28 Thread John Hunter

On Mon, Jul 28, 2008 at 2:02 PM, Robert Kern [EMAIL PROTECTED] wrote:
 On Mon, Jul 28, 2008 at 13:56, John Hunter [EMAIL PROTECTED] wrote:
 In trying to track down a bug in matplotlib, I have come across tsome
 very strange numpy behavior.  Basically, whether or not I call
 np.seterr('raise') or not in a matplotlib demo affects the behavior of
 seterr in another (pure numpy) script, run in a separate process.
 Something about the numpy state is persisting between python sessions.
  This appears to be platform specific, because I have only been able
 to verify it on 1 platform (quad code xeon 64 bit running fedora) but
 not on another (solaris x86).

 Can you make a new, smaller self-contained example? I suspect stale .pyc 
 files.

I'm not sure exactly what you mean by self-contained (since the
behavior requires at least two files).  Do you mean trying to come up
with two numpy only examples files, or one that does away with the npy
file?  Or both  As for the stale files, I'm not sure what you are
thinking but these are clean builds and installs of numpy and mpl.

So if you'll give me a little more guidance in terms of what you are
looking for in a self contained example, I'll be happy to try and put
it together.  But I am not sure what it is about the loading of the
geo_demo that is triggering the behavior (numpy extension code, large
memory footprint, ??).  I tried running a python snippet that would
fill a lot of memory to see if that would clear the persistence.  I
was wondering if there is some kernel memory caching and some empty
numpy memory that is not getting initialized properly and thus is
picking up some memory from a prior session) but it did not.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] strange seterr persistence between sessions

2008-07-28 Thread John Hunter

On Mon, Jul 28, 2008 at 2:35 PM, Robert Kern [EMAIL PROTECTED] wrote:

 Both, if the behavior exhibits itself without the npy file. If it only
 exhibits itself with an npy involved, then we have some more
 information about where the problem might be.

OK, I'll see what I can come up with.  In the mean time, as I was
trying to strip out the npy component and put the data directly into
the file, I find it strange that I am getting a floating point error
on this operation

  import numpy as np
  np.seterr(raise)
  import numpy.ma as ma

  x = 1.50375883
  m = ma.MaskedArray([x])
  sinc_alpha_ma = ma.sin(m) / m

---
FloatingPointErrorTraceback (most recent call last)

/home/jdhunter/ipython console in module()

/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.pyc in
__div__(self, other)
   1885 def __div__(self, other):
   1886 Divide other into self, and return a new masked array.
- 1887 return divide(self, other)
   1888 #
   1889 def __truediv__(self, other):

/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.pyc in
__call__(self, a, b)
636 d1 = getdata(a)
637 d2 = get_data(b)
-- 638 t = narray(self.domain(d1, d2), copy=False)
639 if t.any(None):
640 mb = mask_or(mb, t)

/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.pyc in
__call__(self, a, b)
411 if self.tolerance is None:
412 self.tolerance = np.finfo(float).tiny
-- 413 return umath.absolute(a) * self.tolerance = umath.absolute(b)
414 #
415 class _DomainGreater:

FloatingPointError: underflow encountered in multiply

I am no floating point expert, but I don't see why a numerator of
0.99775383 and a denominator of 1.50375883 should be triggering an
underflow error.  It looks more like a bug in the ma core logic since
umath.absolute(a) * self.tolerance is more or less guaranteed to fail
if np.seterr(raise) is set

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] RFC: A (second) proposal for implementing some date/time types in NumPy

2008-07-25 Thread John Hunter

On Fri, Jul 25, 2008 at 8:22 PM, Matt Knox [EMAIL PROTECTED] wrote:

 The automatic string parsing has been mentioned before, but it is a feature
 I am personally very fond of. I use it all the time, and I suspect a lot of
 people would like it very much if they used it. It's not suited for high
 performance code, but is fantastic for interactive and ad-hoc work. This is
 supported right in the constructor of the current Date class, along with
 conversion from datetime objects. I'd love to see such support built into the
 new date type, although I guess it could be added on easily enough with a
 factory function.

There is a module dateutil.parser which is released under the PSF
license if there is interest in including something like this.  Not
sure if it is appropriate for numpy because of the speed implications,
but its out there.  mpl ships dateutil, so it is already available
with all mpl installs.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] permissions on tests in numpy and scipy

2008-07-15 Thread John Hunter

On Mon, Jul 14, 2008 at 12:34 PM, Robert Kern [EMAIL PROTECTED] wrote:

 We're not doing anything special, here. When I install using sudo
 python install.py on OS X, all of the permissions are 644. I think
 the problem may be in your pipeline.

With a little more testing, what I am finding is that when I do a
fresh svn co at work (solaris x86) a lot of files (eg setup.py or the
test*.py files) come down permissioned at 600 or 700.  If I do the
same checkout on a recent linux box, they come down as 644 or 755.  I
checked my umask and they are the same on both boxes.  So I am a bit
stumped and it is clearly not a numpy problem, but I wanted to mention
it here in case any unix guru has an idea (both of these are from
clean svn checkouts)

Solaris box (funky permissions):
[EMAIL PROTECTED]:~ svn --version
svn, version 1.4.3 (r23084)
   compiled Jun  6 2007, 16:45:15
[EMAIL PROTECTED]:~ uname -a
SunOS flag 5.10 Generic_118855-15 i86pc i386 i86pc
[EMAIL PROTECTED]:~ umask
0002
[EMAIL PROTECTED]:~ cd /export/home/johnh/tmp/numpy/
[EMAIL PROTECTED]:numpy ls -l setup.py
-rwx--   1 johnhresearch3370 Jul 14 14:20 setup.py

##

Linux box (expected permissions):
[EMAIL PROTECTED]:~ svn --version
svn, version 1.4.4 (r25188)
   compiled Sep  2 2007, 14:25:40
[EMAIL PROTECTED]:~ uname -a
Linux bic128.bic.berkeley.edu 2.6.25.9-40.fc8 #1 SMP Fri Jun 27
16:05:49 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[EMAIL PROTECTED]:numpy umask
0002
[EMAIL PROTECTED]:~ cd /home/jdhunter/tmp/numpy/
[EMAIL PROTECTED]:numpy ls -l setup.py
-rwxrwxr-x 1 jdhunter jdhunter 3370 Jul 14 12:19 setup.py
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] permissions on tests in numpy and scipy

2008-07-14 Thread John Hunter

I have a rather unconventional install pipeline at work and owner only
read permissions on a number of the tests are causing me minor
problems.  It appears the permissions on the tests are set rather
inconsistently in numpy and python -- is there any reason not to make
these all 644?

[EMAIL PROTECTED]:site-packages find numpy -name test_*.py|xargs ls -l
-rw---   1 johnhresearch2769 Jul 14 12:01
numpy/core/tests/test_defchararray.py
-rw---   1 johnhresearch8283 Jul 14 12:01
numpy/core/tests/test_defmatrix.py
-rw---   1 johnhresearch1769 Jun 25 10:00
numpy/core/tests/test_errstate.py
-rw---   1 johnhresearch1508 Jun 25 10:00
numpy/core/tests/test_memmap.py
-rw---   1 johnhresearch   4 Jul 14 12:01
numpy/core/tests/test_multiarray.py
-rw---   1 johnhresearch   26695 Jul 14 12:01
numpy/core/tests/test_numeric.py
-rw---   1 johnhresearch   13781 Jul 14 12:01
numpy/core/tests/test_numerictypes.py
-rw---   1 johnhresearch1113 Jul 14 12:01
numpy/core/tests/test_print.py
-rw---   1 johnhresearch4290 Jul 14 12:01
numpy/core/tests/test_records.py
-rw---   1 johnhresearch   39370 Jul 14 12:01
numpy/core/tests/test_regression.py
-rw---   1 johnhresearch4097 Jul 14 12:01
numpy/core/tests/test_scalarmath.py
-rw---   1 johnhresearch9330 Jun 25 10:00
numpy/core/tests/test_ufunc.py
-rw---   1 johnhresearch7583 Jul 14 12:01
numpy/core/tests/test_umath.py
-rw---   1 johnhresearch   11264 Jul 14 12:01
numpy/core/tests/test_unicode.py
-rw---   1 johnhresearch 222 Jul 14 12:00
numpy/distutils/tests/f2py_ext/tests/test_fib2.py
-rw---   1 johnhresearch 220 Jul 14 12:00
numpy/distutils/tests/f2py_f90_ext/tests/test_foo.py
-rw---   1 johnhresearch 220 Jul 14 12:00
numpy/distutils/tests/gen_ext/tests/test_fib3.py
-rw---   1 johnhresearch 277 Jul 14 12:00
numpy/distutils/tests/pyrex_ext/tests/test_primes.py
-rw---   1 johnhresearch 387 Jul 14 12:00
numpy/distutils/tests/swig_ext/tests/test_example.py
-rw---   1 johnhresearch 276 Jul 14 12:00
numpy/distutils/tests/swig_ext/tests/test_example2.py
-rw---   1 johnhresearch1800 Jul 14 12:00
numpy/distutils/tests/test_fcompiler_gnu.py
-rw---   1 johnhresearch2421 Jun 25 09:59
numpy/distutils/tests/test_misc_util.py
-rw-r--r--   1 johnhresearch   64338 Jun 25 09:59
numpy/f2py/lib/parser/test_Fortran2003.py
-rw-r--r--   1 johnhresearch   24785 Jun 25 09:59
numpy/f2py/lib/parser/test_parser.py
-rw---   1 johnhresearch 574 Jul 14 12:00
numpy/fft/tests/test_fftpack.py
-rw---   1 johnhresearch1256 Jul 14 12:00
numpy/fft/tests/test_helper.py
-rw---   1 johnhresearch9948 Jul 14 12:01
numpy/lib/tests/test__datasource.py
-rw---   1 johnhresearch4088 Jul 14 12:01
numpy/lib/tests/test_arraysetops.py
-rw---   1 johnhresearch 809 Jul 14 12:01
numpy/lib/tests/test_financial.py
-rw---   1 johnhresearch   23794 Jul 14 12:01
numpy/lib/tests/test_format.py
-rw---   1 johnhresearch   30772 Jul 14 12:01
numpy/lib/tests/test_function_base.py
-rw---   1 johnhresearch1587 Jun 25 10:00
numpy/lib/tests/test_getlimits.py
-rw---   1 johnhresearch2111 Jul 14 12:01
numpy/lib/tests/test_index_tricks.py
-rw---   1 johnhresearch7691 Jul 14 12:01 numpy/lib/tests/test_io.py
-rw---   1 johnhresearch 999 Jun 25 10:00
numpy/lib/tests/test_machar.py
-rw---   1 johnhresearch2431 Jun 25 10:00
numpy/lib/tests/test_polynomial.py
-rw---   1 johnhresearch1409 Jul 14 12:01
numpy/lib/tests/test_regression.py
-rw---   1 johnhresearch   14353 Jul 14 12:01
numpy/lib/tests/test_shape_base.py
-rw---   1 johnhresearch6958 Jul 14 12:01
numpy/lib/tests/test_stride_tricks.py
-rw---   1 johnhresearch6735 Jul 14 12:01
numpy/lib/tests/test_twodim_base.py
-rw---   1 johnhresearch9579 Jul 14 12:01
numpy/lib/tests/test_type_check.py
-rw---   1 johnhresearch1771 Jun 25 10:00
numpy/lib/tests/test_ufunclike.py
-rw---   1 johnhresearch8279 Jul 14 12:01
numpy/linalg/tests/test_linalg.py
-rw---   1 johnhresearch1739 Jul 14 12:01
numpy/linalg/tests/test_regression.py
-rw---   1 johnhresearch   78819 Jun 25 10:00
numpy/ma/tests/test_core.py
-rw---   1 johnhresearch   15744 Jul 14 12:01
numpy/ma/tests/test_extras.py
-rw---   1 johnhresearch   17759 Jun 25 10:00
numpy/ma/tests/test_mrecords.py
-rw---   1 johnhresearch   33009 Jun 25 10:00
numpy/ma/tests/test_old_ma.py
-rw---   1 johnhresearch5956 Jun 25 10:00
numpy/ma/tests/test_subclassing.py
-rw---   1 johnhresearch3173 Jul 14 12:01
numpy/oldnumeric/tests/test_oldnumeric.py
-rw---   1 johnhresearch2043 Jun 25 09:59
numpy/random/tests/test_random.py
-rw---   1

Re: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy

2008-07-12 Thread John Hunter

On Fri, Jul 11, 2008 at 1:14 PM, Francesc Alted [EMAIL PROTECTED] wrote:

 So, it seems that setters/getters for matplotlib datetime could be
 supported, maybe at the risk of loosing precision.  We should study
 this more carefully, but I suppose that if there is interest enough
 that could be implemented, yes.

You don't need to worry about mpl -- we will support whatever datetime
handling numpy implements (I think your proposal would be a great
addition).  We have been moving away from the date2num floats in mpl
(as you note using floats was not a great idea because of the
precision issue).  We now support native python datetime handling but
a numpy datetime would be ideal.  The infrastructure we use to handle
python datetime's can easily support other datetime objects.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] sampling based on running sums

2008-06-27 Thread John Hunter

I would like to find the sample points where the running sum of some
vector exceeds some threshold -- at those points I want to collect all
the data in the vector since the last time the criteria was reached
and compute some stats on it.  For example, in python

tot = 0.
xs = []
ys = []

samples1 = []
for thisx, thisy in zip(x, y):
tot += thisx
xs.append(thisx)
ys.append(thisy)
if tot=threshold:
samples1.append(func(xs,ys))
tot = 0.
xs = []
ys = []


The following is close in numpy

sx = np.cumsum(x)
n = (sx/threshold).astype(int)
ind = np.nonzero(np.diff(n)0)[0]+1

lasti = 0
samples2 = []
for i in ind:
xs = x[lasti:i+1]
ys = y[lasti:i+1]
samples2.append(func(xs, ys))
lasti = i

But the sample points in ind do no guarantee that at least threshold
points are between the sample points due to truncation error.

What is a good numpy way to do this?

Thanks,
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Record arrays

2008-06-26 Thread John Hunter

On Thu, Jun 26, 2008 at 11:38 AM, Travis E. Oliphant
[EMAIL PROTECTED] wrote:
 Stéfan van der Walt wrote:
 Hi all,

 I am documenting `recarray`, and have a question:

 Is its use still recommended, or has it been superseded by fancy data-types?

 I rarely recommend it's use (but some people do like attribute access to
 the fields).It is wrong, however, to say that recarray has been
 superseded by fancy data types because fancy data types have existed for
 as long as recarrays.

I personally think they are the best thing since sliced bread, and
everyone here who uses them becomes immediately addicted to them.  I
would like to see better support for them, especially making the attrs
exposed to dir so tab completion would work.

People in the financial/business world work with spreadsheet data a
lot, and record arrays are the natural data structure to represent
tabular, heterogeneous data.If you work with this data all day,
you save a lot of ugly keystrokes doing r.date rather than r['date'],
and the code is prettier in my opinion.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New documentation web application

2008-05-31 Thread John Hunter

On Sat, May 31, 2008 at 4:05 AM, R. Bastian [EMAIL PROTECTED] wrote:

 Neat!  I really like the layout.  The red format warnings are a nice touch:
 http://sd-2116.dedibox.fr/pydocweb/doc/numpy.core.umath.exp/

Hi, I was just reading through this example when I noticed this usage:

  from matplotlib import pyplot as plt

Although this is of course perfectly valid python, I have been
encouraging people when importing modules from packages to use the
syntax:

import somepackage.somemodule as somemod

rather than

from somepackage import somemodule as somemod

The reason is that in the first usage it is unambiguous that
somemodule is a module and not a function or constant.   Eg, both of
these are valid python:

In [7]: from numpy import arange

   In [8]: from numpy import fft

but only the module import is valid here:

In [3]: import numpy.fft as fft

In [4]: import numpy.arange as arange
ImportError   Traceback (most recent call last)
ImportError: No module named arange

I taught a class on scientific computing in python to undergraduates,
and the students  were frequently confused about what was a module and
what was a function.  If you are coming from matlab, you are likely to
think fft is a function when you see this:

  from numpy import fft

By being consistent in importing modules using the 'import numpy.fft
as fft', it can make it more clear that we are importing a module.

I already recommend this usage in the matplotlib coding guide, and
numpy may want to adopt it as well.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] logical masking, wrong length mask

2008-05-28 Thread John Hunter

I just spent a while tracking down a bug in my code, and found out the
problem was numpy was letting me get away with using a logical mask of
smaller size than the array it was masking.

  In [19]: x = np.random.rand(10)

  In [20]: x
  Out[20]:
  array([ 0.72253623,  0.8412243 ,  0.12835194,  0.01595052,  0.62208366,
  0.57229259,  0.46099861,  0.44114786,  0.23687212,  0.89507604])

  In [21]: y = np.random.rand(11)

  In [22]: mask = x.5

  In [23]: x[mask]
  Out[23]: array([ 0.72253623,  0.8412243 ,  0.62208366,  0.57229259,
0.89507604])

  In [24]: y[mask]
  Out[24]: array([ 0.13440315,  0.83610533,  0.75390136,  0.79046615,
0.34776165])

  In [25]: mask
  Out[25]: array([ True,  True, False, False,  True,  True, False,
False, False,  True], dtype=bool)

I initially thought line 24 below should raise an error, or coerce
True to 1 and False to 0 and give me either y[0] or y[1] accordingly,
but neither appear to be happening.  Instead, I appear to be getting
y[:len(mask)][mask] .

  In [27]: y[:10][mask]
  Out[27]: array([ 0.13440315,  0.83610533,  0.75390136,  0.79046615,
0.34776165])

  In [28]: y[mask]
  Out[28]: array([ 0.13440315,  0.83610533,  0.75390136,  0.79046615,
0.34776165])

  In [29]: len(y)
  Out[29]: 11

  In [30]: len(mask)
  Out[30]: 10

  In [31]: y[:len(mask)][mask]
  Out[31]: array([ 0.13440315,  0.83610533,  0.75390136,  0.79046615,
0.34776165])

  In [32]: np.__version__
  Out[32]: '1.2.0.dev5243'


Bug or feature?
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] numpy.save bug on solaris x86 w/ nans and objects

2008-05-20 Thread John Hunter

I have a record array w/ dates (O4) and floats.  If some of these
floats are NaN, np.save crashes (on my solaris platform but not on a
linux machine I tested on).  Here is the code that produces the bug:

In [1]: pwd
Out[1]: '/home/titan/johnh/python/svn/matplotlib/matplotlib/examples/data'

In [2]: import matplotlib.mlab as mlab

In [3]: import numpy as np

In [4]: r = mlab.csv2rec('aapl.csv')

In [5]: r.dtype
Out[5]: dtype([('date', '|O4'), ('open', 'f8'), ('high', 'f8'),
('low', 'f8'), ('close', 'f8'), ('volume', 'i4'), ('adj_close',
'f8')])

In [6]: r.close[100:] = np.nan

In [7]: r.close
Out[7]: array([ 124.63,  127.46,  129.4 , ..., NaN, NaN, NaN])

In [8]: np.save('mydata.npy', r)

Traceback (most recent call last):
 File ipython console, line 1, in ?
 File /home/titan/johnh/dev/lib/python2.4/site-packages/numpy/lib/io.py,
line 158, in save
   format.write_array(fid, arr)
 File /home/titan/johnh/dev/lib/python2.4/site-packages/numpy/lib/format.py,
line 272, in write_array
   cPickle.dump(array, fp, protocol=2)
SystemError: frexp() result out of range


In [9]: np.__version__
Out[9]: '1.2.0.dev5136'

In [10]: !uname -a
SunOS flag 5.10 Generic_118855-15 i86pc i386 i86pc
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.save bug on solaris x86 w/ nans and objects

2008-05-20 Thread John Hunter

On Tue, May 20, 2008 at 12:13 PM, Charles R Harris
[EMAIL PROTECTED] wrote:

 Looks like we need to add a test for this before release. But I'm off to
 work.

Here's a simpler example in case you want to wrap it in a test harness:

import datetime
import numpy as np

r = np.rec.fromarrays([
[datetime.date(2007,1,1), datetime.date(2007,1,2), datetime.date(2007,1,2)],
[.1, .2, np.nan],
], names='date,value')


np.save('mytest.npy', r)
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] bug in numpy.histogram?

2008-02-20 Thread John Hunter

We recently deprecated matplotlib.mlab.hist, and I am now hitting a
bug in numpy's historgram, which appears to be caused by the use of
any that does not exist in the namespace.  Small patch attached.
The example below exposes the bug:

Python 2.4.2 (#1, Feb 23 2006, 12:48:31)
Type copyright, credits or license for more information.

IPython 0.8.3.svn.r2876 -- An enhanced Interactive Python.
? - Introduction and overview of IPython's features.
%quickref - Quick reference.
help  - Python's own help system.
object?   - Details about 'object'. ?object also works, ?? prints more.

In [1]: import numpy as np

In [2]: np.__file__
Out[2]: '/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/__init__.pyc'

In [3]: np.__version__
Out[3]: '1.0.5.dev4814'

In [4]: x = np.random.randn(100)

In [5]: bins = np.linspace(x.min(), x.max(), 40)

In [6]: y = np.histogram(x, bins=bins)

Traceback (most recent call last):
  File ipython console, line 1, in ?
  File 
/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/lib/function_base.py,
line 155, in histogram
if(any(bins[1:]-bins[:-1]  0)):
NameError: global name 'any' is not defined
Index: numpy/lib/function_base.py
===
--- numpy/lib/function_base.py	(revision 4814)
+++ numpy/lib/function_base.py	(working copy)
@@ -152,7 +152,7 @@
 bins = linspace(mn, mx, bins, endpoint=False)
 else:
 bins = asarray(bins)
-if(any(bins[1:]-bins[:-1]  0)):
+if(bins[1:]-bins[:-1]  0).any():
 raise AttributeError, 'bins must increase monotonically.'
 
 # best block size probably depends on processor cache size
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] major changes in matplotlib svn

2008-01-08 Thread John Hunter

Apologies for the off-topic post to the numpy list, but we have just
committed some potentially code-breaking changes to the matplotlib svn
repository, and we want to gve as wide a notification to people as
possible.  Please do not reply to the numpy list, but rather to a
matplotlib mailing list .

Migrating to the new matplotlib codebase


Michael Droettboom has spent the last several months working on the
transforms branch of matplotlib, in which he rewrote from the ground
up the transformation infrastructure in matplotlib, which many found
unintuitive and hard to extend.  In addition to a cleaner code base,
the reorganization allows you to define your own transformations and
projections (e.g. map projections) within matplotlib.  He has merged his
work into the HEAD of the svn trunk, and this will be the basis for
future matplotlib releases.

If you are a svn user, we encourage you to continue using the trunk as
before, but with the understanding that you are now truly on the
bleeding edge.  Michael has made sure all the examples still pass with
the new code base, so for the vast majority of you, I expect to see
few problems.  But we need to get as many people as possible using the
new code base so we can find and fix the remaining problems.  We have
take the svn code used in the last stable release in the 0.91 series,
and made it a maintenance branch so we can still fix bugs and support
people who are not ready to migrate to the new transformation
infrastructure but nonetheless need access to svn bug fixes.

Using the new code
==

To check out the trunk with the latest transforms changes:

 svn co 
https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/trunk/matplotlib

If you already have a working copy of the trunk, your next svn up will
include the latest transforms changes.

Before installing, make sure you completely remove the old matplotlib
build and install directories, eg:

 cd matplotlib
 sudo rm -rf build
 sudo rm -rf /usr/local/lib/python2.5/site-packages/matplotlib
 sudo python setup.py install

Using the old svn code
==

To check out the maintenance branch, in order to commit bugfixes to 0.91.x:

 svn co  
https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/branches/v0_91_maint
 matplotlib_0_91_maint

Any applicable bugfixes on the 0.91.x maintenance branch should be
merged into the trunk so they are fixed there as well.  Svnmerge.py
makes this process rather straightforward, but you may also manually
merge if you prefer.

Merging bugfixes on the maint branch to the trunk using svnmerge.py
---

Download svnmerge.py from here:

  http://www.orcaware.com/svn/wiki/Svnmerge.py

From the working copy of the *trunk* (svnmerge.py always pulls *to*
the current working copy), so

svnmerge.py merge

to pull in changes from the maintenance branch.  Manually resolve any
conflicts, if necessary, test them, and then commit with

svn commit -F svnmerge-commit-message.txt

(Note the above will stop working when the maintenance branch is
abandoned.)

API CHANGES in the new transformation infrastructure


While Michael worked hard to keep the API mostly unchanged while
performing what has been called open heart surgery on matplotlib,
there have been some changes, as discussed below.

The primary goal of these changes was to make it easier to
extend matplotlib to support new kinds of projections.  This is
primarily an internal improvement, and the possible user-visible
changes it allows are yet to come.

These changes are detailed in the API_CHANGES document.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] boolean masks lists

2007-11-06 Thread John Hunter

On Nov 6, 2007 8:22 AM, Lisandro Dalcin [EMAIL PROTECTED] wrote:
 Mmm...
 It looks as it 'mask' is being inernally converted from
 [True, False, False, False, True]
 to
 [1, 0, 0, 0, 1]

Yep, clearly.  The question is: is this the desired behavior because
it leads to a silent failure for people who are expecting sequences
of booleans to behave like arrays of booleans.

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] boolean masks lists

2007-11-05 Thread John Hunter

A colleague of mine just asked for help with a pesky bug that turned
out to be caused by his use of a list of booleans rather than an array
of booleans as his logical indexing mask.  I assume this is a feature
and not a bug, but it certainly surprised him:

In [58]: mask = [True, False, False, False, True]

In [59]: maska = n.array(mask, n.bool)

In [60]: x = arange(5)

In [61]: x[mask]
Out[61]: array([1, 0, 0, 0, 1])

In [62]: x[maska]
Out[62]: array([0, 4])
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] convolution and wiener khinchin

2007-10-25 Thread John Hunter

I am working on an example to illustrate convolution in the temporal
and spectral domains, using the property that a convolution in the
time domain is a multiplication in the fourier domain.  I am using
numpy.fft and numpy.convolve to compute the solution two ways, and
comparing them.  I am getting an error for small times in my fourier
solution.  At first I thought this was because of edge affects, but I
see it even when I apply a windowing function.

Can anyone advise me about what I am doing wrong here?


In signal processing, the output of a linear system to an arbitrary
input is given by the convolution of the impulse response function (the
system response to a Dirac-delta impulse) and the input signal.

Mathematically:

  y(t) = \int_0^\t x(\tau)r(t-\tau)d\tau


where x(t) is the input signal at time t, y(t) is the output, and r(t)
is the impulse response function.

In this exercise, we will compute investigate the convolution of a
white noise process with a double exponential impulse response
function, and compute the results 3 ways:

  * using numpy.convolve

  * in Fourier space using the property that a convolution in the
temporal domain is a multiplication in the fourier domain


import numpy as npy
import matplotlib.mlab as mlab
from pylab import figure, show

# build the time, input, output and response arrays
dt = 0.01
t = npy.arange(0.0, 10.0, dt)# the time vector from 0..5
Nt = len(t)

def impulse_response(t):
'double exponential response function'
return (npy.exp(-t) - npy.exp(-5*t))*dt

win = npy.hanning(Nt)
x = npy.random.randn(Nt)# gaussian white noise
x = win*x
r = impulse_response(t)*dt  # evaluate the impulse function
r = win*r
y = npy.convolve(x, r, mode='full')  # convultion of x with r
y = y[:Nt]

# plot t vs x, t vs y and t vs r in three subplots
fig = figure()
ax1 = fig.add_subplot(311)
ax1.plot(t, x)
ax1.set_ylabel('input x')

ax2 = fig.add_subplot(312)
ax2.plot(t, y, label='convolve')
ax2.set_ylabel('output y')


ax3 = fig.add_subplot(313)
ax3.plot(t, r)
ax3.set_ylabel('input response')
ax3.set_xlabel('time (s)')


# compute y via numerical integration of the convolution equation
F = npy.fft.fft(r)
X = npy.fft.fft(x)
Y = F*X
yi = npy.fft.ifft(Y).real
ax2.plot(t, yi, label='fft')
ax2.legend(loc='best')

show()

In signal processing, the output of a linear system to an arbitrary
input is given by the convolution of the impule response function (the
system response to a Dirac-delta impulse) and the input signal.

Mathematically:

  y(t) = \int_0^\t x(\tau)r(t-\tau)d\tau


where x(t) is the input signal at time t, y(t) is the output, and r(t)
is the impulse response function.

In this exercise, we will compute investigate the convolution of a
white noise process with a double exponential impulse response
function, and compute the results 3 ways:

  * using numpy.convolve

  * in Fourier space using the property that a convolution in the
temporal domain is a multiplication in the fourier domain


import numpy as npy
import matplotlib.mlab as mlab
from pylab import figure, show

# build the time, input, output and response arrays
dt = 0.01
t = npy.arange(0.0, 10.0, dt)# the time vector from 0..5
Nt = len(t)

def impulse_response(t):
'double exponential response function'
return (npy.exp(-t) - npy.exp(-5*t))*dt

win = npy.hanning(Nt)
x = npy.random.randn(Nt)# gaussian white noise
x = win*x
r = impulse_response(t)*dt  # evaluate the impulse function
r = win*r
y = npy.convolve(x, r, mode='full')  # convultion of x with r
y = y[:Nt]

# plot t vs x, t vs y and t vs r in three subplots
fig = figure()
ax1 = fig.add_subplot(311)
ax1.plot(t, x)
ax1.set_ylabel('input x')

ax2 = fig.add_subplot(312)
ax2.plot(t, y, label='convolve')
ax2.set_ylabel('output y')


ax3 = fig.add_subplot(313)
ax3.plot(t, r)
ax3.set_ylabel('input response')
ax3.set_xlabel('time (s)')


# compute y via numerical integration of the convolution equation
F = npy.fft.fft(r)
X = npy.fft.fft(x)
Y = F*X
yi = npy.fft.ifft(Y).real
ax2.plot(t, yi, label='fft')
ax2.legend(loc='best')

show()
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] adding field to rec array

2007-10-05 Thread John Hunter

On 9/26/07, Robert Kern [EMAIL PROTECTED] wrote:

 Here is the straightforward way:

 In [15]: import numpy as np

 In [16]: dt = np.dtype([('foo', int), ('bar', float)])

 In [17]: r = np.zeros((3,3), dtype=dt)

Here is a (hopefully) simple question.  If I create an array like
this, how can I efficiently convert it to a record array which lets me
do r.attr in addition to r['attr'].  I'm pretty addicted to the former
syntax.

In [114]: dt = np.dtype([('foo', int), ('bar', float)])

In [115]: r = np.zeros((3,3), dtype=dt)

In [116]: r.dtype
Out[116]: dtype([('foo', 'i4'), ('bar', 'f8')])

In [117]: r['foo']
Out[117]:
array([[0, 0, 0],
   [0, 0, 0],
   [0, 0, 0]])

In [118]: r.foo

Traceback (most recent call last):
  File ipython console, line 1, in ?
AttributeError: 'numpy.ndarray' object has no attribute 'foo'


Thanks,
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] corrcoef

2007-09-06 Thread John Hunter

Is it desirable that numpy.corrcoef for two arrays returns a 2x2 array
rather than a scalar

In [10]: npy.corrcoef(npy.random.rand(10), npy.random.rand(10))
Out[10]:
array([[ 1., -0.16088728],
   [-0.16088728,  1.]])


I always end up extracting the 0,1 element anyway.  What is the
advantage, aside from backwards compatibility, for returning a 2x2?
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Maskedarray implementations

2007-08-25 Thread John Hunter

On 8/24/07, Travis Oliphant [EMAIL PROTECTED] wrote:
 I like the direction of this work.  For me, the biggest issue is whether
 or not matplotlib (and other code depending on numpy.ma) works with it.
 I'm pretty sure this can be handled and so, I'd personally like to see it.

mpl already supports it (both ma and masked array via a config
setting) and we would be very happy to just maskedarray so we don't
have to support both.   Eric Firing added support for this a couple of
months back...  Things like having support for masked record arrays
are a big incentive to use maskedarray for me

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names

2007-07-06 Thread John Hunter

On 7/6/07, Vincent Nijs [EMAIL PROTECTED] wrote:
 I wrote the attached (small) program to read in a text/csv file with
 different data types and convert it into a recarray without having to
 pre-specify the dtypes or variables names. I am just too lazy to type-in
 stuff like that :) The supported types are int, float, dates, and strings.

 I works pretty well but it is not (yet) as fast as I would like so I was
 wonder if any of the numpy experts on this list might have some suggestion
 on how to speed it up. I need to read 500MB-1GB files so speed is important
 for me.

In matplotlib.mlab svn, there is a function csv2rec that does the
same.  You may want to compare implementations in case we can
fruitfully cross pollinate them.  In the examples directy, there is an
example script examples/loadrec.py
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] masked arrays and record arrays

2007-06-14 Thread John Hunter

On 6/13/07, Pierre GM [EMAIL PROTECTED] wrote:

 Have you tried mrecords, in the alternative maskedarray package available on
 the scipy SVN ? It should support masked fields (by opposition to masked
 records in numpy.core.ma). If not, would you mind giving a test and letting
 me know your suggestions ?
 Thanks a lot in advance for any inputs.

I would be happy to try this out -- do you happen to have an example
that shows how to set the masks on the individual fields?

JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] attribute names on record arrays

2007-06-13 Thread John Hunter

I fund myself using record arrays more and more, and feature missing
is the ability to do tab completion on attribute names in ipython,
presumably because you are using a dict under the hood and __getattr__
to resolve

o.key

where o is a record array and key is a field name.

How hard would it be to populate __dict__ with the attribute names so
we could tab complete on them?

Thanks,
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] build advice

2007-05-31 Thread John Hunter

A colleague of mine is trying to update our production environment
with the latest releases of numpy, scipy, mpl and ipython, and is
worried about the lag time when there is a new numpy and old scipy,
etc... as the build progresses.  This is the scheme he is considering,
which looks fine to me, but I thought I would bounce it off the list
here in case anyone has confronted or thought about this problem
before.

Alternatively, an install to a tmp dir and then a bulk cp -r should work, no?

JDH

 We're planning to be putting out a bugfix
 matplotlib point release to 0.90.1 -- can you hold off on the mpl
 install for a day or so?

Sure.  While I have your attention, do you think this install scheme
would work?  It's the body of an email I just sent to c.l.py.


At work I need to upgrade numpy, scipy, ipython and matplotlib.  They need
to be done all at once.  All have distutils setups but the new versions and
the old versions are incompatible with one another as a group because
numpy's apis changed.  Ideally, I could just do something like

cd ~/src
cd numpy
python setup.py install
cd ../scipy
python setup.py install
cd ../matplotlib
python setup.py install
cd ../ipython
python setup.py install

however, even if nothing goes awry it leaves me with a fair chunk of time
where the state of the collective system is inconsistent (new numpy, old
everything else).  I'm wondering...  Can I stage the installs to a different
directory tree like so:

export PYTHONPATH=$HOME/local/lib/python-2.4/site-packages
cd ~/src
cd numpy
python setup.py install --prefix=$PYTHONPATH
cd ../scipy
python setup.py install --prefix=$PYTHONPATH
cd ../matplotlib
python setup.py install --prefix=$PYTHONPATH
cd ../ipython
python setup.py install --prefix=$PYTHONPATH

That would get them all built as a cohesive set.  Then I'd repeat the
installs without PYTHONPATH:

unset PYTHONPATH
cd ~/src
cd numpy
python setup.py install
cd ../scipy
python setup.py install
cd ../matplotlib
python setup.py install
cd ../ipython
python setup.py install

Presumably the compilation (the time-consuming part) is all
location-independent, so the second time the build_ext part should be fast.

Can anyone comment on the feasibility of this approach?  I guess what I'm
wondering is what dependencies there are on the installation directory.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] build advice

2007-05-31 Thread John Hunter

On 5/31/07, Matthew Brett [EMAIL PROTECTED] wrote:
 Hi,

  That would get them all built as a cohesive set.  Then I'd repeat the
  installs without PYTHONPATH:

 Is that any different from:
 cd ~/src
cd numpy
python setup.py build
cd ../scipy
python setup.py build

Well, the scipy and mpl builds need to see the new numpy build, I
think that is the issue.
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] pretty printing record array element - datetime

2007-05-15 Thread John Hunter

I have a numpy record array and I want to pretty print a single
element.  I was trying to loop over the names in the element dtype and
use getattr to access the field value, but I got fouled up because
getattr is trying to access the dtype attribute of one of the python
objects (datetime.date) that I am storing in the record array.  The
problem field in the example below is ('fiscaldate', '|O4') which is a
python datetime.date object.

I subsequently discovered that I can simply use tr[name] rather than
getattr(tr, name),but wanted to post this in case it is a bug.

In other news, would it be possible to add a pprint method to record
array elements that did something like the pprint_element below?  It
would be nice to be able to do tr.pprint() and get nice output which
shows key/value pairs.

Thanks,
JDH

===
# the buggy code

In [33]: type(thisr)
Out[33]: class 'numpy.core.records.recarray'

In [34]: tr = thisr[14]

In [35]: tr
Out[35]: ('CDE', 'Y2002', datetime.date(2002, 12, 31),
datetime.date(2002, 3, 18), datetime.date(2003, 3, 18),
datetime.date(2004, 3, 18), datetime.date(2005, 3, 18), 'CDE', 'A',
767.0, 126.95, 85.9440003, -81.2079998, -8.484,
4.9586, -28.071, 33.031,
415.964997, 184.334)

In [36]: tr.dtype
Out[36]: dtype([('id', '|S10'), ('period', '|S5'), ('fiscaldate',
'|O4'), ('datepy1', '|O4'), ('reportdate', '|O4'), ('dateny1', '|O4'),
('dateny2', '|O4'), ('ticker', '|S6'), ('exchange', '|S1'),
('employees', 'f8'), ('marketcap', 'f8'), ('sales', 'f8'),
('income', 'f8'), ('cashflowops', 'f8'), ('returnunhedged', 'f8'),
('returnq', 'f8'), ('returnhedgedprior', 'f8'),
('returnhedgednext1', 'f8'), ('returnhedgednext2', 'f8')])

In [37]: for name in tr.dtype.names:
print name, getattr(tr, name)
   :
id CDE
period Y2002
fiscaldate
Traceback (most recent call last):
  File ipython console, line 2, in ?
  File 
/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/core/records.py,
line 138, in __getattribute__
if obj.dtype.fields:
AttributeError: 'datetime.date' object has no attribute 'dtype'


In [39]: numpy.__version__
Out[39]: '1.0.3.dev3728'


==

# the good code (the colons are aligned in the output in an ASCII terminal)

def pprint_element(tr):
names = tr.dtype.names
maxlen = max([len(name) for name in names])
rows = []
fmt = '%% %ds: %%s'%maxlen
for name in names:
rows.append(fmt%(name, tr[name]))
return '\n'.join(rows)



In [49]: print pprint_element(tr)
   id: CDE
   period: Y2002
   fiscaldate: 2002-12-31
  datepy1: 2002-03-18
   reportdate: 2003-03-18
  dateny1: 2004-03-18
  dateny2: 2005-03-18
   ticker: CDE
 exchange: A
employees: 767.0
marketcap: 126.95
sales: 85.944
   income: -81.208
  cashflowops: -8.484
   returnunhedged: 4.959
  returnq: -28.072
returnhedgedprior: 33.03
returnhedgednext1: 415.965
returnhedgednext2: 184.334
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] pretty print array

2007-03-03 Thread John Hunter

I have a numpy array of floats, and would like an easy way of
specifying the format string when printing the array, eg

print x.pprint('%1.3f')

would do the normal repr of the array but using my format string for
the individual elements.  Is there and easy way to get something like
this currently?

Thanks,
JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Profiling numpy ? (parts written in C)

2006-12-20 Thread John Hunter

 David == David Cournapeau [EMAIL PROTECTED] writes:

David Of this 300 ms spent in Colormap functor, 200 ms are taken
David by the take function: this is the function which I think
David can be speed up considerably.

Sorry I had missed this in the previous conversations.  It is
impressive that take is taking such a big chunk the __call__ time,
because there is a lot of other stuff going on in that function!

You might want to run this against numarray for comparison -- in a few
instances Travis has been able find some big wins by borrowing from
numarray.


JDH
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Profiling numpy ? (parts written in C)

2006-12-19 Thread John Hunter

 David == David Cournapeau [EMAIL PROTECTED] writes:
David At the end, in the original context (speeding the drawing
David of spectrogram), this is the problem. Even if multiple
David backend/toolkits have obviously an impact in performances,
David I really don't see why a numpy function to convert an array
David to a RGB representation should be 10-20 times slower than
David matlab on the same machine.

This isn't exactly right.  When matplotlib converts a 2D grayscale
array to rgba, a lot goes on under the hood.  It's all numpy, but it's
far from single function and it involves many passes through the
data.  In principle, this could be done with one or two passes through
the data.  In practice, our normalization and colormapping abstractions
are so abstract that it is difficult (though not impossible) to
special case and optimize.

The top-level routine is

def to_rgba(self, x, alpha=1.0):
'''Return a normalized rgba array corresponding to x.
If x is already an rgb or rgba array, return it unchanged.
'''
if hasattr(x, 'shape') and len(x.shape)2: return x
x = ma.asarray(x)
x = self.norm(x)
x = self.cmap(x, alpha)
return x

which implies at a minimum two passes through the data, one for norm
and one for cmap.

In 99% of the use cases, cmap is a LinearSegmentedColormap though
users can define their own as long as it is callable.  My guess is
that the expensive part is Colormap.__call__, the base class for
LinearSegmentedColormap.  We could probably write some extension code
that does the following routine in one pass through the data.  But it
would be hairy.  In a quick look and rough count, I see about 10
passes through the data in the function below.

If you are interested in optimizing colormapping in mpl, I'd start
here.  I suspect there may be some low hanging fruit.

def __call__(self, X, alpha=1.0):

X is either a scalar or an array (of any dimension).
If scalar, a tuple of rgba values is returned, otherwise
an array with the new shape = oldshape+(4,). If the X-values
are integers, then they are used as indices into the array.
If they are floating point, then they must be in the
interval (0.0, 1.0).
Alpha must be a scalar.

if not self._isinit: self._init()
alpha = min(alpha, 1.0) # alpha must be between 0 and 1
alpha = max(alpha, 0.0)
self._lut[:-3, -1] = alpha
mask_bad = None
if not iterable(X):
vtype = 'scalar'
xa = array([X])
else:
vtype = 'array'
xma = ma.asarray(X)
xa = xma.filled(0)
mask_bad = ma.getmask(xma)
if typecode(xa) in typecodes['Float']:
putmask(xa, xa==1.0, 0.999) #Treat 1.0 as slightly less than 1.
xa = (xa * self.N).astype(Int)
# Set the over-range indices before the under-range;
# otherwise the under-range values get converted to over-range.
putmask(xa, xaself.N-1, self._i_over)
putmask(xa, xa0, self._i_under)
if mask_bad is not None and mask_bad.shape == xa.shape:
putmask(xa, mask_bad, self._i_bad)
rgba = take(self._lut, xa)
if vtype == 'scalar':
rgba = tuple(rgba[0,:])
return rgba



 

David I will take into account all those helpful messages, and
David hopefully come with something for the end of the week :),

David cheers

David David ___
David Numpy-discussion mailing list Numpy-discussion@scipy.org
David http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

84 matches

Mail list logo