Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-06 Thread Francesc Alted
A Saturday 05 December 2009 11:16:55 Dag Sverre Seljebotn escrigué:
  Mmh, the only case that I'm aware about dtype *mutability* is changing
  the names of compound types:
 
  In [19]: t = np.dtype(i4,f4)
 
  In [20]: t
  Out[20]: dtype([('f0', 'i4'), ('f1', 'f4')])
 
  In [21]: hash(t)
  Out[21]: -9041335829180134223
 
  In [22]: t.names = ('one', 'other')
 
  In [23]: t
  Out[23]: dtype([('one', 'i4'), ('other', 'f4')])
 
  In [24]: hash(t)
  Out[24]: 8637734220020415106
 
  Perhaps this should be marked as a bug?  I'm not sure about that, because
  the above seems quite useful.
 
 Well, I for one don't like this, but that's just an opinion. I think it
 is unwise to leave object which supports hash() mutable, because it's
 too easy to make hard to find bugs (sticking a dtype as a key in a dict
 is rather useful in many situations). There's a certain tradition in
 Python for leaving types immutable if possible, and dtype certainly
 feels like it.

Yes, I think you are right and force dtype to be immutable would be the best.  
As a bonus, an immutable dtype would render this ticket:

http://projects.scipy.org/numpy/ticket/1127

without effect.

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Chararray depreciated?

2009-12-06 Thread Ralf Gommers
On Sun, Dec 6, 2009 at 12:13 PM, Gael Varoquaux 
gael.varoqu...@normalesup.org wrote:

 http://docs.scipy.org/doc/numpy/reference/generated/numpy.chararray.html
 says that chararray are depreciated. I think I saw a discussion on the
 mailing list that hinted otherwise. Which one is true? Should I correct
 the docs?

You're right, after Mike's fixes that note should have been changed. I
thought Mike had also proposed an alternative text, but I can't find it
right now. So feel free to change it.

Also keep in mind the following (from another of Mike's emails):
All vectorized string operations are now available as regular functions in
the numpy.char namespace.  Usage of the chararray view class is only
recommended for numarray backward compatibility.

Cheers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] histogram for discrete data

2009-12-06 Thread Ernest Adrogué
Hi,

A few weeks ago there was a discussion about a
histogram_discrete() function --sorry for starting a new
thread but I have lost the mails.

Somebody pointed out that bincount() already can be used
to histogram discrete data (except that it doesn't work
with negative values).

I have just discovered a function in scipy.stats called
itemfreq() that does handle negative values.

In [17]: scipy.stats.itemfreq([-1,-1,0,5])
Out[17]: 
array([[-1.,  2.],
   [ 0.,  1.],
   [ 5.,  1.]])

Bye.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] histogram for discrete data

2009-12-06 Thread josef . pktd
2009/12/6 Ernest Adrogué eadro...@gmx.net:
 Hi,

 A few weeks ago there was a discussion about a
 histogram_discrete() function --sorry for starting a new
 thread but I have lost the mails.

 Somebody pointed out that bincount() already can be used
 to histogram discrete data (except that it doesn't work
 with negative values).

 I have just discovered a function in scipy.stats called
 itemfreq() that does handle negative values.

 In [17]: scipy.stats.itemfreq([-1,-1,0,5])
 Out[17]:
 array([[-1.,  2.],
       [ 0.,  1.],
       [ 5.,  1.]])

bincount is a fast c function, stats.itemfreq uses a slow python loop.
The latter should be very slow for large arrays.

stats.itemfreq works also on floats, but not on strings (which should be a bug).

Josef


 Bye.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] np.equal

2009-12-06 Thread josef . pktd
what's the difference in the implementation between np.equal and == ?
np.equal raises NotImplemented for strings, while == works.

 aa
array(['a', 'b', 'a', 'aa', 'a'],
  dtype='|S2')

 aa == 'a'
array([ True, False,  True, False,  True], dtype=bool)
 np.equal(aa,'a')
NotImplemented


 np.equal(np.arange(5),1)
array([False,  True, False, False, False], dtype=bool)
 np.equal(np.arange(5),'a')
NotImplemented
 np.arange(5) == 'a'
False

Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Py3 merge

2009-12-06 Thread Darren Dale
On Sat, Dec 5, 2009 at 10:54 PM, David Cournapeau courn...@gmail.com wrote:
 On Sun, Dec 6, 2009 at 9:41 AM, Pauli Virtanen p...@iki.fi wrote:
 Hi,

 I'd like to commit my Py3 Numpy branch to SVN trunk soon:

        http://github.com/pv/numpy-work/commits/py3k

 Awesome - I think we should merge this ASAP. In particular, I would
 like to start fixing platforms-specific issues.

 Concerning nose, will there be any version which works on both py2 and py3 ?

There is a development branch for python-3 here:

svn checkout http://python-nose.googlecode.com/svn/branches/py3k

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Py3 merge

2009-12-06 Thread Pauli Virtanen
su, 2009-12-06 kello 12:54 +0900, David Cournapeau kirjoitti:
 On Sun, Dec 6, 2009 at 9:41 AM, Pauli Virtanen p...@iki.fi wrote:
  Hi,
 
  I'd like to commit my Py3 Numpy branch to SVN trunk soon:
 
 http://github.com/pv/numpy-work/commits/py3k
 
 Awesome - I think we should merge this ASAP. In particular, I would
 like to start fixing platforms-specific issues.

Ok, the whole shebang is now in.

 Concerning nose, will there be any version which works on both py2 and py3 ?

No idea. (Though there's a separate Py3 branch.)

Pauli



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Chararray depreciated?

2009-12-06 Thread Gael Varoquaux
On Sun, Dec 06, 2009 at 01:11:24PM +0100, Ralf Gommers wrote:

[2]http://docs.scipy.org/doc/numpy/reference/generated/numpy.chararray.html
 says that chararray are depreciated. I think I saw a discussion on the
 mailing list that hinted otherwise. Which one is true? Should I correct
 the docs?

You're right, after Mike's fixes that note should have been changed. I
thought Mike had also proposed an alternative text, but I can't find it
right now. So feel free to change it.

Also keep in mind the following (from another of Mike's emails):
All vectorized string operations are now available as regular functions
in the numpy.char namespace.

Excellent. I tweeked a bit the text to make it clearer:
http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/

I made it clear that the replacement are present only starting numpy 1.4.
I'd love a review and an 'OK to apply'.

By the way, this switch is a clear improvement over chararrays, IMHO.
Congratulations to the pack and Mike for identifying usecases and giving
a good implementation to answer them.

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Some incompatibilities in numpy trunk

2009-12-06 Thread Gael Varoquaux
I have a lot of code that has stopped working with my latest SVN pull to
numpy.

* Some compiled code yields an error looking like (from memory):

incorrect type 'numpy.ndarray'

Rebuilding it is sufficient.

* I had some code doing:

hashlib.md5(x).hexdigest()

where x is a numpy array. I had to replace it by:

hashlib.md5(np.getbuffer(x)).hexdigest()

* Finally, I had to following failure:

  /home/varoquau/dev/enthought/ets/Mayavi_3.1.0/enthought/tvtk/array_handler.pyc
  in array2vtk(num_array, vtk_array)
  -- 298 result_array.SetVoidArray(z_flat, len(z_flat), 1)

  TypeError: argument 1 must be string or read-only buffer, not
  numpy.ndarray

  I can solve the problem using:

result_array.SetVoidArray(numpy.getbuffer(z_flat), len(z_flat), 1)

However, I am wondering: is this some incompatibility that has been
introduced by mistake? I find it a bit strange that a '.x' release
induces so much breakage, and I am afraid that it won't be popular
amongst our users.

Cheers,

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Some incompatibilities in numpy trunk

2009-12-06 Thread Gael Varoquaux
On Sun, Dec 06, 2009 at 04:07:16PM +0200, Pauli Virtanen wrote:
 su, 2009-12-06 kello 14:53 +0100, Gael Varoquaux kirjoitti:
  I have a lot of code that has stopped working with my latest SVN pull to
  numpy.

 Which SVN revision? Before or after the Py3K commits?
 Note that the trunk is currently aiming at 1.5.x, code for 1.4.x is in a
 branch.

Trunk. I had indeed forgotten that we where in 1.5.x now.


  I find it a bit strange that a '.x' release induces so much breakage,
  and I am afraid that it won't be popular amongst our users.

 Well, there's still a lot of time to fix these issues before 1.5.0 is
 out. Just file bug tickets for each one :)

OK, cool. Glad to see that we are on the same page. I will file tickets.

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Some incompatibilities in numpy trunk

2009-12-06 Thread Pauli Virtanen
su, 2009-12-06 kello 15:11 +0100, Gael Varoquaux kirjoitti:
 On Sun, Dec 06, 2009 at 04:07:16PM +0200, Pauli Virtanen wrote:
[clip]
   I find it a bit strange that a '.x' release induces so much breakage,
   and I am afraid that it won't be popular amongst our users.
 
  Well, there's still a lot of time to fix these issues before 1.5.0 is
  out. Just file bug tickets for each one :)
 
 OK, cool. Glad to see that we are on the same page. I will file tickets.

Great, thanks!

The bugs you see, btw, point out holes in our test suite...

Pauli



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Some incompatibilities in numpy trunk

2009-12-06 Thread Gael Varoquaux
On Sun, Dec 06, 2009 at 04:14:55PM +0200, Pauli Virtanen wrote:
 su, 2009-12-06 kello 15:11 +0100, Gael Varoquaux kirjoitti:
  On Sun, Dec 06, 2009 at 04:07:16PM +0200, Pauli Virtanen wrote:
 [clip]
I find it a bit strange that a '.x' release induces so much breakage,
and I am afraid that it won't be popular amongst our users.

   Well, there's still a lot of time to fix these issues before 1.5.0 is
   out. Just file bug tickets for each one :)

  OK, cool. Glad to see that we are on the same page. I will file tickets.

http://projects.scipy.org/numpy/ticket/1312

 The bugs you see, btw, point out holes in our test suite...

Well, the hashlib one is easy to add as a test. The other one is harder.

Cheers,

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Chararray depreciated?

2009-12-06 Thread Ralf Gommers
On Sun, Dec 6, 2009 at 2:10 PM, Gael Varoquaux 
gael.varoqu...@normalesup.org wrote:


 Excellent. I tweeked a bit the text to make it clearer:
 http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/

 I made it clear that the replacement are present only starting numpy 1.4.
 I'd love a review and an 'OK to apply'.

 Looks good, I copied the changes to the notes in defchararray and
arrays.classes.rst, and toggled OK to apply.

Cheers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Help to port numpy to python3?

2009-12-06 Thread Xavier Gnata
Hi,

Is there a way to help to port numpy to python3?
I don't thing I have time to rewrite some code but I can test whatever 
has to be tested.
Is there an official web page showing the status of this port? Same 
question from scipy?
It is already nice to see that the last numpy version is compatible with 
python2.6 :)

Xavier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Help to port numpy to python3?

2009-12-06 Thread Pauli Virtanen
su, 2009-12-06 kello 15:37 +0100, Xavier Gnata kirjoitti:
 Is there a way to help to port numpy to python3?

If you want to write some code, check
http://projects.scipy.org/numpy/browser/trunk/doc/Py3K.txt

 I don't thing I have time to rewrite some code but I can test whatever 
 has to be tested. Is there an official web page showing the status of
 this port?

Otherwise, you can help by:

1) Build Numpy SVN on Python 2.6

   Run all kinds of software that use Numpy, and see if there are new
   bugs as compared to Numpy 1.4.0 or 1.3.0.

   The Py3 transition involves a large amount of changes in the C code,
   and it's easy to miss out some subtle issues.

2) Figure out how to test the PEP 3118 buffer interface on Python 2.6
   and Python 3.1

   Write unit tests for it.

The Py3K.txt is pretty much the status report as we have now.

 Same question from scipy?

Work on Scipy can begin only after most of Numpy works on Py3.

-- 
Pauli Virtanen



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] non-standard standard deviation

2009-12-06 Thread Colin J. Williams


On 04-Dec-09 10:54 AM, Bruce Southey wrote:
 On 12/04/2009 06:18 AM, yogesh karpate wrote:
 @ Pauli and @ Colin:
   Sorry for the late reply. I was 
 busy in some other assignments.
 # As far as  normalization by(n) is concerned then its common 
 assumption that the population is normally distributed and population 
 size is fairly large enough to fit the normal distribution. But this 
 standard deviation, when applied to a small population, tends to be 
 too low therefore it is called  as biased.
 # The correction known as bessel correction is there for small sample 
 size std. deviation. i.e. normalization by (n-1).
 # In electrical-and-electronic-measurements-and-instrumentation by 
 A.K. Sawhney . In 1st chapter of the book Fundamentals of 
 Meausrements  . Its shown that for N=16 the std. deviation 
 normalization was (n-1)=15
 # While I was learning statistics in my course Instructor would 
 advise to take n=20 for normalization by (n-1)
 # Probability and statistics by Schuam Series  is good reading.
 Regards
 ~ymk


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 Hi,
 Basically, all that I see with these arbitrary values is that you are 
 relying on the 'central limit theorem' 
 (http://en.wikipedia.org/wiki/Central_limit_theorem).  Really the 
 issue in using these values is how much statistical bias will you 
 tolerate especially in the impact on usage of that estimate because 
 the usage of variance (such as in statistical tests) tend to be more 
 influenced by bias than the estimate of variance. (Of course, many 
 features rely on asymptotic properties so bias concerns are less 
 apparent in large sample sizes.)

 Obviously the default relies on the developers background and 
 requirements. There are multiple valid variance estimators in 
 statistics with different denominators like N (maximum likelihood 
 estimator), N-1 (restricted maximum likelihood estimator and certain 
 Bayesian estimators) and Stein's 
 (http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). So 
 thecurrent default behavior is a valid and documented. Consequently 
 you can not just have one option or different functions (like certain 
 programs) and Numpy's implementation actually allows you do all these 
 in a single function. So I also see no reason change even if I have to 
 add the ddof=1 argument, after all 'Explicit is better than implicit' :-).

 Bruce
Bruce,

I suggest that the Central Limit Theorem is tied in with the Law of 
Large Numbers.

When one has a smallish sample size, what give the best estimate of the 
variance?  The Bessel Correction provides a rationale, based on 
expectations: (http://en.wikipedia.org/wiki/Bessel%27s_correction).

It is difficult to understand the proof of Stein: 
http://en.wikipedia.org/wiki/Proof_of_Stein%27s_example

The symbols used are not clearly stated.  He seems interested in a 
decision rule for the calculation of the mean of a sample and claims 
that his approach is better than the traditional Least Squares approach.

In most cases, the interest is likely to be in the variance, with a view 
to establishing a confidence interval.

In the widely used Analysis of Variance (ANOVA), the degrees of freedom 
are reduced for each mean estimated, see:
http://www.mnstate.edu/wasson/ed602lesson13.htm for the example below:

*Analysis of Variance Table* ** Source of
Variation   Sum of
Squares Degrees of
Freedom Mean
Square  F Ratio p
Between Groups  25.20   2   12.60   5.178   .05
Within Groups   29.20   12  2.43

Total   54.40   14  




There is a sample of 15 observations, which is divided into three 
groups, depending on the number of hours of therapy.
Thus, the Total degrees of freedom are 15-1 = 14,  the Between Groups 
3-1 = 2 and the Residual is 14 - 2 = 12.

Colin W.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] non-standard standard deviation

2009-12-06 Thread josef . pktd
On Sun, Dec 6, 2009 at 11:01 AM, Colin J. Williams c...@ncf.ca wrote:


 On 04-Dec-09 10:54 AM, Bruce Southey wrote:
 On 12/04/2009 06:18 AM, yogesh karpate wrote:
 @ Pauli and @ Colin:
                                   Sorry for the late reply. I was
 busy in some other assignments.
 # As far as  normalization by(n) is concerned then its common
 assumption that the population is normally distributed and population
 size is fairly large enough to fit the normal distribution. But this
 standard deviation, when applied to a small population, tends to be
 too low therefore it is called  as biased.
 # The correction known as bessel correction is there for small sample
 size std. deviation. i.e. normalization by (n-1).
 # In electrical-and-electronic-measurements-and-instrumentation by
 A.K. Sawhney . In 1st chapter of the book Fundamentals of
 Meausrements  . Its shown that for N=16 the std. deviation
 normalization was (n-1)=15
 # While I was learning statistics in my course Instructor would
 advise to take n=20 for normalization by (n-1)
 # Probability and statistics by Schuam Series  is good reading.
 Regards
 ~ymk


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 Hi,
 Basically, all that I see with these arbitrary values is that you are
 relying on the 'central limit theorem'
 (http://en.wikipedia.org/wiki/Central_limit_theorem).  Really the
 issue in using these values is how much statistical bias will you
 tolerate especially in the impact on usage of that estimate because
 the usage of variance (such as in statistical tests) tend to be more
 influenced by bias than the estimate of variance. (Of course, many
 features rely on asymptotic properties so bias concerns are less
 apparent in large sample sizes.)

 Obviously the default relies on the developers background and
 requirements. There are multiple valid variance estimators in
 statistics with different denominators like N (maximum likelihood
 estimator), N-1 (restricted maximum likelihood estimator and certain
 Bayesian estimators) and Stein's
 (http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). So
 thecurrent default behavior is a valid and documented. Consequently
 you can not just have one option or different functions (like certain
 programs) and Numpy's implementation actually allows you do all these
 in a single function. So I also see no reason change even if I have to
 add the ddof=1 argument, after all 'Explicit is better than implicit' :-).

 Bruce
 Bruce,

 I suggest that the Central Limit Theorem is tied in with the Law of
 Large Numbers.

 When one has a smallish sample size, what give the best estimate of the
 variance?  The Bessel Correction provides a rationale, based on
 expectations: (http://en.wikipedia.org/wiki/Bessel%27s_correction).

 It is difficult to understand the proof of Stein:
 http://en.wikipedia.org/wiki/Proof_of_Stein%27s_example

 The symbols used are not clearly stated.  He seems interested in a
 decision rule for the calculation of the mean of a sample and claims
 that his approach is better than the traditional Least Squares approach.

 In most cases, the interest is likely to be in the variance, with a view
 to establishing a confidence interval.

What's the best estimate? That's the main question

Estimators differ in their (sample or posterior) distribution,
especially bias and variance.
Stein estimator dominates OLS in the mean squared error, so although
it is biased, the variance of the estimator is smaller than OLS so that
MSE (bias plus variance) is also smaller for Stein estimator than for OLS.
Depending on the application there could be many possible loss functions,
including asymmetric, eg. if its more costly to over than to under estimate.

The following was a good book for this, that I read a long time ago:
Statistical decision theory and Bayesian analysis By James O. Berger

http://books.google.ca/books?id=oY_x7dE15_ACpg=PP1lpg=PP1dq=berger+decisionsource=blots=wzL3ocu5_9sig=lGm5VevPtnFW570mgeqJklASalUhl=enei=P9cbS5CSCIqllAf-0f3xCQsa=Xoi=book_resultct=resultresnum=4ved=0CBcQ6AEwAw#v=onepageq=f=false



 In the widely used Analysis of Variance (ANOVA), the degrees of freedom
 are reduced for each mean estimated, see:
 http://www.mnstate.edu/wasson/ed602lesson13.htm for the example below:

 *Analysis of Variance Table* ** Source of
 Variation       Sum of
 Squares         Degrees of
 Freedom         Mean
 Square  F Ratio         p
 Between Groups  25.20   2       12.60   5.178   .05
 Within Groups   29.20   12      2.43

 Total   54.40   14


 There is a sample of 15 observations, which is divided into three
 groups, depending on the number of hours of therapy.
 Thus, the Total degrees of freedom are 15-1 = 14,  the Between Groups
 3-1 = 2 and the Residual is 14 - 2 = 12.

Statistical tests are the only area where I really pay attention to the
degrees of freedom, since the 

Re: [Numpy-discussion] non-standard standard deviation

2009-12-06 Thread Charles R Harris
On Sun, Dec 6, 2009 at 9:21 AM, josef.p...@gmail.com wrote:

 On Sun, Dec 6, 2009 at 11:01 AM, Colin J. Williams c...@ncf.ca wrote:
 


snip


 What's the best estimate? That's the main question

 Estimators differ in their (sample or posterior) distribution,
 especially bias and variance.
 Stein estimator dominates OLS in the mean squared error, so although
 it is biased, the variance of the estimator is smaller than OLS so that
 MSE (bias plus variance) is also smaller for Stein estimator than for OLS.
 Depending on the application there could be many possible loss functions,
 including asymmetric, eg. if its more costly to over than to under
 estimate.

 The following was a good book for this, that I read a long time ago:
 Statistical decision theory and Bayesian analysis By James O. Berger


 http://books.google.ca/books?id=oY_x7dE15_ACpg=PP1lpg=PP1dq=berger+decisionsource=blots=wzL3ocu5_9sig=lGm5VevPtnFW570mgeqJklASalUhl=enei=P9cbS5CSCIqllAf-0f3xCQsa=Xoi=book_resultct=resultresnum=4ved=0CBcQ6AEwAw#v=onepageq=f=false


At last, an explanation I can understand. Thanks Josef.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.equal

2009-12-06 Thread Keith Goodman
On Sun, Dec 6, 2009 at 4:57 AM,  josef.p...@gmail.com wrote:
 what's the difference in the implementation between np.equal and == ?
 np.equal raises NotImplemented for strings, while == works.

 aa
 array(['a', 'b', 'a', 'aa', 'a'],
      dtype='|S2')

 aa == 'a'
 array([ True, False,  True, False,  True], dtype=bool)
 np.equal(aa,'a')
 NotImplemented


 np.equal(np.arange(5),1)
 array([False,  True, False, False, False], dtype=bool)
 np.equal(np.arange(5),'a')
 NotImplemented
 np.arange(5) == 'a'
 False

Seems like none of the ufuncs can handle strings:

 np.log('a')
   NotImplemented
 np.exp('a')
   NotImplemented
 np.add('a', 'b')
   NotImplemented
 np.negative('a')
   NotImplemented
 np.sin('a')
   NotImplemented
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] non-standard standard deviation

2009-12-06 Thread Sturla Molden
Colin J. Williams skrev:
 When one has a smallish sample size, what give the best estimate of the 
 variance? 
What do you mean by best estimate?

Unbiased? Smallest standard error?


 In the widely used Analysis of Variance (ANOVA), the degrees of freedom 
 are reduced for each mean estimated, 
That is for statistical tests, not to compute estimators.





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Chararray depreciated?

2009-12-06 Thread David Goldsmith
On Sun, Dec 6, 2009 at 6:36 AM, Ralf Gommers ralf.gomm...@googlemail.comwrote:



 On Sun, Dec 6, 2009 at 2:10 PM, Gael Varoquaux 
 gael.varoqu...@normalesup.org wrote:


 Excellent. I tweeked a bit the text to make it clearer:
 http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/

 I made it clear that the replacement are present only starting numpy 1.4.
 I'd love a review and an 'OK to apply'.

 Looks good, I copied the changes to the notes in defchararray and
 arrays.classes.rst, and toggled OK to apply.

 Cheers,
 Ralf



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


Thanks, guys!

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Some incompatibilities in numpy trunk

2009-12-06 Thread Robert Kern
On Sun, Dec 6, 2009 at 07:53, Gael Varoquaux
gael.varoqu...@normalesup.org wrote:
 I have a lot of code that has stopped working with my latest SVN pull to
 numpy.

 * Some compiled code yields an error looking like (from memory):

    incorrect type 'numpy.ndarray'

 Rebuilding it is sufficient.

Is this Cython or Pyrex code? Unfortunately Pyrex checks the size of
types exactly such that even if you extend the type in a backwards
compatible way, it will raise that exception. This behavior has been
inherited by Cython. I have asked for this feature to be removed, or
at least turned into a = check, but it got no traction.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Some incompatibilities in numpy trunk

2009-12-06 Thread Gael Varoquaux
On Sun, Dec 06, 2009 at 01:12:52PM -0600, Robert Kern wrote:
 Is this Cython or Pyrex code? 

It is.

 Unfortunately Pyrex checks the size of types exactly such that even if
 you extend the type in a backwards compatible way, it will raise that
 exception. 

OK, that makes sens. Thanks for the explaination.

 This behavior has been inherited by Cython. I have asked for
 this feature to be removed, or at least turned into a = check, but it
 got no traction.

Well, maybe when all the cython deployements break because of the numpy
change, it will get more traction.

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Zero Division not handled correctly?

2009-12-06 Thread Skipper Seabold
I believe this is known, but I am surprised that division by integer
zero results in the following.

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '1.4.0.dev7539'

In [3]: 0**-1 # or 0**-1/-1
---
ZeroDivisionError Traceback (most recent call last)

/home/skipper/school/Data/ascii/numpy/ipython console in module()

ZeroDivisionError: 0.0 cannot be raised to a negative power

In [4]: np.array([0.])**-1
Out[4]: array([ Inf])

In [5]: np.array([0.])**-1/-1
Out[5]: array([-Inf])

In [6]: np.array([0])**-1.
Out[6]: array([ Inf])

In [7]: np.array([0])**-1./-1
Out[7]: array([-Inf])

In [8]: np.array([0])**-1
Out[8]: array([-9223372036854775808])

In [9]: np.array([0])**-1/-1
Floating point exception

This last command crashes the interpreter.

There have been some threads about similar issues over the years, but
I'm wondering if this is still intended/known or if this should raise
an exception or return inf or -inf.  I expected a -inf, though maybe
this is incorrect on my part.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Zero Division not handled correctly?

2009-12-06 Thread David Goldsmith
On Sun, Dec 6, 2009 at 1:16 PM, Skipper Seabold jsseab...@gmail.com wrote:

 In [9]: np.array([0])**-1/-1
 Floating point exception

 This last command crashes the interpreter.


It crashes mine also, and IMO, anything that crashes the interpreter should
be considered a bug - can you file a bug report, please?  Thanks!

DG


 There have been some threads about similar issues over the years, but
 I'm wondering if this is still intended/known or if this should raise
 an exception or return inf or -inf.  I expected a -inf, though maybe
 this is incorrect on my part.

 Skipper
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] another numpy/ATLAS problem

2009-12-06 Thread David Warde-Farley
(we hashed this out on IRC, but replying here for the sake of  
recording it)

On 5-Dec-09, at 9:04 PM, Pauli Virtanen wrote:

 Can you try to change linalg/setup.py so that it *only* includes
 lapack_litemodule.c in the build?

Yup; it turns out it wasn't NumPy's lapack_lite calling dlamc3_ but  
rather other routines in LAPACK. It was accepted as a bug and it  
should be fixed in future 3.9.x's of ATLAS.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] non-standard standard deviation

2009-12-06 Thread Bruce Southey
On Sun, Dec 6, 2009 at 11:36 AM, Sturla Molden stu...@molden.no wrote:
 Colin J. Williams skrev:
 When one has a smallish sample size, what give the best estimate of the
 variance?
 What do you mean by best estimate?

 Unbiased? Smallest standard error?


 In the widely used Analysis of Variance (ANOVA), the degrees of freedom
 are reduced for each mean estimated,
 That is for statistical tests, not to compute estimators.





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


Ignoring the estimation method, there is no correct answer unless you
impose various conditions like minimum-variance unbiased estimator
(http://en.wikipedia.org/wiki/Minimum_variance_unbiased) where usually
N-1 wins.

Anyhow, this is way off topic since it is totally in the realm of math stats.

Law of large numbers
(http://en.wikipedia.org/wiki/Law_of_large_numbers) just address that
the average not the variance so it is not directly applicable.

Bruce
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Zero Division not handled correctly?

2009-12-06 Thread David Cournapeau
Skipper Seabold wrote:
 I believe this is known, but I am surprised that division by integer
 zero results in the following.

 In [1]: import numpy as np

 In [2]: np.__version__
 Out[2]: '1.4.0.dev7539'

 In [3]: 0**-1 # or 0**-1/-1
 ---
 ZeroDivisionError Traceback (most recent call last)

 /home/skipper/school/Data/ascii/numpy/ipython console in module()

 ZeroDivisionError: 0.0 cannot be raised to a negative power

 In [4]: np.array([0.])**-1
 Out[4]: array([ Inf])

 In [5]: np.array([0.])**-1/-1
 Out[5]: array([-Inf])

 In [6]: np.array([0])**-1.
 Out[6]: array([ Inf])

 In [7]: np.array([0])**-1./-1
 Out[7]: array([-Inf])

 In [8]: np.array([0])**-1
 Out[8]: array([-9223372036854775808])

 In [9]: np.array([0])**-1/-1
 Floating point exception

np.array([0])**-1

This last one is sort of interesting - Skipper Seabold wrote:
 I believe this is known, but I am surprised that division by integer
 zero results in the following.

 In [1]: import numpy as np

 In [2]: np.__version__
 Out[2]: '1.4.0.dev7539'

 In [3]: 0**-1 # or 0**-1/-1
 ---
 ZeroDivisionError Traceback (most recent call last)

 /home/skipper/school/Data/ascii/numpy/ipython console in module()

 ZeroDivisionError: 0.0 cannot be raised to a negative power

 In [4]: np.array([0.])**-1
 Out[4]: array([ Inf])

 In [5]: np.array([0.])**-1/-1
 Out[5]: array([-Inf])

 In [6]: np.array([0])**-1.
 Out[6]: array([ Inf])

 In [7]: np.array([0])**-1./-1
 Out[7]: array([-Inf])

 In [8]: np.array([0])**-1
 Out[8]: array([-9223372036854775808])

 In [9]: np.array([0])**-1/-1
 Floating point exception

This last one is sort of interesting - np.array([0])**-1 returns the
smallest long, and on 2-complement machines, this means that its
opposite is not representable. IOW, it is not a divide by zero, but a
division overflow, which also generates a SIGFPE on x86. I think the
crash is the same as the one on this simple C program:

#include stdio.h

int main(void)
{
long a = -2147483648;
long b = -1;

printf(%ld\n, a);
a /= b;
printf(%ld\n, a);

   return 0;
}

I am not sure about how to fix this: one simple way would be to detect
this case in the LONG_divide ufunc (and other concerned signed integer
types). Another, maybe better solution is to handle the signal, but
that's maybe much more work (I still don't know well how signals
interact with python interpreter).

cheers,

David

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.equal

2009-12-06 Thread Fernando Perez
2009/12/6 josef.pktd josef.p...@gmail.com:
 np.equal(np.arange(5),'a')
 NotImplemented

Why is NotImplemented a *return* value?  Normally NotImplementedError
is a raised exception, but if it's not implemented, it shouldn't be
returned as a value.

For one thing, it leads to absurdities like the following being possible:

In [6]: if np.equal(np.random.rand(5),'a'):
   ...: print(Array equal to 'a')
   ...:
   ...:
Array equal to 'a'

In [7]: if np.equal(np.random.rand(5),'a'):
print(Array equal to 'a')
   ...:
   ...:
Array equal to 'a'

In practice, it's as if np.equal() for not implemented cases returns
True always (since bool(NotImplemented)==True).

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion