Re: [Numpy-discussion] Standard Deviation (std): Suggested change for ddof default value

2014-04-03 Thread Bago
 Sturla

 P.S. Personally I am not convinced unbiased is ever a valid argument, as
 the biased estimator has smaller error. This is from experience in
 marksmanship: I'd rather shoot a tight series with small systematic error
 than scatter my bullets wildly but unbiased on the target. It is the
 total error that counts. The series with smallest total error gets the best
 score. It is better to shoot two series and calibrate the sight in between
 than use a calibration-free sight that don't allow us to aim. That's why I
 think classical statistics got this one wrong. Unbiased is never a virtue,
 but the smallest error is. Thus, if we are to repeat an experiment, we
 should calibrate our estimator just like a marksman calibrates his sight.
 But the aim should always be calibrated to give the smallest error, not an
 unbiased scatter. Noone in their right mind would claim a shotgun is more
 precise than a rifle because it has smaller bias. But that is what applying
 the Bessel correction implies.


I agree with the point, and what makes it even worse is that ddof=1 does
not even produce an unbiased standard deviation estimate. I produces an
unbiased variance estimate but the sqrt of this variance estimate is a
biased standard deviation estimate,
http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation.

Bago
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-17 Thread Bago


 I'm now convinced of the usefulness of @ and @@ too but I also think that
 you must think of other uses than only for numpy. In other words, numpy is
 a the good argument for this new operators, but this can also open new
 perspectives for other uses.


Speaking of `@@`, would the relative precedence of @ vs * be the same as @@
vs **?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] vectorizing recursive sequences

2013-10-26 Thread Bago
This behavior seems to depend on the order in which elements of the arrays
are processes. That seems like a dangerous thing to rely on, the main
reason I can thing of that someone would want to change the loop order is
to implement parallel ufuncs.

Bago



On Fri, Oct 25, 2013 at 12:32 PM, Jaime Fernández del Río 
jaime.f...@gmail.com wrote:

 I recently came up with a way of vectorizing some recursive sequence
 calculations. While it works, I am afraid it is relying on implementation
 details potentially subject to change. The basic idea is illustrated by
 this function, calculating the first n items of the Fibonacci sequence:

 def fibonacci(n):

 fib = np.empty(n, dtype=np.intp)

 fib[:2] = 1

 np.add(fib[:-2], fib[1:-1], out=fib[2:])

 return fib


  fibonacci(10)

 array([ 1,  1,  2,  3,  5,  8, 13, 21, 34, 55], dtype=int64)


 I believe that the biggest issue that could break this is if the ufunc
 decided to buffer the arrays, as this is relying on the inputs and outputs
 of np.add sharing the same memory.


 You can use the same idea to do more complicated things, for instance to
 calculate the items of the sequence:


 f[0] = a[0]

 f[n] = a[n] + x * f[n-1]


 from numpy.lib.stride_tricks import as_strided

 from numpy.core.umath_tests import inner1d


 def f(a, x):

 out = np.array(a, copy=True, dtype=np.double)

 n = len(a)

 out_view = as_strided(out, shape=(n-1, 2), strides=out.strides*2)

 inner1d(out_view, [x, 1], out=out[1:])

 return out


  f(np.arange(10), 0.1)

 array([ 0.,  1.,  2.1   ,  3.21  ,  4.321 ,

 5.4321,  6.54321   ,  7.654321  ,  8.7654321 ,  9.87654321])

 Again, I think  buffering is the clearest danger of doing something like
 this, as the first input and output must share the same memory for this to
 work. That this is a real concern is easy to see: since `inner1d` only has
 loops registered for long ints and double floats:

  inner1d.types
 ['ll-l', 'dd-d']

 the above function `f` doesn't work if the `out` array is created, e.g. as
 np.float32, since there will be buffering happening because of the type
 casting.

 So I have two questions:

 1. Is there some other reason, aside from buffering, that could go wrong,
 or change in a future release?
 2. As far as buffering is concerned, I thought of calling
 `np.setbuffer(1)` before any of these functions, but it complains and
 requests that the value be a multiple of 16. Is there any other way to
 ensure that the data is fetched from an updated array in every internal
 iteration?

 Thanks!

 Jaime

 --
 (\__/)
 ( O.o)
 (  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
 de dominación mundial.

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] another indexing question

2013-05-20 Thread Bago
You could also try using bincount, (np.bincount(x, y.real) +
1j*np.bincount(x, y.imag)) / np.bincount(x)

Bago


On Mon, May 20, 2013 at 9:03 AM, Robert Kern robert.k...@gmail.com wrote:

 On Mon, May 20, 2013 at 5:00 PM, Neal Becker ndbeck...@gmail.com wrote:
  I have a system that transmits signals for an alphabet of M symbols
  over and additive Gaussian noise channel.  The receiver has a
  1-d array of complex received values.  I'd like to find the means
  of the received values according to the symbol that was transmitted.
 
  So transmit symbol indexes might be:
 
  x = [0, 1, 2, 1, 3, ...]
 
  and receive output might be:
 
  y = [(1+1j), (1-1j), ...]
 
  Suppose the alphabet was M=4.  Then I'd like to get an array of means
 
  m[0...3] that correspond to the values of y for each of the corresponding
  values of x.
 
  I can't think of a better way than manually using loops.  Any tricks
 here?

 All you need is a single loop over the alphabet, which is usually not
 problematic.


 means = np.empty([M])
 for i in range(M):
 means[i] = y[x == i].mean()

 --
 Robert Kern
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] searchsorted descending arrays

2013-05-06 Thread Bago
I submitted a patch a little while ago,
https://github.com/numpy/numpy/pull/3107, which gave the searchsorted
function the ability to search arrays sorted in descending order. At the
time my  approach was to detect the sortorder of the array by comparing the
first and last elements. This works pretty well in most cases, but fails in
one notable case. After giving it some thought, I think the best way to add
searching of descending arrays to numpy would be by adding a keyword to the
searchsorted function. I wanted to know what you guys thought of this
before updating the pr.

I would like to add something like the following to numpy:

A = [10, 9, 2, 1]
np.searchsorted(A, 5, sortorder='descending')

the other option would be to auto-detect the order, but then this case
might surprise some users:

A = [0, 0, 0]
A = np.sort(A)[::-1]
print np.searchsorted(A, [1, -1])
# [3, 0]

This might surprise a user who expects to be searching a descending array


Bago
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion