Re: [Numpy-discussion] python array

2014-03-13 Thread Brett Olsen
The difference appears to be that the boolean selection pulls out all data
values = 0.5 whether or not they are masked, and then carries over the
appropriate masks to the new array.  So r2010 and bt contain identical
unmasked values but different numbers of masked values.  Because the
initial fill value for your masked values was a large negative number, in
r2010 those masked values are carried over.  In bt, you've taken the
absolute value of the data array, so those fill values are now positive and
they are no longer carried over into the indexed array.

Because the final arrays are still masked, you are observing no difference
in the statistical properties of the arrays, only their sizes, because one
contains many more masked values than the other.  I don't think this should
be a problem for your computations. If you're concerned, you could always
explicitly demask them before your computations.  See the example problem
below.

~Brett

In [61]: import numpy as np

In [62]: import numpy.ma as ma

In [65]: a = np.arange(-8, 8).reshape((4, 4))

In [66]: a
Out[66]:
array([[-8, -7, -6, -5],
   [-4, -3, -2, -1],
   [ 0,  1,  2,  3],
   [ 4,  5,  6,  7]])

In [68]: b = ma.masked_array(a, mask=a  0)

In [69]: b
Out[69]:
masked_array(data =
 [[-- -- -- --]
 [-- -- -- --]
 [0 1 2 3]
 [4 5 6 7]],
 mask =
 [[ True  True  True  True]
 [ True  True  True  True]
 [False False False False]
 [False False False False]],
   fill_value = 99)

In [70]: b.data
Out[70]:
array([[-8, -7, -6, -5],
   [-4, -3, -2, -1],
   [ 0,  1,  2,  3],
   [ 4,  5,  6,  7]])

In [71]: c = abs(b)

In [72]: c[c = 4].shape
Out[72]: (9L,)

In [73]: b[b = 4].shape
Out[73]: (13L,)

In [74]: b[b = 4]
Out[74]:
masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3 4],
 mask = [ True  True  True  True  True  True  True  True False
False False False
 False],
   fill_value = 99)


In [75]: c[c = 4]
Out[75]:
masked_array(data = [-- -- -- -- 0 1 2 3 4],
 mask = [ True  True  True  True False False False False False],
   fill_value = 99)


On Thu, Mar 13, 2014 at 8:14 PM, Sudheer Joseph sudheer.jos...@yahoo.comwrote:

 Sorry,
The below solution I thoght working was not working but was
 just giving array size.

 
 On Fri, 14/3/14, Sudheer Joseph sudheer.jos...@yahoo.com wrote:

  Subject: Re: [Numpy-discussion] python array
  To: Discussion of Numerical Python numpy-discussion@scipy.org
  Date: Friday, 14 March, 2014, 1:09 AM

  Thank you very much Nicolas and
  Chris,

   The
  hint was helpful and from that I treid below steps ( a crude
  way I would say) and getting same result now

  I have been using abs available by default and it is the
  same with numpy.absolute( i checked).

  nr= ((r2010r2010.min())  (r2010r2010.max()))
  nr[nr.5].shape
  Out[25]: (33868,)
  anr=numpy.absolute(nr)
  anr[anr.5].shape
  Out[27]: (33868,)

  This way I used may have problem when mask used has values
  which can affect the min max operation.

  So I would like to know if there is a standard formal (
  python/numpy) way to handle masked array when they need to
  be subjected to boolean operations.

  with best regards,
  Sudheer


  ***
  Sudheer Joseph
  Indian National Centre for Ocean Information Services
  Ministry of Earth Sciences, Govt. of India
  POST BOX NO: 21, IDA Jeedeemetla P.O.
  Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
  Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
  Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile)
  E-mail:sjo.in...@gmail.com;sudheer.jos...@yahoo.com
  Web- http://oppamthadathil.tripod.com
  ***

  
  On Thu, 13/3/14, Chris Barker - NOAA Federal chris.bar...@noaa.gov
  wrote:

   Subject: Re: [Numpy-discussion] python array
   To: Discussion of Numerical Python numpy-discussion@scipy.org
   Date: Thursday, 13 March, 2014, 11:53 PM

   On Mar 13, 2014, at 9:39 AM, Nicolas
   Rougier nicolas.roug...@inria.fr
   wrote:

   
Seems to be related to the masked values:

   Good hint -- a masked array keeps the junk values in the
   main array.

   What abs are you using -- it may not be mask-aware. (
  you
   want a
   numpy abs anyway)

   Also -- I'm not sure I know what happens with Boolean
   operators on
   masked arrays when you use them to index. I'd investigate
   that.
   (sorry, not at a machine I can play with now)

   Chris


print r2010[:3,:3]
[[-- -- --]
[-- -- --]
[-- -- --]]
   
print abs(r2010)[:3,:3]
[[-- -- --]
[-- -- --]
[-- -- --]]
   
   
print r2010[ r2010[:3,:3] 0 ]
[-- -- -- -- -- -- -- -- --]
   
print r2010[ abs(r2010)[:3,:3]  0]
[]
   
Nicolas
   
   
   
On 13 Mar 2014, at 16:52, Sudheer Joseph sudheer.jos...@yahoo.com
   wrote:
  

Re: [Numpy-discussion] Robust Sorting of Points

2013-10-28 Thread Brett Olsen
Here's some code implementing the replace similar values with an
arbitrarily chosen one (in this case the smallest of the similar values).
 I didn't see any way to do this cleverly with strides, so I just did a
simple loop.  It's about 100 times slower in pure Python, or a bit under 10
times slower if you're willing to use a bit of Cython.  Not sure if this is
good enough for your purposes.  I imagine you could go a bit faster if you
were willing to do the lexical integration by hand (since you've already
done the separate sorting of each subarray for value replacement purposes)
instead of passing that off to np.lexsort.

Note that this approach will only work if your points are not only
well-separated in space but also either well-separated or identical in each
dimension as well.  It's OK to have points with the same, say, x value, but
if you have points that have close x values before the noise is added, then
the noise can move intermediate points around in the sort order.  It works
well with the gridded data I used as a sample, but if you're, say,
generating random points, this could be a problem:

point 1 is (1, 0, 1e-12)
point 2 is (0, 1, 0)

These are well separated.  The algorithm will pool those z values and
report 1 as coming before 2.  Unless you get jitter like this:

point 1: (1, 0, 1.5e-12)
point 2: (0, 1, -0.5e-12)

Now they won't be pooled any more and we'll get 2 as coming before 1.

Anyway, here's the code:

In [1]:

import numpy as np
In [2]:

def gen_grid(n, d):
#Generate a bunch of grid points, n in each dimension of spacing d
vals = np.linspace(0, (n-1)*d, n)
x, y, z = np.meshgrid(vals, vals, vals)
grid = np.empty((n**3, 3))
grid[:,0] = x.flatten()
grid[:,1] = y.flatten()
grid[:,2] = z.flatten()
return grid

def jitter(array, epsilon=1e-12):
#Add random jitter from a uniform distribution of width epsilon
return array + np.random.random(array.shape) * epsilon - epsilon / 2
In [3]:

grid = gen_grid(4, 0.1)
print np.lexsort(grid.T)
print np.lexsort(jitter(grid.T))
[ 0  4  8 12 16 20 24 28 32 36 40 44 48 52 56 60  1  5  9 13 17 21 25 29 33
 37 41 45 49 53 57 61  2  6 10 14 18 22 26 30 34 38 42 46 50 54 58 62  3  7
 11 15 19 23 27 31 35 39 43 47 51 55 59 63]
[60  4 48 32 40 12 36 28 44 56 16  8 24  0 52 20 45 25 49  1 53 29  9 33  5
 61 41 37 17 13 21 57 22 50 18 10  2 62 58 54  6 34 26 42 38 46 14 30  3 11
 55 63 27 15 35 43 31 39  7 59 47 23 51 19]
In [4]:

def pool_values(A, epsilon=1e-12):
idx = np.argsort(A)
for i in range(1, len(A)):
if A[idx[i]] - A[idx[i-1]]  epsilon:
A[idx[i]] = A[idx[i-1]]
return A

def stable_sort(grid):
return np.lexsort((pool_values(grid[:,0]),
   pool_values(grid[:,1]),
   pool_values(grid[:,2])))
In [5]:

print stable_sort(grid)
print stable_sort(jitter(grid))
[ 0  4  8 12 16 20 24 28 32 36 40 44 48 52 56 60  1  5  9 13 17 21 25 29 33
 37 41 45 49 53 57 61  2  6 10 14 18 22 26 30 34 38 42 46 50 54 58 62  3  7
 11 15 19 23 27 31 35 39 43 47 51 55 59 63]
[ 0  4  8 12 16 20 24 28 32 36 40 44 48 52 56 60  1  5  9 13 17 21 25 29 33
 37 41 45 49 53 57 61  2  6 10 14 18 22 26 30 34 38 42 46 50 54 58 62  3  7
 11 15 19 23 27 31 35 39 43 47 51 55 59 63]
In [6]:

%timeit np.lexsort(jitter(grid.T))
10 loops, best of 3: 10.4 µs per loop
In [7]:

%timeit stable_sort(jitter(grid))
1000 loops, best of 3: 1.39 ms per loop
In [8]:

%load_ext cythonmagic
In [12]:

%%cython
import numpy as np
cimport numpy as np

cdef fast_pool_values(double[:] A, double epsilon=1e-12):
cdef long[:] idx = np.argsort(A)
cdef int i
for i in range(1, len(A)):
if A[idx[i]] - A[idx[i-1]]  epsilon:
A[idx[i]] = A[idx[i-1]]
return A

def fast_stable_sort(grid):
return np.lexsort((fast_pool_values(grid[:,0]),
   fast_pool_values(grid[:,1]),
   fast_pool_values(grid[:,2])))
In [10]:

%timeit np.lexsort(jitter(grid.T))
1 loops, best of 3: 38.5 µs per loop
In [13]:

%timeit fast_stable_sort(jitter(grid))
1000 loops, best of 3: 309 µs per loop


On Sun, Oct 27, 2013 at 5:41 PM, Freddie Witherden fred...@witherden.orgwrote:

 On 27/10/13 21:05, Jonathan March wrote:
  If an almost always works solution is good enough, then sort on the
  distance to some fixed random point that is in the vicinity of your N
  points.

 I had considered this.  Unfortunately I need a solution which really
 does always work.

 The only pure-Python solution I can envision -- at the moment anyway --
 is to do some cleverness with the output of np.unique to identify
 similar values and replace them with an arbitrarily chosen one.  This
 should permit the output to be passed to np.lexsort without issue.

 Regards, Freddie.





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



Re: [Numpy-discussion] Stick (line segments) percolation algorithm - graph theory?

2013-08-26 Thread Brett Olsen
I can see a couple opportunities for improvements in your algorithm.
 Running your code on a single experiment, I get about 2.9 seconds to run.
I get this down to about 1.0 seconds by (1) exploiting the symmetry of the
M matrix and (2) avoiding the costly inner loop over k in favor of array
operations:

def check_segments(j, others, data):
x1, y1, x2, y2 = data

x_A1B1 = x2[j]-x1[j]
y_A1B1 = y2[j]-y1[j]

x_A1A2 = x1[others]-x1[j]
y_A1A2 = y1[others]-y1[j]

x_A2A1 = -1*x_A1A2
y_A2A1 = -1*y_A1A2

x_A2B2 = x2[others]-x1[others]
y_A2B2 = y2[others]-y1[others]

x_A1B2 = x2[others]-x1[j]
y_A1B2 = y2[others]-y1[j]

x_A2B1 = x2[j]-x1[others]
y_A2B1 = y2[j]-y1[others]

p1 = x_A1B1*y_A1A2 - y_A1B1*x_A1A2
p2 = x_A1B1*y_A1B2 - y_A1B1*x_A1B2
p3 = x_A2B2*y_A2B1 - y_A2B2*x_A2B1
p4 = x_A2B2*y_A2A1 - y_A2B2*x_A2A1

condition_1=p1*p2
condition_2=p3*p4

return (p1 * p2 = 0)  (p3 * p4 = 0)

for j in xrange(1, N):
valid = check_segments(j, range(j), (x1, y1, x2, y2))
M[j,0:j] = valid
M[0:j,j] = valid

I don't see any other particularly simple ways to improve this.  You could
probably add an interval check to ensure that the x and y intervals for the
segments of interest overlap before doing the full check, but how much that
would help would depend on the implementations.

~Brett

On Fri, Aug 23, 2013 at 5:09 PM, Josè Luis Mietta 
joseluismie...@yahoo.com.ar wrote:

 I wrote an algorithm for study stick percolation (i.e.: networks between
 line segments that intersect between them). In my algorithm N sticks (line
 segments) are created inside a rectangular box of sides 'b' and 'h' and
 then, one by one, the algorithm explores the intersection between all line
 segments. This is a Monte Carlo simulation, so the 'experiment' is executed
 many times (no less than 100 times). Written like that, very much RAM is
 consumed:  Here, the element Mij=1 if stick i intersects stick j and Mij=0
 if not.
 How can I optimize my algorithm? Graph theory is useful in this case?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimize removing nan-values of dataset

2013-08-14 Thread Brett Olsen
The example data/method you've provided doesn't do what you describe.
 E.g., in your example data you have several 2x2 blocks of NaNs.  According
to your description, these should not be replaced (as they all have a
neighbor that is also a NaN).  Your example method, however, replaces them
- in fact, replaces any NaN values that are not in the first or last row or
contiguous with NaNs in the first or last row.

Here's a replacement method that does do what you've described:
def nan_to_mean(data):
data[1:-1][np.isnan(data[1:-1])] = ((data[:-2] + data[2:]) /
2)[np.isnan(data[1:-1])]
return data

~Brett


On Tue, Aug 13, 2013 at 1:50 AM, Thomas Goebel 
thomas.goe...@th-nuernberg.de wrote:

 Hi,

 i am trying to remove nan-values from an array of shape(40, 6).
 These nan-values at point data[x] should be replaced by the mean
 of data[x-1] and data[x+1] if both values at x-1 and x+1 are not
 nan. The function nan_to_mean (see below) is working but i wonder
 if i could optimize the code.

 I thought about something like
   1. Find all nan values in array:
  nans = np.isnan(dataarray)
   2. Check if values before, after nan indice are not nan
   3. Calculate mean

 While using this script for my original dataset of
 shape(63856, 6) it takes 139.343 seconds to run it. And some
 datasets are even bigger. I attached the example_dataset.txt and
 the example.py script.

 Thanks for any help,
 Tom

 def nan_to_mean(arr):
 for cnt, value in enumerate(arr):
 # Check if first value is nan, if so continue
 if cnt == 0 and np.isnan(value):
 continue
 # Check if last value is nan:
 # If x-1 value is nan dont do anything!
 # If x-1 is float, last value will be value of x-1
 elif cnt == (len(arr)-1):
 if np.isnan(value) and not np.isnan(arr[cnt-1]):
 arr[cnt] = arr[cnt-1]
 # If the first values of file are nan ignore them all
 elif np.isnan(value) and np.isnan(arr[cnt-1]):
 continue
 # Found nan value and x-1 value is of type float
 elif np.isnan(value) and not np.isnan(arr[cnt-1]):
 # Check if x+1 value is not nan
 if not np.isnan(arr[cnt+1]):
 arr[cnt] = '%.1f' % np.mean((
 arr[cnt-1],arr[cnt+1]))
 # If x+1 value is nan, go to next value
 else:
 for N in xrange(2, 30):
 if cnt+N == (len(arr)):
 break
 elif not np.isnan(arr[cnt+N]):
 arr[cnt] = '%.1f' % np.mean(
 (arr[cnt-1], arr[cnt+N]))
 return arr

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Smart way to do this?

2013-02-22 Thread Brett Olsen
a = np.ones(30)
idx = np.array([2, 3, 2])
a += 2 * np.bincount(idx, minlength=len(a))
 a
array([ 1.,  1.,  5.,  3.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.])

As for speed:

def loop(a, idx):
for i in idx:
a[i] += 2

def count(a, idx):
a += 2 * np.bincount(idx, minlength=len(a))

%timeit loop(np.ones(30), np.array([2, 3, 2]))
1 loops, best of 3: 19.9 us per loop

%timeit count(np.ones(30), np.array(2, 3, 2]))
10 loops, best of 3: 19.2 us per loop

So no big difference here.  But go to larger systems and you'll see a huge
difference:

%timeit loop(np.ones(1), np.random.randint(1, size=10))
1 loops, best of 3: 260 ms per loop

%timeit count(np.ones(1), np.random.randint(1, size=10))
100 loops, best of 3: 3.03 ms per loop.

~Brett


On Fri, Feb 22, 2013 at 8:38 PM, santhu kumar mesan...@gmail.com wrote:

 Sorry typo :

 a = np.ones(30)
 idx = np.array([2,3,2]) # there is a duplicate index of 2
 a[idx] += 2

 On Fri, Feb 22, 2013 at 8:35 PM, santhu kumar mesan...@gmail.com wrote:

 Hi all,

 I dont want to run a loop for this but it should be possible using numpy
 smart ways.

 a = np.ones(30)
 idx = np.array([2,3,2]) # there is a duplicate index of 2
 a += 2

 a
 array([ 1.,  1.,  3.,  3.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
 1.,  1.,  1.,  1.])


 But if we do this :
 for i in range(idx.shape[0]):
a[idx[i]] += 2

  a
 array([ 1.,  1.,  5.,  3.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
 1.,  1.,  1.,  1.])

 How to achieve the second result without looping??
 Thanks
 Santhosh



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is there a more efficient way to do this?

2012-08-08 Thread Brett Olsen
On Wed, Aug 8, 2012 at 9:19 AM, Laszlo Nagy gand...@shopzeus.com wrote:
 Is there a more efficient way to calculate the slices array below?

 I do not want to make copies of DATA, because it can be huge. The
 argsort is fast enough. I just need to create slices for different
 dimensions. The above code works, but it does a linear time search,
 implemented in pure Python code. For every iteration, Python code is
 executed. For 1 million rows, this is very slow. Is there a way to
 produce slices with numpy code? I could write C code for this, but I
 would prefer to do it with mass numpy operations.

 Thanks,

Laszlo

#Code
import numpy as np

#rows between 100 to 1M
rows = 1000
data = np.random.random_integers(0, 100, rows)

def get_slices_slow(data):
o = np.argsort(data)

slices = []
prev_val = None
sidx = -1

for oidx, rowidx in enumerate(o):
val = data[rowidx]
if not val == prev_val:
if prev_val is None:
prev_val = val
sidx = oidx
else:
slices.append((prev_val, sidx, oidx))
sidx = oidx
prev_val = val

if (sidx = 0) and (sidx  rows):
slices.append((val, sidx, rows))
slices = np.array(slices, dtype=np.int64)
return slices

def get_slices_fast(data):
nums = np.unique(data)
slices = np.zeros((len(nums), 3), dtype=np.int64)
slices[:,0] = nums
count = 0
for i, num in enumerate(nums):
count += (data == num).sum()
slices[i,2] = count
slices[1:,1] = slices[:-1,2]
return slices

def get_slices_faster(data):
nums = np.unique(data)
slices = np.zeros((len(nums), 3), dtype=np.int64)
slices[:,0] = nums
count = np.bincount(data)
slices[:,2] = count.cumsum()
slices[1:,1] = slices[:-1,2]
return slices

#Testing in ipython
In [2]: (get_slices_slow(data) == get_slices_fast(data)).all()
Out[2]: True

In [3]: (get_slices_slow(data) == get_slices_faster(data)).all()
Out[3]: True

In [4]: timeit get_slices_slow(data)
100 loops, best of 3: 3.51 ms per loop

In [5]: timeit get_slices_fast(data)
1000 loops, best of 3: 1.76 ms per loop

In [6]: timeit get_slices_faster(data)
1 loops, best of 3: 116 us per loop

So using the fast bincount and array indexing methods gets you about a
factor of 30 improvement.  Even just doing the counting in a loop with
good indexing will get you a factor of 2.

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy array in networkx graph?

2012-06-12 Thread Brett Olsen
This seems to work:

import networkx as nx
import pylab
import numpy as N

M = N.random.random((10, 10))
G = nx.Graph(M)
node_colors = []
for i in xrange(len(M)):
  if M[i,0]  0.5:
node_colors.append('white')
  else:
node_colors.append('blue')
nx.draw(G, node_color=node_colors)
pylab.show()

~Brett

On Tue, Jun 12, 2012 at 1:49 PM, bob tnur bobtnu...@gmail.com wrote:
 can anyone give me a hint on the following code?

 import network as nx
 import pylab as plt

 G=nx.Graph(M)  # M is numpy matrix ,i.e:type(M)=numpy.ndarray
 for i in xrange(len(M)):
   tt=P[i,:].sum()
   if tt==1:
   G.add_node(i,color='blue')
   elif tt==2:
   G.add_node(i,color='red')
   elif tt==3:
   G.add_node(i,color='white')
   else:
   tt==4
   G.add_node(i,color='green')
 G.nodes(data=True)
 T=nx.draw(G)
 plt.axis('off')
 plt.savefig(test.png)

 I didn't get color change, still the defualt color is used.Did I miss
 something?

 my aim is to obtain:
 something like:
 find total number of w-red-red-z path
    number of w-red-red-red-z path
    number of w-red-red-red-red-z path
 where w (left side of some cyclic polygon(can also be conjugated ring))
 and z(right-side of it)are any of the colors except red.

 any comment is appreciated?
 Thanks
 Bob

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Brett Olsen
 Another issue to watch out for is if the array is empty.  Technically
 speaking, that should be True, but some of the solutions offered so far
 would fail in this case.

Similarly, NaNs or Infs could cause problems:  they should signal as
False, but several of the solutions would return True.

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Forbidden charcter in the names argument of genfromtxt?

2012-02-20 Thread Brett Olsen
On Sat, Feb 18, 2012 at 8:12 PM, Adam Hughes hugad...@gwmail.gwu.edu wrote:
 Hey everyone,

 I have timeseries data in which the column label is simply a filename from
 which the original data was taken.  Here's some sample data:

 name1.txt  name2.txt  name3.txt
 32  34    953
 32  03    402

 I've noticed that the standard genfromtxt() method works great; however, the
 names aren't written correctly.  That is, if I use the command:

 print data['name1.txt']

 Nothing happens.

 However, when I remove the file extension, Eg:

 name1  name2  name3
 32  34    953
 32  03    402

 Then print data['name1'] return (32, 32) as expected.  It seems that the
 period in the name isn't compatible with the genfromtxt() names attribute.
 Is there a workaround, or do I need to restructure my program to get the
 extension removed?  I'd rather not do this if possible for reasons that
 aren't important for the discussion at hand.

It looks like the period is just getting stripped out of the names:

In [1]: import numpy as N

In [2]: N.genfromtxt('sample.txt', names=True)
Out[2]:
array([(32.0, 34.0, 954.0), (32.0, 3.0, 402.0)],
  dtype=[('name1txt', 'f8'), ('name2txt', 'f8'), ('name3txt', 'f8')])

Interestingly, this still happens if you supply the names manually:

In [17]: def reader(filename):
   : infile = open(filename, 'r')
   : names = infile.readline().split()
   : data = N.genfromtxt(infile, names=names)
   : infile.close()
   : return data
   :

In [20]: data = reader('sample.txt')

In [21]: data
Out[21]:
array([(32.0, 34.0, 954.0), (32.0, 3.0, 402.0)],
  dtype=[('name1txt', 'f8'), ('name2txt', 'f8'), ('name3txt', 'f8')])

What you can do is reset the names after genfromtxt is through with it, though:

In [34]: def reader(filename):
   : infile = open(filename, 'r')
   : names = infile.readline().split()
   : infile.close()
   : data = N.genfromtxt(filename, names=True)
   : data.dtype.names = names
   : return data
   :

In [35]: data = reader('sample.txt')

In [36]: data
Out[36]:
array([(32.0, 34.0, 954.0), (32.0, 3.0, 402.0)],
  dtype=[('name1.txt', 'f8'), ('name2.txt', 'f8'), ('name3.txt', 'f8')])

Be warned, I don't know why the period is getting stripped; there may
be a good reason, and adding it in might cause problems.

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] (no subject)

2012-02-06 Thread Brett Olsen
The namespace is different.  If you want to use numpy.sin(), for
example, you would use:

import numpy as np
np.sin(angle)

or

from numpy import *
sin(angle)

I generally prefer the first option because then I don't need to worry
about multiple imports writing on top of each other (i.e., having test
functions in several modules, and then accidentally using the wrong
one).

~Brett

On Mon, Feb 6, 2012 at 1:21 PM, Debashish Saha silid...@gmail.com wrote:
 basic difference between the commands:
 import numpy as np
 from numpy import *
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Addressing arrays

2012-01-30 Thread Brett Olsen
On Mon, Jan 30, 2012 at 10:57 AM, Ted To rainexpec...@theo.to wrote:
 Sure thing.  To keep it simple suppose I have just a two dimensional
 array (time,output):
 [(1,2),(2,3),(3,4)]
 I would like to look at all values of output for which, for example time==2.

 My actual application has a six dimensional array and I'd like to look
 at the contents using one or more of the first three dimensions.

 Many thanks,
 Ted

Couldn't you just do something like this with boolean indexing:

In [1]: import numpy as np

In [2]: a = np.array([(1,2),(2,3),(3,4)])

In [3]: a
Out[3]:
array([[1, 2],
   [2, 3],
   [3, 4]])

In [4]: mask = a[:,0] == 2

In [5]: mask
Out[5]: array([False,  True, False], dtype=bool)

In [6]: a[mask,1]
Out[6]: array([3])

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Addressing arrays

2012-01-30 Thread Brett Olsen
On Mon, Jan 30, 2012 at 11:31 AM, Ted To rainexpec...@theo.to wrote:
 On 01/30/2012 12:13 PM, Brett Olsen wrote:
 On Mon, Jan 30, 2012 at 10:57 AM, Ted To rainexpec...@theo.to wrote:
 Sure thing.  To keep it simple suppose I have just a two dimensional
 array (time,output):
 [(1,2),(2,3),(3,4)]
 I would like to look at all values of output for which, for example time==2.

 My actual application has a six dimensional array and I'd like to look
 at the contents using one or more of the first three dimensions.

 Many thanks,
 Ted

 Couldn't you just do something like this with boolean indexing:

 In [1]: import numpy as np

 In [2]: a = np.array([(1,2),(2,3),(3,4)])

 In [3]: a
 Out[3]:
 array([[1, 2],
        [2, 3],
        [3, 4]])

 In [4]: mask = a[:,0] == 2

 In [5]: mask
 Out[5]: array([False,  True, False], dtype=bool)

 In [6]: a[mask,1]
 Out[6]: array([3])

 ~Brett

 Thanks!  That works great if I only want to search over one index but I
 can't quite figure out what to do with more than a single index.  So
 suppose I have a labeled, multidimensional array with labels 'month',
 'year' and 'quantity'.  a[['month','year']] gives me an array of indices
 but a[['month','year']]==(1,1960) produces False.  I'm sure I simply
 don't know the proper syntax and I apologize for that -- I'm kind of new
 to numpy.

 Ted

You'd want to update your mask appropriately to get everything you
want to select, one criteria at a time e.g.:
mask = a[:,0] == 1
mask = a[:,1] == 1960

Alternatively:
mask = (a[:,0] == 1)  (a[:,1] == 1960)
but be careful with the parens,  and | are normally high-priority
bitwise operators and if you leave the parens out, it will try to
bitwise-and 1 and a[:,1] and throw an error.

If you've got a ton of parameters, you can combine these more
aesthetically with:
mask = (a[:,[0,1]] == [1, 1960]).all(axis=1)

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to output array with indexes to a text file?

2011-08-26 Thread Brett Olsen
On Thu, Aug 25, 2011 at 2:10 PM, Paul Menzel
paulepan...@users.sourceforge.net wrote:
 is there an easy way to also save the indexes of an array (columns, rows
 or both) when outputting it to a text file. For saving an array to a
 file I only found `savetxt()` [1] which does not seem to have such an
 option. Adding indexes manually is doable but I would like to avoid
 that.
 Is there a way to accomplish that task without reserving the 0th row or
 column to store the indexes?

 I want to process these text files to produce graphs and MetaPost’s [2]
 graph package needs these indexes. (I know about Matplotlib [3], but I
 would like to use MetaPost.)


 Thanks,

 Paul

Why don't you just write a wrapper for numpy.savetxt that adds the
indices?  E.g.:

In [1]: import numpy as N

In [2]: a = N.arange(6,12).reshape((2,3))

In [3]: a
Out[3]:
array([[ 6,  7,  8],
   [ 9, 10, 11]])

In [4]: def save_with_indices(filename, output):
   ...: (rows, cols) = output.shape
   ...: tmp = N.hstack((N.arange(1,rows+1).reshape((rows,1)), output))
   ...: tmp = N.vstack((N.arange(cols+1).reshape((1,cols+1)), tmp))
   ...: N.savetxt(filename, tmp, fmt='%8i')
   ...:

In [5]: N.savetxt('noidx.txt', a, fmt='%8i')

In [6]: save_with_indices('idx.txt', a)

'noidx.txt' looks like:
   678
   9   10   11
'idx.txt' looks like:
   0123
   1678
   29   10   11

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Finding many ways to incorrectly create a numpy array. Please advice

2011-08-02 Thread Brett Olsen
On Tue, Aug 2, 2011 at 9:44 AM, Jeremy Conlin jlcon...@gmail.com wrote:
 I am trying to create a numpy array from some text I'm reading from a
 file. Ideally, I'd like to create a structured array with the first
 element as an int and the remaining as floats. I'm currently
 unsuccessful in my attempts. I've copied a simple script below that
 shows what I've done and the wrong output. Can someone please show me
 what is happening?

 I'm using numpy version 1.5.1 under Python 2.7.1 on a Mac running Snow 
 Leopard.

 Thanks,
 Jeremy

I'd use numpy.loadtxt:

In [1]: import numpy, StringIO

In [2]: l = '  32000  7.89131E-01  8.05999E-03  3.88222E+03'

In [3]: tfc_dtype = numpy.dtype([('nps', 'u8'), ('t', 'f8'), ('e',
'f8'), ('fom', 'f8')])

In [4]: input = StringIO.StringIO(l)

In [5]: numpy.loadtxt(input, dtype=tfc_dtype)
Out[5]:
array((32000L, 0.789131003, 0.00805998995, 3882.21998),
  dtype=[('nps', 'u8'), ('t', 'f8'), ('e', 'f8'), ('fom', 'f8')])

In [6]: input.close()

In [7]: input = StringIO.StringIO(l)

In [8]: numpy.loadtxt(input)
Out[8]:
array([  3.2000e+04,   7.89131000e-01,   8.05999000e-03,
 3.88222000e+03])

In [9]: input.close()

If you're reading from a file you can replace the StringIO objects
with file objects.

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fill a particular value in the place of number satisfying certain condition by another number in an array.

2011-08-01 Thread Brett Olsen
This method is probably simpler:

In [1]: import numpy as N

In [2]: A = N.random.random_integers(-10, 10, 25).reshape((5, 5))

In [3]: A
Out[3]:
array([[ -5,   9,   1,   9,  -2],
   [ -8,   0,   9,   7, -10],
   [  2,  -3,  -1,   5,  -7],
   [  0,  -2,  -2,   9,   1],
   [ -7,  -9,  -4,  -1,   6]])

In [4]: A[A  0] = 0

In [5]: A
Out[5]:
array([[0, 9, 1, 9, 0],
   [0, 0, 9, 7, 0],
   [2, 0, 0, 5, 0],
   [0, 0, 0, 9, 1],
   [0, 0, 0, 0, 6]])

~Brett

On Mon, Aug 1, 2011 at 4:31 AM, dileep kunjaai dileepkunj...@gmail.com wrote:
 Dear sir,
    How can we fill a particular value in the place of number satisfying
 certain condition by another number in an array.


 Example:
  A=[[[  9.42233087e-42  - 4.71116544e-42   0.e+00 ...,
 1.48303127e+01
  1.31524124e+01   1.14745111e+01]
   [  3.91788793e+00   1.95894396e+00   0.e+00 ...,   1.78252487e+01
  1.28667984e+01   7.90834856e+00]
   [  7.83592510e+00   -3.91796255e+00   0.e+00 ...,   2.08202991e+01
  1.25811749e+01   4.34205008e+00]
   ...,
   [  -8.51249974e-03   7.00901222e+00   -1.40095119e+01 ...,
 0.e+00
  0.e+00   0.e+00]
   [  4.26390441e-03   3.51080871e+00   -7.01735353e+00 ...,   0.e+00
  0.e+00   0.e+00]
   [  0.e+00   0.e+00   0.e+00 ...,   0.e+00
  0.e+00   0.e+00]]

  [[  9.42233087e-42   -4.71116544e-42   0.e+00 ...,   8.48242474e+00
  7.97146845e+00   7.46051216e+00]
   [  5.16325808e+00   2.58162904e+00   0.e+00 ...,   8.47719383e+00
  8.28024673e+00   8.08330059e+00]
   [  1.03267126e+01   5.16335630e+00   0.e+00 ...,   8.47196198e+00
  8.58903694e+00   8.70611191e+00]
   ...,
   [  0.e+00   2.74500012e-01   5.4925e-01 ...,   0.e+00
  0.e+00   0.e+00]
   [  0.e+00   1.37496844e-01   -2.74993688e-01 ...,   0.e+00
  0.e+00   0.e+00]
   [  0.e+00   0.e+00   0.e+00 ...,   0.e+00
  0.e+00   0.e+00]]

  [[  9.42233087e-42   4.71116544e-42   0.e+00 ...,   1.18437748e+01
  9.72778034e+00   7.61178637e+00]
   [  2.96431869e-01   1.48215935e-01   0.e+00 ...,   1.64031239e+01
  1.32768812e+01   1.01506386e+01]
   [  5.92875004e-01   2.96437502e-01   0.e+00 ...,   2.09626484e+01
  1.68261185e+01   1.26895866e+01]
   ...,
   [  1.78188753e+00   -8.90943766e-01   0.e+00 ...,   0.e+00
  1.2755e-03   2.5509e-03]
   [  9.34620261e-01   -4.67310131e-01   0.e+00 ...,   0.e+00
  6.38646539e-04   1.27729308e-03]
   [  8.4339e-02   4.21500020e-02   0.e+00 ...,   0.e+00
  0.e+00   0.e+00]]]
   A contain some negative value i want to change the negative numbers to
 '0'.
 I used 'masked_where', command but I failed.



 Please help me

 --
 DILEEPKUMAR. R
 J R F, IIT DELHI


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Alternative to boolean array

2011-07-20 Thread Brett Olsen
On Tue, Jul 19, 2011 at 11:08 AM, Robert Kern robert.k...@gmail.com wrote:
 On Tue, Jul 19, 2011 at 07:38, Andrea Cimatoribus
 g.plantagen...@gmail.com wrote:
 Dear all,
 I would like to avoid the use of a boolean array (mask) in the following
 statement:

 mask = (A != 0.)
 B   = A[mask]

 in order to be able to move this bit of code in a cython script (boolean
 arrays are not yet implemented there, and they slow down execution a lot as
 they can't be defined explicitely).
 Any idea of an efficient alternative?

 You will have to count the number of True values, create the B array
 with the right size, then run a simple loop to assign into it where A
 != 0. This makes you do the comparisons twice.

 Or you can allocate a B array the same size as A, run your loop to
 assign into it when A != 0 and incrementing the index into B, then
 slice out or memcpy out the portion that you assigned.

According to my calculations, the last method is the fastest, though
the savings aren't considerable.

In cython, defining some test mask functions (saved as cython_mask.pyx):
import numpy as N
cimport numpy as N

def mask1(N.ndarray[N.int32_t, ndim=1] A):
cdef N.ndarray[N.int32_t, ndim=1] B
B = A[A != 0]
return B

def mask2(N.ndarray[N.int32_t, ndim=1] A):
cdef int i
cdef int count = 0
for i in range(len(A)):
if A[i] == 0: continue
count += 1

cdef N.ndarray[N.int32_t, ndim=1] B = N.empty(count, dtype=int)
count = 0
for i in range(len(A)):
if A[i] == 0: continue
B[count] = A[i]
count += 1
return B

def mask3(N.ndarray[N.int32_t, ndim=1] A):
cdef N.ndarray[N.int32_t, ndim=1] B = N.empty(len(A), dtype=int)
cdef int i
cdef int count = 0
for i in range(len(A)):
if A[i] == 0: continue
B[count] = A[i]
count += 1
return B[:count]

In [1]: import numpy as N

In [2]: import timeit

In [3]: from cython_mask import *

In [4]: A = N.random.randint(0, 2, 1)

In [5]: def mask4(A):
   ...: return A[A != 0]
   ...:

In [6]: %timeit mask1(A)
1 loops, best of 3: 195 us per loop

In [7]: %timeit mask2(A)
1 loops, best of 3: 136 us per loop

In [8]: %timeit mask3(A)
1 loops, best of 3: 117 us per loop

In [9]: %timeit mask4(A)
1 loops, best of 3: 193 us per loop

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Beginner's question

2011-04-20 Thread Brett Olsen
On Sat, Apr 16, 2011 at 2:08 PM, Laszlo Nagy gand...@shopzeus.com wrote:
 import numpy as np
 import numpy.random as rnd

 def dim_weight(X):
     weights = X[0]
     volumes = X[1]*X[2]*X[3]
     res = np.empty(len(volumes), dtype=np.double)
     for i,v in enumerate(volumes):
         if v5184:
             res[i] = v/194.0
         else:
             res[i] = weights[i]
     return res
 N = 10
 X = rnd.randint( 1,25, (4,N))
 print dim_weight(X)
    Laszlo

This works:

def dim_weight2(X):
w = X[0]
v = X[1]*X[2]*X[3]
res = np.empty(len(volumes), dtype=np.double)
res[:] = w[:]
res[v5184] = v[v5184]/194.0
return res

~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slicing / indexing question

2010-09-21 Thread Brett Olsen
On Tue, Sep 21, 2010 at 6:20 PM, Timothy W. Hilton hil...@meteo.psu.edu wrote:
 Hello,

 I have an indexing problem which I suspect has a simple solution, but
 I've not been able to piece together various threads I've read on this
 list to solve.

 I have an 80x1200x1200 nd.array of floats this_par.  I have a
 1200x1200 boolean array idx, and an 80-element float array pars.  For
 each element of idx that is True, I wish to replace the corresponding
 80x1x1 slice of this_par with the elements of pars.

 I've tried lots of variations on the theme of
this_par[idx[np.newaxis, ...]] = pars[:, np.newaxis, np.newaxis]
 but so far, no dice.

 Any help greatly appreciated!

 Thanks,
 Tim

This works, although I imagine it could be streamlined.

In [1]: this_par = N.ones((2,4,4))
In [2]: idx = N.random.random((4,4))  0.5
In [3]: pars = N.arange(2) - 10
In [4]: this_par[:,idx] = N.tile(pars, (idx.sum(), 1)).transpose()
In [5]: idx
Out[5]
array([[ True, False, True, False],
 [False, False, True, True],
 [False, False, False, False],
 [False, False, False, False]], dtype=bool)
In [6]: this_par
Out[6]:
array([[[-10.,   1., -10.,   1.],
  [   1.,   1., -10., -10.],
  [   1.,   1.,1.,   1.],
  [   1.,   1.,1.,   1.]],
 [[ -9.,   1.,  -9.,   1.],
  [   1.,   1.,  -9.,  -9.],
  [   1.,   1.,1.,   1.],
  [   1.,   1.,1.,   1.]]])

Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Two questions on indexing

2010-09-15 Thread Brett Olsen
On Wed, Sep 15, 2010 at 4:38 PM, Mark Fenner mfen...@gmail.com wrote:
 A separate question.  Suppose I have a slice for indexing that looks like:

 [:, :, 2, :, 5]

 How can I get an indexing slice for all OTHER dimension values besides
 those specified.  Conceptually, something like:

 [:, :, all but 2, :, all but 5]

 Incidentally, the goal is to construct a new array with all those
 other spots filled in with zero and the specified spots with their
 original values.  Would it be easier to construct a 0-1 indicator
 array with 1s in the [:,:,2,:,5] positions and multiply it out?  Humm,
 I may have just answered my own question.

 For argument sake, how would you do it with indexing/slicing?  I
 suppose one develops some intuition as one gains experience with numpy
 with regards to when to (1) use clever matrix ops and when to (2) use
 clever slicing and when to (3) use a combination of both.

This works, although I'm not sure how efficient it is compared to other methods:

In [19]: a = N.arange(16).reshape(4,4)
In [20]: a
Out[20]:
array([[ 0,  1,  2,  3],
   [ 4,  5,  6,  7],
   [ 8,  9, 10, 11],
   [12, 13, 14, 15]])

In [24]: a[:,N.arange(4) != 2]
Out[24]:
array([[ 0,  1,  3],
   [ 4,  5,  7],
   [ 8,  9, 11],
   [12, 13, 15]])

Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Brett Olsen
On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano
massimodisa...@gmail.com wrote:
 Hello All,

 i need to extract data from an array, that are inside a
 rectangle area defined as :

 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625

 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z)

 #X,Y,Z
 3020081.5500,76.3100,0.0300
 3020086.2000,769991.6500,0.4600
 3020099.6600,769996.2700,0.9000
 ...
 ...

 i read it using  numpy.loadtxt 

 data :

 http://www.geofemengineering.it/data/csv.txt     5,3 mb (158735 rows)

 to extract data that are inside the boundy-box area (N, S, E, W) i'm using a 
 loop
 inside a function like :

 import numpy as np

 def getMinMaxBB(data, N, S, E, W):
        mydata = data * 0.3048006096012
        for i in range(len(mydata)):
                if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or 
 mydata[i,1]  S :
                        if i == 0:
                                newdata = 
 np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float)
                        else :
                                newdata = np.vstack((newdata,(mydata[i,0], 
 mydata[i,1], mydata[i,2])))
        results = {}
        results['Max_Z'] = newdata.max(0)[2]
        results['Min_Z'] = newdata.min(0)[2]
        results['Num_P'] = len(newdata)
        return results


 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625
 data = '/Users/sasha/csv.txt'
 mydata = np.loadtxt(data, comments='#', delimiter=',')
 out = getMinMaxBB(mydata, N, S, E, W)

 print out

Use boolean arrays to index the parts of your array that you want to look at:

def newGetMinMax(data, N, S, E, W):
mydata = data * 0.3048006096012
mask = np.zeros(mydata.shape[0], dtype=bool)
mask |= mydata[:,0]  E
mask |= mydata[:,0]  W
mask |= mydata[:,1]  N
mask |= mydata[:,1]  S
results = {}
results['Max_Z'] = mydata[mask,2].max()
results['Min_Z'] = mydata[mask,2].min()
results['Num_P'] = mask.sum()
return results

This runs about 5000 times faster on my machine.

Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Brett Olsen
On Sat, Sep 11, 2010 at 4:46 PM, Massimo Di Stefano
massimodisa...@gmail.com wrote:
 Thanks Pierre,

 i tried it and all works fine and fast.

 my apologize :-(

 i used a wrong if statment to represent my needs

 if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or mydata[i,1]  S :

 ^^ totally wrong for my needs^^


 this if  instead :

 if W  mydata[i,0]  E and S  mydata[i,1]  N:

 should reflect your example :

 yselect = (data[:,1] = N)  (data[:,1] = S)
 xselect = (data[:,0] = E)  (data[:,0] = W)
 selected_data = data[xselect  yselect]


 a question, how to code a masked array,
 as in the Brett's code, to reflect the new (right) if statment ?

Just replace the lines

  mask |= mydata[:,0]  E
  mask |= mydata[:,0]  W
  mask |= mydata[:,1]  N
  mask |= mydata[:,1]  S

with

  mask = mydata[:,0]  E
  mask = mydata[:,0]  W
  mask = mydata[:,1]  N
  mask = mydata[:,1]  S

Sorry, I wasn't paying attention to what you were actually trying to
do and just duplicated the function of the code you supplied.

There's a good primer on how to index with boolean arrays at
http://www.scipy.org/Tentative_NumPy_Tutorial#head-d55e594d46b4f347c20efe1b4c65c92779f06268
that will explain why this works.

Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Boolean arrays

2010-08-27 Thread Brett Olsen
Hello,

I have an array of non-numeric data, and I want to create a boolean
array denoting whether each element in this array is a valid value
or not.  This is straightforward if there's only one possible valid
value:
 import numpy as N
 ar = N.array((a, b, c, b, b, a, d, c, a))
 ar == a
array([ True, False, False, False, False,  True, False, False,  True],
dtype=bool)

If there's multiple possible valid values, I've come up with a couple
possible methods, but they all seem to be inefficient or kludges:
 valid = N.array((a, c))
 (ar == valid[0]) | (ar == valid[1])
array([ True, False,  True, False, False,  True, False,  True,  True],
dtype=bool)
 N.array(map(lambda x: x in valid, ar))
array([ True, False,  True, False, False,  True, False,  True,  True],
dtype=bool)

Is there a numpy-appropriate way to do this?

Thanks,
Brett Olsen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion