Re: [Numpy-discussion] Numpy interpolate: cut through 2D matrix

2012-03-01 Thread Vincent Schut
On 03/01/2012 12:35 PM, Pierre Barthelemy wrote:
 Hello,

 for a data analysis tool i am programming, i need to plot a cut through
 a 2D graph. I then have a 2D array, and the indices
 start=(start_x,start_y) and stop=(stop_x,stop_y) that are the position
 of the starting point and stop point of the cut. The code i programmed
 is placed on the bottom.

 This code returns only value existing in the original array: if the cut
 should pass between the column index i and column index i+1, it returns
 anyway the value at column index i.
 Is it possible to use the numpy.interpolate library to make such that
 when passing between i and i+1, the function returns an interpolation of
 the graph between the points [row,column] and [row,column+1] ?

Pierre,

if you have scipy next to numpy, have a look at 
scipy.ndimage.map_coordinates. In short, you define the coordinates you 
want your array to be interpolated at, and it will give you the 
interpolated values, using spline interpolation.

Best,
Vincent.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python geospatial package?

2012-02-23 Thread Vincent Schut
On 02/22/2012 10:45 PM, Chao YUE wrote:
 Hi all,

 Is anyone using some python geospatial package that can do jobs like
 intersection, etc.  the job is like you automatically extract a region
 on a global map etc.

 thanks and cheers,

 Chao

Chao,

shapely would do this, though I found it had a bit of a steep learning 
curve. Or you could go the gdal/ogr way, which uses the geos library 
under the hood (if present) to do geometrical operations like 
intersections etc.

cheers,
Vincent.


 --
 ***
 Chao YUE
 Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
 UMR 1572 CEA-CNRS-UVSQ
 Batiment 712 - Pe 119
 91191 GIF Sur YVETTE Cedex
 Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
 



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] find location of maximum values

2012-01-06 Thread Vincent Schut
On Wed, 04 Jan 2012 12:29:36 +0100, Derek Homeier wrote:

 On 04.01.2012, at 5:10AM, questions anon wrote:
 
 Thanks for your responses but I am still having difficuties with this
 problem. Using argmax gives me one very large value and I am not sure
 what it is.

it is the index in the flattened array. To translate this into a 
multidimensional index, use numpy.unravel_index(i, original_shape).

Cheers,
Vincent.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] local variable referenced before assignment

2011-05-31 Thread Vincent Schut
On 05/31/2011 11:04 AM, Edoardo Pasca wrote:
 Dear all,

 sometimes I encounter the problem that calling many times a function
 it happens that some local variables are not defined and the procedure
 crashes.

 For example I have a function defined as

 def procedure(tt, Ctissue, WeinmannFit, bodyWeight):
  for maxFastSampling in range(1,len(tt)):
  if tt[maxFastSampling] - tt[maxFastSampling-1]  10:
  break

  if numpy.equal(Ctissue[:maxFastSampling] , 0 ).all():
  # do something

 Now the inputs are all numpy.array except bodyWeight that is a number.

 The program calls the function procedure many times but at some point
 fails at the evaluation of numpy.equal with

 UnboundLocalError: local variable 'maxFastSampling' referenced before 
 assignment

 I wonder whether you see any way that the maxFastSampling variable
 might not be set.

Edo,

When len(tt) is 2, range(1, len(tt)) gives an emtpy list, the for loop 
will never execute and maxFastSampling will not be set.

VS.

 Python 2.6.4
 numpy 1.5.0

 Thanks

 Edo


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] need a better way to fill a grid

2011-01-24 Thread Vincent Schut
On 01/24/2011 02:53 PM, John wrote:
 Hello,

 I'm trying to cycle over some vectors (lat,lon,emissions) of
 irregularly spaced lat/lon spots, and values. I need to sum the values
 each contributing to grid on a regular lat lon grid.

 This is what I have presently, but it is too slow. Is there a more
 efficient way to do this? I would prefer not to create an external
 module (f2py, cython) unless there is really no way to make this more
 efficient... it's the looping through the grid I guess that takes so
 long.

Use np.histogram2d with weights=emissions, and lat and lon as your x and 
y to histogram. Choose the bins to match your grid, and it will 
effectively sum the emission values for each grid cell.

Vincent.


 Thanks,
 john



  def grid_emissions(lon,lat,emissions,grid.dx, grid.dy,
 grid.outlat0, grid.outlon0, grid.nxmax, grid.nymax):
   sample the emissions into a grid to fold into model output
  

  dx = grid.dxout
  dy = grid.dyout

  # Generate a regular grid to fill with the sum of emissions
  xi = np.linspace(grid.outlon0,
 grid.outlon0+(grid.nxmax*grid.d), grid.nxmax)
  yi = np.linspace(grid.outlat0,
 grid.outlat0+(grid.nymax*grid.dy), grid.nymax)

  X, Y = np.meshgrid(yi, xi)
  Z = np.zeros(X.shape)

  for i,x in enumerate(xi):
  for j,y in enumerate(yi):
  Z[i,j] = np.sum( emissions[\
   np.where(((laty-dy)  (laty+dy))
 ((lonx-dx)  (lonx+dx)))[0]])

  return Z

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] large float32 array issue

2010-11-04 Thread Vincent Schut
On 11/03/2010 03:04 PM, Bruce Southey wrote:
 On 11/03/2010 06:52 AM, Pauli Virtanen wrote:
 Wed, 03 Nov 2010 12:39:08 +0100, Vincent Schut wrote:
 [clip]
 Btw, should I file a bug on this?
 One can argue that mean() and sum() should use a numerically stabler
 algorithm, so yes, a bug can be filed if there is not yet one already.

 This is a 'user bug' not a numpy bug because it is a well known
 numerical problem. I recall that we have had this type of discussion
 before that has resulted in these functions being left as they are. The
 numerical problem is mentioned better in the np.mean docstring than the
 np.sum docstring.

 My understanding was that any new algorithm has to be better than the
 current algorithm especially in speed and accuracy across 'typical'
 numpy problems across the different Python and OS versions not just for
 numerically challenged cases. For example, I would not want to sacrifice
 speed if I achieve the same accuracy without losing as much speed as
 just changing the dtype to float128 (as I use x86_64 Linux).

 Also in Warren's mean example, this is simply a 32-bit error because it
 disappears when using 64-bit (numpy's default) - well, until we reach
 the extreme 64-bit values.

 np.ones((11334,16002)).mean()
 1.0
 np.ones((11334,16002),np.float32).mean()
 0.092504406598019437
 np.ones((11334,16002),np.float32).mean().dtype
 dtype('float64')

 Note that there is probably a bug in np.mean because a 64-bit dtype is
 returned for integers and 32-bit or lower precision floats. So upcast is
 not apparently being done on the accumulator.


 Bruce

Thanks for the info, all. I agree that this is a 'user bug', however, 
mentioning this as a corner case someplace a user would look when 
finding errors like these might be an idea, as I have the feeling it 
will keep turning up once and again at this list otherwise. Maybe start 
a webpage 'numpy and some often encountered floating point issues'?

For now, I've just ordered more memory. The cheapest and simplest 
solution, if you ask me :-)

Vincent.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] large float32 array issue

2010-11-03 Thread Vincent Schut
Hi, I'm running in this strange issue when using some pretty large 
float32 arrays. In the following code I create a large array filled with 
ones, and calculate mean and sum, first with a float64 version, then 
with a float32 version. Note the difference between the two. NB the 
float64 version is obviously right :-)



In [2]: areaGrid = numpy.ones((11334, 16002))
In [3]: print(areaGrid.dtype)
float64
In [4]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(), 
areaGrid.mean(), areaGrid.sum())
((11334, 16002), 1.0, 1.0, 1.0, 18138.0)


In [5]: areaGrid = numpy.ones((11334, 16002), numpy.float32)
In [6]: print(areaGrid.dtype)
float32
In [7]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(), 
areaGrid.mean(), areaGrid.sum())
((11334, 16002), 1.0, 1.0, 0.092504406598019437, 16777216.0)


Can anybody confirm this? And better: explain it? Am I running into a 
for me till now hidden ieee float 'feature'? Or is it a bug somewhere?

Btw I'd like to use float32 arrays, as precision is not really an issue 
in this case, but memory usage is...


This is using python 2.7, numpy from git (yesterday's checkout), on arch 
linux 64bit.

Best,
Vincent.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] large float32 array issue

2010-11-03 Thread Vincent Schut


On 11/03/2010 12:31 PM, Warren Weckesser wrote:


 On Wed, Nov 3, 2010 at 5:59 AM, Warren Weckesser
 warren.weckes...@enthought.com mailto:warren.weckes...@enthought.com
 wrote:



 On Wed, Nov 3, 2010 at 3:54 AM, Vincent Schut sc...@sarvision.nl
 mailto:sc...@sarvision.nl wrote:

 Hi, I'm running in this strange issue when using some pretty large
 float32 arrays. In the following code I create a large array
 filled with
 ones, and calculate mean and sum, first with a float64 version, then
 with a float32 version. Note the difference between the two. NB the
 float64 version is obviously right :-)



 In [2]: areaGrid = numpy.ones((11334, 16002))
 In [3]: print(areaGrid.dtype)
 float64
 In [4]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
 areaGrid.mean(), areaGrid.sum())
 ((11334, 16002), 1.0, 1.0, 1.0, 18138.0)


 In [5]: areaGrid = numpy.ones((11334, 16002), numpy.float32)
 In [6]: print(areaGrid.dtype)
 float32
 In [7]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
 areaGrid.mean(), areaGrid.sum())
 ((11334, 16002), 1.0, 1.0, 0.092504406598019437, 16777216.0)


 Can anybody confirm this? And better: explain it? Am I running
 into a
 for me till now hidden ieee float 'feature'? Or is it a bug
 somewhere?

 Btw I'd like to use float32 arrays, as precision is not really
 an issue
 in this case, but memory usage is...


 This is using python 2.7, numpy from git (yesterday's checkout),
 on arch
 linux 64bit.



 The problem kicks in with an array of ones of size 2**24.  Note that
 np.float32(2**24) + np.float32(1.0) equals np.float32(2**24):


 In [41]: b = np.ones(2**24, np.float32)

 In [42]: b.size, b.sum()
 Out[42]: (16777216, 16777216.0)

 In [43]: b = np.ones(2**24+1, np.float32)

 In [44]: b.size, b.sum()
 Out[44]: (16777217, 16777216.0)

 In [45]: np.spacing(np.float32(2**24))
 Out[45]: 2.0

 In [46]: np.float32(2**24) + np.float32(1)
 Out[46]: 16777216.0




 By the way, you can override the dtype of the accumulator of the mean()
 function:

 In [61]: a = np.ones((11334,16002),np.float32)

 In [62]: a.mean()  # Not correct
 Out[62]: 0.092504406598019437

 In [63]: a.mean(dtype=np.float64)
 Out[63]: 1.0

Thanks for this. That at least gives me a temporary solution (I actually 
need sum() instead of mean(), but the trick works for sum too).

Btw, should I file a bug on this?

Vincent.



 Warren



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] summarizing blocks of an array using a moving window

2010-07-22 Thread Vincent Schut
On 07/22/2010 06:47 AM, Robin Kraft wrote:
 Hello all,

 The short version: For a given NxN array, is there an efficient way to use a 
 moving window to collect a summary statistic on a chunk of the array, and 
 insert it into another array?

Hi Robin,

been wrestling with similar stuff myself, though I have no obvious 
answer... A lot depends on your data and the stats you want.

Some thoughts:

- though you want non-overlapping stats, you could take a look at 
scipy.ndimage. Lots of this is coded in C, and though it does 
overlapping moving window stats, its speed might outweigh the stuff you 
have to do otherwise to 'tile' your array. You would then just slice the 
result of the ndimage operation (e.g. [::2, ::2] or similar).
- an other option would be some smart reshaping, which finally gives you 
a [y//2, x//2, 2, 2] array, which you could then reduce to calculate 
stats (mean, std, etc) on the last two axes.  I *think* you'd have to 
first reshape both x and y axes, and then reposition the axes. An example:
a = gdal_array.BandReadAsArray(blabla)
y,x = a.shape #y and x need be divideable by 2!
b = a.reshape(y/2, 2, x/2, x).transpose(0,2,1,3).reshape(y/2, x/2, 4)
bMean = b.mean(axis=-1)
bMax = ..etc.
- a third option would be to create an index array, which has a unique 
value per 2x2 square, and then use histogram2d. This would, if you use 
its 'weight' functionality, at least enable you to get efficient counts 
and sums/means. Other stats might be hard, though.

Good luck!
Vincent Schut.

 The long version: I am trying to resample an image loaded with GDAL into an 
 NxN array. Note that this is for statistical purposes, so image quality 
 doesn't matter. For the curious, the image is derived from satellite imagery 
 and displays a map of hotspots of tropical deforestation at 500m resolution. 
 I need to assign a count of these deforestation 'hits' to each pixel in an 
 existing array of 1000m pixels.


 Let's say the image looks like this: np.random.randint(0,2, 16).reshape(4,4)

 array([[0, 0, 0, 1],
 [0, 0, 1, 1],
 [1, 1, 0, 0],
 [0, 0, 0, 0]])

 I want to use a square, non-overlapping moving window for resampling, so that 
 I get a count of all the 1's in each 2x2 window.

 0, 0,   0, 1
 0, 0,   1, 1 0  3
  =2  0
 1, 1,   0, 0
 0, 0,   0, 0

 In another situation with similar data I'll need the average, or the maximum 
 value, etc..

 My brute-force method is to loop through the rows and columns to get little 
 chunks of data to process. But must be a more elegant way to do this - it's 
 probably obvious too ... (inelegant way further down).

 Another way to do it would be to use np.tile to create an array called arr 
 filed with blocks of [[0,1],[2,3]]. I could then use something like this to 
 get the stats I want:

 d0 = img[np.where(arr==0)]
 d1 = img[np.where(arr==1)]
 d2 = img[np.where(arr==2)]
 d3 = img[np.where(arr==3)]

 img_out = d0 + d1 + d2 + d3

 This would be quite snappy if I could create arr efficiently. Unfortunately 
 np.tile does something akin to np.hstack to create this array, so it isn't 
 square and can't be reshaped appropriately 
 (np.tile(np.arange(2**2).reshape(2, 2), 4)):

 array([[0, 1, 0, 1, 0, 1, 0, 1],
 [2, 3, 2, 3, 2, 3, 2, 3]])

 Inefficient sample code below. Advice greatly appreciated!

 -Robin


 import numpy as np
 from math import sqrt
 from time import time

 def rs(img_size=16, window_size=2):
  w = window_size

  # making sure the numbers work

  if img_size % sqrt(img_size)  0:
  print Square root of image size must be an integer.
  print Sqrt =, sqrt(img_size)

  return

  elif (img_size / sqrt(img_size)) % w  0:
  print Square root of image size must be evenly divisible by, w
  print Sqrt =, sqrt(img_size)
  print sqrt(img_size), /, w, =, sqrt(img_size)/w

  return

  else:

  rows = int(sqrt(img_size))
  cols = int(sqrt(img_size))

  # create fake data: ones and zeros
  img = np.random.randint(0, 2, img_size).reshape(rows, cols)

  # create empty array for resampled data
  img_out = np.zeros((rows/w, cols/w), dtype=np.int)

  t = time()

  # retreive blocks of pixels in img
  # insert summary into img_out

  for r in xrange(0, rows, w): # skip rows based on window size
  for c in xrange(0, cols, w): # skip columns based on window size

  # calculate the sum of the values in moving window,
  #insert value into corresponding pixel in img_out

  img_out[r/w, c/w] = np.int(np.sum(img[r:r+w, c:c+w]))

  t1= time()
  elapsed = t1-t
  print img shape:, img.shape
  print img, \n
  print img_out shape:, img_out.shape
  print img_out

  print %.7f seconds % elapsed

 rs(imgage_size=16, window=2)
 rs(81, 3)
 rs

Re: [Numpy-discussion] Crosstabulation

2010-07-20 Thread Vincent Schut


On 07/19/2010 10:14 PM, Friedrich Romstedt wrote:
 2010/7/19 sandric ionutsandricio...@yahoo.com:
 For land-use a class would be for example forest, other would be orchard
 etc. For Slope gradient I would have values which3 and between 3 and 7
 etc. So, I will have 2 raster data with, let's say, 3 classes each: forest,
 orchards and built-up area and for slope gradient: 0-3, 3-15, 15-35. The
 cross-tabulation analysis should give me a table like:

  forest orchards built-up
 0-3 10 20   15
 3-15   5   10   20
 15-35 5   15   15

 where the numbers represents all the common cells, for example: 10 cells
 with forest correspond to 10 cells with 0-3 slope gradient interval and so
 on
 (by cells I mean the pixel from a raster data)

 Okay everything is clear now.  I would suggestest looping over the
 cells of the table.  E.g. when:

 landuse_catmap
 slope_map

 are the respective maps, you can categorise the slope_map with

 slope_catmap = 0 * (0= slope_map) * (slope_map  3) + 1 * (3=
 slope_map) * (slope_map  15) + 2 * (15= slope_map)

 to get categories 0, 1, and 2, where the zero case is included here
 only for illustration, since it's just zero it can be omitted.

 Then you are maybe supposed to do:

 table = numpy.zeros((N_slope_cats, N_landuse_cats))

 and finally the looping over its elements, where the looping over the
 map cells is done entirely by numpy:

 for slope_cat in xrange(0, N_slope_cats):
   for landuse_cat in xrange(0, N_landuse_cats):
   table[slope_cat, landuse_cat] = \
   ((slope_catmap == slope_cat) * \
   (landuse_catmap == landuse_cat)).sum()

 I believe there is some slightly faster but more obscure way doing
 this table creation in one single large step by putting the maps into
 a hyperdimensianal array but I also believe this is beyond our scope
 here if this is already fast enough.

That 'slightly faster but more obscure way' would be histogram2d :-) 
Especially when you have many landuse and/or slope classes over which to 
loop, histogram2d will be more than just slightly faster. About obscure, 
well, I leave that to the user. Probably depends on your background. 
Personally I use it all the time. Loops are evil :-)

your 'more obscure' code would become something like this:

slope_bin_edges = [0, 3, 15, 35]
landuse_bin_edges = [0, 1, 2, 3]
crosstab = numpy.histogram2d(landuse, slope, bins=(landuse_bin_edges, 
slope_bin_edges))


VS.

 Friedrich

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Crosstabulation

2010-07-19 Thread Vincent Schut


On 07/19/2010 09:55 AM, sandric ionut wrote:

 Hi Friedrich:

 For land-use a class would be for example forest, other would be orchard
 etc. For Slope gradient I would have values which 3 and between 3 and 7
 etc. So, I will have 2 raster data with, let's say, 3 classes each:
 forest, orchards and built-up area and for slope gradient: 0-3, 3-15,
 15-35. The cross-tabulation analysis should give me a table like:

 forest orchards built-up
 0-3 10 n bsp; 20 15
 3-15 5 10 20
 15-35 5 15 15

 where the numbers represents all the common cells, for example: 10 cells
 with forest correspond to 10 cells with 0-3 slope gradient interval and
 so on
 (by cells I mean the pixel from a raster data)

 The analysis is better illustrated here:
 http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=tabulate_area

 Ionut

Ha, we're messing up lingo's here :-) Need to switch to GIS (geographic 
information systems) dialect.
- DEM = digital elevation map, usually a 2d array ('raster') with 
elevation values (for a certain area on earth)
- slope gradient = the slope (literally, not as in math speak) of the 
surface depicted by the elevation map. Mostly defined as the maximum 
slope within a certain moving window; several competing methods to 
estimate/calculate slope exist.
- land use/cover class: raster (array) where each cell ('pixel') has an 
integer value, which maps to some well defined land use at that location 
(e.g. 0 means sea, 1 means forest, 2 means agriculture, etc)
- crosstabulation usually means some kind of 2d histogram, where the 
total number of raster cells with a certain value (e.g. depicting 'land 
use class') 'within' a range of values of another raster with the same 
shape (and matching locations). Like: how many cells of forest lie 
withing a slope range of 0-10 degrees?

Right. On to the answers. I think you should look into 
numpy.histogram2d, where you can do exactly what you want. Your land use 
array is x, your slope gradient array = y, then you define the bins as 
your class numbers (for x) and your slope gradient ranges (for y), and 
you will get a pixel count for each bin combination.

see: 
http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram2d.html

Regards,
Vincent Schut.


 
 *From:* Friedrich Romstedt friedrichromst...@gmail.com
 *To:* Discussion of Numerical Python numpy-discussion@scipy.org
 *Sent:* Sun, July 18, 2010 12:09:04 AM
 *Subject:* Re: [Numpy-discussion] Crosstabulation

 2010/7/17 Robert Kern robert.k...@gmail.com
 mailto:robert.k...@gmail.com:
   On Sat, Jul 17, 2010 at 13:11, Friedrich Romstedt
   friedrichromst...@gmail.com mailto:friedrichromst...@gmail.com wrote:
   2010/7/14 Ionut Sandric sandricio...@yahoo.com
 mailto:sandricio...@yahoo.com:
   I'm afraid also Zach does not understand what you are talking about
   ... So my first question (please bear with me) would be: What's a dem?
  
   Digital Elevation Map.
  
   (n/a in my dictionary) And sorry for the cross-talk on the other
   first post by you ...
  
   And by slope gradient you mean second derivative?
  
   No, the first derivative. Slope gradient is a reasonably common,
   albeit somewhat redundant, idiom meaning the gradient of an elevation
   map.

 Thanks Robert, that clarifies a lot.

 But still I don't understand how the crosstabulation shall work. What
 are the classes?

 Friedrich
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy for Python 3?

2010-07-19 Thread Vincent Schut


On 07/19/2010 02:56 PM, Richard D. Moores wrote:
 On Mon, Jul 19, 2010 at 05:28, Vincent Schutsc...@sarvision.nl  wrote:

 Well, you might want to read up on some beginners guide for python? It's
 up to you, of course, but usually before starting with numpy (which
 extends python), it is advised to have at least some basic python
 understanding... Googling will give you plenty of good resources, if
 you'd like to.

 I'm not a python beginner. Why did you assume I was?

I appologize, then. This, however, made me think you were:
'Now what? Try simple commands? Like Lemme outta here!?'
Now english is not my native language, so some subtle humour might have 
escaped me and I may have understood that entirely wrong... :-)


 Then, for the sake of helping you further anyway:
 you'll have to mind the significance of paths (=directories or folders
 in windows speak I think). The folder you're currently in, will restrict
 what you find when typing commands. If you need to reference something
 from a different folder, you'll need to explicitly specify that.

 Yes, I have that understanding.
Good.
The fact that you were starting python from the Python31 folder, and 
then typed 'now what?' gave me the idea you did not...

 To build numpy, you'll need to be in the numpy source folder (the numpy
 you extracted from svn). But if you're there, simply typing 'python' or
 'python.exe' will probably not work because 'python.exe' is in a
 different folder (c:\Python31). You could go into that folder, but then
 you would not be able to find numpy's setup.py script. Best way to solve
 that: make sure you're in the numpy folder, and type something like:
 'c:\Python31\python.exe setup.py build'. That should get you started at
 least.

 However, if I'm allowed to give you some unaskedfor advice: this might
 become lots easier if you make sure you're at least a bit comfortable
 with 1) the windows command prompt, 2) python, and 3) building python
 stuff from svn source checkouts. No offence meant. But you sound as you
 feel a lot more comfortable with pre-built packages compared to building
 yourself from source on windows...

 No, I fail your number 3.

Well, than you've come along quite far already, and are on the right 
list :-)
Then, please post the output of your 'python setup.py build' command, 
which will give us some clues about *why* you fail... (or, if the output 
is long, try to find the relevant lines indicating where and what goes 
wrong)


 Good luck anyway!
 Vincent Schut.

 Thanks, Vincent. And I am more comfortable with pre-built packages.
Most of us are ;-) But sometimes you just need to bite the bullet...

 Dick

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy for Python 3?

2010-07-19 Thread Vincent Schut


On 07/19/2010 03:34 PM, Richard D. Moores wrote:
 On Mon, Jul 19, 2010 at 06:15, Vincent Schutsc...@sarvision.nl  wrote:


 On 07/19/2010 02:56 PM, Richard D. Moores wrote:
 On Mon, Jul 19, 2010 at 05:28, Vincent Schutsc...@sarvision.nlwrote:

 Well, you might want to read up on some beginners guide for python? It's
 up to you, of course, but usually before starting with numpy (which
 extends python), it is advised to have at least some basic python
 understanding... Googling will give you plenty of good resources, if
 you'd like to.

 I'm not a python beginner. Why did you assume I was?

 I appologize, then. This, however, made me think you were:
 'Now what? Try simple commands? Like Lemme outta here!?'
 Now english is not my native language, so some subtle humour might have
 escaped me and I may have understood that entirely wrong... :-)

 Oh, that's OK. But it's otherwise hard to believe English is not your
 native language.
Ha, thanks :-) Dutch it is, however realizing that we're just a tiny 
speck on the world's stage we get tought english from our 10th...


 Then, for the sake of helping you further anyway:
 you'll have to mind the significance of paths (=directories or folders
 in windows speak I think). The folder you're currently in, will restrict
 what you find when typing commands. If you need to reference something
 from a different folder, you'll need to explicitly specify that.

 Yes, I have that understanding.
 Good.
 The fact that you were starting python from the Python31 folder, and
 then typed 'now what?' gave me the idea you did not...

 To build numpy, you'll need to be in the numpy source folder (the numpy
 you extracted from svn). But if you're there, simply typing 'python' or
 'python.exe' will probably not work because 'python.exe' is in a
 different folder (c:\Python31). You could go into that folder, but then
 you would not be able to find numpy's setup.py script. Best way to solve
 that: make sure you're in the numpy folder, and type something like:
 'c:\Python31\python.exe setup.py build'. That should get you started at
 least.

 However, if I'm allowed to give you some unaskedfor advice: this might
 become lots easier if you make sure you're at least a bit comfortable
 with 1) the windows command prompt, 2) python, and 3) building python
 stuff from svn source checkouts. No offence meant. But you sound as you
 feel a lot more comfortable with pre-built packages compared to building
 yourself from source on windows...

 No, I fail your number 3.

 Well, than you've come along quite far already, and are on the right
 list :-)
 Then, please post the output of your 'python setup.py build' command,
 which will give us some clues about *why* you fail... (or, if the output
 is long, try to find the relevant lines indicating where and what goes
 wrong)

 I posted the output as an attached text file in my reply to Dave. Not
 sure that got to the list as I'm not familiar with the list's rules
 about attachments.

I, my fault then, I didn't see that. Seen  read it now, though. I'm not 
a windows users, and am afraid that I can't help you any further anymore 
with this as your problems seem to be pretty much windows/msvc 
related... Others on this list will know more, probably.


 Thanks, Vincent. And I am more comfortable with pre-built packages.
 Most of us are ;-) But sometimes you just need to bite the bullet...

 I'm biting, I'm biting (that's another kind of U.S. joking). Actually,
 several years ago I was using Ulipad, an IDE for Python. It was under
 active development and frequently updated via svn. So I had and used
 TortoiseSVN then, but on an old computer.  So I'm starting over
 getting the details of how to use it back.
Ha, I've used ulipad too. Mostly because it's lean 'n mean and doesn't 
force you to create an entire 'project' to just create a new python 
script (like eclipse/pydev et.al. do). Have newer hardware now, so the 
lean 'n' mean argument is of less significance, but still use it 
sometimes for some quick hacking...

 Dick
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2D binning

2010-06-02 Thread Vincent Schut
On 06/02/2010 04:52 AM, josef.p...@gmail.com wrote:
 On Tue, Jun 1, 2010 at 9:57 PM, Zachary Pincuszachary.pin...@yale.edu  
 wrote:
 I guess it's as fast as I'm going to get. I don't really see any
 other way. BTW, the lat/lons are integers)

 You could (in c or cython) try a brain-dead hashtable with no
 collision detection:

 for lat, long, data in dataset:
bin = (lat ^ long) % num_bins
hashtable[bin] = update_incremental_mean(hashtable[bin], data)

 you'll of course want to do some experiments to see if your data are
 sufficiently sparse and/or you can afford a large enough hashtable
 array that you won't get spurious hash collisions. Adding error-
 checking to ensure that there are no collisions would be pretty
 trivial (just keep a table of the lat/long for each hash value, which
 you'll need anyway, and check that different lat/long pairs don't get
 assigned the same bin).

 Zach



 -Mathew

 On Tue, Jun 1, 2010 at 1:49 PM, Zachary Pincuszachary.pin...@yale.edu
 wrote:
 Hi
 Can anyone think of a clever (non-lopping) solution to the
 following?

 A have a list of latitudes, a list of longitudes, and list of data
 values. All lists are the same length.

 I want to compute an average  of data values for each lat/lon pair.
 e.g. if lat[1001] lon[1001] = lat[2001] [lon [2001] then
 data[1001] = (data[1001] + data[2001])/2

 Looping is going to take wa to long.

 As a start, are the equal lat/lon pairs exactly equal (i.e. either
 not floating-point, or floats that will always compare equal, that is,
 the floating-point bit-patterns will be guaranteed to be identical) or
 approximately equal to float tolerance?

 If you're in the approx-equal case, then look at the KD-tree in scipy
 for doing near-neighbors queries.

 If you're in the exact-equal case, you could consider hashing the lat/
 lon pairs or something. At least then the looping is O(N) and not
 O(N^2):

 import collections
 grouped = collections.defaultdict(list)
 for lt, ln, da in zip(lat, lon, data):
grouped[(lt, ln)].append(da)

 averaged = dict((ltln, numpy.mean(da)) for ltln, da in
 grouped.items())

 Is that fast enough?

 If the lat lon can be converted to a 1d label as Wes suggested, then
 in a similar timing exercise ndimage was the fastest.
 http://mail.scipy.org/pipermail/scipy-user/2009-February/019850.html

And as you said your lats and lons are integers, you could simply do

ll = lat*1000 + lon

to get unique 'hashes' or '1d labels' for you latlon pairs, as a lat or 
lon will never exceed 360 (degrees).

After that, either use the ndimage approach, or you could use 
histogramming with weighting by data values and divide by histogram 
withouth weighting, or just loop.

Vincent


 (this was for python 2.4, also later I found np.bincount which
 requires that the labels are consecutive integers, but is as fast as
 ndimage)

 I don't know how it would compare to the new suggestions.

 Josef




 Zach
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Extracting values from one array corresponding to argmax elements in another array

2010-04-07 Thread Vincent Schut
On 04/06/2010 03:22 PM, Ken Basye wrote:
 From: Vincent Schut sc...@sarvision.nl
 On 04/05/2010 06:06 PM, Keith Goodman wrote:

 On Mon, Apr 5, 2010 at 8:44 AM, Ken Basyekbas...@jhu.edu   wrote:

snip
 b[a.argmax(axis=0), range(3)]

  array([0, 4, 5])


 Which does not work anymore when your arrays become more-dimensional
 (like in my case: 4 or more) and the axis you want to select on is not
 the first/last one. If I recall correctly, I needed to construct the
 full index arrays for the other dimensions too (like with ogrid I
 think). So: create the ogrid, replace the one for the dimensions you
 want the argmax selection to take place on with the argmax parameter,
 and use those index arrays to index your b array.
 I'd need to look up my source code to be more sure/precise. If anyone
 would like me to, please let me know. If anyone knows a less elaborate
 way, also please let us know! :-)

 Hi Vincent,
 I'd like to see more about your solution. For my present purposes,
 Keith's solution was sufficient, but I'm still very interested in a
 solution that's independent of dimension and axis.
 Thanks (and thanks, Keith),
 Ken


I've tracked it down. I have created the following quick'n dirty function:

def selectFromDimension(data, selection, dimension):
 ogridStatement = numpy.ogrid[
 slices = []
 for d in data.shape:
 slices.append(0: + str(d))
 ogridStatement += ,.join(slices)
 ogridStatement += ]
 grid = eval(ogridStatement)
 grid[dimension] = selection
 result = data[grid]
 return result

which you call with the array to select from, the result of 
argmin/max(axis=...), and the axis to select from (usually the same as 
the axis argument to the argmin/max)

It uses eval to be able to create the ogrid independent of 
dimensionality. There's probably a much cleaner way, but this worked for 
me :-)

Now I think about it, probably another way could be to transpose your 
axes such that the axis to select on becomes the first axis. Then just 
index with the result of argmin/max, and transpose back if necessary. 
However, I have not tried this and I might miss something obvious here...

Vincent.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Extracting values from one array corresponding to argmax elements in another array

2010-04-06 Thread Vincent Schut
On 04/05/2010 06:06 PM, Keith Goodman wrote:
 On Mon, Apr 5, 2010 at 8:44 AM, Ken Basyekbas...@jhu.edu  wrote:
 Hi Folks,
   I have two arrays, A and B, with the same shape.  I want to find the
 highest values in A along some axis, then extract the corresponding
 values from B.  I can get the highest values in A with A.max(axis=0) and
 the indices of these highest values with A.argmax(axis=0).  I'm trying
 to figure out a loop-free way to extract the corresponding elements from
 B using these indices.  Here's code with a loop that will do what I want
 for two-dimensional arrays:

 a
 array([[ 100.,0.,0.],
[   0.,  100.,  100.],
[   0.,0.,0.]])

 a.max(axis=0)
 array([ 100.,  100.,  100.])

 sel = a.argmax(axis=0)
   sel
 array([0, 1, 1])

 b = np.arange(9).reshape((3,3))
 b
 array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])

 b_best = np.empty(3)
 for i in xrange(3):
 ...b_best[i] = b[sel[i], i]
 ...
 b_best
 array([ 0.,  4.,  5.])

 Here's one way:

 b[a.argmax(axis=0), range(3)]
 array([0, 4, 5])

Which does not work anymore when your arrays become more-dimensional 
(like in my case: 4 or more) and the axis you want to select on is not 
the first/last one. If I recall correctly, I needed to construct the 
full index arrays for the other dimensions too (like with ogrid I 
think). So: create the ogrid, replace the one for the dimensions you 
want the argmax selection to take place on with the argmax parameter, 
and use those index arrays to index your b array.
I'd need to look up my source code to be more sure/precise. If anyone 
would like me to, please let me know. If anyone knows a less elaborate 
way, also please let us know! :-)

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a bug in numpy.ma.reduce?

2010-03-05 Thread Vincent Schut
On 03/05/2010 11:51 AM, Pierre GM wrote:
 On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote:
 Hi!  Sorry for the cross-post, but my own investigation has led me to 
 suspect that mine is actually a numpy problem, not a matplotlib problem.  
 I'm getting the following traceback from a call to matplotlib.imshow:
 ...
 Based on examination of the code, the last self is an instance of 
 ma._extrema_operation (or one of its subclasses) - is there a reason why 
 this class is unable to deal with a zero-size array to ufunc.reduce without 
 identity, (i.e., was it thought that it would - or should - never get one) 
 or was this merely an oversight?  Either way, there's other instances on the 
 lists of this error cropping up, so this circumstance should probably be 
 handled more robustly.  In the meantime, workaround?


 'm'fraid no. I gonna have to investigate that. Please open a ticket with a 
 self-contained example that reproduces the issue.
 Thx in advance...
 P.

This might be completely wrong, but I seem to remember a similar issue, 
which I then traced down to having a masked array with a mask that was 
set to True or False, instead of being a full fledged bool mask array. I 
was in a hurry then and completely forgot about it later, so filed no 
bug report whatsoever, for which I apologize.

VS.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Producing a Histogram When Bins Are Known

2009-11-27 Thread Vincent Schut
Wayne Watson wrote:
 I have a list that already has the frequencies from 0 to 255. However, 
 I'd like to make a histogram  that has say 32 bins whose ranges are 0-7, 
 8-15, ... 248-255. Is it possible?
 
Wayne,

you might find the 'numpy example list with doc' webpage quite 
informative... http://www.scipy.org/Numpy_Example_List_With_Doc (give it 
some time to load, it's pretty large...)
For new users (I was one once...) it usually takes some time to find the 
usual suspects in numpy/scipy help and docs... This one page has really 
become unvaluable for me.

It gives you the docstrings for numpy functions, often including some 
example code.

If you check out the histogram() function, you'll see it takes a 'bins=' 
argument:

bins : int or sequence of scalars, optional
 If `bins` is an int, it defines the number of equal-width
 bins in the given range (10, by default). If `bins` is a sequence,
 it defines the bin edges, including the rightmost edge, allowing
 for non-uniform bin widths.

So, if your bins are known, you can pass it to numpy.histogram, either 
as number-of-bins (if equal width), if necessary combined with the 
'range=' parameter to specify the range to divide into equal bins, or as 
bin edges (e.g. in your case: (0, 8, 16, ... 256) or 
numpy.linspace(0,256,33) which will give you this range nicely.

If you don't specify the 'range=' parameter, it will check the min and 
max from your input data and use that as lower and upper bounds.

Good luck learning numpy! :)

Vincent.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Help making better use of numpy array functions

2009-11-26 Thread Vincent Schut
Hi mdekauwe,

as long as your data size is small enough to fit a 8x array in memory 
and use it, why not just skip the running total and average 8 data 
arrays each 8day period? And skip the x and y loops; these are real 
performance killers. As a bonus, your code gets a lot smaller and more 
readeable... :)
Please correct me is I'm wrong: the ultimate goal is to have a average 
of the valid pixels 8 days of data, each 8 days, correct?

I'd do it this way (more or less pythonic pseudocode, untested, but your 
should get the idea):

# read filenames
filenames = glob.glob.
filenames.sort() # to be sure they are in proper order
filenamesChunked = [filenames[n:n+8] for n in range(0, len(filenames, 8)]
# this will produce a list of lists, with each sublist containing the
# next 8 filenames

nodatavalue =  -999

for fchunk in filenamesChunked:
data8days = numpy.ones((8, ncols, nrows)) * -999 # fill with nodata 
values, in case there are less than 8 days
for di in range(len(fchunk)):
f = fchunk[di]
data8days[di] = read f
weights = data8days  nodatavalue
# this way all nodata values get a weight of 0
avg8days = numpy.average(data8days, axis=0, weights=weights)
save avg8days

uhm... well, that's it, really :)

best regards,
Vincent.

mdekauwe wrote:
 Hi I have written some code and I would appreciate any suggestions to make
 better use of the numpy arrays functions to make it a bit more efficient and
 less of a port from C. Any tricks are thoughts would be much appreciated.
 
 The code reads in a series of images, collects a running total if the value
 is valid and averages everything every 8 images.
 
 Code below...
 
 Thanks in advance...
 
 #!/usr/bin/env python
 
 
 Average the daily LST values to make an 8 day product...
 
 Martin De Kauwe
 18th November 2009
 
 import sys, os, glob
 import numpy as np
 import matplotlib.pyplot as plt
 
 
 def averageEightDays(files, lst, count, numrows, numcols):
   
 day_count = 0
 # starting day - tag for output file
 doy = 122
 
 # loop over all the images and average all the information
 # every 8 days...
 for fname in glob.glob(os.path.join(path, files)):
 current_file = fname.split('/')[8]
 year = current_file.split('_')[2][0:4]
   
 try:
 f = open(fname, 'rb')
 except IOError:
 print Cannot open outfile for read, fname
 sys.exit(1)
   
 data = np.fromfile(f, dtype=np.float32).reshape(numrows, numcols)
 f.close()
   
 # if it is day 1 then we need to fill up the holding array...
 # copy the first file into the array...
 if day_count == 0:
 lst = data
 for i in xrange(numrows):
  for j in xrange(numcols):
  # increase the pixel count if we got vaild data
 (undefined = -999.0
  if lst[i,j]  -900.:
  count[i,j] = 1   
  day_count += 1
   
 # keep a running total of valid lst values in an 8 day sequence
 elif day_count  0 and day_count = 7:
 for i in xrange(numrows):
 for j in xrange(numcols):
 # lst valid pixel?
 if data[i,j]  -900.:
 # was the existing pixel valid?
 if lst[i,j]  -900.:
 lst[i,j] = data[i,j]
 count[i,j] = 1
 else:
 lst[i,j] += data[i,j]
 count[i,j] += 1   
 day_count += 1
   
 # need to average previous 8 days and write to a file...
 if day_count == 8:
 for i in xrange(numrows):
 for j in xrange(numcols):
 if count[i,j]  0:
 lst[i,j] = lst[i,j] / count[i,j]
 else:
 lst[i,j] = -999.0
   
 day_count = 0
 doy += 8
   
 lst, count = write_outputs(lst, count, year, doy)
   
 # it is possible we didn't have enough slices to do the last 8day avg...
 # but if we have more than one day we shall use it
 # need to average previous 8 days and write to a file...
 if day_count  1:
 for i in xrange(numrows):
 for j in xrange(numcols):   
 if count[i,j]  0:
 lst[i,j] = lst[i,j] / count[i,j]
 else:
 lst[i,j] = -999.0
   
 day_count = 0
 doy += 8
   
 lst, count = write_outputs(lst, count, year, doy)
   
 def write_outputs(lst, count, year, doy):
 

Re: [Numpy-discussion] Help making better use of numpy array functions

2009-11-26 Thread Vincent Schut
mdekauwe wrote:
 Thanks...I have adopted that and as you said it is a lot neater! Though I
 need to keep the pixel count for a weighting in another piece of code so
 have amended your logic slightly.

Alternatively, you could simply take the sum over axis=0 of the weight 
array to get the pixel count (e.g. pixelcount=weight.sum(axis=0)). I 
don't know if performance is an issue, but I'd say that skipping these 
numpy.where's and increments of count and running total, and use 
numpy.average instead, should give you a speedup. Though the size of 
your arrays might be too small for it to become very noticeable.
However, in your current version, it makes no sense to allocate an 
8.nx.ny array to read the data, as you anyways calculate a running 
total. The 8,nx,ny array only makes sense to postpone calculations till 
you've read all 8 days, and only then calculate the average (and pixel 
count). Currently you preserve the previous days of data, but you don't 
use them anywhere anymore.
Either way will work, though, I guess. Choose what feels most naturally 
for you, as that will help you maintain and understand your code later on.

Oh, and minor issue: creating a array of zeros and then multiplying with 
-999 still makes an array of zeros... I'd incorporated an array of 
*ones* multiplied with -999, because for the last chunk of days you 
could end up with a 8day array only partly filled with real data. E.g. 
if you'd have only 3 days of data left in your last chunk, 8dayData[0:3] 
would be data, and to prevent the rest ([3:8]) to be incorporated into 
the average calculation, these need to be -999. Currently these pixels 
will be 0, which is  nodatavalue, and thus infuencing your average (the 
pixelcount will be incremented for these 0 values).


Vincent.
 
 #!/usr/bin/env python
 
 
 import sys, os, glob
 import numpy as np
 
 def averageEightDays(filenamesList, numrows, numcols, year):
   doy = 122
   nodatavalue =  -999.0
   
   for fchunk in filenamesList:
   # fill with nodata values, in case there are less than 8 days
   data8days = np.zeros((8, numrows, numcols), dtype=np.float32) * 
 -999.0 
   pixCount = np.zeros((numrows, numcols), dtype=np.int)
   avg8days = np.zeros((numrows, numcols), dtype=np.float32)
   for day in xrange(len(fchunk)):
   f = fchunk[day]
   data8days[day] = np.fromfile(f, 
 dtype=np.float32).reshape(numrows,
 numcols)
   avg8days = np.where(data8days[day]  nodatavalue,
 avg8days+data8days[day], avg8days)
   pixCount = np.where(data8days[day]  nodatavalue, 
 pixCount+1, pixCount)
   
   avg8days = np.where(pixCount  0, avg8days / 
 np.float32(pixCount), -999.0)
   doy += 8
   print year,':',doy
   
   outfile = lst_8day1030am_ + str(year) + str(doy) + .gra
   write_outputs(outfile, avg8days)
   outfile = pixelcount_8day1030am_ + str(year) + str(doy) + 
 .gra
   write_outputs(outfile, pixCount)
 
 def write_outputs(outfile, data):
   opath = /users/eow/mgdk/research/HOFF_plots/LST/tmp
   
   try:
   of = open(os.path.join(opath, outfile), 'wb')
   except IOError:
   print Cannot open outfile for write, outfile
   sys.exit(1)
   
   # empty stuff
   data.tofile(of)
   of.close()
   
   
 if __name__ == __main__:
 
   numrows = 332
   numcols = 667
   path = /users/eow/mgdk/research/HOFF_plots/LST/gridded_03/
   
   files = 'lst_scr_2006*.gra'
   filenames = glob.glob(os.path.join(path, files))
   filenames.sort()
   filenamesList = [filenames[n:n+8] for n in xrange(0, len(filenames), 8)]
   averageEightDays(filenamesList, numrows, numcols, year=2006)
   
   files = 'lst_scr_2007*.gra'
   filenames = glob.glob(os.path.join(path, files))
   filenames.sort()
   filenamesList = [filenames[n:n+8] for n in xrange(0, len(filenames), 8)]
   averageEightDays(filenamesList, numrows, numcols, year=2007)
 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Help making better use of numpy array functions

2009-11-26 Thread Vincent Schut
mdekauwe wrote:
 
 Vincent Schut-2 wrote:
 Oh, and minor issue: creating a array of zeros and then multiplying with 
 -999 still makes an array of zeros... I'd incorporated an array of 
 *ones* multiplied with -999, because for the last chunk of days you 
 could end up with a 8day array only partly filled with real data. E.g. 
 if you'd have only 3 days of data left in your last chunk, 8dayData[0:3] 
 would be data, and to prevent the rest ([3:8]) to be incorporated into 
 the average calculation, these need to be -999. Currently these pixels 
 will be 0, which is  nodatavalue, and thus infuencing your average (the 
 pixelcount will be incremented for these 0 values).

 
 Ok I hadn't thought about it in that way but you are of course right! I have
 amended it.
 
 
 Vincent Schut-2 wrote:
 Alternatively, you could simply take the sum over axis=0 of the weight 
 array to get the pixel count (e.g. pixelcount=weight.sum(axis=0)).

 
 Ok I see your point here as well. So I tried implementing your suggestion,
 as I understand it
 
 weights = data8days  nodatavalue
 
 will make and 8, nrows, ncols array containing true and false.
 
 as you said I can get the pixel count I was after by using
 weights.sum(axis=0).
 
 However when I try to do the averaging step:
 
 avg8days = np.average(data8days, axis=0, weights=weights)
 
 I get the error msg  in average raise ZeroDivisionError, Weights sum to
 zero, can't be normalized
 ZeroDivisionError: Weights sum to zero, can't be normalized
 
 Which I guess (but don't know) comes from the trying do weight by a pixel
 count of zero. So not sure what has happened here?
 
 Thanks 
 

Ah... apparently numpy.average can't handle the situation when all 
weights are 0... didn't know that.
Hmm... what would you want to have in your average array when for a 
certain pixel there are only nodata values? If you'd like to have -999 
in your average result then, the solution is simple: in the weight 
array, for those pixels where weight is always 0, set 1 dayweight to 1. 
this way you'll get a nice -999 in your average result for those pixels.
e.g.:

weights[0] |= (weights.sum(axis=0)==0)

will set (boolean OR assign) all pixels with weight==0 for all days to 
True (==1).

Hope this helps.
Vincent.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] histogram: sum up values in each bin

2009-08-27 Thread Vincent Schut
Tim Michelsen wrote:
 Hello,
 I need some advice on histograms.
 If I interpret the documentation [1, 2] for numpy.histogram correctly, the
 result of the function is a count of the occurences sorted into each bin.
 
 (n, bins) = numpy.histogram(v, bins=50, normed=1)
 
 But how can I apply another function on these values stacked in each bin?
 Like summing them up or building averages?

Hi Tim,

If you just want to sum and/or average (= sum / count), you can shortcut 
this using the weights parameter of numpy.histogram, e.g. something like:

data = numpy.random.random((100,))
countsPerBin = numpy.histogram(data)
sumsPerBin = numpy.histogram(data, weights=data)
averagePerBin = sumsPerBin / countsPerBin

Regards,
Vincent.

 
 Thanks,
 Timmie
 
 [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html
 [2]
 http://www.scipy.org/Tentative_NumPy_Tutorial#head-aa75ec76530ff51a2e98071adb7224a4b793519e

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]]

2009-08-24 Thread Vincent Schut
Stéfan van der Walt wrote:
 Hi Vincent
 
 2009/8/21 Vincent Schut sc...@sarvision.nl:
 I know it probably will be a pretty involved task, as ndimage comes from
 numarray and seems to be largely implemented in C. But I really wanted
 to raise the issue now the image processing subject turns up once again,
 and hope some folks with more/better programming skills than me might
 like the idea...
 
 What would you like the behaviour to be?  For example, how should
 ndimage.zoom handle these missing values?

Good question :-)
I see 2 possibilities, both of them can be usefull in their own 
situations. Note that I am really not into splines mathematically, so my 
suggestions and terminology might not apply at all...

1. for any output cell that depends on a missing value in the input, 
return a missing/masked/NaN value, but (and I think this differs from 
the current implementation), for any output cell which could be 
calculated, return a proper output. Currently any array that contains 
one or more NaNs will give an output array full of NaNs (except for 
order=0, which shows this exact behaviour already). But maybe that's 
inherent to splines interpolation?
This would at least allow input arrays with missing values (or masked 
arrays) to be used; this behaviour could be extended to many of the 
ndimage functions, like the kernel based stuff. FFT based convolutions 
could be another story altogether...

2. In case of zoomco: only use the non-missing values to calculate the 
splines, thus effectively inter/extrapolating missing/masked values in 
the process. This probably raises a lot of new questions about the 
implementation. It would however be highly usefull for me... I don't 
know if a irregular grid based splines interpolation implementation exists?

What I currently do in a case like this (zooming an array with missing 
values) is first fill the missing values by using ndimage.generic_filter 
with a kernel function that averages the non-missing values in the 
moving window. This works as long as there are not too many missing 
values next to each other, however it is very slow...

I think that, if an effort like this is to be made, a thorough 
discussion on the possible behaviours of ndimage functions with missing 
values should take place on one of the numpy/scipy related mailing 
lists. I'm sure I'm not the only one with ideas and/or use cases for 
this, and I'm certainly not someone with a lot of theoretical knowledge 
in this area.

Regards,
Vincent.
 
 Regards
 Stéfan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]]

2009-08-21 Thread Vincent Schut
Christopher Hanley wrote:
 Hi Stefan,
 
 Never mind.  I just found the Sprint website and read the  
 description.  I'm sorry I hadn't found this sooner.  I would have made  
 plans to stay and help.  My apologizes.
 
Hi list,

I just saw this too and would like to misuse this thread to suggest 
another enhancement related to image processing you might consider: make 
ndimage maskedarray aware. Unfortunately we don't have any money to 
spend, otherwise I'd love to support this financially too. But it's a 
thing I'm running into (ndimage not working with missing data, that is) 
regularly, and for which it often is pretty hard to work out a 
workaround. E.g. any of the resampling (ndimage.zoom) or kernel 
filtering routines choke on array's with NaN's, and don't recognize 
masked arrays. Alas virtually all of the image data I process (satellite 
imagery) contains missing/bad data...

I know it probably will be a pretty involved task, as ndimage comes from 
numarray and seems to be largely implemented in C. But I really wanted 
to raise the issue now the image processing subject turns up once again, 
and hope some folks with more/better programming skills than me might 
like the idea...

Oh and I know of course ndimage is scipy, and this list is numpy. But as 
the image processing subject emerged here, well...

Cheers,
Vincent Schut.

 Sorry,
 Chris
 
 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] add axis to results of reduction (mean, min, ...)

2009-08-18 Thread Vincent Schut
Keith Goodman wrote:
 On Thu, Aug 6, 2009 at 9:58 AM, Charles R
 Harrischarlesr.har...@gmail.com wrote:

 On Thu, Aug 6, 2009 at 9:55 AM, josef.p...@gmail.com wrote:
 What's the best way of getting back the correct shape to be able to
 broadcast, mean, min,.. to the original array, that works for
 arbitrary dimension and axis?

 I thought I have seen some helper functions, but I don't find them
 anymore?
 Adding a keyword to retain the number of dimensions has been mooted. It
 shouldn't be too difficult to implement and would allow things like:

 scaled = a/a.max(1, reduce=0)
 I could do that for 1.4 if folks are interested.
 
 I'd use that. It's better than what I usually do:
 
 scaled = a / a.max(1).reshape(-1,1)

To chime in after returning from holidays: I'd use that keyword a great 
deal. Would be more than welcome to me. I currently have loads of code 
numpy.newaxis-ing the results of min/max/mean operations...

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bitwise view on numpy array

2009-05-06 Thread Vincent Schut
Hi,

I'm gonna have large (e.g. 2400x2400) arrays of 16 and 32 bit bitfields. 
  I've been searching in vain for an efficient and convenient way to 
represent these array's individual bit's (or, even better, configureable 
bitfields of 1-4 bits each).
Of course I know I can 'split' the array in its separate bitfields using 
bitwise operators and shifts, but this will greatly increase the memory 
usage because it'll create one byte array for each bitfield.
So I was looking for a way to create a bitwise view on the original 
array's data. I've been looking at recarray's, but the smallest element 
these can use are bytes, correct?. I've been looking at ctypes arrays of 
  Structure subclasses, which can define bitfields. However, these will 
give me an object array of elements with the Structure class subclass, 
and only allow me to access the bits per array element instead of for 
the entire array (or a subset), e.g. data[:].bit17-19 or someting like that.
After searching the net in vain for some hours, the list is my last 
resort :-) Anyone having ideas of how to get both memory-efficient and 
convenient access to single bits of a numpy array?

On a slightly related note, during my search I found some comments 
saying that numpy.bool arrays use an entire byte for each element. Could 
someone confirm (or, better, negate) that?

Thanks,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is it a bug?

2009-03-13 Thread Vincent Schut
Travis E. Oliphant wrote:
 shuwj5...@163.com wrote:
snipsnip

Travis, thanks for the excellent explanation! It clears something which 
I think is related to this, I've been wanting to ask on the ml for some 
time already.
Now here's the case.
I often have 4d arrays that are actually related sets of satellite 
imagery, and have the form of [date, band, y, x]. These can get pretty 
large, so I like to prevent too much broadcasting or reshape-copy-ing 
when indexing to save some memory. However, I regularly have to apply 
some boolean index of [date, y, x] to each of the band dimensions (think 
of the bool mask as a threshold base on just one of the bands). 
Currently I usually loop over all band indices, e.g.

for b in data.shape[1]:
data[:, b, :, :][mask] = 0

Would there be a way to do this in a more numpy-like fashion that is 
also memory-efficient?

Thanks,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fancy view question

2009-02-18 Thread Vincent Schut
Gael Varoquaux wrote:
 On Tue, Feb 17, 2009 at 10:18:11AM -0600, Robert Kern wrote:
 On Tue, Feb 17, 2009 at 10:16, Gael Varoquaux
 gael.varoqu...@normalesup.org wrote:
 On Tue, Feb 17, 2009 at 09:09:38AM -0600, Robert Kern wrote:
 np.repeat(np.repeat(x, 2, axis=0), 2, axis=1)
 
 stride_tricks are fun, but this is already a solved problem in numpy.
 
 Wow. I still have a lot to learn! How about adding a see-also in
 as_strided.
 
 They're not really related.
 
 Yes, I now understand that there where no views involved, on the contrary
 to what I first thought, so indeed there are not related.
 
 Gaël

Wow, did I start something! :-)
Thanks for all that shared their knowledge and ideas. I learned a lot 
beside getting answers to my questions.

Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] fancy view question

2009-02-17 Thread Vincent Schut
Hi list,

would it be possible to create a view on an array, such that this view 
is twice as large (in some dimensions) and in fact does a nearest 
neighbour 'zoom' on the original array? E.g. using some fancy 
slicing/striding tricks?

an example:

a = [[1, 2],
  [3, 4]]

then I'd like a view on a such that this view is:

[[1, 1, 2, 2],
  [1, 1, 2, 2],
  [3, 3, 4, 4],
  [3, 3, 4, 4]]

I know several ways to create this as a copy, but as memory does play a 
role in this case (yes, I know it's cheap :-))...

If not, fancy ways to create this as a copy are apreciated. I currently 
either create an empty array and then loop-fill it (e.g. this would be 
2x2 loops), of if I' lazy I use ndimage.zoom, which actually is a bit 
overkill.

Regards,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] memory usage

2008-10-15 Thread Vincent Schut
Huang-Wen Chen wrote:
 Robert Kern wrote:
 from numpy import *
 for i in range(1000):
   a = random.randn(512**2)
   b = a.argsort(kind='quick')
 Can you try upgrading to numpy 1.2.0? On my machine with numpy 1.2.0
 on OS X, the memory usage is stable.
   
 I tried the code fragment on two platforms and the memory usage is also 
 normal.
 
 1. numpy 1.1.1, python 2.5.1 on Vista 32bit
 2. numpy 1.2.0, python 2.6 on RedHat 64bit

If I recall correctly, there were some major improvements in python's 
memory management/garbage collection from version 2.4 to 2.5. If you 
could try to upgrade your python to 2.5 (and possibly also your numpy to 
1.2.0), you'd probably see some better behaviour.

Regards,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] xml-rpc with numpy arrays

2008-10-01 Thread Vincent Schut
Lisandro Dalcin wrote:
 
 I believe xmlrpclib is currently the simpler approach. Some day I'll
 have the time to implement something similar using MPI communication
 with mpi4py. However, I believe it can be done even better: local,
 client-side proxies should automatically provide access to all
 members/methods of remote, server-side instances. The registering step
 needed within xmlrpclib is a bit ugly ;-)
 
 
Try pyro: http://pyro.sourceforge.net/ or rpyc: 
http://rpyc.wikispaces.com/, both of which if I recall correctly, do 
implement this.

VS.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] maskedarray: how to force mask to expand

2008-09-26 Thread Vincent Schut
Pierre,

Thanks for your explanations. It still seems a little (too) complicated, 
but from a backwards-compatibility pov combined with your 'nomask is not 
False' implementation detail, I can understand mostly :-) I think the 
idea that when a.mask returns False, that actually means nomask instead 
of the False I'm used to, is what caused a major part of my confusion.
It might actually be nice to give you some context of why I asked this: 
during my (satellite image) processing, I use maskedarrays by default 
for each step in the processing chain, and I need to save the result of 
each step to disk. That means saving the array and its mask. I save both 
of them as tiff files (because these can include all other info that is 
nice for satellite imagery, like coordinates and projection). When 
saving the mask, I'm creating a tiff file and pushing the .mask array 
into it. Therefore, I obviously need the .mask not to be nomask, but to 
be a full shaped array. That's the context you need to see my confusion in.
Because speed and memory don't matter that much to me (well, they to 
matter, but I'm processing huge amounts of data anyways, and using 
parallel/clustered processing anyways, so I can take those masks...), I 
thought it might be easiest to make sure my data always has a full 
shaped mask. But, of course, perfomance-wise it would be best to be able 
to work with nomasks, and only expand these to full shaped masks when 
writing to disk. That's why I asked for a possible method on an ma to 
force expanding a mask, or e.g. an ma.mask.as_full_shaped_mask() method 
that returns either the mask, or (if nomask) a new array of Falses. I 
just supposed it existed and I could not find it, but now I understand 
it does not exist. But I could easily write something that checks for 
nomask and always returns an expanded mask.

The 'trick' to create ma's with the mask=False keyword is neat, I had 
not thought about that.
Same applies for masking values using ma[idx] = ma.masked.
Just for completeness (in case someone else is reading this and 
wondering how to *unmask* values): just setting ma[idx] to some valid 
number will unset the mask for that index. No need to do ma[idx] = 
ma.unmask or whatever, just ma[idx] = v.

OK, top posting is bad :) Further comments inline.

Pierre GM wrote:
 Vincent,
 
 The argument of speed (having a default mask of nomask) applies only to the 
 computations inside MaskedArray. Of course, it is still far faster not to use 
 masks but only ndarrays.
 
 Just for clarity, to rephrase my question: how do I force ma to give me
 (always/by default/by some method on a maskedarray) a full shaped mask
 instead of 'False' or nomask? Because I am sure from the beginning that
 I'll need this mask in full shape, I want it, and I want to be able to
 treat it like any normal bool array :-)
 
 Easy:
 a = ma.array([1,2,3,4], mask=False)
 masked_array(data = [1 2 3 4],
   mask = [False False False False],
   fill_value=99)
 
 Puzzling ? Not so much. See, by default, the value of the mask parameter is 
 `nomask`. nomask is in fact a 0-size boolean ndarray with value 0 (False). At 
 the creation of the masked array, we check whether a value was given to the 
 mask parameter. If no value is given, we default to `nomask`, and we end up 
 with `a.mask is nomask`. If you force the mask parameter to the boolean 
 False, you're not using `nomask`: in that case, the full mask is created.

I understand that now.

 
 That won't work with ma.zeros or ma.ones. I could add an extra keyword to 
 deal 
 witht that, but is it really needed when you can simply do a.mask=False ?

No real need for that, then. Would be just conveniencewise 
sugar-addition on (especially my) cake.
 
 Note: 
 a=ma.array([1,2,3,])
 a.mask is False
 False
 a.mask is ma.nomask
 True
 a.mask == False
 True
 
Btw, in future versions, would it be an idea to separate 'nomask' and 
'False' a little more? I always assumed (correctly?) that in python, 
True and False are singletons (as far as that is possible in python), 
just like None. 'False is False' should always compare to True, then. In 
this case (a.mask is False) it at least *seems* to break that 'rule'...

 And is there also a complement, like ma.unmasked? I could not find it
 (very quick search, I admit)... Or can I use !ma.masked?

Can just set elements to unmask them, found out that. No need for 
ma.unmasked.
 
 No, there isn't and no you can't.  ma.masked is actually a constant defined 
 module-wide and independent of any array. That way, you can test whether an 
 element is masked with `a[..] is masked`. 
 
 Ah, now the magic starts... (normal user cap on head, beware):

 In [9]: am.mask
 Out[9]: False

 In [10]: am.mask = False

 In [11]: am.mask
 Out[11]:
 array([[False, False],
 [False, False]], dtype=bool)

 while (with the same am as before [9], with am.mask == False):

 In [15]: am.mask = am.mask

 In [16]: am.mask
 Out[16]: False

 Do you see (and agree with 

Re: [Numpy-discussion] maskedarray: how to force mask to expand

2008-09-25 Thread Vincent Schut
 to force the mask at 
creation to full shape, and there is no method on a maskedarray to 
change the mask to full shape.
However, one can apply some magic and use 'a.mask' = False directly 
after creation to force the mask to full shape. This of course only 
works when the mask already *was* False, otherwise you'll be effectively 
changing your mask. So we presume ma never by default returns a mask of 
'True', and then this works. The obvious trick to workaround this remote 
possibility of a mask of 'True' would be a.mask = a.mask, but that does 
not work.

Hey, sorry about starting a discussion about this, while I meant to ask 
just a simple question (and really assumed I had overlooked something, 
it seemed so simple...). Again, no offence meant, and your work on ma is 
really appreciated. I hope this discussion will result in more 
intuitiveness in a future (C?) implementation of ma.

Regards,
Vincent.
 
 
 On Wednesday 24 September 2008 06:25:57 Vincent Schut wrote:
 Probably I'm just overlooking something obvious, but I'm having problems
 with maskedarrays (numpy.ma from svn: '1.3.0.dev5861'), the mask by
 default being a single bool value ('False') instead of a properly sized
 bool array. If I then try to mask one value by assigning values to
 certain mask positions (a.mask[0,0]=True) I get an error, logically. I
 know I can use mask_where, but I like the mask[...] idiom. And I have to
 expand the mask anyway, as I'm gonna write it to a file at the end.

 1) Is there a way to have ma always use properly expanded masks (bool
 arrays instead of single bool values)? I tried the shrink=False keyword,
 but that does not do what I want, and is not available for
 numpy.ma.zeros, which I conveniently use a lot.

 2) Is there a method/function to request the mask, be it a single bool
 value or an array, as a properly sized array? I found shrink_mask but no
 opposite method, and shrink_mask seems to do something subtly different
 even.

 Regards,
 Vincent.

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] maskedarray: how to force mask to expand

2008-09-24 Thread Vincent Schut
Probably I'm just overlooking something obvious, but I'm having problems 
with maskedarrays (numpy.ma from svn: '1.3.0.dev5861'), the mask by 
default being a single bool value ('False') instead of a properly sized 
bool array. If I then try to mask one value by assigning values to 
certain mask positions (a.mask[0,0]=True) I get an error, logically. I 
know I can use mask_where, but I like the mask[...] idiom. And I have to 
expand the mask anyway, as I'm gonna write it to a file at the end.

1) Is there a way to have ma always use properly expanded masks (bool 
arrays instead of single bool values)? I tried the shrink=False keyword, 
but that does not do what I want, and is not available for 
numpy.ma.zeros, which I conveniently use a lot.

2) Is there a method/function to request the mask, be it a single bool 
value or an array, as a properly sized array? I found shrink_mask but no 
opposite method, and shrink_mask seems to do something subtly different 
even.

Regards,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.2 tasks

2008-08-05 Thread Vincent Schut
David Huard wrote:
 
 
 On Mon, Aug 4, 2008 at 1:45 PM, Jarrod Millman [EMAIL PROTECTED] 
 mailto:[EMAIL PROTECTED] wrote:

 snip

 Question: Should histogram raise a warning by default (new=True) to warn 
 users that the behaviour has changed ? Or warn only if new=False to 
 remind that
 the old behaviour will be deprecated in 1.3 ?  I think that users will 
 prefer being annoyed at warnings than surprised by an unexpected change, 
 but repeated warnings
 can become a nuisance.
 
 To minimize annoyance, we could also offer three possibilities:
   
 new=None (default) : Equivalent to True, print warning about change.
 new=True : Don't print warning.
 new=False : Print warning about future deprecation.
 
 So those who have already set new=True don't get warnings, and all 
 others are warned. Feedback ?

As a regular user of histogram I say: please warn! Your proposal above 
seems OK to me. I do have histogram in a lot of kind of old (and 
sometimes long-running) code of mine, and I certainly would prefer to be 
warned.

Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] first recarray steps

2008-05-22 Thread Vincent Schut
Anne Archibald wrote:
 2008/5/21 Vincent Schut [EMAIL PROTECTED]:
 Christopher Barker wrote:
 Also, if you image data is rgb, usually, that's a (width, height, 3)
 array: rgbrgbrgbrgb... in memory. If you have a (3, width, height)
 array, then that's rrr... Some image libs
 may give you that, I'm not sure.
 My data is. In fact, this is a simplification of my situation; I'm
 processing satellite data, which usually has more (and other) bands than
 just rgb. But the data is definitely in shape (bands, y, x).
 
 You may find your life becomes easier if you transpose the data in
 memory. This can make a big difference to efficiency. Years ago I was
 working with enormous (by the standards of the day) MATLAB files on
 disk, storing complex data. The way (that version of) MATLAB
 represented complex data was the way you describe: matrix of real
 parts, matrix of imaginary parts. This meant that to draw a single
 pixel, the disk needed to seek twice... depending on what sort of
 operations you're doing, transposing your data so that each pixel is
 all in one place may improve cache coherency as well as making the use
 of record arrays possible.
 
 Anne

Anne, thanks for the thoughts. In most cases, you'll probably be right. 
In this case, however, it won't give me much (if any) speedup, maybe 
even slowdown. Satellite images often are stored on disk in a band 
sequential manner. The library I use for IO is GDAL, which is a higly 
optimized c library for reading/writing almost any kind of satellite 
data type. It also features an internal caching mechanism. And it gives 
me my data as (y, x, bands).
I'm not reading single pixels anyway. The amounts of data I have to 
process (enormous, even by the standards of today ;-)) require me to do 
this in chunks, in parallel, even on different cores/cpu's/computers. 
Every chunk usually is (chunkYSize, chunkXSize, allBands) with xsize and 
ysize being not so small (think from 64^2 to 1024^2) so that pretty much 
eliminates any performance issues regarding the data on disk. 
Furthermore, having to process on multiple computers forces me to have 
my data on networked storage. The latency and transfer rate of the 
network will probably eliminate any small speedup because my drive has 
to do less seeks...
Now for the recarray part, that would indeed ease my life a bit :) 
However, having to transpose the data in memory on every read and write 
does not sound very attractive. It will spoil cycles, and memory, and be 
asking for bugs. I can live without recarrays, for sure. I only hoped 
they might make my live a bit easier and my code a bit more readable, 
without too much effort. Well, they won't, apparently... I'll just go on 
like I did before this little excercise.

Thanks all for the inputs.

Cheers,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] distance_matrix: how to speed up?

2008-05-22 Thread Vincent Schut
Emanuele Olivetti wrote:
snip
 
 This solution is super-fast, stable and use little memory.
 It is based on the fact that:
 (x-y)^2*w = x*x*w - 2*x*y*w + y*y*w
 
 For size1=size2=dimensions=1000 requires ~0.6sec. to compute
 on my dual core duo. It is 2 order of magnitude faster than my
 previous solution, but 1-2 order of magnitude slower than using
 C with weave.inline.
 
 Definitely good enough for me.
 
 
 Emanuele

Reading this thread, I remembered having tried scipy's sandbox.rbf 
(radial basis function) to interpolate a pretty large, multidimensional 
dataset, to fill in the missing data points. This however failed soon 
with out-of-memory errors, which, if I remember correctly, came from the 
pretty straightforward distance calculation between the different data 
points that is used in this package. Being no math wonder, I assumed 
that there simply was no simple way to calculate distances without using 
much memory, and ended my rbf experiments.

To make a story short: correct me if I am wrong, but might it be an idea 
to use the above solution in scipy.sandbox.rbf?

Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] distance_matrix: how to speed up?

2008-05-22 Thread Vincent Schut
Rob Hetland wrote:
 On May 22, 2008, at 9:45 AM, Vincent Schut wrote:
 
snip
 
 Really, though, the rbf toolbox will not be limited by the memory of  
 the distance matrix.  Later on, you need to do a large linear algebra  
 'solve', like this:
 
 
 r = norm(x, x) # The distances between all of the ND points to each  
 other.
 A = psi(r)  # where psi is some divergent function, often the  
 multiquadratic function : sqrt((self.epsilon*r)**2 + 1)
 coefs = linalg.solve(A, data)  # where data is the length of x, one  
 data point for each spatial point.
 
 # to find the interpolated data points at xi
 ri = norm(xi, x)
 Ai = psi(ri)
 di = dot(Ai, coefs)
 
 
 All in all, it is the 'linalg.solve' that kills you.

Ah, indeed, my memory was faulty, I'm afraid. It was in this phase that 
it halted, not in the distance calculations.

Vincent.

 
 -Rob
 
 
 Rob Hetland, Associate Professor
 Dept. of Oceanography, Texas AM University
 http://pong.tamu.edu/~rob
 phone: 979-458-0096, fax: 979-845-6331

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] first recarray steps

2008-05-21 Thread Vincent Schut
Christopher Barker wrote:
 
 Vincent Schut wrote:
 Lets say I have a rgb image of arbitrary size, as a normal ndarray 
 (that's what my image reading lib gives me). Thus shape is 
 (3,ysize,xsize), dtype = int8. How would I convert/view this as a 
 recarray of shape (ysize, xsize) with the first dimension split up into 
 'r', 'g', 'b' fields? No need for 'x' and 'y' fields.
 
 Take a look in this list for a thread entitled recarray fun about a 
 month ago -- you'll find some more discussion of approaches.

Well, actually that thread was my inspiration to take a closer look into 
recarrays...
 
 Also, if you image data is rgb, usually, that's a (width, height, 3) 
 array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) 
 array, then that's rrr... Some image libs 
 may give you that, I'm not sure.

My data is. In fact, this is a simplification of my situation; I'm 
processing satellite data, which usually has more (and other) bands than 
just rgb. But the data is definitely in shape (bands, y, x).
 
 Also, you probably want a uint8 dtype, giving you 0-255 for each byte.

Same story. In fact, in this case it's int16, but can actually be any 
data type, even floats, even complex.
But thanks for the thoughts :-)
 
 -Chris
 
 
 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] first recarray steps

2008-05-21 Thread Vincent Schut
Robert Kern wrote:
 On Wed, May 21, 2008 at 1:48 AM, Vincent Schut [EMAIL PROTECTED] wrote:
 Christopher Barker wrote:
 
 Also, if you image data is rgb, usually, that's a (width, height, 3)
 array: rgbrgbrgbrgb... in memory. If you have a (3, width, height)
 array, then that's rrr... Some image libs
 may give you that, I'm not sure.
 My data is. In fact, this is a simplification of my situation; I'm
 processing satellite data, which usually has more (and other) bands than
 just rgb. But the data is definitely in shape (bands, y, x).
 
 I don't think record arrays will help you much, then. Individual
 records need to be contiguous (bar padding). You can't interleave
 them.
 
Hmm, that was just what I was wondering about, when reading Stefan's 
reply. So in fact, recarrays aren't just another way to view some data, 
no matter in what shape it is.

So his solution: 
x.T.reshape((-1,x.shape[0])).view(dt).reshape(x.shape[1:]).T won't work, 
than. Or, at least, won't give me a view on my original dat, but would 
give me a recarray with a copy of my data.

I guess I was misled by this text on the recarray wiki page:

We would like to represent a small colour image. The image is two 
pixels high and two pixels wide. Each pixel has a red, green and blue 
colour component, which is represented by a 32-bit floating point number 
between 0 and 1.

Intuitively, we could represent the image as a 3x2x2 array, where the 
first dimension represents the color, and the last two the pixel 
positions, i.e. 

Note the 3x2x2, which suggested imho that this would work with an 
image with (bands,y,x) shape, not with (x,y,bands) shape. But I 
understand that it's not shape, but internal representation in memory 
(contiguous or not, C/Fortran, etc) that matters?

I know I can change the wiki text, but I'm afraid I still don't feel 
confident on this matter...

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] first recarray steps

2008-05-21 Thread Vincent Schut
Robert Kern wrote:
 On Wed, May 21, 2008 at 2:03 AM, Vincent Schut [EMAIL PROTECTED] wrote:
 Robert Kern wrote:
 On Wed, May 21, 2008 at 1:48 AM, Vincent Schut [EMAIL PROTECTED] wrote:
 Christopher Barker wrote:
 Also, if you image data is rgb, usually, that's a (width, height, 3)
 array: rgbrgbrgbrgb... in memory. If you have a (3, width, height)
 array, then that's rrr... Some image libs
 may give you that, I'm not sure.
 My data is. In fact, this is a simplification of my situation; I'm
 processing satellite data, which usually has more (and other) bands than
 just rgb. But the data is definitely in shape (bands, y, x).
 I don't think record arrays will help you much, then. Individual
 records need to be contiguous (bar padding). You can't interleave
 them.

 Hmm, that was just what I was wondering about, when reading Stefan's
 reply. So in fact, recarrays aren't just another way to view some data,
 no matter in what shape it is.

 So his solution:
 x.T.reshape((-1,x.shape[0])).view(dt).reshape(x.shape[1:]).T won't work,
 than. Or, at least, won't give me a view on my original dat, but would
 give me a recarray with a copy of my data.
 
 Right.
 
 I guess I was misled by this text on the recarray wiki page:

 We would like to represent a small colour image. The image is two
 pixels high and two pixels wide. Each pixel has a red, green and blue
 colour component, which is represented by a 32-bit floating point number
 between 0 and 1.

 Intuitively, we could represent the image as a 3x2x2 array, where the
 first dimension represents the color, and the last two the pixel
 positions, i.e. 

 Note the 3x2x2, which suggested imho that this would work with an
 image with (bands,y,x) shape, not with (x,y,bands) shape.
 
 Yes, the tutorial goes on to use record arrays as a view onto an
 (x,y,bands) array and also make a (bands,x,y) view from that, too.
 That is, in fact, quite a confusing presentation of the subject.
 
 Now, there is a way to use record arrays here; it's a bit ugly but can
 be quite useful when parsing data formats. Each item in the record can
 also be an array. So let's pretend we have a (3,nx,ny) RGB array.
 
 nbands, nx, ny = a.shape
 dtype = numpy.dtype([
   ('r', a.dtype, [nx, ny]),
   ('g', a.dtype, [nx, ny]),
   ('b', a.dtype, [nx, ny]),
 ])
 
 # The flatten() is necessary to pre-empt numpy from
 # trying to do too much interpretation of a's shape.
 rec = a.flatten().view(dtype)
 print rec['r']
 print rec['g']
 print rec['b']
 

Ah, now that is clarifying! Thanks a lot. I'll do some experiments to 
see whether this way of viewing my data is useful to me (in a sense that 
making may code more readable is already very useful).

Cheers,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] first recarray steps

2008-05-20 Thread Vincent Schut
Hi, I'm trying to get into recarrays. Unfortunately documentation is a 
bit on the short side...

Lets say I have a rgb image of arbitrary size, as a normal ndarray 
(that's what my image reading lib gives me). Thus shape is 
(3,ysize,xsize), dtype = int8. How would I convert/view this as a 
recarray of shape (ysize, xsize) with the first dimension split up into 
'r', 'g', 'b' fields? No need for 'x' and 'y' fields.

I tried creating a numpy dtype {names: ('r','g','b'), formats: 
(numpy.int8,)*3}, but when I try to raw_img.view(rgb_dtype) I get:
ValueError: new type not compatible with array.

Now this probably should not be too difficult, but I just don't see it...

Thanks,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] moving window in numpy

2007-11-26 Thread Vincent Schut
Matthew Perry wrote:
 Hi all,
 
  I'm not sure if my terminology is familiar but I'm trying to do a
 moving window analysis (ie a spatial filter or kernel) on a 2-D
 array representing elevation. For example, a 3x3 window centered on
 each cell is used to calculate the derivate slope of that cell.
 
  Can this easily be implemented using numpy?
 

If I understand you correctly, scipy.ndimage does exactly what you ask
for. See http://www.scipy.org/SciPyPackages/Ndimage.

Cheers,
VS.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy : your experiences?

2007-11-20 Thread Vincent Schut
Rahul Garg wrote:
 a) Can you guys tell me briefly about the kind of problems you are
 tackling with numpy and scipy?

mainly timeseries of Remote Sensing data ('satellite images')
processing. No really fancy math, but huge (sometimes multiple
gigabytes) multidimensional (date, bands, y, x: order of magnitude:
[hundreds, some, tenthousands or more, idem]) datasets. Some tasks are
long-running, not because of the complex math involved, but because of
the amount of data.

 b) Have you ever felt that numpy/scipy was slow and had to switch to
 C/C++/Fortran?

Yes, but usually the overhead of programming the whole thing in C is not
worth the speedup of processing, especially with modern fast computers
and easy ways to parallelize stuff (ipython and/or parallelpython).

 c) Do you use any form of parallel processing? Multicores? SMPs?
 Clusters? If yes how did u utilize them?

See above. Just started to use parallelpython here, currently not yet
for production work, but in testing/prototyping phase. Using both smp,
multicore and multiple machines ('cluster'). PP doesn't make any
difference between them.
How? Just cut the data in suitable pieces and throw them as a pp job to
the cluster :-) It's really that simple nowadays. And most of our
processing is very parallel in nature.

Cheers,
Vincent Schut.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ndim argsort question

2007-11-20 Thread Vincent Schut
Hi,

I'm getting a bit lost in numpy's fancy indexing, using argsort on a
slice of a 4d arrays, and then using the resulting indices to sort other
slices of that same array, preferably in one go.

I have a data[d, b, y, x] array (for the curious: date, band, y, x:
satellite image)
Then I create a sorting index from band 0:
idx = data[:, 0, :, :].argsort(axis=0)
Now I want to sort all 7 bands accoring to this index, without looping,
or at least avoid looping as far as possible. Something like sortedData
= data[idx], but then for each 'b'. However, I cant get multidim
indexing using the argsort result to work. Any hints?

Hard to explain. I hope someone understands it, otherwise please let me
know and I'll try to refrase :)

Cheers,
Vincent Schut.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] strange problem compiling against ndarrayobject.h

2007-11-01 Thread Vincent Schut
Hmm, it seems my original message did not come through? Not in gmane, at
least... Well, here's again:

Hi numpy and/or gdal guru's,

I'm suddenly getting into trouble compiling gdal's python extension,
when it includes ndarrayobject.h from numpy. First it complains about my
python not being unicode, though I am sure it is unicode, and then it
bails out with all kinds of errors (see snippet below).
Numpy people, please read on even though this might seem a gdal problem,
because it clearly is numpy related, see the snippet...

I've tried everything I could think of, but to no avail. I hope one of
you can give me a hint, because I'm kind of clueless, and even don't
know in what direction I should search...

Here is a paste from the output from building (python setup.py build) in
the gdal/swig/python dir.

make[2]: Entering directory `/usr/local/src/gdal/swig/python'
python setup.py build
numpy include /usr/lib64/python2.5/site-packages/numpy/core/include
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.5
copying gdal.py - build/lib.linux-x86_64-2.5
copying ogr.py - build/lib.linux-x86_64-2.5
copying osr.py - build/lib.linux-x86_64-2.5
copying gdalconst.py - build/lib.linux-x86_64-2.5
creating build/lib.linux-x86_64-2.5/osgeo
copying osgeo/gdalnumeric.py - build/lib.linux-x86_64-2.5/osgeo
copying osgeo/gdalconst.py - build/lib.linux-x86_64-2.5/osgeo
copying osgeo/__init__.py - build/lib.linux-x86_64-2.5/osgeo
copying osgeo/osr.py - build/lib.linux-x86_64-2.5/osgeo
copying osgeo/gdal_array.py - build/lib.linux-x86_64-2.5/osgeo
copying osgeo/gdal.py - build/lib.linux-x86_64-2.5/osgeo
copying osgeo/ogr.py - build/lib.linux-x86_64-2.5/osgeo
running build_ext
building 'osgeo._gdal' extension
C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
-DNDEBUG -fPIC

creating build/temp.linux-x86_64-2.5
creating build/temp.linux-x86_64-2.5/extensions
compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
-I/usr/lib64/python2.5/site-packages/numpy/core/include
-I/usr/include/python2.5 -c'
x86_64-pc-linux-gnu-gcc: extensions/gdal_wrap.cpp
extensions/gdal_wrap.cpp: In function ‘PyObject*
_wrap_MajorObject_SetMetadata__SWIG_0(PyObject*, PyObject*)’:
extensions/gdal_wrap.cpp:4570: warning: deprecated conversion from
string constant to ‘char*’
x86_64-pc-linux-gnu-g++ -pthread -shared
build/temp.linux-x86_64-2.5/extensions/gdal_wrap.o -L../../.libs -L../..
-L/usr/lib64 -lgdal -lpython2.5 -o build/lib.linux-x86_64-2.5/osgeo/_gdal.so
building 'osgeo._gdalconst' extension
C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
-DNDEBUG -fPIC

compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
-I/usr/lib64/python2.5/site-packages/numpy/core/include
-I/usr/include/python2.5 -c'
x86_64-pc-linux-gnu-gcc: extensions/gdalconst_wrap.c
x86_64-pc-linux-gnu-gcc -pthread -shared
build/temp.linux-x86_64-2.5/extensions/gdalconst_wrap.o -L../../.libs
-L../.. -L/usr/lib64 -lgdal -lpython2.5 -o
build/lib.linux-x86_64-2.5/osgeo/_gdalconst.so
building 'osgeo._osr' extension
C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
-DNDEBUG -fPIC

compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
-I/usr/lib64/python2.5/site-packages/numpy/core/include
-I/usr/include/python2.5 -c'
x86_64-pc-linux-gnu-gcc: extensions/osr_wrap.cpp
x86_64-pc-linux-gnu-g++ -pthread -shared
build/temp.linux-x86_64-2.5/extensions/osr_wrap.o -L../../.libs -L../..
-L/usr/lib64 -lgdal -lpython2.5 -o build/lib.linux-x86_64-2.5/osgeo/_osr.so
building 'osgeo._ogr' extension
C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
-DNDEBUG -fPIC

compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
-I/usr/lib64/python2.5/site-packages/numpy/core/include
-I/usr/include/python2.5 -c'
x86_64-pc-linux-gnu-gcc: extensions/ogr_wrap.cpp
x86_64-pc-linux-gnu-g++ -pthread -shared
build/temp.linux-x86_64-2.5/extensions/ogr_wrap.o -L../../.libs -L../..
-L/usr/lib64 -lgdal -lpython2.5 -o build/lib.linux-x86_64-2.5/osgeo/_ogr.so
building 'osgeo._gdal_array' extension
C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
-DNDEBUG -fPIC

compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
-I/usr/lib64/python2.5/site-packages/numpy/core/include
-I/usr/include/python2.5 -c'
x86_64-pc-linux-gnu-gcc: extensions/_gdal_array.cpp
In file included from
/usr/lib64/python2.5/site-packages/numpy/core/include/numpy/arrayobject.h:14,
 from extensions/gdal_array.h:10,
 from extensions/_gdal_array.cpp:33:
/usr/lib64/python2.5/site-packages/numpy/core/include/numpy/ndarrayobject.h:92:2:
error: #error Must use Python with unicode enabled.
In file included from /usr/include/python2.5/Python.h:62,
 from extensions/gdal_array.h:16,
 from extensions/_gdal_array.cpp:33:
/usr/include/python2.5/pyport.h:105:1: warning: PY_SSIZE_T_MAX redefined
In file included from

Re: [Numpy-discussion] strange problem compiling against ndarrayobject.h

2007-11-01 Thread Vincent Schut
Hi all,

It appeared to be a gdal issue after all: the arrayobject header file
was being included before the python headers...
Glad it wasn't something like me having borked my numpy build :)

Cheers,
Vincent.


Vincent Schut wrote:
 Hmm, it seems my original message did not come through? Not in gmane, at
 least... Well, here's again:

 Hi numpy and/or gdal guru's,

 I'm suddenly getting into trouble compiling gdal's python extension,
 when it includes ndarrayobject.h from numpy. First it complains about my
 python not being unicode, though I am sure it is unicode, and then it
 bails out with all kinds of errors (see snippet below).
 Numpy people, please read on even though this might seem a gdal problem,
 because it clearly is numpy related, see the snippet...

 I've tried everything I could think of, but to no avail. I hope one of
 you can give me a hint, because I'm kind of clueless, and even don't
 know in what direction I should search...

 Here is a paste from the output from building (python setup.py build) in
 the gdal/swig/python dir.

 make[2]: Entering directory `/usr/local/src/gdal/swig/python'
 python setup.py build
 numpy include /usr/lib64/python2.5/site-packages/numpy/core/include
 running build
 running build_py
 creating build
 creating build/lib.linux-x86_64-2.5
 copying gdal.py - build/lib.linux-x86_64-2.5
 copying ogr.py - build/lib.linux-x86_64-2.5
 copying osr.py - build/lib.linux-x86_64-2.5
 copying gdalconst.py - build/lib.linux-x86_64-2.5
 creating build/lib.linux-x86_64-2.5/osgeo
 copying osgeo/gdalnumeric.py - build/lib.linux-x86_64-2.5/osgeo
 copying osgeo/gdalconst.py - build/lib.linux-x86_64-2.5/osgeo
 copying osgeo/__init__.py - build/lib.linux-x86_64-2.5/osgeo
 copying osgeo/osr.py - build/lib.linux-x86_64-2.5/osgeo
 copying osgeo/gdal_array.py - build/lib.linux-x86_64-2.5/osgeo
 copying osgeo/gdal.py - build/lib.linux-x86_64-2.5/osgeo
 copying osgeo/ogr.py - build/lib.linux-x86_64-2.5/osgeo
 running build_ext
 building 'osgeo._gdal' extension
 C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
 -DNDEBUG -fPIC

 creating build/temp.linux-x86_64-2.5
 creating build/temp.linux-x86_64-2.5/extensions
 compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
 -I/usr/lib64/python2.5/site-packages/numpy/core/include
 -I/usr/include/python2.5 -c'
 x86_64-pc-linux-gnu-gcc: extensions/gdal_wrap.cpp
 extensions/gdal_wrap.cpp: In function ‘PyObject*
 _wrap_MajorObject_SetMetadata__SWIG_0(PyObject*, PyObject*)’:
 extensions/gdal_wrap.cpp:4570: warning: deprecated conversion from
 string constant to ‘char*’
 x86_64-pc-linux-gnu-g++ -pthread -shared
 build/temp.linux-x86_64-2.5/extensions/gdal_wrap.o -L../../.libs -L../..
 -L/usr/lib64 -lgdal -lpython2.5 -o build/lib.linux-x86_64-2.5/osgeo/_gdal.so
 building 'osgeo._gdalconst' extension
 C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
 -DNDEBUG -fPIC

 compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
 -I/usr/lib64/python2.5/site-packages/numpy/core/include
 -I/usr/include/python2.5 -c'
 x86_64-pc-linux-gnu-gcc: extensions/gdalconst_wrap.c
 x86_64-pc-linux-gnu-gcc -pthread -shared
 build/temp.linux-x86_64-2.5/extensions/gdalconst_wrap.o -L../../.libs
 -L../.. -L/usr/lib64 -lgdal -lpython2.5 -o
 build/lib.linux-x86_64-2.5/osgeo/_gdalconst.so
 building 'osgeo._osr' extension
 C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
 -DNDEBUG -fPIC

 compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
 -I/usr/lib64/python2.5/site-packages/numpy/core/include
 -I/usr/include/python2.5 -c'
 x86_64-pc-linux-gnu-gcc: extensions/osr_wrap.cpp
 x86_64-pc-linux-gnu-g++ -pthread -shared
 build/temp.linux-x86_64-2.5/extensions/osr_wrap.o -L../../.libs -L../..
 -L/usr/lib64 -lgdal -lpython2.5 -o build/lib.linux-x86_64-2.5/osgeo/_osr.so
 building 'osgeo._ogr' extension
 C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
 -DNDEBUG -fPIC

 compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
 -I/usr/lib64/python2.5/site-packages/numpy/core/include
 -I/usr/include/python2.5 -c'
 x86_64-pc-linux-gnu-gcc: extensions/ogr_wrap.cpp
 x86_64-pc-linux-gnu-g++ -pthread -shared
 build/temp.linux-x86_64-2.5/extensions/ogr_wrap.o -L../../.libs -L../..
 -L/usr/lib64 -lgdal -lpython2.5 -o build/lib.linux-x86_64-2.5/osgeo/_ogr.so
 building 'osgeo._gdal_array' extension
 C compiler: x86_64-pc-linux-gnu-gcc -pthread -fno-strict-aliasing
 -DNDEBUG -fPIC

 compile options: '-I../../port -I../../gcore -I../../alg -I../../ogr
 -I/usr/lib64/python2.5/site-packages/numpy/core/include
 -I/usr/include/python2.5 -c'
 x86_64-pc-linux-gnu-gcc: extensions/_gdal_array.cpp
 In file included from
 /usr/lib64/python2.5/site-packages/numpy/core/include/numpy/arrayobject.h:14,
  from extensions/gdal_array.h:10,
  from extensions/_gdal_array.cpp:33:
 /usr/lib64/python2.5/site-packages/numpy/core/include/numpy/ndarrayobject.h:92:2:
 error