Thanks Andre I had a go at following your advice but it didn't seem to work (it kept focusing on the last loop and not combining them all together) so I have posted a note on scipy user group instead (code below). Also thanks for the advice regarding averaging!
from netCDF4 import Dataset import matplotlib.pyplot as plt import numpy as N from mpl_toolkits.basemap import Basemap import os MainFolder=r"E:/temp_samples/" for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir for ncfile in files: if ncfile[-3:]=='.nc': ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][4::24] LAT=ncfile.variables['latitude'][:] LON=ncfile.variables['longitude'][:] TIME=ncfile.variables['time'][:] fillvalue=ncfile.variables['T_SFC']._FillValue ncfile.close() #calculate summary stats big_array=[] for i in TSFC: big_array.append(i) big_array=N.array(big_array) Mean=N.mean(big_array, axis=0) #plot output summary stats map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') map.drawcoastlines() map.drawstates() x,y=map(*N.meshgrid(LON,LAT)) plt.title('Total Mean at 3pm') ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] CS = map.contourf(x,y,Mean,ticks, cmap=plt.cm.jet) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h]) plt.colorbar(CS,cax=cax, drawedges=True) plt.show() On Thu, Aug 11, 2011 at 4:48 PM, Andre' Walker-Loud <walksl...@gmail.com>wrote: > Hello Anonymous Questioner, > > First - you will probably be told this isn't the correct email list for > this question - this is a general python tutorial list, while your question > is numpy specific, so if this doesn't help, you should probably look to > another email list. > > There are a couple general comments to make regarding your question. > Basically, you want to load all the data into one big numpy array. > Currently, it seems your data array has 3 indices, from this line > > > TSFC=ncfile.variables['T_SFC'][4::24,:,:] > > you can make a larger array with one more index, where the new index runs > over files. So instead of looping over the files as you do, you can use the > loop to build the larger array. > > I am sure there is a more elegant way to do this, but here is something > that is along the lines of what you want > > > big_array = [] > > for ncfile in files: > > big_array.append(ncfile.variables['T_SFC'][4::24,:,:]) > > if all of your data you are taking from each file is the same size for all > the files, then you can declare the whole thing an array > > > big_array = N.array(big_array) > > and then you can take averages very easily by specifying which index, or > indices you want to average over. If life is not so kind, and your files > are of different sizes, then you have to be more careful. But to give a > precise answer to your problem would require knowledge of what your data > files actually look like. > > > Also, a bit of caution with averaging (in case you are unfamiliar). Say > you have 5 data sets, and each one has a different amount of data in it. If > you first take the average of each set, you can not simply take the average > of those sets to produce the global average. You have to weight each set by > the amount of data you averaged. Say the length of the sets are > [N0,N1,N2,N3,N4] with average values [a0,a1,a2,a3,a4], then to produce the > global average, you need to take > > avg = (1 / (N0+N1+N2+N3+N4) ) * (N0*a0 + N1*a1 + N2*a2 + N3*a3 + N4*a4) > > a few lines of algebra can demonstrate this produces the average you would > get by combining all the data in all the sets and taking one big average. > > > Regards, > > Andre > > > > On Aug 10, 2011, at 6:57 PM, questions anon wrote: > > > I have many ncfiles each containing one month of hourly temperature data. > > I have worked out how to loop through a number of ncfiles and calculate > the mean for each file at a particular time and even plot this and output as > a *.png. > > What I am unsure of is how to calculate the mean at a particular time for > all files. > > Do I somehow output the mean for each folder and then calculate the mean > of all the means? Or is there some way to loop through and calculate it > directly? > > Below is the code I have working that calculates the mean for each file. > Any help will be greatly appreciated!! > > > > from netCDF4 import Dataset > > import numpy as N > > > > MainFolder=r"D:/temp_samples/" > > print MainFolder > > for (path, dirs, files) in os.walk(MainFolder): > > for dir in dirs: > > print dir > > path=path+'/' > > > > for ncfile in files: > > if ncfile[-3:]=='.nc': > > print "dealing with ncfiles:", ncfile > > ncfile=os.path.join(path,ncfile) > > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > > TSFC=ncfile.variables['T_SFC'][4::24,:,:] #this chooses a > particular time for each day for the whole month > > LAT=ncfile.variables['latitude'][:] > > LON=ncfile.variables['longitude'][:] > > TIME=ncfile.variables['time'][:] > > fillvalue=ncfile.variables['T_SFC']._FillValue > > ncfile.close() > > > > #calculate summary stats > > TSFCmean=N.mean(TSFC, axis=0) > > print "The TSFCmean array is:", TSFCmean > > > > # there is a lot more code that follows this to plot the mean for each > ncfile but I have cut that out as I think it is irrelevant to my question > > > > _______________________________________________ > > Tutor maillist - Tutor@python.org > > To unsubscribe or change subscription options: > > http://mail.python.org/mailman/listinfo/tutor > >
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor