You might prefer to try Nujan instead of mixing python and netcdf, although variables are limited to 2GB
http://www.ral.ucar.edu/~steves/nujan.html On Fri, Nov 25, 2011 at 11:01 AM, Ute Brönner <[email protected]> wrote: > Hi folks, > > I kind of lost track of our latest discussions and had the feeling that this > was partly outside the mailing group; so I will try to sum up what we were > discussing. > My latest try was to produce NetCDF for particle trajectory trying to write > out the concentration grid which resulted in a 11GB netFCDF3 file :-( > > So we have different motivations for discussion particle trajectory and > netcdf4. > > First question: > Does anybody know if and if yes, when writing netCDF4 will be incorporated > into the NetCDF Java library? Or will we use Python with the help of Jython > etc. (http://www.slideshare.net/onyame/mixing-python-and-java) to write > netCDF4? > > Second question: > Is there a de facto standard / proposal for writing Particle Trajectory Data > which could be CF:featureType: <whatever we agree on>? The suggestion below > is not suitable because: > 1) we don't track a particle the whole time, it may disappear and show up > again later, but if I have 1000 particles in time step 1 and 1000 in time > step 2 we cannot be sure these 1000 are the same as before. > 2) I cannot know the number of time steps in advance. > > I would like sth. like > dimensions: > particle = UNLIMITED; //because it may change each time step > time = UNLIMITED; // because I don't know > > then every variable is like > latitude (particle, time) > longitude (particle, time) > > and I might have > int number_particles_per_timestep(time); > :units = "1"; > :long_name = "number particles per current timestep"; > :CF:ragged_row_count = "particle"; > > That some of you need to know which spill a particle came from, may be solved > with a 3rd dimension spill > dimensions: > spill = 3; // or how many one has > particle = UNLIMITED; //because it may change each time step > time = UNLIMITED; // because I don't know > > particle (spill, time) > > then every variable is like > latitude (particle) > longitude (particle) > > how would one write this? With coordinates or as hierarchical data structure? > At least we need the ability to use several unlimited dimensions and the > ragged-array feature. > > Third question: > How can we compress big netCDF3 files? Or is it smarter to go for netCDF4 > directly with hierarchical data. As in my example above I would need to write > out a 11 GB file and then deflate it like described here > http://www.unidata.ucar.edu/mailing_lists/archives/netcdf-java/2010/msg00095.html > or with Rich's script; but is that really necessary? > > > Hoping to get up the discussion again and that we agree on a standard quite > soon! > Have a nice weekend! > > Best, > Ute > > -------- Original Message -------- > Subject: [CF-metadata] Particle Track Feature Type (was: Re: point > observation data in CF 1.4) > Date: Fri, 19 Nov 2010 04:15:35 +0100 > From: John Caron <[email protected]> > To: [email protected] <[email protected]> > > Im thinking that we need a new feature type for this. Im calling it > "particleTrack" but theres probably a better name. > > My reasoning is that the nested table representation of trajectories is: > > Table { > traj_id; > Table { > time; > lat, lon, z; > data; > } > } > > but this case has the inner and outer table inverted: > > Table { > time; > Table { > particle_id; > lat, lon, z; > data; > data2; > } > } > > So, following that line of thought, the possibilities in CDL are: > > 1) If avg number of particles ~ max number of particles at any time step, > then one could use multdimensional arrays: > > dimensions: > maxParticles = 1000 ; > time = 7777 ; // may be UNLIMITED > > variables: > > double time(time) ; > > int particle_id(time, maxParticles) ; > float lon(time, maxParticles) ; > float lat(time, maxParticles) ; > float z(time, maxParticles) ; > float data(time, maxParticles) ; > > attributes: > :featureType = "particleTrack"; > > note maxParticles is the max number of particles at any one time step, not > total particle tracks. The particle trajectories have to be found by > examining the values of particle_id(time, maxParticles). > > 2) The CDL of the ragged case would look like: > > dimensions: > obs = 500000; // UNLIMITED > time = 7777 ; > > variables: > int time(time) ; > int rowSize(time) ; > > int particle_id(obs) ; > float lon(obs) ; > float lat(obs) ; > float z(obs) ; > float data(obs) ; > > attributes: > :featureType = "particleTrack"; > > in this case, you dont have to know the max number of particles at any one > time step, but you do need to know the number of time steps beforehand. The > particle trajectories have to be found by examining the values of > particle_id(obs). The particles at time step i are contained in the obs > variables between start(i) to start(i) + rowSize(i). > > these layouts are optimized for processing all particles at a given time, and > for sequentially processing time steps. If one wanted to process particle > trajectories, that will be much slower. If you needed to do it a lot, you > might want to rewrite the file. a more sophisticated application, possibly a > server, could write an index to speed it up. > > > _______________________________________________ > CF-metadata mailing list > [email protected] > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of Rich Signell > Sent: Donnerstag, 18. August 2011 19:04 > To: Christopher Barker > Cc: Ute Brönner; Ben Hetland; Mark Reed; Nils Rune Bodsberg; CJ > Beegle-Krause; Caitlin O'Connor; Alex Hadjilambris; Rob Hetland > Subject: Re: netcdf for particle trajectories > > Chris, > > >>>> so I'll make part of my homework to deliver you a Python script >>>> using Whitaker's NetCDF4 that writes a sample file. >> >> How did this go, Rich? > > Yes, I took Rob Hetland's Python short course, and yes, I wrote a small > example showing how to take NetCDF3 particle tracking output and create a > compressed NetCDF4 file with chunking. I just forgot to send it. ;-) > > Note: You can get a OpenDAP-enabled NetCDF4 Python module for both 32 and 64 > bit windows from: > http://www.lfd.uci.edu/~gohlke/pythonlibs/ > > -Rich >> >> We're getting closer to a prototype file (i.e. we've got GNOME writing >> something, but it still needs some tweaking). I'll sent out an example >> when I think we're close. >> >> One new issue: >> >> In GNOME, we have the concept of any number of "spills" -- each spill >> is a set of particles that usually share some properties. >> >> So we're trying to figure out how to capture that. Two ideas: >> >> 1) each spill is a unique set of data -- but I think that it would ony >> be possible to do this by using a convension on teh variable names: >> >> data_1 >> particle_count_1 >> longitude_1 >> latitude_1 >> ... >> >> data_2 >> particle_count_2 >> longitude_2 >> latitude_2 >> ... >> >> That seems pretty ugly. Could netcdf4's "hierarchical data" help us here? >> Maybe this provides the motivation to use it. >> >> Option two: >> >> put all the particles in one big array, but identify the different "spills" >> by particle ID: >> >> ID_range_1 = 0-1000 >> ID_range_2 = 1000-2000 >> ... >> >> then they could get split up by the client software, if desired, or >> the separate spills could be ignored, and it could all be treated as one. >> >> -- thoughts? >> >> >> -- >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 >> (206) 526-6317 main reception >> >> [email protected] >> > > > > -- > Dr. Richard P. Signell (508) 457-2229 > USGS, 384 Woods Hole Rd. > Woods Hole, MA 02543-1598 > _______________________________________________ > CF-metadata mailing list > [email protected] > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
