John,

Thanks for putting this together, this is pretty close to what I was thinking.

I'm thinking that we need a new feature type for this. I'm calling it
"particleTrack" but there's probably a better name.

something that captures the sense of a collection of particles would be better:

particleCollectionTrack ?

not that I like that, either...

but this case has the inner and outer table inverted:
yup.

1) If avg number of particles ~ max number of particles at any time
step, then one could use multdimensional arrays:

2) The CDL of the ragged case would look like:

Are you proposing that both options should be supported, or that we need to choose one. If we need to choose, I'd say the ragged representation is the way to go -- it's more flexible. If you do have a model that adds and removes particles, then you'd need to parse out the ID to follow a particular particle anyway.

Which makes me think -- if we support the 2-d array approach, I think it should be used only for data where the particle corresponding to a given index along the particles axis does not change. If it does, then you might as well use the ragged array approach.

variables:
int time(time) ;
int rowSize(time) ;

How about:

int time_step_index(time);

That would give you an index into the nth time step easily -- so you could very quickly grab a given time step, without having to add up all the rowSize values up to that time step.

rowSize and time_step_index are redundant data, which is not a good idea, as it could get out of sync. If I were to choose one, it would be time_step_index, as you could compute rowSize simply by subtracting the two adjacent values of time_step_index

The particles at time step i are contained
in the obs variables between start(i) to start(i) + rowSize(i).

where do you get "start"? did you intend to include that (I assume it's like my time_step_index ?)

these layouts are optimized for processing all particles at a given
time, and for sequentially processing time steps.

Which are the most common use-cases.

If one wanted to
process particle trajectories, that will be much slower. If you needed
to do it a lot, you might want to rewrite the file. a more sophisticated
application, possibly a server, could write an index to speed it up.

yup.

One thing I'm not clear on:

Do the netcdf libs (the C lib in particular) have any built-in support for ragged arrays? or does the client code have to handle that?

-Chris



--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[email protected]
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to