A couple quick comments: I think we're close here, so that's good. I'm not that clear on where tehre are decisions left to be made, but I'll highlight two:
... > Your aim is to > describe the network alone. > ... > a collection of timeseries is stored as a > data variable with a single dimension of time and a single dimension of > space. > I don't see a conflict here -- if you can describe the network (geometry) then you can associate data with it (UGRID used indexes into cells, nodes, etc, this should be equally applicable) > You would like to have SOMETHING alone in the file, just to > describe the network itself. CF doesn't do this at present (domain without > data), isn't a set of coordinate variables essentially do that? i.e. you can define a rectangular grid -- even if there is no data on it. And you can certainly do that with UGRID, which is another standard, but I don't think it conflicts with CF. > Taking your previous comments into account (I'll come back to them below), > as > a modified version of what I suggested before, here's a possible way to > handle > this case, for a small number (3) of linestrings: > That looks good to me, I think... > > data: > SOMETHING=2, 4, 3; > lon=0, 1, 0, -1, -2, -3, 2, 3, 4; > lat=51, 52, 51, 50, 50, 49, 55, 55, 56; > I'm confused about what this is. These simple geometries can be regarded as a more complex alternative to > cells > bounds - each timeseries has a complicated geometry of nodes and lines, but > logically it's still a single "cell". yup. > For the sake of applications which can > read CF but don't understand simple geometries, it might be a good idea in > addition to provide a "representative" location for each timeseries, as > representive_lat(station) and representative_lon(station), which could for > instance be the mean of the node coordinates for each geometry. We do that in UGRID, too -- I think it's even required (and called coordinates, actually). It may make little sense with complex geometries, but it can be handy. > You propose the index variable in order for the convention to be like > > ugrid. However this still seems to me to be an unnecessary complexity and > > use of space if you aren’t going to have many shared nodes. > To be frank, I'm not convinced by either argument. Regarding the first, in > your > example you don't reuse any points at all. Can you give an example where > there > is a lot of reuse? The stream network example would be a good one. also things like political boundaries -- they tend to be complex polygons with shared vertices. > Regarding the second, I agree that it is a nuisance and > unreliable to have to make comparisons with tolerance between > floating-point > numbers to determine equality. However, when you write a file, I suppose > you > can and would write exactly the same numbers for the coordinates of a node > if > it appears several times, wouldn't you? Thus the coincidence of nodes can > be > tested by *exact* equality of coordinates - no tolerance needed. > you still don't know fo sure if the vertices are the SAME or if the Happen to be the same. This is a tough one -- the "normal" GIS data model does not have shared nodes (that I know of) so perhaps we should follow that. But this lack of shared nodes is actually a substantial pain for GIS systems and uses -- there is a lot of complex "snapping" that needs to be done. So I'm on the fence about this -- I'm pretty convinced shared nodes are a better model, but if we want to interact seamlessly with other GIS formats, we may be better off matching that data model. In my example above, I assumed the polygons have no holes in them, so I've > omitted the inside/outside information. If needed, this information could > also > be an attribute e.g. SOMETHING:inout="OIIIOOOOIOO", with as many elements > as > there are polygons in total. Thinking again about it, I wonder whether this > information is really needed. If you draw all the polygons, isn't it > apparent > which ones are inside anyway? When would you use this information? > it's not always clear. if there is a hole in a polygon, you can figure it out, but if there is a lake in a land polygon, and a island in the lake, then it gets pretty tricky. I think shapefiles use clockwise vs anti-clockwise to indicate inside-outside, but IIUC, they are pretty limited with nested polygons, too. > My scheme avoids the use of break values, which you're not very keen on > your- > selves, it sounds like. I don't like break values either. > You wrote > - It is more difficult to extract a single geometry using this > approach. It's not hard, though, and the same comment would apply to the > CF > contiguous ragged array representation. yes -- you can represent a ragged array by either specifying the start-index of each "row", or by specifying the size of each row. CF specifies the size of each row. I think that's a worse way to do it -- it's similar if you are looping through from the start, but much harder to get an arbitrary row in the middle -- but I"ve gone with the the CF way for other stuff [1] because it's better not to have two ways to do the same thing. So we might as well stick with it here, too. -CHB [1] a netcdf format for particle tracking model output: https://github.com/NOAA-ORR-ERD/nc_particles -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [email protected]
_______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
