On Wed, Feb 22, 2017 at 7:23 AM, David Blodgett <[email protected]> wrote:
> We will meet on google hangouts at 8am CT on March 7th If you’d like to be > added to the calendar invite, please let me know. > Please invite me -- though it's pretty early for me -- aren't timezones fun! Darn round earth.... And you do mean 8am UTC-6, (USA Central Standard Time), yes? -Chris > *NetCDF - Simple Geometries Discussion* > Scheduled: Mar 7, 2017, 8:00 AM to 9:00 AM CT > Hopefully google hangouts will work. Please use this url: > https://plus.google.com/hangouts/_/calendar/ZGJsb2RnZXR0Lmgyb0BnbWFpbC5jb2 > 0.c192p3bskticchduvubks6ug44 > > - Dave > > On Feb 18, 2017, at 4:17 PM, Blodgett, David <[email protected]> wrote: > > Let's try this again. Sorry to spam everyone. > > http://doodle.com/poll/yaherucx2w3cd9y6 > >> >> >> >> On Feb 6, 2017, at 11:29 AM, David Blodgett <[email protected]> wrote: >> >> >> >> >> Dear CF, >> >> >> >> >> I want to follow up on the conversation here with an alternative approach >> suggested off list primarily between Jonathan and I. For this, I’m going to >> focus on use cases satisfied and simplification of the proposal allowed by >> not supporting those use cases. The changes below are largely driven by a >> desire to better align this proposal with the technical details of the >> prior art that is CF. >> >> >> >> >> If we: >> >> 1) don’t support node sharing, we can remove the complication of node - >> coordinate indexing / indirection, simplifying the proposal pretty >> significantly. >> >> 2) don’t use “break values” to indicate the separation between multi-part >> geometries and polygon holes, we end up with a data model with an extra >> dimension, but the NetCDF dimensions align with the natural dimensions of >> the data. >> >> 3) use “count” instead of a “start pointer” approach, we are better >> aligned with the existing DSG contiguous ragged array approach. >> >> >> >> >> Coming back to the three directions we could take this proposal from my >> cover letter on February 2nd. >> >> 1. Direct use of Well-Known Text (WKT). In this approach, well known >> text strings would be encoded using character arrays following a contiguous >> ragged array approach to index the character array by geometry (or instance >> in DSG parlance). >> >> >> 2. Implement the WKT approach using a NetCDF binary array. In this >> approach, well known text separators (brackets, commas and spaces) for >> multipoint, multiline, multipolygon, and polygon holes, would be encoded as >> break type separator values like -1 for multiparts and -2 for holes. >> >> >> 3. Implement the fundamental dimensions of geometry data in NetCDF. >> In this approach, additional dimensions and variables along those >> dimensions would be introduced to represent geometries, geometry parts, >> geometry nodes, and unique (potentially shared) coordinate locations for >> nodes to reference. >> >> >> The alternative I’m outlining here moves in the direction of 3. We had >> originally discounted it because it becomes very verbose and seems overly >> complicated if support for coordinate sharing is a requirement. If the >> three simplifications described above are used, then the third approach >> seems more tenable. >> >> >> >> >> Jonathan has also suggested that: (these are in reaction to the CDL in my >> letter from February 2nd) >> >> 1) Rename geom_coordinates as node_coordinates, for consistency with >> UGRID. >> >> 2) Omit node_dimension. This is redundant, since the dimension can be >> found by >> >> examining the node coordinate variables. >> >> 3) Prescribe numerous “codes” and assumptions in the specification >> instead of letting them be described with attribute values. >> >> 4) It would be more consistent with CF and UGRID to use a single >> container variable to hang all the topology/geometry information from. >> >> >> >> >> Which I, personally, am happy to accept if others don’t object. >> >> >> >> >> A couple other suggestions from Jonathan I want to discuss a bit more: >> >> 1) Rename geometry as topology and geom_type as topology_type. >> >> While I’d be open to something other than geom, topology is >> odd. If this is really “node_collection_topology_type” I guess I could >> be convinced, but would be curious how people react to this. (Especially in >> relation to UGRID) >> >> 2) This extension is more appropriate as an extension to the concept of >> cell bounds than the addition of a complex time-invariate type of discrete >> sampling geometry. >> >> Having just re-read the cell bounds chapter, I think it >> would over complicate the cell bounds to include this material. My basic >> issue here is that these geometries do not necessarily have a reference >> location. They are, rather, first order entities that need to be treated as >> such. That said, it makes sense that these geometries are not necessarily a >> good fit for the original intent of Discrete Sampling Geometries. Jonathan >> suggested they may belong in their own chapter, which may be a good >> alternative? MY suggested CDL below might lead us in the direction of this >> being a special type of auxiliary coordinate variable. >> >> >> >> >> This alternative starts to look like the CDL pasted below. >> >> >> >> >> Note that the issue of coordinates is sticking out like a sore thumb. >> Below, I’ve attempted to reconcile Jonathan’s ideas regarding coordinates >> with my thoughts about how these geometries are “first order entities” that >> don’t have a single representative x and y. The spatial coordinates can be >> said to reside in the system of geometries described in the “sf” container >> variable? I realize this goes against the idea of coordinates a bit, but I >> think it is holding with the spirit of the attribute? >> >> >> >> >> Finally, I’m glad to continue answering questions and debating things via >> the list to a point, but I think it would be in our interest to arrange a >> telecom to discuss this stuff further with a list of interested parties. >> Feel free to follow up on list, but for decision making, let’s not let this >> rabbit hole go too deep. I’ll plan on letting this and the other recent >> action on this proposal settle with people for a week or two then start to >> bring together a conference call (or calls depending on time zones). Please >> respond to me off list if you are interested in being part of a call to >> discuss. >> >> >> >> >> Regards, >> >> >> >> >> - Dave >> >> >> >> >> netcdf multipolygon_example { >> >> dimensions: >> >> node = 47 ; >> >> part = 9 ; >> >> instance = 3 ; >> >> time = 5 ; >> >> strlen = 5 ; >> >> variables: >> >> char instance_name(instance, strlen) ; >> >> instance_name:cf_role = "timeseries_id" ; >> >> double someVariable(instance) ; >> >> someVariable:long_name = "a variable describing a single-valued attribute >> of a polygon" ; >> >> someVariable:coordinates = "sf" ; // or "instance_name"? >> >> int time(time) ; >> >> time:units = "days since 2000-01-01" ; >> >> double someData(instance, time) ; >> >> someData:coordinates = "time sf" ; // or "time instance_name"? >> >> someData:featureType = "timeSeries" ; >> >> someData:geometry="sf"; >> >> int sf; // containing variable -- datatype irrelevant because no data >> >> sf:geom_type = "multipolygon" ; // could be node_topology_type? >> >> sf:node_count_variable="node_count"; >> >> sf:node_coordinates = "x y" ; >> >> sf:part_count = "part_node_count" ; >> >> sf:part_type = "part_type" ; // Note required unless polygons with holes >> present. >> >> sf:outer_ring_order = "anticlockwise" ; // not required if written in >> spec? >> >> sf:closure_convention = "last_node_equals_first" ; // not required if >> written in spec? >> >> sf:outer_type_code = 0 ; // not required if written in spec? >> >> sf:inner_type_code = 1 ; // not required if written in spec? >> >> int node_count(instance); >> >> node_count:long_name = “count of coordinates in each instance geometry" ; >> >> int part_node_count(part) ; >> >> part_node_count:long_name = “count of coordinates in each geometry part" ; >> >> int part_type(part) ; >> >> part_type:long_name = “type of each geometry part" ; >> >> double x(node) ; >> >> x:units = "degrees_east" ; >> >> x:standard_name = "longitude" ; // or projection_x_coordinate >> >> X:cf_role = "geometry_x_node" ; >> >> double y(node) ; >> >> y:units = "degrees_north" ; >> >> y:standard_name = “latitude” ; // or projection_y_coordinate >> >> y:cf_role = "geometry_y_node" >> >> // global attributes: >> >> :Conventions = "CF-1.8" ; >> >> >> >> data: >> >> >> >> instance_name = >> >> "flash", >> >> "bang", >> >> "pow" ; >> >> >> >> someVariable = 1, 2, 3 ; >> >> >> >> time = 1, 2, 3, 4, 5 ; >> >> >> >> someData = >> >> 1, 2, 3, 4, 5, >> >> 1, 2, 3, 4, 5, >> >> 1, 2, 3, 4, 5 ; >> >> >> >> node_count = 25, 15, 7 ; >> >> >> >> part_node_count = 5, 4, 4, 4, 4, 8, 6, 8, 4 ; >> >> >> >> part_type = 0, 1, 1, 1, 0, 0, 0, 1, 0 ; >> >> >> >> x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, >> >> 5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, -20, >> -30, -20, -20, -30, 30, >> >> 45, 10, 30, 25, 50, 30, 25 ; >> >> >> >> y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, 25, 29, >> >> 25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, -35, >> -20, -15, -25, -20, 20, >> >> 40, 40, 20, 5, 10, 15, 5 ; >> >> } >> >> >> >> >> >> >> >> >> On Feb 4, 2017, at 8:07 AM, David Blodgett <[email protected]> wrote: >> >> >> >> >> Dear Chris, >> >> >> >> >> Thanks for your thorough treatment of these issues. We have gone through >> a similar thought process to arrive at the proposal we came up with. I’ll >> answer as briefly as I can. >> >> >> >> >> 1) how would you translate between netcdf geometries and, say geo JSON? >> >> >> >> >> The thinking is that node coordinate sharing is optional. If the writer >> wants to check or already knows that nodes share coordinates, then it’s >> possible. Otherwise, it doesn’t have to be used. I’ve always felt that this >> was important, but maybe not critical for a core NetCDF-CF data model. Some >> offline conversation has led to an example that does not use it that may be >> a good alternative, more on that later. >> >> >> >> >> 2) Break Values >> >> >> >> >> You really do have to hold your nose on the break values. The issue is >> that you have to store that information somehow and it is almost worse to >> create new variables to store the multi-part and hole/not hole information. >> The alternative approach that’s forming up as mentioned above does break >> the information out into additional variables but simplifies things >> otherwise. In that case it doesn’t feel overly complex to me… so stay tuned >> for more on this front. >> >> >> >> >> 3) Ragged Indexing >> >> >> >> >> Your thought process follows ours exactly. The key is that you either >> have to create the “pointer” array as a first order of business or loop >> over the counts ad nauseam. I’m actually leaning toward the counts for two >> reasons. First, the counts approach is already in CF so is a natural fit >> and will be familiar to developers in this space. Second, the issue of 0 vs >> 1 indexing is annoying. In our proposal, we settled on 0 indexing because >> it aligns with the idea of an offset, but it is still annoying and some >> applications would always have to adjust that pointer array as a first >> order of business. >> >> >> >> >> On to Bob’s comments. >> >> >> >> >> Regarding aligning with other data models / encodings, I guess this needs >> to be unpacked a bit. >> >> >> >> >> 1) In this setting, simple features is a data model, not an encoding. An >> encoding can implement part or all of a data model as is needed by the use >> case(s) at hand. There is no problem with partial implementations you still >> get interoperability for the intended use cases. >> >> 2) Attempting to align with other encoding standards UGRID and NetCDF-CF >> are the primary ones here, is simply to keep the implementation patterns >> similar and familiar. This may be a fools errand, but is presumably good >> for adoptability and consistency. >> >> So, I don’t see a problem with implementing important simple features >> types in a way that aligns with the way the existing community standards >> work. >> >> >> >> >> I don’t see this as ignoring existing standards at all. There is no open >> community standard for binary encoding of geometries and related data that >> passes the CF requirements of human readability and self-description. We >> are adopting the appropriate data model and suggesting a new encoding that >> will solve a lot of problems in the environmental modeling space. >> >> >> >> >> As we’ve discussed before, your "different approach” sounds great, but >> seems like an exercise for a future effort that doesn’t attempt to align >> with CF 1.7. Maybe what you suggest is a path forward for variable length >> arrays in the CF 2.0 “vision in the mist”, but I don’t see it as a tenable >> solution for CF 1.*. >> >> >> >> >> Best Regards, >> >> >> >> >> - Dave >> >> >> >> >> >> >> On Feb 3, 2017, at 3:31 PM, Chris Barker <[email protected]> wrote: >> >> >> >> >> a few thoughts. First, I think there are three core "issues" that need to >> be resolved: >> >> >> >> >> 1) Coordinate indexing (indirection) >> >> the question of whether you have an array of "vertices" that the geomotry >> types index into to get thier data: >> >> >> >> >> Advantages: >> >> - if a number of geometries share a lot of vertices, it can be more >> efficient >> >> - the relationship between geometries that share vertices (i.e. polygons >> that share a boundary) etc. is well defined. you dopnt need to check for >> closeness, and maybe have a tolerance, etc. >> >> >> >> >> These were absolutely critical for UGRID for example -- a UGRID mesh is a >> single thing", NOT a collection of polygons that happen to share some >> vertices. >> >> >> >> >> Disadvantages: >> >> - if the geometries do not share many vertices, it is less efficient. >> >> - there are additional code complications in "getting" the vertices of >> the given geometry >> >> - it does not match the OGC data model. >> >> >> >> >> My 0.02 -- given my use cases, I tend to want teh advantages -- but I >> don't know that that's a typical use case. And I think it's a really good >> idea to keep with the OGS data model where possible -- i.e. e able to >> translate from netcdf to, say, geoJSON as losslessly as possible. Given >> that I think it's probably a better idea not to have the indirection. >> >> >> >> >> However (to equivocate) perhaps the types of information people are >> likely to want to store in netcdf are a subset of what the OGC >> standards are designed for -- and for those use-cases, maybe shared >> vertices are critical. >> >> >> >> >> One way to think about it -- how would you translate between netcdf >> geometries and, say geo JSON: >> >> - nc => geojson would lose the shared index info. >> >> - geojson => nc -- would you try to reconstruct the shared vertices?? >> I"m thinking that would be a bit dangerous in the general case, because you >> are adding information that you don't know is true -- are these a shared >> vertex or two that just happen to be at the same location? >> >> >> >> >> > > Break values >> >> >> >> >> I don't really like break values as an approach, but with netcdf any >> option will be ugly one way or another. So keeping with the WKT approach >> makes sense to me. Either way you'll need custom code to unpack it. (BTW -- >> what does WellKnownBinary do?) >> >> >> >> >> > > Ragged indexing >> >> >> >> >> There are two "natural" ways to represent a ragged array: >> >> >> >> >> (a) store the length of each "row" >> >> (b) store the index to the beginning (or end) or each "row" >> >> >> >> >> CF already uses (a). However, working with it, I'm pretty convinced that >> it's the "wrong" choice: >> >> >> >> >> If you want to know how long a given row is, that is really easy with >> (a), and almost as easy with (b) (involves two indexes and a subtraction) >> >> >> >> >> However, if you want to extract a particular row: (b) makes this really >> easy -- you simply access the slice of the array you want. with (a) you >> need to loop through the entire "length_of_rows" array (up to the row of >> interest) and add up the values to find the slice you need. not a huge >> issue, but it is an issue. In fact, in my code to read ragged arrays in >> netcdf, the first thing I do is pre-compute the index-to-each-row, so I can >> then use that to access individual rows for future access -- if you are >> accessing via OpenDAP -- that's particular helpful. >> >> >> >> >> So -- (b) is clearly (to me) the "best" way to do it -- but is it worth >> introducing a second way to handle ragged arrays in CF? I would think yes, >> but that would be offset if: >> >> >> >> >> - There is a bunch of existing library code that transparently handles >> ragged arrays in netcdf (does netcdfJava have something? I'm pretty sure >> Python doesn't -- certainly not in netCDF4) >> >> >> >> >> - That that existing lib code would be advantageous to leverage for code >> reading features: I suspect that there will have to be enough custom code >> that the ragged array bits are going to be the least of it. >> >> >> >> >> So I'm for the "new" way of representing ragged arrays >> >> >> >> >> -CHB >> >> >> >> >> >> >> On Fri, Feb 3, 2017 at 11:41 AM, Bob Simons - NOAA Federal < >> [email protected]> wrote: >> >> Then, isn't this proposal just the first step in the creation of a new >> model and a new encoding of Simple Features, one that is "align[ed] ... >> with as many other encoding standards in this space as is practical"? In >> other words, yet another standard for Simple Features? >> >> >> >> >> If so, it seems risky to me to take just the first (easy?) step "to >> support the use cases that have a compelling need today" and not solve the >> entire problem. I know the CF way is to just solve real, current needs, but >> in this case it seems to risk a head slap moment in the future when we >> realize that, in order to deal with some new simple feature variant, we >> should have done things differently from the beginning? >> >> >> >> >> And it seems odd to reject existing standards that have been so >> painstakingly hammered out, in favor of starting the process all over >> again. We follow existing standards for other things (e.g., IEEE-754 for >> representing floating point numbers in binary files), why can't we follow >> an existing Simple Features standard? >> >> >> >> >> --- >> >> Rather than just be a naysayer, let me suggest a very different >> alternative: >> >> >> >> >> There are several projects in the CF realm (e.g., this Simple Features >> project, Discrete Sampling Geometry (DSG), true variable-length Strings, >> ugrid(?)) which share a common underlying problem: how to deal with >> variable-length multidimensional arrays: a[b][c], where the length of the c >> dimension may be different for different b indices. >> >> DSG solved this (5 different ways!), but only for DSG. >> >> The Simple Features proposal seeks to solve the problem for Simple >> Features. >> >> We still have no support for Unicode variable-length Strings. >> >> >> >> >> Instead of continuing to solve the variable-length problem a different >> way every time we confront it, shouldn't we solve it once, with one small >> addition to the standard, and then use that solution repeatedly? >> >> The solution could be a simple variant of one of the DSG solutions, but >> generalized so that it could be used in different situations. >> >> An encoding standard and built-in support for variable-length data arrays >> in netcdf-java/c would solve a lot of problems, now and in the future. >> >> Some work on this is already done: I think the netcdf-java API already >> supports variable-length arrays when reading netcdf-4 files. >> >> For Simple Features, the problem would reduce to: store the feature >> (using some specified existing standard like WKT or WKB) in a >> variable-length array. >> >> >> >> >> >> >> >> >> >> >> >> >> On Fri, Feb 3, 2017 at 9:07 AM, <[email protected]> wrote: >> >> Date: Fri, 3 Feb 2017 11:07:00 -0600 >> From: David Blodgett <[email protected]> >> To: Bob Simons - NOAA Federal <[email protected]> >> Cc: CF Metadata <[email protected]> >> Subject: Re: [CF-metadata] Extension of Discrete Sampling Geometries >> for Simple Features >> Message-ID: <[email protected]> >> Content-Type: text/plain; charset="utf-8" >> >> Dear Bob, >> >> I?ll just take these in line. >> >> 1) noted. We have been trying to figure out what to do with the point >> featureType and I think leaving it more or less alone is a viable path >> forward. >> >> 2) This is not an exact replica of WKT, but rather a similar approach to >> WKT. As I stated, we have followed the ISO simple features data model and >> well known text feature types in concept, but have not used the same >> standardization formalisms. We aren?t advocating for supporting ?all of? >> any standard but are rather attempting to support the use cases that have a >> compelling need today while aligning this with as many other encoding >> standards in this space as is practical. Hopefully that answers your >> question, sorry if it?s vague. >> >> 3) The google doc linked in my response contains the encoding we are >> proposing as a starting point for conversation: http://goo.gl/Kq9ASq < >> http://goo.gl/Kq9ASq> I want to stress, as a starting point for >> discussion. I expect that this proposal will change drastically before >> we?re done. >> >> 4) Absolutely envision tools doing what you say, convert to/from standard >> spatial formats and NetCDF-CF geometries. We intend to introduce an R and a >> Python implementation that does exactly as you say along with whatever form >> this standard takes in the end. R and Python were chosen as the team that >> brought this together are familiar with those two languages, additional >> implementations would be more than welcome. >> >> 5) We do include a ?geometry? featureType similar to the ?point? >> featureType. Thus our difficulty with what to do with the ?point? >> featureType. You are correct, there are lots of non timeSeries applications >> to be solved and this proposal does intend to support them (within the >> existing DSG constructs). >> >> Thanks for your questions, hopefully my answers close some gaps for you. >> >> - Dave >> >> > On Feb 3, 2017, at 10:47 AM, Bob Simons - NOAA Federal < >> [email protected]> wrote: >> > >> > 1) There is a vague comment in the proposal about possibly changing the >> point featureType. Please don't, unless the changes don't affect current >> uses of Point. There are already 1000's of files that use it. If this new >> system offers an alternative, then fine, it's an alternative. One of the >> most important and useful features of a good standard is backwards >> compatibility. >> > >> > 2) You advocate "Implement the WKT approach using a NetCDF binary >> array." Is this system then an exact encoding of WKT, neither a subset nor >> a superset? "Simple Features" are often not simple. >> > If it is WKT (or something else), what is the standard you are >> following to describe the Simple Features (e.g., ISO/IEC 13249-3:2016 and >> ISO 19162:2015)? >> > Does your proposal deviate in any way from the standard's capabilities? >> > Do you advocate following the entire WKT standard, e.g., supporting all >> the feature types that WKT supports? >> > >> > 3) Since you are not using the WKT encoding, but creating your own, >> where is the definition of the encoding system you are using? >> > >> > 4) This is a little out of CF scope, but: >> > Do you envision tools, notably, netcdf-c/java, having a writer function >> that takes in WKT and encodes the information in a file, and having a >> reader function that reads the file and returns WKT? Or is it your plan >> that the encoding/ decoding is left to the user? >> > >> > 5) This proposal is for "Simple Features plus Time Series" (my phrase >> not yours). But aren't there lots of other uses of Simple Features? Will >> there be other proposals in the future for "Simple Features plus X" and >> "Simple Features plus Y"? If so, will CF eventually become a massive >> document where Simple Features are defined over and over again, but in >> different contexts? If so, wouldn't a better solution be to deal with >> Simple Features separately (as Postgres does by making a geometric data >> type?), and then add "Simple Features plus Time Series" as the first use of >> it? >> > >> > Thanks for answering these questions. >> > Please forgive me if I missed parts of your proposal that answer these >> questions. >> > >> > >> > On Thu, Feb 2, 2017 at 5:57 AM, <[email protected] >> <mailto:[email protected]>> wrote: >> > Date: Thu, 2 Feb 2017 07:57:36 -0600 >> > From: David Blodgett <[email protected] <mailto:[email protected]>> >> > To: <[email protected] <mailto:[email protected]>> >> > Subject: [CF-metadata] Extension of Discrete Sampling Geometries for >> > Simple Features >> > Message-ID: <[email protected] <mailto: >> [email protected]>> >> > Content-Type: text/plain; charset="utf-8" >> > >> > Dear CF Community, >> > >> > We are pleased to submit this proposal for your consideration and >> review. The cover letter we've prepared below provides some background and >> explanation for the proposed approach. The google doc here < >> http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq>> is an excerpt of the CF >> specification with track changes turned on. Permissions for the document >> allow any google user to comment, so feel free to comment and ask questions >> in line. >> > >> > Note that I?m sharing this with you with one issue unresolved. What to >> do with the point featureType? Our draft suggests that it is part of a new >> geometry featureType, but it could be that we leave it alone and introduce >> a geometry featureType. This may be a minor point of discussion, but we >> need to be clear that this is an issue that still needs to be resolved in >> the proposal. >> > >> > Thank you for your time and consideration. >> > >> > Best Regards, >> > >> > David Blodgett, Tim Whiteaker, and Ben Koziol >> > >> > Proposed Extension to NetCDF-CF for Simple Geometries >> > >> > Preface >> > >> > The proposed addition to NetCDF-CF introduced below is inspired by a >> pre-existing data model governed by OGC and ISO as ISO 19125-1. More >> information on Simple Features may be found here. < >> https://en.wikipedia.org/wiki/Simple_Features < >> https://en.wikipedia.org/wiki/Simple_Features>> To the knowledge of the >> authors, it is consistent with ISO 19125-1 but has not been specified using >> the formalisms of OGC or ISO. Language used attempts to hold true to >> NetCDF-CF semantics while not conflicting with the existing standards >> baseline. While this proposal does not support the entire scope of the the >> simple features ecosystem, it does support the core data types in most >> common use around the community. >> > >> > The other existing standard to mention is UGRID convention < >> http://ugrid-conventions.github.io/ugrid-conventions/ < >> http://ugrid-conventions.github.io/ugrid-conventions/>>. The authors >> have experience reading and writing UGRID and have designed the proposed >> structure in a way that is inspired by and consistent with it. >> > >> > Terms and Definitions >> > >> > (Taken from OGC 06-103r4 OpenGIS Implementation Specification for >> Geographic information - Simple feature access - Part 1: Common >> architecture <http://www.opengeospatial.org/standards/sfa < >> http://www.opengeospatial.org/standards/sfa>>.) >> > >> > Feature: Abstraction of real world phenomena - typically a geospatial >> abstraction with associated descriptive attributes. >> > Simple Feature: A feature with all geometric attributes described >> piecewise by straight line or planar interpolation between point sets. >> > Geometry (geometric complex): A set of disjoint geometric primitives - >> one or more points, lines, or polygons that form the spatial representation >> of a feature. >> > Introduction >> > >> > Discrete Sampling Geometries (DSGs) handle data from one (or a >> collection of) timeSeries (point), Trajectory, Profile, TrajectoryProfile >> or timeSeriesProfile geometries. Measurements are from a point (timeSeries >> and Profile) or points along a trajectory. In this proposal, we reuse the >> core DSG timeSeries type which provides support for basic time series use >> cases e.g., a timeSerieswhich is measured (or modeled) at a given point. >> > >> > Changes to Existing CF Specification >> > >> > In NetCDF-CF 1.7, Discrete Sampling Geometries separate dimensions and >> variables into two types ? instance and element < >> http://cfconventions.org/cf-conventions/cf-conventions.html >> #_collections_instances_and_elements <http://cfconventions.org/cf-c >> onventions/cf-conventions.html#_collections_instances_and_elements>>. >> Instance refers to individual points, trajectories, profiles, etc. These >> would sometimes be referred to as features given that they are identified >> entities that can have associated attributes and be related to other >> entities. Element dimensions describe temporal or other dimensions to >> describe data on a per-instance basis. This proposal extends the DSG >> timeSeries featuretype <http://cfconventions.org/cf-c >> onventions/cf-conventions.html#_features_and_feature_types < >> http://cfconventions.org/cf-conventions/cf-conventions.html >> #_features_and_feature_types>> such that the geospatial coordinates of >> the instances can be point, multi-point, line, multi-line, polygon, or >> multi-polyg >> on geometries. Rather than overload the DSG contiguous ragged array >> encoding, designed with timeseries in mind, a geometry ragged array >> encoding is introduced in a new section 9.3.5. See thi >> > s google doc for specific proposed changes. <http://goo.gl/Kq9ASq < >> http://goo.gl/Kq9ASq>> >> > Motivation >> > >> > DSGs have no system to define a geometry (polyline, polygon, etc., >> other than point) and an association with a time series that applies over >> that entire geometry e.g., The expected rainfall in this watershed polygon >> for some period of time is 10 mm. As suggested in the last paragraph of >> section 9.1, current practice is to assign a representative point or just >> use an ID and forgo spatial information within a NetCDF-CF file. In order >> to satisfy a number of environmental modeling use cases, we need a way to >> encode a geometry (point, line, polygon, multi-point, multi-line, or >> multi-polygon) that is the static spatial feature representation to which >> one or more timeSeries can be associated. In this proposal, we provide an >> encoding to define collections of simple feature geometries. It interfaces >> cleanly with the existing DSG specification, enabling DSGs and Simple >> Geometries to be used concurrently. >> > >> > Looking Forward >> > >> > This proposal is a compromise solution that attempts to stay consisten >> to CF ideals and fit within the structure of the existing specification >> with minimal disruption. Line and polygon data types often require variable >> length arrays. Development of this proposal has brought to light the need >> for a general abstraction for variable length arrays in NetCDF-CF. Such a >> general abstraction would necessarily be reusable for character arrays, >> ragged arrays of time series, and ragged arrays of geometry nodes, as well >> as any other ragged data structures that may come up in the future. This >> proposal does not introduce such a general ragged array abstraction but >> does not preclude such a development in the future. >> > >> > Three Alternative Approaches >> > >> > Respecting the human readability ideal of NetCDF-CF, the development of >> this proposal started from a human readable format for geometries known as >> Well Known Text <https://en.wikipedia.org/wiki/Well-known_text < >> https://en.wikipedia.org/wiki/Well-known_text>>. We considered three >> high level design approaches while developing this proposal. >> > >> > Direct use of Well-Known Text (WKT). In this approach, well known text >> strings would be encoded using character arrays following a contiguous >> ragged array approach to index the character array by geometry (or instance >> in DSG parlance). >> > Implement the WKT approach using a NetCDF binary array. In this >> approach, well known text separators (brackets, commas and spaces) for >> multipoint, multiline, multipolygon, and polygon holes, would be encoded as >> break type separator values like -1 for multiparts and -2 for holes. >> > Implement the fundamental dimensions of geometry data in NetCDF. In >> this approach, additional dimensions and variables along those dimensions >> would be introduced to represent geometries, geometry parts, geometry >> nodes, and unique (potentially shared) coordinate locations for nodes to >> reference. >> > Selected Approach >> > >> > The first approach was seen as too opaque to stay true to the CF ideal >> of complete self-description. The third approach seemed needlessly verbose >> and difficult to implement. The second approach was selected for the >> following reasons: >> > >> > The second approach is just as or more human-readable than the third. >> > Use of break values keeps geometries relatively atomic. >> > Will be familiar to developers who are familiar with the WKT geometry >> format. >> > Character arrays, which are needed for options one and three, are >> cumbersome to use in some programming languages in common use with NetCDF. >> > Break values replace the need for extraneous variables related to >> multi-part and polygon holes (interiors). Multi-part geometries are >> generally an exception and excessive instrumentation to support them should >> be discounted. >> > Example: Representation of WKT-Style Polygons in a NetCDF-3 >> timeSeriesfeatureType >> > >> > Below is sample CDL demonstrating how polygons are encoded in NetCDF-3 >> using a continuous ragged array-like encoding. There are three details to >> note in the example below. >> > >> > The attribute contiguous_ragged_dimension with value of a dimension in >> the file. >> > The geom_coordinates attribute with a value containing a space >> separated string of variable names. >> > The cf_role geometry_x_node and geometry_y_node. >> > These three attributes form a system to fully describe collections of >> multi-polygon feature geometries. Any variable that has the >> continuous_ragged_dimension attribute contains integers that indicate the >> 0-indexed starting position of each geometry along the instance dimension. >> Any variable that uses the dimension referenced in the >> continuous_ragged_dimension attribute can be interpreted using the values >> in the variable containing the contiguous_ragged_dimension attribute. The >> variables referenced in the geom_coordinates attribute describe spatial >> coordinates of geometries. These variables can also be identified by the >> cf_roles geometry_x_node and geometry_y_node. Note that the example below >> also includes a mechanism to handle multi-polygon features that also >> contain holes. >> > >> > netcdf multipolygon_example { >> > dimensions: >> > node = 47 ; >> > indices = 55 ; >> > instance = 3 ; >> > time = 5 ; >> > strlen = 5 ; >> > variables: >> > char instance_name(instance, strlen) ; >> > instance_name:cf_role = "timeseries_id" ; >> > int coordinate_index(indices) ; >> > coordinate_index:geom_type = "multipolygon" ; >> > coordinate_index:geom_coordinates = "x y" ; >> > coordinate_index:multipart_break_value = -1 ; >> > coordinate_index:hole_break_value = -2 ; >> > coordinate_index:outer_ring_order = "anticlockwise" ; >> > coordinate_index:closure_convention = "last_node_equals_first" ; >> > int coordinate_index_start(instance) ; >> > coordinate_index_start:long_name = "index of first coordinate in >> each instance geometry" ; >> > coordinate_index_start:contiguous_ragged_dimension = "indices" ; >> > double x(node) ; >> > x:units = "degrees_east" ; >> > x:standard_name = "longitude" ; // or projection_x_coordinate >> > X:cf_role = "geometry_x_node" ; >> > double y(node) ; >> > y:units = "degrees_north" ; >> > y:standard_name = ?latitude? ; // or projection_y_coordinate >> > y:cf_role = "geometry_y_node" >> > double someVariable(instance) ; >> > someVariable:long_name = "a variable describing a single-valued >> attribute of a polygon" ; >> > int time(time) ; >> > time:units = "days since 2000-01-01" ; >> > double someData(instance, time) ; >> > someData:coordinates = "time x y" ; >> > someData:featureType = "timeSeries" ; >> > // global attributes: >> > :Conventions = "CF-1.8" ; >> > >> > data: >> > >> > instance_name = >> > "flash", >> > "bang", >> > "pow" ; >> > >> > coordinate_index = 0, 1, 2, 3, 4, -2, 5, 6, 7, 8, -2, 9, 10, 11, 12, >> -2, 13, 14, 15, 16, >> > -1, 17, 18, 19, 20, -1, 21, 22, 23, 24, 25, 26, 27, 28, -1, 29, 30, >> 31, 32, 33, >> > 34, -2, 35, 36, 37, 38, 39, 40, 41, 42, -1, 43, 44, 45, 46 ; >> > >> > coordinate_index_start = 0, 30, 46 ; >> > >> > x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, >> > 5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, >> -20, -30, -20, -20, -30, 30, >> > 45, 10, 30, 25, 50, 30, 25 ; >> > >> > y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, >> 25, 29, >> > 25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, >> -35, -20, -15, -25, -20, 20, >> > 40, 40, 20, 5, 10, 15, 5 ; >> > >> > someVariable = 1, 2, 3 ; >> > >> > time = 1, 2, 3, 4, 5 ; >> > >> > someData = >> > 1, 2, 3, 4, 5, >> > 1, 2, 3, 4, 5, >> > 1, 2, 3, 4, 5 ; >> > } >> > How To Interpret >> > >> > Starting from the timeSeries variables: >> > >> > See CF-1.8 conventions. >> > See the timeSeries featureType. >> > Find the timeseries_id cf_role. >> > Find the coordinates attribute of data variables. >> > See that the variables indicated by the coordinates attribute have a >> cf_role geometry_x_nodeand geometry_y_node to determine that these are >> geometries according to this new specification. >> > Find the coordinate index variable with geom_coordinates that point to >> the nodes. >> > Find the variable with contiguous_ragged_dimension pointing to the >> dimension of the coordinate index variable to determine how to index into >> the coordinate index. >> > Iterate over polygons, parsing out geometries using the contiguous >> ragged start variable and coordinate index variable to interpret the >> coordinate data variables. >> > Or, without reference to timeSeries: >> > >> > See CF-1.8 conventions. >> > See the geom_type of multipolygon. >> > Find the variable with a contiguous_ragged_dimension matching the >> coordinate index variable?s dimension. >> > See the geom_coordinates of x y. >> > Using the contiguous ragged start variable found in 3 and the >> coordinate index variable found in 2, geometries can be parsed out of the >> coordinate index variable and parsed using the hole and break values in it. >> > >> > -------------- next part -------------- >> > An HTML attachment was scrubbed... >> > URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachmen >> ts/20170202/4ce5b42f/attachment.html <http://mailman.cgd.ucar.edu/p >> ipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html>> >> > >> > ------------------------------ >> > >> > Subject: Digest Footer >> > >> > _______________________________________________ >> > CF-metadata mailing list >> > [email protected] <mailto:[email protected]> >> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata < >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >> > >> > >> > ------------------------------ >> > >> > End of CF-metadata Digest, Vol 166, Issue 3 >> > ******************************************* >> > >> > >> > >> > -- >> > Sincerely, >> > >> > Bob Simons >> > IT Specialist >> > Environmental Research Division >> > NOAA Southwest Fisheries Science Center >> > 99 Pacific St., Suite 255A (New!) >> > Monterey, CA 93940 (New!) >> > Phone: (831)333-9878 (New!) >> > Fax: (831)648-8440 >> > Email: [email protected] <mailto:[email protected]> >> > >> > The contents of this message are mine personally and >> > do not necessarily reflect any position of the >> > Government or the National Oceanic and Atmospheric Administration. >> > <>< <>< <>< <>< <>< <>< <>< <>< <>< >> > >> > _______________________________________________ >> > CF-metadata mailing list >> > [email protected] >> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >> >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachmen >> ts/20170203/4ff55def/attachment.html> >> >> ------------------------------ >> >> Subject: Digest Footer >> >> _______________________________________________ >> CF-metadata mailing list >> [email protected] >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >> >> >> ------------------------------ >> >> End of CF-metadata Digest, Vol 166, Issue 5 >> ******************************************* >> >> >> >> >> >> >> >> -- >> >> Sincerely, >> >> Bob Simons >> IT Specialist >> Environmental Research Division >> NOAA Southwest Fisheries Science Center >> 99 Pacific St., Suite 255A (New!) >> Monterey, CA 93940 (New!) >> Phone: (831)333-9878 <(831)%20333-9878> (New!) >> >> Fax: (831)648-8440 <(831)%20648-8440> >> Email: [email protected] >> >> The contents of this message are mine personally and >> do not necessarily reflect any position of the >> Government or the National Oceanic and Atmospheric Administration. >> <>< <>< <>< <>< <>< <>< <>< <>< <>< >> >> >> _______________________________________________ >> CF-metadata mailing list >> [email protected] >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >> >> >> >> >> >> >> >> -- >> >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> [email protected] >> >> _______________________________________________ >> CF-metadata mailing list >> [email protected] >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >> >> >> >> >> >> >> >> >> >> >> >> >> > > > _______________________________________________ > CF-metadata mailing list > [email protected] > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [email protected]
_______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
