My apologies, I forgot to turn on time zone support in the poll below. Please use this one instead. http://doodle.com/poll/eikarnt35tdm7igd <http://doodle.com/poll/eikarnt35tdm7igd>
> On Feb 17, 2017, at 1:22 PM, David Blodgett <[email protected]> wrote: > > All, > > I haven’t heard much follow up, but here’s a doodle to coordinate a phone > conversation about this. I think we have west-coast US participants and EU > participants, so I chose times mid to late morning for me (midwest US). > > http://doodle.com/poll/eikarnt35tdm7igd > <http://doodle.com/poll/eikarnt35tdm7igd> > > Will make a call once a few people have expressed interest and we have a > clear day/time. > > Regards, > > - Dave > >> On Feb 6, 2017, at 11:29 AM, David Blodgett <[email protected] >> <mailto:[email protected]>> wrote: >> >> Dear CF, >> >> I want to follow up on the conversation here with an alternative approach >> suggested off list primarily between Jonathan and I. For this, I’m going to >> focus on use cases satisfied and simplification of the proposal allowed by >> not supporting those use cases. The changes below are largely driven by a >> desire to better align this proposal with the technical details of the prior >> art that is CF. >> >> If we: >> 1) don’t support node sharing, we can remove the complication of node - >> coordinate indexing / indirection, simplifying the proposal pretty >> significantly. >> 2) don’t use “break values” to indicate the separation between multi-part >> geometries and polygon holes, we end up with a data model with an extra >> dimension, but the NetCDF dimensions align with the natural dimensions of >> the data. >> 3) use “count” instead of a “start pointer” approach, we are better aligned >> with the existing DSG contiguous ragged array approach. >> >> Coming back to the three directions we could take this proposal from my >> cover letter on February 2nd. >>> Direct use of Well-Known Text (WKT). In this approach, well known text >>> strings would be encoded using character arrays following a contiguous >>> ragged array approach to index the character array by geometry (or instance >>> in DSG parlance). >>> Implement the WKT approach using a NetCDF binary array. In this approach, >>> well known text separators (brackets, commas and spaces) for multipoint, >>> multiline, multipolygon, and polygon holes, would be encoded as break type >>> separator values like -1 for multiparts and -2 for holes. >>> Implement the fundamental dimensions of geometry data in NetCDF. In this >>> approach, additional dimensions and variables along those dimensions would >>> be introduced to represent geometries, geometry parts, geometry nodes, and >>> unique (potentially shared) coordinate locations for nodes to reference. >> The alternative I’m outlining here moves in the direction of 3. We had >> originally discounted it because it becomes very verbose and seems overly >> complicated if support for coordinate sharing is a requirement. If the three >> simplifications described above are used, then the third approach seems more >> tenable. >> >> Jonathan has also suggested that: (these are in reaction to the CDL in my >> letter from February 2nd) >> 1) Rename geom_coordinates as node_coordinates, for consistency with UGRID. >> 2) Omit node_dimension. This is redundant, since the dimension can be found >> by >> examining the node coordinate variables. >> 3) Prescribe numerous “codes” and assumptions in the specification instead >> of letting them be described with attribute values. >> 4) It would be more consistent with CF and UGRID to use a single container >> variable to hang all the topology/geometry information from. >> >> Which I, personally, am happy to accept if others don’t object. >> >> A couple other suggestions from Jonathan I want to discuss a bit more: >> 1) Rename geometry as topology and geom_type as topology_type. >> While I’d be open to something other than geom, topology is odd. If >> this is really “node_collection_topology_type” I guess I could be convinced, >> but would be curious how people react to this. (Especially in relation to >> UGRID) >> 2) This extension is more appropriate as an extension to the concept of cell >> bounds than the addition of a complex time-invariate type of discrete >> sampling geometry. >> Having just re-read the cell bounds chapter, I think it would over >> complicate the cell bounds to include this material. My basic issue here is >> that these geometries do not necessarily have a reference location. They >> are, rather, first order entities that need to be treated as such. That >> said, it makes sense that these geometries are not necessarily a good fit >> for the original intent of Discrete Sampling Geometries. Jonathan suggested >> they may belong in their own chapter, which may be a good alternative? MY >> suggested CDL below might lead us in the direction of this being a special >> type of auxiliary coordinate variable. >> >> This alternative starts to look like the CDL pasted below. >> >> Note that the issue of coordinates is sticking out like a sore thumb. Below, >> I’ve attempted to reconcile Jonathan’s ideas regarding coordinates with my >> thoughts about how these geometries are “first order entities” that don’t >> have a single representative x and y. The spatial coordinates can be said to >> reside in the system of geometries described in the “sf” container variable? >> I realize this goes against the idea of coordinates a bit, but I think it is >> holding with the spirit of the attribute? >> >> Finally, I’m glad to continue answering questions and debating things via >> the list to a point, but I think it would be in our interest to arrange a >> telecom to discuss this stuff further with a list of interested parties. >> Feel free to follow up on list, but for decision making, let’s not let this >> rabbit hole go too deep. I’ll plan on letting this and the other recent >> action on this proposal settle with people for a week or two then start to >> bring together a conference call (or calls depending on time zones). Please >> respond to me off list if you are interested in being part of a call to >> discuss. >> >> Regards, >> >> - Dave >> >> netcdf multipolygon_example { >> dimensions: >> node = 47 ; >> part = 9 ; >> instance = 3 ; >> time = 5 ; >> strlen = 5 ; >> variables: >> char instance_name(instance, strlen) ; >> instance_name:cf_role = "timeseries_id" ; >> double someVariable(instance) ; >> someVariable:long_name = "a variable describing a single-valued attribute >> of a polygon" ; >> someVariable:coordinates = "sf" ; // or "instance_name"? >> int time(time) ; >> time:units = "days since 2000-01-01" ; >> double someData(instance, time) ; >> someData:coordinates = "time sf" ; // or "time instance_name"? >> someData:featureType = "timeSeries" ; >> someData:geometry="sf"; >> int sf; // containing variable -- datatype irrelevant because no data >> sf:geom_type = "multipolygon" ; // could be node_topology_type? >> sf:node_count_variable="node_count"; >> sf:node_coordinates = "x y" ; >> sf:part_count = "part_node_count" ; >> sf:part_type = "part_type" ; // Note required unless polygons with holes >> present. >> sf:outer_ring_order = "anticlockwise" ; // not required if written in >> spec? >> sf:closure_convention = "last_node_equals_first" ; // not required if >> written in spec? >> sf:outer_type_code = 0 ; // not required if written in spec? >> sf:inner_type_code = 1 ; // not required if written in spec? >> int node_count(instance); >> node_count:long_name = “count of coordinates in each instance geometry" ; >> int part_node_count(part) ; >> part_node_count:long_name = “count of coordinates in each geometry part" ; >> int part_type(part) ; >> part_type:long_name = “type of each geometry part" ; >> double x(node) ; >> x:units = "degrees_east" ; >> x:standard_name = "longitude" ; // or projection_x_coordinate >> X:cf_role = "geometry_x_node" ; >> double y(node) ; >> y:units = "degrees_north" ; >> y:standard_name = “latitude” ; // or projection_y_coordinate >> y:cf_role = "geometry_y_node" >> // global attributes: >> :Conventions = "CF-1.8" ; >> >> data: >> >> instance_name = >> "flash", >> "bang", >> "pow" ; >> >> someVariable = 1, 2, 3 ; >> >> time = 1, 2, 3, 4, 5 ; >> >> someData = >> 1, 2, 3, 4, 5, >> 1, 2, 3, 4, 5, >> 1, 2, 3, 4, 5 ; >> >> node_count = 25, 15, 7 ; >> >> part_node_count = 5, 4, 4, 4, 4, 8, 6, 8, 4 ; >> >> part_type = 0, 1, 1, 1, 0, 0, 0, 1, 0 ; >> >> x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, >> 5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, -20, >> -30, -20, -20, -30, 30, >> 45, 10, 30, 25, 50, 30, 25 ; >> >> y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, 25, >> 29, >> 25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, -35, >> -20, -15, -25, -20, 20, >> 40, 40, 20, 5, 10, 15, 5 ; >> } >> >> >> >>> On Feb 4, 2017, at 8:07 AM, David Blodgett <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Dear Chris, >>> >>> Thanks for your thorough treatment of these issues. We have gone through a >>> similar thought process to arrive at the proposal we came up with. I’ll >>> answer as briefly as I can. >>> >>> 1) how would you translate between netcdf geometries and, say geo JSON? >>> >>> The thinking is that node coordinate sharing is optional. If the writer >>> wants to check or already knows that nodes share coordinates, then it’s >>> possible. Otherwise, it doesn’t have to be used. I’ve always felt that this >>> was important, but maybe not critical for a core NetCDF-CF data model. Some >>> offline conversation has led to an example that does not use it that may be >>> a good alternative, more on that later. >>> >>> 2) Break Values >>> >>> You really do have to hold your nose on the break values. The issue is that >>> you have to store that information somehow and it is almost worse to create >>> new variables to store the multi-part and hole/not hole information. The >>> alternative approach that’s forming up as mentioned above does break the >>> information out into additional variables but simplifies things otherwise. >>> In that case it doesn’t feel overly complex to me… so stay tuned for more >>> on this front. >>> >>> 3) Ragged Indexing >>> >>> Your thought process follows ours exactly. The key is that you either have >>> to create the “pointer” array as a first order of business or loop over the >>> counts ad nauseam. I’m actually leaning toward the counts for two reasons. >>> First, the counts approach is already in CF so is a natural fit and will be >>> familiar to developers in this space. Second, the issue of 0 vs 1 indexing >>> is annoying. In our proposal, we settled on 0 indexing because it aligns >>> with the idea of an offset, but it is still annoying and some applications >>> would always have to adjust that pointer array as a first order of >>> business. >>> >>> On to Bob’s comments. >>> >>> Regarding aligning with other data models / encodings, I guess this needs >>> to be unpacked a bit. >>> >>> 1) In this setting, simple features is a data model, not an encoding. An >>> encoding can implement part or all of a data model as is needed by the use >>> case(s) at hand. There is no problem with partial implementations you still >>> get interoperability for the intended use cases. >>> 2) Attempting to align with other encoding standards UGRID and NetCDF-CF >>> are the primary ones here, is simply to keep the implementation patterns >>> similar and familiar. This may be a fools errand, but is presumably good >>> for adoptability and consistency. >>> So, I don’t see a problem with implementing important simple features types >>> in a way that aligns with the way the existing community standards work. >>> >>> I don’t see this as ignoring existing standards at all. There is no open >>> community standard for binary encoding of geometries and related data that >>> passes the CF requirements of human readability and self-description. We >>> are adopting the appropriate data model and suggesting a new encoding that >>> will solve a lot of problems in the environmental modeling space. >>> >>> As we’ve discussed before, your "different approach” sounds great, but >>> seems like an exercise for a future effort that doesn’t attempt to align >>> with CF 1.7. Maybe what you suggest is a path forward for variable length >>> arrays in the CF 2.0 “vision in the mist”, but I don’t see it as a tenable >>> solution for CF 1.*. >>> >>> Best Regards, >>> >>> - Dave >>> >>> >>>> On Feb 3, 2017, at 3:31 PM, Chris Barker <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> a few thoughts. First, I think there are three core "issues" that need to >>>> be resolved: >>>> >>>> 1) Coordinate indexing (indirection) >>>> >>>> the question of whether you have an array of "vertices" that the geomotry >>>> types index into to get thier data: >>>> >>>> Advantages: >>>> - if a number of geometries share a lot of vertices, it can be more >>>> efficient >>>> - the relationship between geometries that share vertices (i.e. polygons >>>> that share a boundary) etc. is well defined. you dopnt need to check for >>>> closeness, and maybe have a tolerance, etc. >>>> >>>> These were absolutely critical for UGRID for example -- a UGRID mesh is a >>>> single thing", NOT a collection of polygons that happen to share some >>>> vertices. >>>> >>>> Disadvantages: >>>> - if the geometries do not share many vertices, it is less efficient. >>>> - there are additional code complications in "getting" the vertices of >>>> the given geometry >>>> - it does not match the OGC data model. >>>> >>>> My 0.02 -- given my use cases, I tend to want teh advantages -- but I >>>> don't know that that's a typical use case. And I think it's a really good >>>> idea to keep with the OGS data model where possible -- i.e. e able to >>>> translate from netcdf to, say, geoJSON as losslessly as possible. Given >>>> that I think it's probably a better idea not to have the indirection. >>>> >>>> However (to equivocate) perhaps the types of information people are likely >>>> to want to store in netcdf are a subset of what the OGC standards are >>>> designed for -- and for those use-cases, maybe shared vertices are >>>> critical. >>>> >>>> One way to think about it -- how would you translate between netcdf >>>> geometries and, say geo JSON: >>>> - nc => geojson would lose the shared index info. >>>> - geojson => nc -- would you try to reconstruct the shared vertices?? >>>> I"m thinking that would be a bit dangerous in the general case, because >>>> you are adding information that you don't know is true -- are these a >>>> shared vertex or two that just happen to be at the same location? >>>> >>>> > > Break values >>>> >>>> I don't really like break values as an approach, but with netcdf any >>>> option will be ugly one way or another. So keeping with the WKT approach >>>> makes sense to me. Either way you'll need custom code to unpack it. (BTW >>>> -- what does WellKnownBinary do?) >>>> >>>> > > Ragged indexing >>>> >>>> There are two "natural" ways to represent a ragged array: >>>> >>>> (a) store the length of each "row" >>>> (b) store the index to the beginning (or end) or each "row" >>>> >>>> CF already uses (a). However, working with it, I'm pretty convinced that >>>> it's the "wrong" choice: >>>> >>>> If you want to know how long a given row is, that is really easy with (a), >>>> and almost as easy with (b) (involves two indexes and a subtraction) >>>> >>>> However, if you want to extract a particular row: (b) makes this really >>>> easy -- you simply access the slice of the array you want. with (a) you >>>> need to loop through the entire "length_of_rows" array (up to the row of >>>> interest) and add up the values to find the slice you need. not a huge >>>> issue, but it is an issue. In fact, in my code to read ragged arrays in >>>> netcdf, the first thing I do is pre-compute the index-to-each-row, so I >>>> can then use that to access individual rows for future access -- if you >>>> are accessing via OpenDAP -- that's particular helpful. >>>> >>>> So -- (b) is clearly (to me) the "best" way to do it -- but is it worth >>>> introducing a second way to handle ragged arrays in CF? I would think yes, >>>> but that would be offset if: >>>> >>>> - There is a bunch of existing library code that transparently handles >>>> ragged arrays in netcdf (does netcdfJava have something? I'm pretty sure >>>> Python doesn't -- certainly not in netCDF4) >>>> >>>> - That that existing lib code would be advantageous to leverage for code >>>> reading features: I suspect that there will have to be enough custom code >>>> that the ragged array bits are going to be the least of it. >>>> >>>> So I'm for the "new" way of representing ragged arrays >>>> >>>> -CHB >>>> >>>> >>>> On Fri, Feb 3, 2017 at 11:41 AM, Bob Simons - NOAA Federal >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> Then, isn't this proposal just the first step in the creation of a new >>>> model and a new encoding of Simple Features, one that is "align[ed] ... >>>> with as many other encoding standards in this space as is practical"? In >>>> other words, yet another standard for Simple Features? >>>> >>>> If so, it seems risky to me to take just the first (easy?) step "to >>>> support the use cases that have a compelling need today" and not solve the >>>> entire problem. I know the CF way is to just solve real, current needs, >>>> but in this case it seems to risk a head slap moment in the future when we >>>> realize that, in order to deal with some new simple feature variant, we >>>> should have done things differently from the beginning? >>>> >>>> And it seems odd to reject existing standards that have been so >>>> painstakingly hammered out, in favor of starting the process all over >>>> again. We follow existing standards for other things (e.g., IEEE-754 for >>>> representing floating point numbers in binary files), why can't we follow >>>> an existing Simple Features standard? >>>> >>>> --- >>>> Rather than just be a naysayer, let me suggest a very different >>>> alternative: >>>> >>>> There are several projects in the CF realm (e.g., this Simple Features >>>> project, Discrete Sampling Geometry (DSG), true variable-length Strings, >>>> ugrid(?)) which share a common underlying problem: how to deal with >>>> variable-length multidimensional arrays: a[b][c], where the length of the >>>> c dimension may be different for different b indices. >>>> DSG solved this (5 different ways!), but only for DSG. >>>> The Simple Features proposal seeks to solve the problem for Simple >>>> Features. >>>> We still have no support for Unicode variable-length Strings. >>>> >>>> Instead of continuing to solve the variable-length problem a different way >>>> every time we confront it, shouldn't we solve it once, with one small >>>> addition to the standard, and then use that solution repeatedly? >>>> The solution could be a simple variant of one of the DSG solutions, but >>>> generalized so that it could be used in different situations. >>>> An encoding standard and built-in support for variable-length data arrays >>>> in netcdf-java/c would solve a lot of problems, now and in the future. >>>> Some work on this is already done: I think the netcdf-java API already >>>> supports variable-length arrays when reading netcdf-4 files. >>>> For Simple Features, the problem would reduce to: store the feature (using >>>> some specified existing standard like WKT or WKB) in a variable-length >>>> array. >>>> >>>> >>>> >>>> >>>> >>>> On Fri, Feb 3, 2017 at 9:07 AM, <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> Date: Fri, 3 Feb 2017 11:07:00 -0600 >>>> From: David Blodgett <[email protected] <mailto:[email protected]>> >>>> To: Bob Simons - NOAA Federal <[email protected] >>>> <mailto:[email protected]>> >>>> Cc: CF Metadata <[email protected] >>>> <mailto:[email protected]>> >>>> Subject: Re: [CF-metadata] Extension of Discrete Sampling Geometries >>>> for Simple Features >>>> Message-ID: <[email protected] >>>> <mailto:[email protected]>> >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Dear Bob, >>>> >>>> I?ll just take these in line. >>>> >>>> 1) noted. We have been trying to figure out what to do with the point >>>> featureType and I think leaving it more or less alone is a viable path >>>> forward. >>>> >>>> 2) This is not an exact replica of WKT, but rather a similar approach to >>>> WKT. As I stated, we have followed the ISO simple features data model and >>>> well known text feature types in concept, but have not used the same >>>> standardization formalisms. We aren?t advocating for supporting ?all of? >>>> any standard but are rather attempting to support the use cases that have >>>> a compelling need today while aligning this with as many other encoding >>>> standards in this space as is practical. Hopefully that answers your >>>> question, sorry if it?s vague. >>>> >>>> 3) The google doc linked in my response contains the encoding we are >>>> proposing as a starting point for conversation: http://goo.gl/Kq9ASq >>>> <http://goo.gl/Kq9ASq> <http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq>> I >>>> want to stress, as a starting point for discussion. I expect that this >>>> proposal will change drastically before we?re done. >>>> >>>> 4) Absolutely envision tools doing what you say, convert to/from standard >>>> spatial formats and NetCDF-CF geometries. We intend to introduce an R and >>>> a Python implementation that does exactly as you say along with whatever >>>> form this standard takes in the end. R and Python were chosen as the team >>>> that brought this together are familiar with those two languages, >>>> additional implementations would be more than welcome. >>>> >>>> 5) We do include a ?geometry? featureType similar to the ?point? >>>> featureType. Thus our difficulty with what to do with the ?point? >>>> featureType. You are correct, there are lots of non timeSeries >>>> applications to be solved and this proposal does intend to support them >>>> (within the existing DSG constructs). >>>> >>>> Thanks for your questions, hopefully my answers close some gaps for you. >>>> >>>> - Dave >>>> >>>> > On Feb 3, 2017, at 10:47 AM, Bob Simons - NOAA Federal >>>> > <[email protected] <mailto:[email protected]>> wrote: >>>> > >>>> > 1) There is a vague comment in the proposal about possibly changing the >>>> > point featureType. Please don't, unless the changes don't affect current >>>> > uses of Point. There are already 1000's of files that use it. If this >>>> > new system offers an alternative, then fine, it's an alternative. One of >>>> > the most important and useful features of a good standard is backwards >>>> > compatibility. >>>> > >>>> > 2) You advocate "Implement the WKT approach using a NetCDF binary >>>> > array." Is this system then an exact encoding of WKT, neither a subset >>>> > nor a superset? "Simple Features" are often not simple. >>>> > If it is WKT (or something else), what is the standard you are following >>>> > to describe the Simple Features (e.g., ISO/IEC 13249-3:2016 and ISO >>>> > 19162:2015)? >>>> > Does your proposal deviate in any way from the standard's capabilities? >>>> > Do you advocate following the entire WKT standard, e.g., supporting all >>>> > the feature types that WKT supports? >>>> > >>>> > 3) Since you are not using the WKT encoding, but creating your own, >>>> > where is the definition of the encoding system you are using? >>>> > >>>> > 4) This is a little out of CF scope, but: >>>> > Do you envision tools, notably, netcdf-c/java, having a writer function >>>> > that takes in WKT and encodes the information in a file, and having a >>>> > reader function that reads the file and returns WKT? Or is it your plan >>>> > that the encoding/ decoding is left to the user? >>>> > >>>> > 5) This proposal is for "Simple Features plus Time Series" (my phrase >>>> > not yours). But aren't there lots of other uses of Simple Features? Will >>>> > there be other proposals in the future for "Simple Features plus X" and >>>> > "Simple Features plus Y"? If so, will CF eventually become a massive >>>> > document where Simple Features are defined over and over again, but in >>>> > different contexts? If so, wouldn't a better solution be to deal with >>>> > Simple Features separately (as Postgres does by making a geometric data >>>> > type?), and then add "Simple Features plus Time Series" as the first use >>>> > of it? >>>> > >>>> > Thanks for answering these questions. >>>> > Please forgive me if I missed parts of your proposal that answer these >>>> > questions. >>>> > >>>> > >>>> > On Thu, Feb 2, 2017 at 5:57 AM, <[email protected] >>>> > <mailto:[email protected]> >>>> > <mailto:[email protected] >>>> > <mailto:[email protected]>>> wrote: >>>> > Date: Thu, 2 Feb 2017 07:57:36 -0600 >>>> > From: David Blodgett <[email protected] <mailto:[email protected]> >>>> > <mailto:[email protected] <mailto:[email protected]>>> >>>> > To: <[email protected] <mailto:[email protected]> >>>> > <mailto:[email protected] <mailto:[email protected]>>> >>>> > Subject: [CF-metadata] Extension of Discrete Sampling Geometries for >>>> > Simple Features >>>> > Message-ID: <[email protected] >>>> > <mailto:[email protected]> >>>> > <mailto:[email protected] >>>> > <mailto:[email protected]>>> >>>> > Content-Type: text/plain; charset="utf-8" >>>> > >>>> > Dear CF Community, >>>> > >>>> > We are pleased to submit this proposal for your consideration and >>>> > review. The cover letter we've prepared below provides some background >>>> > and explanation for the proposed approach. The google doc here >>>> > <http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq> <http://goo.gl/Kq9ASq >>>> > <http://goo.gl/Kq9ASq>>> is an excerpt of the CF specification with >>>> > track changes turned on. Permissions for the document allow any google >>>> > user to comment, so feel free to comment and ask questions in line. >>>> > >>>> > Note that I?m sharing this with you with one issue unresolved. What to >>>> > do with the point featureType? Our draft suggests that it is part of a >>>> > new geometry featureType, but it could be that we leave it alone and >>>> > introduce a geometry featureType. This may be a minor point of >>>> > discussion, but we need to be clear that this is an issue that still >>>> > needs to be resolved in the proposal. >>>> > >>>> > Thank you for your time and consideration. >>>> > >>>> > Best Regards, >>>> > >>>> > David Blodgett, Tim Whiteaker, and Ben Koziol >>>> > >>>> > Proposed Extension to NetCDF-CF for Simple Geometries >>>> > >>>> > Preface >>>> > >>>> > The proposed addition to NetCDF-CF introduced below is inspired by a >>>> > pre-existing data model governed by OGC and ISO as ISO 19125-1. More >>>> > information on Simple Features may be found here. >>>> > <https://en.wikipedia.org/wiki/Simple_Features >>>> > <https://en.wikipedia.org/wiki/Simple_Features> >>>> > <https://en.wikipedia.org/wiki/Simple_Features >>>> > <https://en.wikipedia.org/wiki/Simple_Features>>> To the knowledge of >>>> > the authors, it is consistent with ISO 19125-1 but has not been >>>> > specified using the formalisms of OGC or ISO. Language used attempts to >>>> > hold true to NetCDF-CF semantics while not conflicting with the existing >>>> > standards baseline. While this proposal does not support the entire >>>> > scope of the the simple features ecosystem, it does support the core >>>> > data types in most common use around the community. >>>> > >>>> > The other existing standard to mention is UGRID convention >>>> > <http://ugrid-conventions.github.io/ugrid-conventions/ >>>> > <http://ugrid-conventions.github.io/ugrid-conventions/> >>>> > <http://ugrid-conventions.github.io/ugrid-conventions/ >>>> > <http://ugrid-conventions.github.io/ugrid-conventions/>>>. The authors >>>> > have experience reading and writing UGRID and have designed the proposed >>>> > structure in a way that is inspired by and consistent with it. >>>> > >>>> > Terms and Definitions >>>> > >>>> > (Taken from OGC 06-103r4 OpenGIS Implementation Specification for >>>> > Geographic information - Simple feature access - Part 1: Common >>>> > architecture <http://www.opengeospatial.org/standards/sfa >>>> > <http://www.opengeospatial.org/standards/sfa> >>>> > <http://www.opengeospatial.org/standards/sfa >>>> > <http://www.opengeospatial.org/standards/sfa>>>.) >>>> > >>>> > Feature: Abstraction of real world phenomena - typically a geospatial >>>> > abstraction with associated descriptive attributes. >>>> > Simple Feature: A feature with all geometric attributes described >>>> > piecewise by straight line or planar interpolation between point sets. >>>> > Geometry (geometric complex): A set of disjoint geometric primitives - >>>> > one or more points, lines, or polygons that form the spatial >>>> > representation of a feature. >>>> > Introduction >>>> > >>>> > Discrete Sampling Geometries (DSGs) handle data from one (or a >>>> > collection of) timeSeries (point), Trajectory, Profile, >>>> > TrajectoryProfile or timeSeriesProfile geometries. Measurements are from >>>> > a point (timeSeries and Profile) or points along a trajectory. In this >>>> > proposal, we reuse the core DSG timeSeries type which provides support >>>> > for basic time series use cases e.g., a timeSerieswhich is measured (or >>>> > modeled) at a given point. >>>> > >>>> > Changes to Existing CF Specification >>>> > >>>> > In NetCDF-CF 1.7, Discrete Sampling Geometries separate dimensions and >>>> > variables into two types ? instance and element >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements >>>> > >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements> >>>> > >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements >>>> > >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements>>>. >>>> > Instance refers to individual points, trajectories, profiles, etc. >>>> > These would sometimes be referred to as features given that they are >>>> > identified entities that can have associated attributes and be related >>>> > to other entities. Element dimensions describe temporal or other >>>> > dimensions to describe data on a per-instance basis. This proposal >>>> > extends the DSG timeSeries featuretype >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types >>>> > >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types> >>>> > >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types >>>> > >>>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types>>> >>>> > such that the geospatial coordinates of the instances can be point, >>>> > multi-point, line, multi-line, polygon, or multi-polyg >>>> on geometries. Rather than overload the DSG contiguous ragged array >>>> encoding, designed with timeseries in mind, a geometry ragged array >>>> encoding is introduced in a new section 9.3.5. See thi >>>> > s google doc for specific proposed changes. <http://goo.gl/Kq9ASq >>>> > <http://goo.gl/Kq9ASq> <http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq>>> >>>> > Motivation >>>> > >>>> > DSGs have no system to define a geometry (polyline, polygon, etc., other >>>> > than point) and an association with a time series that applies over that >>>> > entire geometry e.g., The expected rainfall in this watershed polygon >>>> > for some period of time is 10 mm. As suggested in the last paragraph of >>>> > section 9.1, current practice is to assign a representative point or >>>> > just use an ID and forgo spatial information within a NetCDF-CF file. In >>>> > order to satisfy a number of environmental modeling use cases, we need a >>>> > way to encode a geometry (point, line, polygon, multi-point, multi-line, >>>> > or multi-polygon) that is the static spatial feature representation to >>>> > which one or more timeSeries can be associated. In this proposal, we >>>> > provide an encoding to define collections of simple feature geometries. >>>> > It interfaces cleanly with the existing DSG specification, enabling DSGs >>>> > and Simple Geometries to be used concurrently. >>>> > >>>> > Looking Forward >>>> > >>>> > This proposal is a compromise solution that attempts to stay consisten >>>> > to CF ideals and fit within the structure of the existing specification >>>> > with minimal disruption. Line and polygon data types often require >>>> > variable length arrays. Development of this proposal has brought to >>>> > light the need for a general abstraction for variable length arrays in >>>> > NetCDF-CF. Such a general abstraction would necessarily be reusable for >>>> > character arrays, ragged arrays of time series, and ragged arrays of >>>> > geometry nodes, as well as any other ragged data structures that may >>>> > come up in the future. This proposal does not introduce such a general >>>> > ragged array abstraction but does not preclude such a development in the >>>> > future. >>>> > >>>> > Three Alternative Approaches >>>> > >>>> > Respecting the human readability ideal of NetCDF-CF, the development of >>>> > this proposal started from a human readable format for geometries known >>>> > as Well Known Text <https://en.wikipedia.org/wiki/Well-known_text >>>> > <https://en.wikipedia.org/wiki/Well-known_text> >>>> > <https://en.wikipedia.org/wiki/Well-known_text >>>> > <https://en.wikipedia.org/wiki/Well-known_text>>>. We considered three >>>> > high level design approaches while developing this proposal. >>>> > >>>> > Direct use of Well-Known Text (WKT). In this approach, well known text >>>> > strings would be encoded using character arrays following a contiguous >>>> > ragged array approach to index the character array by geometry (or >>>> > instance in DSG parlance). >>>> > Implement the WKT approach using a NetCDF binary array. In this >>>> > approach, well known text separators (brackets, commas and spaces) for >>>> > multipoint, multiline, multipolygon, and polygon holes, would be encoded >>>> > as break type separator values like -1 for multiparts and -2 for holes. >>>> > Implement the fundamental dimensions of geometry data in NetCDF. In this >>>> > approach, additional dimensions and variables along those dimensions >>>> > would be introduced to represent geometries, geometry parts, geometry >>>> > nodes, and unique (potentially shared) coordinate locations for nodes to >>>> > reference. >>>> > Selected Approach >>>> > >>>> > The first approach was seen as too opaque to stay true to the CF ideal >>>> > of complete self-description. The third approach seemed needlessly >>>> > verbose and difficult to implement. The second approach was selected for >>>> > the following reasons: >>>> > >>>> > The second approach is just as or more human-readable than the third. >>>> > Use of break values keeps geometries relatively atomic. >>>> > Will be familiar to developers who are familiar with the WKT geometry >>>> > format. >>>> > Character arrays, which are needed for options one and three, are >>>> > cumbersome to use in some programming languages in common use with >>>> > NetCDF. >>>> > Break values replace the need for extraneous variables related to >>>> > multi-part and polygon holes (interiors). Multi-part geometries are >>>> > generally an exception and excessive instrumentation to support them >>>> > should be discounted. >>>> > Example: Representation of WKT-Style Polygons in a NetCDF-3 >>>> > timeSeriesfeatureType >>>> > >>>> > Below is sample CDL demonstrating how polygons are encoded in NetCDF-3 >>>> > using a continuous ragged array-like encoding. There are three details >>>> > to note in the example below. >>>> > >>>> > The attribute contiguous_ragged_dimension with value of a dimension in >>>> > the file. >>>> > The geom_coordinates attribute with a value containing a space separated >>>> > string of variable names. >>>> > The cf_role geometry_x_node and geometry_y_node. >>>> > These three attributes form a system to fully describe collections of >>>> > multi-polygon feature geometries. Any variable that has the >>>> > continuous_ragged_dimension attribute contains integers that indicate >>>> > the 0-indexed starting position of each geometry along the instance >>>> > dimension. Any variable that uses the dimension referenced in the >>>> > continuous_ragged_dimension attribute can be interpreted using the >>>> > values in the variable containing the contiguous_ragged_dimension >>>> > attribute. The variables referenced in the geom_coordinates attribute >>>> > describe spatial coordinates of geometries. These variables can also be >>>> > identified by the cf_roles geometry_x_node and geometry_y_node. Note >>>> > that the example below also includes a mechanism to handle multi-polygon >>>> > features that also contain holes. >>>> > >>>> > netcdf multipolygon_example { >>>> > dimensions: >>>> > node = 47 ; >>>> > indices = 55 ; >>>> > instance = 3 ; >>>> > time = 5 ; >>>> > strlen = 5 ; >>>> > variables: >>>> > char instance_name(instance, strlen) ; >>>> > instance_name:cf_role = "timeseries_id" ; >>>> > int coordinate_index(indices) ; >>>> > coordinate_index:geom_type = "multipolygon" ; >>>> > coordinate_index:geom_coordinates = "x y" ; >>>> > coordinate_index:multipart_break_value = -1 ; >>>> > coordinate_index:hole_break_value = -2 ; >>>> > coordinate_index:outer_ring_order = "anticlockwise" ; >>>> > coordinate_index:closure_convention = "last_node_equals_first" ; >>>> > int coordinate_index_start(instance) ; >>>> > coordinate_index_start:long_name = "index of first coordinate in >>>> > each instance geometry" ; >>>> > coordinate_index_start:contiguous_ragged_dimension = "indices" ; >>>> > double x(node) ; >>>> > x:units = "degrees_east" ; >>>> > x:standard_name = "longitude" ; // or projection_x_coordinate >>>> > X:cf_role = "geometry_x_node" ; >>>> > double y(node) ; >>>> > y:units = "degrees_north" ; >>>> > y:standard_name = ?latitude? ; // or projection_y_coordinate >>>> > y:cf_role = "geometry_y_node" >>>> > double someVariable(instance) ; >>>> > someVariable:long_name = "a variable describing a single-valued >>>> > attribute of a polygon" ; >>>> > int time(time) ; >>>> > time:units = "days since 2000-01-01" ; >>>> > double someData(instance, time) ; >>>> > someData:coordinates = "time x y" ; >>>> > someData:featureType = "timeSeries" ; >>>> > // global attributes: >>>> > :Conventions = "CF-1.8" ; >>>> > >>>> > data: >>>> > >>>> > instance_name = >>>> > "flash", >>>> > "bang", >>>> > "pow" ; >>>> > >>>> > coordinate_index = 0, 1, 2, 3, 4, -2, 5, 6, 7, 8, -2, 9, 10, 11, 12, >>>> > -2, 13, 14, 15, 16, >>>> > -1, 17, 18, 19, 20, -1, 21, 22, 23, 24, 25, 26, 27, 28, -1, 29, 30, >>>> > 31, 32, 33, >>>> > 34, -2, 35, 36, 37, 38, 39, 40, 41, 42, -1, 43, 44, 45, 46 ; >>>> > >>>> > coordinate_index_start = 0, 30, 46 ; >>>> > >>>> > x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, >>>> > 5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, -20, >>>> > -30, -20, -20, -30, 30, >>>> > 45, 10, 30, 25, 50, 30, 25 ; >>>> > >>>> > y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, >>>> > 25, 29, >>>> > 25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, -35, >>>> > -20, -15, -25, -20, 20, >>>> > 40, 40, 20, 5, 10, 15, 5 ; >>>> > >>>> > someVariable = 1, 2, 3 ; >>>> > >>>> > time = 1, 2, 3, 4, 5 ; >>>> > >>>> > someData = >>>> > 1, 2, 3, 4, 5, >>>> > 1, 2, 3, 4, 5, >>>> > 1, 2, 3, 4, 5 ; >>>> > } >>>> > How To Interpret >>>> > >>>> > Starting from the timeSeries variables: >>>> > >>>> > See CF-1.8 conventions. >>>> > See the timeSeries featureType. >>>> > Find the timeseries_id cf_role. >>>> > Find the coordinates attribute of data variables. >>>> > See that the variables indicated by the coordinates attribute have a >>>> > cf_role geometry_x_nodeand geometry_y_node to determine that these are >>>> > geometries according to this new specification. >>>> > Find the coordinate index variable with geom_coordinates that point to >>>> > the nodes. >>>> > Find the variable with contiguous_ragged_dimension pointing to the >>>> > dimension of the coordinate index variable to determine how to index >>>> > into the coordinate index. >>>> > Iterate over polygons, parsing out geometries using the contiguous >>>> > ragged start variable and coordinate index variable to interpret the >>>> > coordinate data variables. >>>> > Or, without reference to timeSeries: >>>> > >>>> > See CF-1.8 conventions. >>>> > See the geom_type of multipolygon. >>>> > Find the variable with a contiguous_ragged_dimension matching the >>>> > coordinate index variable?s dimension. >>>> > See the geom_coordinates of x y. >>>> > Using the contiguous ragged start variable found in 3 and the coordinate >>>> > index variable found in 2, geometries can be parsed out of the >>>> > coordinate index variable and parsed using the hole and break values in >>>> > it. >>>> > >>>> > -------------- next part -------------- >>>> > An HTML attachment was scrubbed... >>>> > URL: >>>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html >>>> > >>>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html> >>>> > >>>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html >>>> > >>>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html>>> >>>> > >>>> > ------------------------------ >>>> > >>>> > Subject: Digest Footer >>>> > >>>> > _______________________________________________ >>>> > CF-metadata mailing list >>>> > [email protected] <mailto:[email protected]> >>>> > <mailto:[email protected] <mailto:[email protected]>> >>>> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> >>>> > >>>> > >>>> > ------------------------------ >>>> > >>>> > End of CF-metadata Digest, Vol 166, Issue 3 >>>> > ******************************************* >>>> > >>>> > >>>> > >>>> > -- >>>> > Sincerely, >>>> > >>>> > Bob Simons >>>> > IT Specialist >>>> > Environmental Research Division >>>> > NOAA Southwest Fisheries Science Center >>>> > 99 Pacific St., Suite 255A (New!) >>>> > Monterey, CA 93940 (New!) >>>> > Phone: (831)333-9878 <tel:%28831%29333-9878> (New!) >>>> > Fax: (831)648-8440 <tel:%28831%29648-8440> >>>> > Email: [email protected] <mailto:[email protected]> >>>> > <mailto:[email protected] <mailto:[email protected]>> >>>> > >>>> > The contents of this message are mine personally and >>>> > do not necessarily reflect any position of the >>>> > Government or the National Oceanic and Atmospheric Administration. >>>> > <>< <>< <>< <>< <>< <>< <>< <>< <>< >>>> > >>>> > _______________________________________________ >>>> > CF-metadata mailing list >>>> > [email protected] <mailto:[email protected]> >>>> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>>> >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: >>>> <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170203/4ff55def/attachment.html >>>> >>>> <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170203/4ff55def/attachment.html>> >>>> >>>> ------------------------------ >>>> >>>> Subject: Digest Footer >>>> >>>> _______________________________________________ >>>> CF-metadata mailing list >>>> [email protected] <mailto:[email protected]> >>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>>> <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>>> >>>> >>>> ------------------------------ >>>> >>>> End of CF-metadata Digest, Vol 166, Issue 5 >>>> ******************************************* >>>> >>>> >>>> >>>> -- >>>> Sincerely, >>>> >>>> Bob Simons >>>> IT Specialist >>>> Environmental Research Division >>>> NOAA Southwest Fisheries Science Center >>>> 99 Pacific St., Suite 255A (New!) >>>> Monterey, CA 93940 (New!) >>>> Phone: (831)333-9878 <tel:(831)%20333-9878> (New!) >>>> Fax: (831)648-8440 <tel:(831)%20648-8440> >>>> Email: [email protected] <mailto:[email protected]> >>>> >>>> The contents of this message are mine personally and >>>> do not necessarily reflect any position of the >>>> Government or the National Oceanic and Atmospheric Administration. >>>> <>< <>< <>< <>< <>< <>< <>< <>< <>< >>>> >>>> >>>> _______________________________________________ >>>> CF-metadata mailing list >>>> [email protected] <mailto:[email protected]> >>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>>> <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Christopher Barker, Ph.D. >>>> Oceanographer >>>> >>>> Emergency Response Division >>>> NOAA/NOS/OR&R (206) 526-6959 voice >>>> 7600 Sand Point Way NE (206) 526-6329 fax >>>> Seattle, WA 98115 (206) 526-6317 main reception >>>> >>>> [email protected] >>>> <mailto:[email protected]>_______________________________________________ >>>> CF-metadata mailing list >>>> [email protected] <mailto:[email protected]> >>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>>> <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>> >> >
_______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
