All, I haven’t heard much follow up, but here’s a doodle to coordinate a phone conversation about this. I think we have west-coast US participants and EU participants, so I chose times mid to late morning for me (midwest US).
http://doodle.com/poll/eikarnt35tdm7igd <http://doodle.com/poll/eikarnt35tdm7igd> Will make a call once a few people have expressed interest and we have a clear day/time. Regards, - Dave > On Feb 6, 2017, at 11:29 AM, David Blodgett <[email protected]> wrote: > > Dear CF, > > I want to follow up on the conversation here with an alternative approach > suggested off list primarily between Jonathan and I. For this, I’m going to > focus on use cases satisfied and simplification of the proposal allowed by > not supporting those use cases. The changes below are largely driven by a > desire to better align this proposal with the technical details of the prior > art that is CF. > > If we: > 1) don’t support node sharing, we can remove the complication of node - > coordinate indexing / indirection, simplifying the proposal pretty > significantly. > 2) don’t use “break values” to indicate the separation between multi-part > geometries and polygon holes, we end up with a data model with an extra > dimension, but the NetCDF dimensions align with the natural dimensions of the > data. > 3) use “count” instead of a “start pointer” approach, we are better aligned > with the existing DSG contiguous ragged array approach. > > Coming back to the three directions we could take this proposal from my cover > letter on February 2nd. >> Direct use of Well-Known Text (WKT). In this approach, well known text >> strings would be encoded using character arrays following a contiguous >> ragged array approach to index the character array by geometry (or instance >> in DSG parlance). >> Implement the WKT approach using a NetCDF binary array. In this approach, >> well known text separators (brackets, commas and spaces) for multipoint, >> multiline, multipolygon, and polygon holes, would be encoded as break type >> separator values like -1 for multiparts and -2 for holes. >> Implement the fundamental dimensions of geometry data in NetCDF. In this >> approach, additional dimensions and variables along those dimensions would >> be introduced to represent geometries, geometry parts, geometry nodes, and >> unique (potentially shared) coordinate locations for nodes to reference. > The alternative I’m outlining here moves in the direction of 3. We had > originally discounted it because it becomes very verbose and seems overly > complicated if support for coordinate sharing is a requirement. If the three > simplifications described above are used, then the third approach seems more > tenable. > > Jonathan has also suggested that: (these are in reaction to the CDL in my > letter from February 2nd) > 1) Rename geom_coordinates as node_coordinates, for consistency with UGRID. > 2) Omit node_dimension. This is redundant, since the dimension can be found by > examining the node coordinate variables. > 3) Prescribe numerous “codes” and assumptions in the specification instead of > letting them be described with attribute values. > 4) It would be more consistent with CF and UGRID to use a single container > variable to hang all the topology/geometry information from. > > Which I, personally, am happy to accept if others don’t object. > > A couple other suggestions from Jonathan I want to discuss a bit more: > 1) Rename geometry as topology and geom_type as topology_type. > While I’d be open to something other than geom, topology is odd. If > this is really “node_collection_topology_type” I guess I could be convinced, > but would be curious how people react to this. (Especially in relation to > UGRID) > 2) This extension is more appropriate as an extension to the concept of cell > bounds than the addition of a complex time-invariate type of discrete > sampling geometry. > Having just re-read the cell bounds chapter, I think it would over > complicate the cell bounds to include this material. My basic issue here is > that these geometries do not necessarily have a reference location. They are, > rather, first order entities that need to be treated as such. That said, it > makes sense that these geometries are not necessarily a good fit for the > original intent of Discrete Sampling Geometries. Jonathan suggested they may > belong in their own chapter, which may be a good alternative? MY suggested > CDL below might lead us in the direction of this being a special type of > auxiliary coordinate variable. > > This alternative starts to look like the CDL pasted below. > > Note that the issue of coordinates is sticking out like a sore thumb. Below, > I’ve attempted to reconcile Jonathan’s ideas regarding coordinates with my > thoughts about how these geometries are “first order entities” that don’t > have a single representative x and y. The spatial coordinates can be said to > reside in the system of geometries described in the “sf” container variable? > I realize this goes against the idea of coordinates a bit, but I think it is > holding with the spirit of the attribute? > > Finally, I’m glad to continue answering questions and debating things via the > list to a point, but I think it would be in our interest to arrange a telecom > to discuss this stuff further with a list of interested parties. Feel free to > follow up on list, but for decision making, let’s not let this rabbit hole go > too deep. I’ll plan on letting this and the other recent action on this > proposal settle with people for a week or two then start to bring together a > conference call (or calls depending on time zones). Please respond to me off > list if you are interested in being part of a call to discuss. > > Regards, > > - Dave > > netcdf multipolygon_example { > dimensions: > node = 47 ; > part = 9 ; > instance = 3 ; > time = 5 ; > strlen = 5 ; > variables: > char instance_name(instance, strlen) ; > instance_name:cf_role = "timeseries_id" ; > double someVariable(instance) ; > someVariable:long_name = "a variable describing a single-valued attribute > of a polygon" ; > someVariable:coordinates = "sf" ; // or "instance_name"? > int time(time) ; > time:units = "days since 2000-01-01" ; > double someData(instance, time) ; > someData:coordinates = "time sf" ; // or "time instance_name"? > someData:featureType = "timeSeries" ; > someData:geometry="sf"; > int sf; // containing variable -- datatype irrelevant because no data > sf:geom_type = "multipolygon" ; // could be node_topology_type? > sf:node_count_variable="node_count"; > sf:node_coordinates = "x y" ; > sf:part_count = "part_node_count" ; > sf:part_type = "part_type" ; // Note required unless polygons with holes > present. > sf:outer_ring_order = "anticlockwise" ; // not required if written in spec? > sf:closure_convention = "last_node_equals_first" ; // not required if > written in spec? > sf:outer_type_code = 0 ; // not required if written in spec? > sf:inner_type_code = 1 ; // not required if written in spec? > int node_count(instance); > node_count:long_name = “count of coordinates in each instance geometry" ; > int part_node_count(part) ; > part_node_count:long_name = “count of coordinates in each geometry part" ; > int part_type(part) ; > part_type:long_name = “type of each geometry part" ; > double x(node) ; > x:units = "degrees_east" ; > x:standard_name = "longitude" ; // or projection_x_coordinate > X:cf_role = "geometry_x_node" ; > double y(node) ; > y:units = "degrees_north" ; > y:standard_name = “latitude” ; // or projection_y_coordinate > y:cf_role = "geometry_y_node" > // global attributes: > :Conventions = "CF-1.8" ; > > data: > > instance_name = > "flash", > "bang", > "pow" ; > > someVariable = 1, 2, 3 ; > > time = 1, 2, 3, 4, 5 ; > > someData = > 1, 2, 3, 4, 5, > 1, 2, 3, 4, 5, > 1, 2, 3, 4, 5 ; > > node_count = 25, 15, 7 ; > > part_node_count = 5, 4, 4, 4, 4, 8, 6, 8, 4 ; > > part_type = 0, 1, 1, 1, 0, 0, 0, 1, 0 ; > > x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, > 5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, -20, -30, > -20, -20, -30, 30, > 45, 10, 30, 25, 50, 30, 25 ; > > y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, 25, 29, > 25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, -35, -20, > -15, -25, -20, 20, > 40, 40, 20, 5, 10, 15, 5 ; > } > > > >> On Feb 4, 2017, at 8:07 AM, David Blodgett <[email protected] >> <mailto:[email protected]>> wrote: >> >> Dear Chris, >> >> Thanks for your thorough treatment of these issues. We have gone through a >> similar thought process to arrive at the proposal we came up with. I’ll >> answer as briefly as I can. >> >> 1) how would you translate between netcdf geometries and, say geo JSON? >> >> The thinking is that node coordinate sharing is optional. If the writer >> wants to check or already knows that nodes share coordinates, then it’s >> possible. Otherwise, it doesn’t have to be used. I’ve always felt that this >> was important, but maybe not critical for a core NetCDF-CF data model. Some >> offline conversation has led to an example that does not use it that may be >> a good alternative, more on that later. >> >> 2) Break Values >> >> You really do have to hold your nose on the break values. The issue is that >> you have to store that information somehow and it is almost worse to create >> new variables to store the multi-part and hole/not hole information. The >> alternative approach that’s forming up as mentioned above does break the >> information out into additional variables but simplifies things otherwise. >> In that case it doesn’t feel overly complex to me… so stay tuned for more on >> this front. >> >> 3) Ragged Indexing >> >> Your thought process follows ours exactly. The key is that you either have >> to create the “pointer” array as a first order of business or loop over the >> counts ad nauseam. I’m actually leaning toward the counts for two reasons. >> First, the counts approach is already in CF so is a natural fit and will be >> familiar to developers in this space. Second, the issue of 0 vs 1 indexing >> is annoying. In our proposal, we settled on 0 indexing because it aligns >> with the idea of an offset, but it is still annoying and some applications >> would always have to adjust that pointer array as a first order of business. >> >> On to Bob’s comments. >> >> Regarding aligning with other data models / encodings, I guess this needs to >> be unpacked a bit. >> >> 1) In this setting, simple features is a data model, not an encoding. An >> encoding can implement part or all of a data model as is needed by the use >> case(s) at hand. There is no problem with partial implementations you still >> get interoperability for the intended use cases. >> 2) Attempting to align with other encoding standards UGRID and NetCDF-CF are >> the primary ones here, is simply to keep the implementation patterns similar >> and familiar. This may be a fools errand, but is presumably good for >> adoptability and consistency. >> So, I don’t see a problem with implementing important simple features types >> in a way that aligns with the way the existing community standards work. >> >> I don’t see this as ignoring existing standards at all. There is no open >> community standard for binary encoding of geometries and related data that >> passes the CF requirements of human readability and self-description. We are >> adopting the appropriate data model and suggesting a new encoding that will >> solve a lot of problems in the environmental modeling space. >> >> As we’ve discussed before, your "different approach” sounds great, but seems >> like an exercise for a future effort that doesn’t attempt to align with CF >> 1.7. Maybe what you suggest is a path forward for variable length arrays in >> the CF 2.0 “vision in the mist”, but I don’t see it as a tenable solution >> for CF 1.*. >> >> Best Regards, >> >> - Dave >> >> >>> On Feb 3, 2017, at 3:31 PM, Chris Barker <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> a few thoughts. First, I think there are three core "issues" that need to >>> be resolved: >>> >>> 1) Coordinate indexing (indirection) >>> >>> the question of whether you have an array of "vertices" that the geomotry >>> types index into to get thier data: >>> >>> Advantages: >>> - if a number of geometries share a lot of vertices, it can be more >>> efficient >>> - the relationship between geometries that share vertices (i.e. polygons >>> that share a boundary) etc. is well defined. you dopnt need to check for >>> closeness, and maybe have a tolerance, etc. >>> >>> These were absolutely critical for UGRID for example -- a UGRID mesh is a >>> single thing", NOT a collection of polygons that happen to share some >>> vertices. >>> >>> Disadvantages: >>> - if the geometries do not share many vertices, it is less efficient. >>> - there are additional code complications in "getting" the vertices of >>> the given geometry >>> - it does not match the OGC data model. >>> >>> My 0.02 -- given my use cases, I tend to want teh advantages -- but I don't >>> know that that's a typical use case. And I think it's a really good idea to >>> keep with the OGS data model where possible -- i.e. e able to translate >>> from netcdf to, say, geoJSON as losslessly as possible. Given that I think >>> it's probably a better idea not to have the indirection. >>> >>> However (to equivocate) perhaps the types of information people are likely >>> to want to store in netcdf are a subset of what the OGC standards are >>> designed for -- and for those use-cases, maybe shared vertices are critical. >>> >>> One way to think about it -- how would you translate between netcdf >>> geometries and, say geo JSON: >>> - nc => geojson would lose the shared index info. >>> - geojson => nc -- would you try to reconstruct the shared vertices?? I"m >>> thinking that would be a bit dangerous in the general case, because you are >>> adding information that you don't know is true -- are these a shared vertex >>> or two that just happen to be at the same location? >>> >>> > > Break values >>> >>> I don't really like break values as an approach, but with netcdf any option >>> will be ugly one way or another. So keeping with the WKT approach makes >>> sense to me. Either way you'll need custom code to unpack it. (BTW -- what >>> does WellKnownBinary do?) >>> >>> > > Ragged indexing >>> >>> There are two "natural" ways to represent a ragged array: >>> >>> (a) store the length of each "row" >>> (b) store the index to the beginning (or end) or each "row" >>> >>> CF already uses (a). However, working with it, I'm pretty convinced that >>> it's the "wrong" choice: >>> >>> If you want to know how long a given row is, that is really easy with (a), >>> and almost as easy with (b) (involves two indexes and a subtraction) >>> >>> However, if you want to extract a particular row: (b) makes this really >>> easy -- you simply access the slice of the array you want. with (a) you >>> need to loop through the entire "length_of_rows" array (up to the row of >>> interest) and add up the values to find the slice you need. not a huge >>> issue, but it is an issue. In fact, in my code to read ragged arrays in >>> netcdf, the first thing I do is pre-compute the index-to-each-row, so I can >>> then use that to access individual rows for future access -- if you are >>> accessing via OpenDAP -- that's particular helpful. >>> >>> So -- (b) is clearly (to me) the "best" way to do it -- but is it worth >>> introducing a second way to handle ragged arrays in CF? I would think yes, >>> but that would be offset if: >>> >>> - There is a bunch of existing library code that transparently handles >>> ragged arrays in netcdf (does netcdfJava have something? I'm pretty sure >>> Python doesn't -- certainly not in netCDF4) >>> >>> - That that existing lib code would be advantageous to leverage for code >>> reading features: I suspect that there will have to be enough custom code >>> that the ragged array bits are going to be the least of it. >>> >>> So I'm for the "new" way of representing ragged arrays >>> >>> -CHB >>> >>> >>> On Fri, Feb 3, 2017 at 11:41 AM, Bob Simons - NOAA Federal >>> <[email protected] <mailto:[email protected]>> wrote: >>> Then, isn't this proposal just the first step in the creation of a new >>> model and a new encoding of Simple Features, one that is "align[ed] ... >>> with as many other encoding standards in this space as is practical"? In >>> other words, yet another standard for Simple Features? >>> >>> If so, it seems risky to me to take just the first (easy?) step "to support >>> the use cases that have a compelling need today" and not solve the entire >>> problem. I know the CF way is to just solve real, current needs, but in >>> this case it seems to risk a head slap moment in the future when we realize >>> that, in order to deal with some new simple feature variant, we should have >>> done things differently from the beginning? >>> >>> And it seems odd to reject existing standards that have been so >>> painstakingly hammered out, in favor of starting the process all over >>> again. We follow existing standards for other things (e.g., IEEE-754 for >>> representing floating point numbers in binary files), why can't we follow >>> an existing Simple Features standard? >>> >>> --- >>> Rather than just be a naysayer, let me suggest a very different alternative: >>> >>> There are several projects in the CF realm (e.g., this Simple Features >>> project, Discrete Sampling Geometry (DSG), true variable-length Strings, >>> ugrid(?)) which share a common underlying problem: how to deal with >>> variable-length multidimensional arrays: a[b][c], where the length of the c >>> dimension may be different for different b indices. >>> DSG solved this (5 different ways!), but only for DSG. >>> The Simple Features proposal seeks to solve the problem for Simple Features. >>> We still have no support for Unicode variable-length Strings. >>> >>> Instead of continuing to solve the variable-length problem a different way >>> every time we confront it, shouldn't we solve it once, with one small >>> addition to the standard, and then use that solution repeatedly? >>> The solution could be a simple variant of one of the DSG solutions, but >>> generalized so that it could be used in different situations. >>> An encoding standard and built-in support for variable-length data arrays >>> in netcdf-java/c would solve a lot of problems, now and in the future. >>> Some work on this is already done: I think the netcdf-java API already >>> supports variable-length arrays when reading netcdf-4 files. >>> For Simple Features, the problem would reduce to: store the feature (using >>> some specified existing standard like WKT or WKB) in a variable-length >>> array. >>> >>> >>> >>> >>> >>> On Fri, Feb 3, 2017 at 9:07 AM, <[email protected] >>> <mailto:[email protected]>> wrote: >>> Date: Fri, 3 Feb 2017 11:07:00 -0600 >>> From: David Blodgett <[email protected] <mailto:[email protected]>> >>> To: Bob Simons - NOAA Federal <[email protected] >>> <mailto:[email protected]>> >>> Cc: CF Metadata <[email protected] <mailto:[email protected]>> >>> Subject: Re: [CF-metadata] Extension of Discrete Sampling Geometries >>> for Simple Features >>> Message-ID: <[email protected] >>> <mailto:[email protected]>> >>> Content-Type: text/plain; charset="utf-8" >>> >>> Dear Bob, >>> >>> I?ll just take these in line. >>> >>> 1) noted. We have been trying to figure out what to do with the point >>> featureType and I think leaving it more or less alone is a viable path >>> forward. >>> >>> 2) This is not an exact replica of WKT, but rather a similar approach to >>> WKT. As I stated, we have followed the ISO simple features data model and >>> well known text feature types in concept, but have not used the same >>> standardization formalisms. We aren?t advocating for supporting ?all of? >>> any standard but are rather attempting to support the use cases that have a >>> compelling need today while aligning this with as many other encoding >>> standards in this space as is practical. Hopefully that answers your >>> question, sorry if it?s vague. >>> >>> 3) The google doc linked in my response contains the encoding we are >>> proposing as a starting point for conversation: http://goo.gl/Kq9ASq >>> <http://goo.gl/Kq9ASq> <http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq>> I want >>> to stress, as a starting point for discussion. I expect that this proposal >>> will change drastically before we?re done. >>> >>> 4) Absolutely envision tools doing what you say, convert to/from standard >>> spatial formats and NetCDF-CF geometries. We intend to introduce an R and a >>> Python implementation that does exactly as you say along with whatever form >>> this standard takes in the end. R and Python were chosen as the team that >>> brought this together are familiar with those two languages, additional >>> implementations would be more than welcome. >>> >>> 5) We do include a ?geometry? featureType similar to the ?point? >>> featureType. Thus our difficulty with what to do with the ?point? >>> featureType. You are correct, there are lots of non timeSeries applications >>> to be solved and this proposal does intend to support them (within the >>> existing DSG constructs). >>> >>> Thanks for your questions, hopefully my answers close some gaps for you. >>> >>> - Dave >>> >>> > On Feb 3, 2017, at 10:47 AM, Bob Simons - NOAA Federal >>> > <[email protected] <mailto:[email protected]>> wrote: >>> > >>> > 1) There is a vague comment in the proposal about possibly changing the >>> > point featureType. Please don't, unless the changes don't affect current >>> > uses of Point. There are already 1000's of files that use it. If this new >>> > system offers an alternative, then fine, it's an alternative. One of the >>> > most important and useful features of a good standard is backwards >>> > compatibility. >>> > >>> > 2) You advocate "Implement the WKT approach using a NetCDF binary array." >>> > Is this system then an exact encoding of WKT, neither a subset nor a >>> > superset? "Simple Features" are often not simple. >>> > If it is WKT (or something else), what is the standard you are following >>> > to describe the Simple Features (e.g., ISO/IEC 13249-3:2016 and ISO >>> > 19162:2015)? >>> > Does your proposal deviate in any way from the standard's capabilities? >>> > Do you advocate following the entire WKT standard, e.g., supporting all >>> > the feature types that WKT supports? >>> > >>> > 3) Since you are not using the WKT encoding, but creating your own, where >>> > is the definition of the encoding system you are using? >>> > >>> > 4) This is a little out of CF scope, but: >>> > Do you envision tools, notably, netcdf-c/java, having a writer function >>> > that takes in WKT and encodes the information in a file, and having a >>> > reader function that reads the file and returns WKT? Or is it your plan >>> > that the encoding/ decoding is left to the user? >>> > >>> > 5) This proposal is for "Simple Features plus Time Series" (my phrase not >>> > yours). But aren't there lots of other uses of Simple Features? Will >>> > there be other proposals in the future for "Simple Features plus X" and >>> > "Simple Features plus Y"? If so, will CF eventually become a massive >>> > document where Simple Features are defined over and over again, but in >>> > different contexts? If so, wouldn't a better solution be to deal with >>> > Simple Features separately (as Postgres does by making a geometric data >>> > type?), and then add "Simple Features plus Time Series" as the first use >>> > of it? >>> > >>> > Thanks for answering these questions. >>> > Please forgive me if I missed parts of your proposal that answer these >>> > questions. >>> > >>> > >>> > On Thu, Feb 2, 2017 at 5:57 AM, <[email protected] >>> > <mailto:[email protected]> >>> > <mailto:[email protected] >>> > <mailto:[email protected]>>> wrote: >>> > Date: Thu, 2 Feb 2017 07:57:36 -0600 >>> > From: David Blodgett <[email protected] <mailto:[email protected]> >>> > <mailto:[email protected] <mailto:[email protected]>>> >>> > To: <[email protected] <mailto:[email protected]> >>> > <mailto:[email protected] <mailto:[email protected]>>> >>> > Subject: [CF-metadata] Extension of Discrete Sampling Geometries for >>> > Simple Features >>> > Message-ID: <[email protected] >>> > <mailto:[email protected]> >>> > <mailto:[email protected] >>> > <mailto:[email protected]>>> >>> > Content-Type: text/plain; charset="utf-8" >>> > >>> > Dear CF Community, >>> > >>> > We are pleased to submit this proposal for your consideration and review. >>> > The cover letter we've prepared below provides some background and >>> > explanation for the proposed approach. The google doc here >>> > <http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq> <http://goo.gl/Kq9ASq >>> > <http://goo.gl/Kq9ASq>>> is an excerpt of the CF specification with track >>> > changes turned on. Permissions for the document allow any google user to >>> > comment, so feel free to comment and ask questions in line. >>> > >>> > Note that I?m sharing this with you with one issue unresolved. What to do >>> > with the point featureType? Our draft suggests that it is part of a new >>> > geometry featureType, but it could be that we leave it alone and >>> > introduce a geometry featureType. This may be a minor point of >>> > discussion, but we need to be clear that this is an issue that still >>> > needs to be resolved in the proposal. >>> > >>> > Thank you for your time and consideration. >>> > >>> > Best Regards, >>> > >>> > David Blodgett, Tim Whiteaker, and Ben Koziol >>> > >>> > Proposed Extension to NetCDF-CF for Simple Geometries >>> > >>> > Preface >>> > >>> > The proposed addition to NetCDF-CF introduced below is inspired by a >>> > pre-existing data model governed by OGC and ISO as ISO 19125-1. More >>> > information on Simple Features may be found here. >>> > <https://en.wikipedia.org/wiki/Simple_Features >>> > <https://en.wikipedia.org/wiki/Simple_Features> >>> > <https://en.wikipedia.org/wiki/Simple_Features >>> > <https://en.wikipedia.org/wiki/Simple_Features>>> To the knowledge of the >>> > authors, it is consistent with ISO 19125-1 but has not been specified >>> > using the formalisms of OGC or ISO. Language used attempts to hold true >>> > to NetCDF-CF semantics while not conflicting with the existing standards >>> > baseline. While this proposal does not support the entire scope of the >>> > the simple features ecosystem, it does support the core data types in >>> > most common use around the community. >>> > >>> > The other existing standard to mention is UGRID convention >>> > <http://ugrid-conventions.github.io/ugrid-conventions/ >>> > <http://ugrid-conventions.github.io/ugrid-conventions/> >>> > <http://ugrid-conventions.github.io/ugrid-conventions/ >>> > <http://ugrid-conventions.github.io/ugrid-conventions/>>>. The authors >>> > have experience reading and writing UGRID and have designed the proposed >>> > structure in a way that is inspired by and consistent with it. >>> > >>> > Terms and Definitions >>> > >>> > (Taken from OGC 06-103r4 OpenGIS Implementation Specification for >>> > Geographic information - Simple feature access - Part 1: Common >>> > architecture <http://www.opengeospatial.org/standards/sfa >>> > <http://www.opengeospatial.org/standards/sfa> >>> > <http://www.opengeospatial.org/standards/sfa >>> > <http://www.opengeospatial.org/standards/sfa>>>.) >>> > >>> > Feature: Abstraction of real world phenomena - typically a geospatial >>> > abstraction with associated descriptive attributes. >>> > Simple Feature: A feature with all geometric attributes described >>> > piecewise by straight line or planar interpolation between point sets. >>> > Geometry (geometric complex): A set of disjoint geometric primitives - >>> > one or more points, lines, or polygons that form the spatial >>> > representation of a feature. >>> > Introduction >>> > >>> > Discrete Sampling Geometries (DSGs) handle data from one (or a collection >>> > of) timeSeries (point), Trajectory, Profile, TrajectoryProfile or >>> > timeSeriesProfile geometries. Measurements are from a point (timeSeries >>> > and Profile) or points along a trajectory. In this proposal, we reuse the >>> > core DSG timeSeries type which provides support for basic time series use >>> > cases e.g., a timeSerieswhich is measured (or modeled) at a given point. >>> > >>> > Changes to Existing CF Specification >>> > >>> > In NetCDF-CF 1.7, Discrete Sampling Geometries separate dimensions and >>> > variables into two types ? instance and element >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements >>> > >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements> >>> > >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements >>> > >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_collections_instances_and_elements>>>. >>> > Instance refers to individual points, trajectories, profiles, etc. These >>> > would sometimes be referred to as features given that they are identified >>> > entities that can have associated attributes and be related to other >>> > entities. Element dimensions describe temporal or other dimensions to >>> > describe data on a per-instance basis. This proposal extends the DSG >>> > timeSeries featuretype >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types >>> > >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types> >>> > >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types >>> > >>> > <http://cfconventions.org/cf-conventions/cf-conventions.html#_features_and_feature_types>>> >>> > such that the geospatial coordinates of the instances can be point, >>> > multi-point, line, multi-line, polygon, or multi-polyg >>> on geometries. Rather than overload the DSG contiguous ragged array >>> encoding, designed with timeseries in mind, a geometry ragged array >>> encoding is introduced in a new section 9.3.5. See thi >>> > s google doc for specific proposed changes. <http://goo.gl/Kq9ASq >>> > <http://goo.gl/Kq9ASq> <http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq>>> >>> > Motivation >>> > >>> > DSGs have no system to define a geometry (polyline, polygon, etc., other >>> > than point) and an association with a time series that applies over that >>> > entire geometry e.g., The expected rainfall in this watershed polygon for >>> > some period of time is 10 mm. As suggested in the last paragraph of >>> > section 9.1, current practice is to assign a representative point or just >>> > use an ID and forgo spatial information within a NetCDF-CF file. In order >>> > to satisfy a number of environmental modeling use cases, we need a way to >>> > encode a geometry (point, line, polygon, multi-point, multi-line, or >>> > multi-polygon) that is the static spatial feature representation to which >>> > one or more timeSeries can be associated. In this proposal, we provide an >>> > encoding to define collections of simple feature geometries. It >>> > interfaces cleanly with the existing DSG specification, enabling DSGs and >>> > Simple Geometries to be used concurrently. >>> > >>> > Looking Forward >>> > >>> > This proposal is a compromise solution that attempts to stay consisten to >>> > CF ideals and fit within the structure of the existing specification with >>> > minimal disruption. Line and polygon data types often require variable >>> > length arrays. Development of this proposal has brought to light the need >>> > for a general abstraction for variable length arrays in NetCDF-CF. Such a >>> > general abstraction would necessarily be reusable for character arrays, >>> > ragged arrays of time series, and ragged arrays of geometry nodes, as >>> > well as any other ragged data structures that may come up in the future. >>> > This proposal does not introduce such a general ragged array abstraction >>> > but does not preclude such a development in the future. >>> > >>> > Three Alternative Approaches >>> > >>> > Respecting the human readability ideal of NetCDF-CF, the development of >>> > this proposal started from a human readable format for geometries known >>> > as Well Known Text <https://en.wikipedia.org/wiki/Well-known_text >>> > <https://en.wikipedia.org/wiki/Well-known_text> >>> > <https://en.wikipedia.org/wiki/Well-known_text >>> > <https://en.wikipedia.org/wiki/Well-known_text>>>. We considered three >>> > high level design approaches while developing this proposal. >>> > >>> > Direct use of Well-Known Text (WKT). In this approach, well known text >>> > strings would be encoded using character arrays following a contiguous >>> > ragged array approach to index the character array by geometry (or >>> > instance in DSG parlance). >>> > Implement the WKT approach using a NetCDF binary array. In this approach, >>> > well known text separators (brackets, commas and spaces) for multipoint, >>> > multiline, multipolygon, and polygon holes, would be encoded as break >>> > type separator values like -1 for multiparts and -2 for holes. >>> > Implement the fundamental dimensions of geometry data in NetCDF. In this >>> > approach, additional dimensions and variables along those dimensions >>> > would be introduced to represent geometries, geometry parts, geometry >>> > nodes, and unique (potentially shared) coordinate locations for nodes to >>> > reference. >>> > Selected Approach >>> > >>> > The first approach was seen as too opaque to stay true to the CF ideal of >>> > complete self-description. The third approach seemed needlessly verbose >>> > and difficult to implement. The second approach was selected for the >>> > following reasons: >>> > >>> > The second approach is just as or more human-readable than the third. >>> > Use of break values keeps geometries relatively atomic. >>> > Will be familiar to developers who are familiar with the WKT geometry >>> > format. >>> > Character arrays, which are needed for options one and three, are >>> > cumbersome to use in some programming languages in common use with NetCDF. >>> > Break values replace the need for extraneous variables related to >>> > multi-part and polygon holes (interiors). Multi-part geometries are >>> > generally an exception and excessive instrumentation to support them >>> > should be discounted. >>> > Example: Representation of WKT-Style Polygons in a NetCDF-3 >>> > timeSeriesfeatureType >>> > >>> > Below is sample CDL demonstrating how polygons are encoded in NetCDF-3 >>> > using a continuous ragged array-like encoding. There are three details to >>> > note in the example below. >>> > >>> > The attribute contiguous_ragged_dimension with value of a dimension in >>> > the file. >>> > The geom_coordinates attribute with a value containing a space separated >>> > string of variable names. >>> > The cf_role geometry_x_node and geometry_y_node. >>> > These three attributes form a system to fully describe collections of >>> > multi-polygon feature geometries. Any variable that has the >>> > continuous_ragged_dimension attribute contains integers that indicate the >>> > 0-indexed starting position of each geometry along the instance >>> > dimension. Any variable that uses the dimension referenced in the >>> > continuous_ragged_dimension attribute can be interpreted using the values >>> > in the variable containing the contiguous_ragged_dimension attribute. The >>> > variables referenced in the geom_coordinates attribute describe spatial >>> > coordinates of geometries. These variables can also be identified by the >>> > cf_roles geometry_x_node and geometry_y_node. Note that the example below >>> > also includes a mechanism to handle multi-polygon features that also >>> > contain holes. >>> > >>> > netcdf multipolygon_example { >>> > dimensions: >>> > node = 47 ; >>> > indices = 55 ; >>> > instance = 3 ; >>> > time = 5 ; >>> > strlen = 5 ; >>> > variables: >>> > char instance_name(instance, strlen) ; >>> > instance_name:cf_role = "timeseries_id" ; >>> > int coordinate_index(indices) ; >>> > coordinate_index:geom_type = "multipolygon" ; >>> > coordinate_index:geom_coordinates = "x y" ; >>> > coordinate_index:multipart_break_value = -1 ; >>> > coordinate_index:hole_break_value = -2 ; >>> > coordinate_index:outer_ring_order = "anticlockwise" ; >>> > coordinate_index:closure_convention = "last_node_equals_first" ; >>> > int coordinate_index_start(instance) ; >>> > coordinate_index_start:long_name = "index of first coordinate in each >>> > instance geometry" ; >>> > coordinate_index_start:contiguous_ragged_dimension = "indices" ; >>> > double x(node) ; >>> > x:units = "degrees_east" ; >>> > x:standard_name = "longitude" ; // or projection_x_coordinate >>> > X:cf_role = "geometry_x_node" ; >>> > double y(node) ; >>> > y:units = "degrees_north" ; >>> > y:standard_name = ?latitude? ; // or projection_y_coordinate >>> > y:cf_role = "geometry_y_node" >>> > double someVariable(instance) ; >>> > someVariable:long_name = "a variable describing a single-valued >>> > attribute of a polygon" ; >>> > int time(time) ; >>> > time:units = "days since 2000-01-01" ; >>> > double someData(instance, time) ; >>> > someData:coordinates = "time x y" ; >>> > someData:featureType = "timeSeries" ; >>> > // global attributes: >>> > :Conventions = "CF-1.8" ; >>> > >>> > data: >>> > >>> > instance_name = >>> > "flash", >>> > "bang", >>> > "pow" ; >>> > >>> > coordinate_index = 0, 1, 2, 3, 4, -2, 5, 6, 7, 8, -2, 9, 10, 11, 12, -2, >>> > 13, 14, 15, 16, >>> > -1, 17, 18, 19, 20, -1, 21, 22, 23, 24, 25, 26, 27, 28, -1, 29, 30, >>> > 31, 32, 33, >>> > 34, -2, 35, 36, 37, 38, 39, 40, 41, 42, -1, 43, 44, 45, 46 ; >>> > >>> > coordinate_index_start = 0, 30, 46 ; >>> > >>> > x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, >>> > 5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, -20, >>> > -30, -20, -20, -30, 30, >>> > 45, 10, 30, 25, 50, 30, 25 ; >>> > >>> > y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, 25, >>> > 29, >>> > 25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, -35, >>> > -20, -15, -25, -20, 20, >>> > 40, 40, 20, 5, 10, 15, 5 ; >>> > >>> > someVariable = 1, 2, 3 ; >>> > >>> > time = 1, 2, 3, 4, 5 ; >>> > >>> > someData = >>> > 1, 2, 3, 4, 5, >>> > 1, 2, 3, 4, 5, >>> > 1, 2, 3, 4, 5 ; >>> > } >>> > How To Interpret >>> > >>> > Starting from the timeSeries variables: >>> > >>> > See CF-1.8 conventions. >>> > See the timeSeries featureType. >>> > Find the timeseries_id cf_role. >>> > Find the coordinates attribute of data variables. >>> > See that the variables indicated by the coordinates attribute have a >>> > cf_role geometry_x_nodeand geometry_y_node to determine that these are >>> > geometries according to this new specification. >>> > Find the coordinate index variable with geom_coordinates that point to >>> > the nodes. >>> > Find the variable with contiguous_ragged_dimension pointing to the >>> > dimension of the coordinate index variable to determine how to index into >>> > the coordinate index. >>> > Iterate over polygons, parsing out geometries using the contiguous ragged >>> > start variable and coordinate index variable to interpret the coordinate >>> > data variables. >>> > Or, without reference to timeSeries: >>> > >>> > See CF-1.8 conventions. >>> > See the geom_type of multipolygon. >>> > Find the variable with a contiguous_ragged_dimension matching the >>> > coordinate index variable?s dimension. >>> > See the geom_coordinates of x y. >>> > Using the contiguous ragged start variable found in 3 and the coordinate >>> > index variable found in 2, geometries can be parsed out of the coordinate >>> > index variable and parsed using the hole and break values in it. >>> > >>> > -------------- next part -------------- >>> > An HTML attachment was scrubbed... >>> > URL: >>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html >>> > >>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html> >>> > >>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html >>> > >>> > <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170202/4ce5b42f/attachment.html>>> >>> > >>> > ------------------------------ >>> > >>> > Subject: Digest Footer >>> > >>> > _______________________________________________ >>> > CF-metadata mailing list >>> > [email protected] <mailto:[email protected]> >>> > <mailto:[email protected] <mailto:[email protected]>> >>> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> >>> > >>> > >>> > ------------------------------ >>> > >>> > End of CF-metadata Digest, Vol 166, Issue 3 >>> > ******************************************* >>> > >>> > >>> > >>> > -- >>> > Sincerely, >>> > >>> > Bob Simons >>> > IT Specialist >>> > Environmental Research Division >>> > NOAA Southwest Fisheries Science Center >>> > 99 Pacific St., Suite 255A (New!) >>> > Monterey, CA 93940 (New!) >>> > Phone: (831)333-9878 <tel:%28831%29333-9878> (New!) >>> > Fax: (831)648-8440 <tel:%28831%29648-8440> >>> > Email: [email protected] <mailto:[email protected]> >>> > <mailto:[email protected] <mailto:[email protected]>> >>> > >>> > The contents of this message are mine personally and >>> > do not necessarily reflect any position of the >>> > Government or the National Oceanic and Atmospheric Administration. >>> > <>< <>< <>< <>< <>< <>< <>< <>< <>< >>> > >>> > _______________________________________________ >>> > CF-metadata mailing list >>> > [email protected] <mailto:[email protected]> >>> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>> > <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>> >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >>> <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170203/4ff55def/attachment.html >>> >>> <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170203/4ff55def/attachment.html>> >>> >>> ------------------------------ >>> >>> Subject: Digest Footer >>> >>> _______________________________________________ >>> CF-metadata mailing list >>> [email protected] <mailto:[email protected]> >>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>> <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>> >>> >>> ------------------------------ >>> >>> End of CF-metadata Digest, Vol 166, Issue 5 >>> ******************************************* >>> >>> >>> >>> -- >>> Sincerely, >>> >>> Bob Simons >>> IT Specialist >>> Environmental Research Division >>> NOAA Southwest Fisheries Science Center >>> 99 Pacific St., Suite 255A (New!) >>> Monterey, CA 93940 (New!) >>> Phone: (831)333-9878 <tel:(831)%20333-9878> (New!) >>> Fax: (831)648-8440 <tel:(831)%20648-8440> >>> Email: [email protected] <mailto:[email protected]> >>> >>> The contents of this message are mine personally and >>> do not necessarily reflect any position of the >>> Government or the National Oceanic and Atmospheric Administration. >>> <>< <>< <>< <>< <>< <>< <>< <>< <>< >>> >>> >>> _______________________________________________ >>> CF-metadata mailing list >>> [email protected] <mailto:[email protected]> >>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>> <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >>> >>> >>> >>> >>> -- >>> >>> Christopher Barker, Ph.D. >>> Oceanographer >>> >>> Emergency Response Division >>> NOAA/NOS/OR&R (206) 526-6959 voice >>> 7600 Sand Point Way NE (206) 526-6329 fax >>> Seattle, WA 98115 (206) 526-6317 main reception >>> >>> [email protected] >>> <mailto:[email protected]>_______________________________________________ >>> CF-metadata mailing list >>> [email protected] <mailto:[email protected]> >>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata >>> <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> >> >
_______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
