Re: [CF-metadata] the need to store lat/lon coordinates in a CF metadata compliant netCDF file

Upendra Dadi Thu, 28 Jul 2011 08:34:22 -0700

I have heard this argument before- datum need not be provided fordatasets which are too imprecise. The usual assumption: "Choosing awrong datum doesn't change anything. So, lets not specify datum eventhough I have the datum. Lets leave it for the user to choose the datum,even though I know the datum." What is often ignored is that the user ismade to do extra work here. She/he has to make a choice of choosing adatum, when she/he doesn't have to. Even if datum does not matter (whichI don't agree), users time & effort does matter. Software usually needthis information to function. Using flags and warnings in the softwareis all good if the datum is not there. But why should user have to readthis, when he doesn't have to (if the datum is specified)? Dataproviders have an additional responsibility here.


Upendra



On 7/27/2011 3:17 PM, V. Balaji wrote:

This seems to have become a discussion specifically about the datum.

This issue was discussed at length at the GO-ESSP meeting in Asheville.
It seems reasonable to conclude on the basis of those discussions, and
the current ones about GCMs, that some datasets are simply too
imprecise for the datum to matter, while for others it might be
critical.

Perhaps the best approach still is for the datum to remain optional,
and it be left to the end user or application to decide what to do when
they are operating on two datasets where one has a datum specified
and the other doesn't; or when both do but they are different. A
quick implementation might be to emit a warning and keep going:-).

This may go some way to meeting David's requirement that end users "know
they are making assumptions", but I don't see an alternative to allowing
datasets without a datum to treat coordinates as having whatever
reference datum the end user would like it to have. It would make no
sense to require GCM output to mention a datum at all. We have been
known to run coupled models where different component models use
different values for the radius of a spherical Earth:-).

David Blodgett writes:
Jonathan and Karl,
I don't disagree with the GCM examples or lack of model resolutionbeing talked about here, I just don't think that is the only argument.
There are cases, like downscaled climate projections, reanalysisproducts, or radar indicated rainfall, where a datum is critical tointerpreting the data accurately. My concern is that the justifieddisregard for datums at course resolution (and lack of requirementfor it in the spec) fosters a lack of awareness that datums becomecritically important at finer scales.
Jonathan, you have a point that an analyst should be free to maketheir own assumptions. However, I would prefer that they know theyare making assumptions. It would be helpful if the CF specificationpaid some tribute to this issue rather than treating lat/loncoordinates as base-equivalent coordinates.
Are there any arguments against CF recommending a standard datumassumption when intersecting data without a datum specified with datathat does have a datum specified?
Cheers,

Dave

On Jul 27, 2011, at 12:14 PM, Karl Taylor wrote:
Hi all,

(I think the horse still shows a few signs of life)
what I'm arguing is that if the results the scientists are usingcome from a GCM, then although the two scientists got differences,those differences (no matter how large) should not be consideredsignificant in terms of their reliability (as opposed to in somestatistical sense). Just as one wouldn't rely on a climate model topredict global mean temperature to within 0.001 K (and surely itwould be silly and perhaps misleading to report temperatures to thisprecision) one shouldn't expect to pin down the *location* of thegrid cell temperature reported by a GCM to a point given with higherprecision than the spacing of the grid cells (at least whencomparing with observations).
I will grant you that at some time far in the future, it is possibleour models' resolution and accuracy will have improved to the pointthat we might have to alter the precision with which we report thelocations of their output values, but we're not there yet.
Best regards,
Karl



On 7/27/11 9:23 AM, David Blodgett wrote:
Not to beat a dead horse, but this issue has been a huge stumblingblock in our work to integrate data and software across the climateand geographic communities.
The argument here is: Since CF data is usually so coarse and lowprecision complete geolocation metadata should not be required.
An example of why this matters: Two scientists take the samedownscaled climate data that doesn't have a datum specification andimport it into their application. One application assumes onedatum, the other assumes another datum. Scientist 1's resultsdiffer from scientist 2's results. In situations where their aresteep gradients in downscaled data, these differences may besubstantial.
One solution would be to adopt a default datum for data lackingdatum definition. So, given a file that uses lat/lon and claims tofollow CF spec, a scientist could follow the specs guidance on whatdatum to assume. Without this type guidance or a requirement toinclude the information, lat/lon without a datum amounts toproviding any other value without units.
Dave

On Jul 27, 2011, at 10:59 AM, Karl Taylor wrote:
Dear all,

another view:
Can't remember *all* the issues here, but certainly reporting thelatitude and longitude points for GCM grids without furtherprecision (e.g., information on the figure of the Earth) issufficient for any comparison with observations. Only certain(usually prescribed) conditions at the earth's surface (e.g.,surface height) coming from a GCM should be trusted at theindividual grid point scale, and no sub-grid scale information isdirectly available from the GCM (normally). So, even if a stationdata is near the boundary of a GCM's grid-cell, it should hardlymatter which of the grid cells it straddles you compare it to.The GCM sort of gives you a grid cell average value that appliesto some region in the vicinity of the cell. So, it doesn'tmatter where you think it is precisely located.
Down-scaled output from the GCM will be at higher resolution, butagain since the original data doesn't apply at a point but for ageneral region (usually quite a bit larger than 12 km, and even ifit weren't we wouldn't believe stuff going on at that scale), sowhere the cell is exactly located again doesn't matter.
best regards,
Karl


On 7/27/11 4:38 AM, David Blodgett wrote:
Without the grid_mapping, the lat and lon still make sense inthe common case(and original CF case) of GCM data, and in many other cases, theintendedusage of the data does not require precision about the figure ofthe Earth.Although this metadata could be valuable if it can be defined, Ithink it would
be too onerous to require it.
I hope to present on this very issue at AGU. The problem we seewith ambiguous definition of datums is a cascade ofnon-recognition of datums through processing algorithms and inthe output of some processes that generate very detailed data.
The prime example is downscaled climate data. Because the climatemodelers involved generally consider lat/lon to be a lowestcommon denominator, the datum used to geolocate historical data(like rain gages) is neglected. What results is, in our case, a1/8deg (12km) grid with no datum. This is unacceptable. As atthis resolution, the errors in a wrong assumption of datum forthe grid can cause very substantial (a full grid cell or more)geolocation errors.
If the CF community intends to consume any ground based data,then datums must be preserved from ingest of ground based forcingthroughout data storage and processing. This is fundamentalinformation that is required for ALL data comparison operations.
I would argue that CF compliance should require this information.This puts the requirement to make metadata assumptions on datapublishers/producers rather than data consumers. It isunacceptable to have different data consumers making differentassumptions of geolocation on the same data.
Off soapbox.

Dave Blodgett
Center for Integrated Data Analytics (CIDA)
USGS WI Water Science Center
8505 Research Way Middleton WI 53562
608-821-3899 | 608-628-5855 (cell)
http://cida.usgs.gov


On Jul 26, 2011, at 5:24 AM, Jonathan Gregory wrote:
Dear all
For datasets which are intended for analysis by end-users Ithink it would beundesirable to remove the requirement of providing explicit latand loncoords even if a grid_mapping is provided. I think it isunrealistic to expectall software which someone might use to analyse netCDF files tobe able torecognise and act upon all possible values of the CFgrid_mapping attribute,and without the lat and lon information the user would have aproblem. If theissue is storage space in the file I think the much betterchoice is to storethe explicit coordinates in another file, by extending the CFconvention toallow datasets to be distributed over several linked files, asgridspec does
for example.
Steve appears to suggest that grid_mapping is required in somecircumstances,but I don't think it is at present. However, the text Stevequotes may not be
quite right:

  "/When the coordinate variables for a horizontal grid are not
longitude and latitude,*_it is required that the true latitudeandlongitude coordinates be supplied_* via the coordinatesattribute/."
The text should make it clear that this requirement applies whenthe data has ageolocated horizontal grid. It doesn't necessarily apply toidealised cases.
We could clarify this with a defect ticket.

Cheers

Jonathan
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Re: [CF-metadata] the need to store lat/lon coordinates in a CF metadata compliant netCDF file

Reply via email to