Jonathan and Karl,

I don't disagree with the GCM examples or lack of model resolution being talked 
about here, I just don't think that is the only argument.

There are cases, like downscaled climate projections, reanalysis products, or 
radar indicated rainfall, where a datum is critical to interpreting the data 
accurately. My concern is that the justified disregard for datums at course 
resolution (and lack of requirement for it in the spec) fosters a lack of 
awareness that datums become critically important at finer scales. 

Jonathan, you have a point that an analyst should be free to make their own 
assumptions. However, I would prefer that they know they are making 
assumptions. It would be helpful if the CF specification paid some tribute to 
this issue rather than treating lat/lon coordinates as base-equivalent 
coordinates.

Are there any arguments against CF recommending a standard datum assumption 
when intersecting data without a datum specified with data that does have a 
datum specified?

Cheers,

Dave 

On Jul 27, 2011, at 12:14 PM, Karl Taylor wrote:

> Hi all,
> 
> (I think the horse still shows a few signs of life)
> 
> what I'm arguing is that if the results the scientists are using come from a 
> GCM, then although the two scientists got differences, those differences (no 
> matter how large) should not be considered significant in terms of their 
> reliability (as opposed to in some statistical sense).  Just as one wouldn't 
> rely on a climate model to predict global mean temperature to within 0.001 K 
> (and surely it would be silly and perhaps misleading to report temperatures 
> to this precision) one shouldn't expect to pin down the *location* of the 
> grid cell temperature reported by a GCM to a point given with higher 
> precision than the spacing of the grid cells (at least when comparing with 
> observations).
> 
> I will grant you that at some time far in the future, it is possible our 
> models' resolution and accuracy will have improved to the point that we might 
> have to alter the precision with which we report the locations of their 
> output values, but we're not there yet.
> 
> Best regards,
> Karl
> 
> 
> 
> On 7/27/11 9:23 AM, David Blodgett wrote:
>> 
>> Not to beat a dead horse, but this issue has been a huge stumbling block in 
>> our work to integrate data and software across the climate and geographic 
>> communities.
>> 
>> The argument here is: Since CF data is usually so coarse and low precision 
>> complete geolocation metadata should not be required. 
>> 
>> An example of why this matters: Two scientists take the same downscaled 
>> climate data that doesn't have a datum specification and import it into 
>> their application. One application assumes one datum, the other assumes 
>> another datum. Scientist 1's results differ from scientist 2's results. In 
>> situations where their are steep gradients in downscaled data, these 
>> differences may be substantial. 
>> 
>> One solution would be to adopt a default datum for data lacking datum 
>> definition. So, given a file that uses lat/lon and claims to follow CF spec, 
>> a scientist could follow the specs guidance on what datum to assume. Without 
>> this type guidance or a requirement to include the information, lat/lon 
>> without a datum amounts to providing any other value without units.
>> 
>> Dave
>> 
>> On Jul 27, 2011, at 10:59 AM, Karl Taylor wrote:
>> 
>>> Dear all,
>>> 
>>> another view:
>>> 
>>> Can't remember *all* the issues here, but certainly reporting the latitude 
>>> and longitude points for GCM grids without further precision (e.g., 
>>> information on the figure of the Earth) is sufficient for any comparison 
>>> with observations.  Only certain (usually prescribed) conditions at the 
>>> earth's surface (e.g., surface height) coming from a GCM should be trusted 
>>> at the individual grid point scale, and no sub-grid scale information is 
>>> directly available from the GCM (normally).  So, even if a station data is 
>>> near the boundary of a GCM's grid-cell, it should hardly matter which of 
>>> the grid cells it straddles you compare it to.  The GCM sort of gives you a 
>>> grid cell average value that applies to some region in the vicinity of the  
>>> cell.  So, it doesn't matter where you think it is precisely located.
>>> 
>>> Down-scaled output from the GCM will be at higher resolution, but again 
>>> since the original data doesn't apply at a point but for a general region 
>>> (usually quite a bit larger than 12 km, and even if it weren't we wouldn't 
>>> believe stuff going on at that scale), so where the cell is exactly located 
>>> again doesn't matter.
>>> 
>>> best regards,
>>> Karl
>>> 
>>> 
>>> On 7/27/11 4:38 AM, David Blodgett wrote:
>>>> 
>>>>> Without the grid_mapping, the lat and lon still make sense in the common 
>>>>> case
>>>>> (and original CF case) of GCM data, and in many other cases, the intended
>>>>> usage of the data does not require precision about the figure of the 
>>>>> Earth.
>>>>> Although this metadata could be valuable if it can be defined, I think it 
>>>>> would
>>>>> be too onerous to require it.
>>>> 
>>>> I hope to present on this very issue at AGU. The problem we see with 
>>>> ambiguous definition of datums is a cascade of non-recognition of datums 
>>>> through processing algorithms and in the output of some processes that 
>>>> generate very detailed data. 
>>>> 
>>>> The prime example is downscaled climate data. Because the climate modelers 
>>>> involved generally consider lat/lon to be a lowest common denominator, the 
>>>> datum used to geolocate historical data (like rain gages) is neglected. 
>>>> What results is, in our case, a 1/8deg (12km) grid with no datum. This is 
>>>> unacceptable. As at this resolution, the errors in a wrong assumption of 
>>>> datum for the grid can cause very substantial (a full grid cell or more) 
>>>> geolocation errors.
>>>> 
>>>> If the CF community intends to consume any ground based data, then datums 
>>>> must be preserved from ingest of ground based forcing throughout data 
>>>> storage and processing. This is fundamental information that is required 
>>>> for ALL data comparison operations.
>>>> 
>>>> I would argue that CF compliance should require this information. This 
>>>> puts the requirement to make metadata assumptions on data 
>>>> publishers/producers rather than data consumers. It is unacceptable to 
>>>> have different data consumers making different assumptions of geolocation 
>>>> on the same data. 
>>>> 
>>>> Off soapbox.
>>>> 
>>>> Dave Blodgett
>>>> Center for Integrated Data Analytics (CIDA)
>>>> USGS WI Water Science Center
>>>> 8505 Research Way Middleton WI 53562
>>>> 608-821-3899 | 608-628-5855 (cell)
>>>> http://cida.usgs.gov
>>>> 
>>>> 
>>>> On Jul 26, 2011, at 5:24 AM, Jonathan Gregory wrote:
>>>> 
>>>>> Dear all
>>>>> 
>>>>> For datasets which are intended for analysis by end-users I think it 
>>>>> would be
>>>>> undesirable to remove the requirement of providing explicit lat and lon
>>>>> coords even if a grid_mapping is provided. I think it is unrealistic to 
>>>>> expect
>>>>> all software which someone might use to analyse netCDF files to be able to
>>>>> recognise and act upon all possible values of the CF grid_mapping 
>>>>> attribute,
>>>>> and without the lat and lon information the user would have a problem. If 
>>>>> the
>>>>> issue is storage space in the file I think the much better choice is to 
>>>>> store
>>>>> the explicit coordinates in another file, by extending the CF convention 
>>>>> to
>>>>> allow datasets to be distributed over several linked files, as gridspec 
>>>>> does
>>>>> for example.
>>>>> 
>>>>> Steve appears to suggest that grid_mapping is required in some 
>>>>> circumstances,
>>>>> but I don't think it is at present. However, the text Steve quotes may 
>>>>> not be
>>>>> quite right:
>>>>> 
>>>>>   "/When the coordinate variables for a horizontal grid are not
>>>>>   longitude and latitude,*_it is required that the true latitude and
>>>>>   longitude coordinates be supplied_* via the coordinates attribute/."
>>>>> 
>>>>> The text should make it clear that this requirement applies when the data 
>>>>> has a
>>>>> geolocated horizontal grid. It doesn't necessarily apply to idealised 
>>>>> cases.
>>>>> We could clarify this with a defect ticket.
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> Jonathan
>>>>> _______________________________________________
>>>>> CF-metadata mailing list
>>>>> [email protected]
>>>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>> 
>>> _______________________________________________
>>> CF-metadata mailing list
>>> [email protected]
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>> 

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to