Tools and CF compliance Was: CF Conventions 1.2 (Jon Blower)

Hi,

   concerning compliance and the CF compliance checker: In my view there are 
two very different ways how this can/should be interpreted:
(1) check whether (or to what extent) a given file adheres to the convention. 
This assumes that the file does in theory adhere to the convention and can be 
quite useful to detect errors.
(2) support the development of scripts, CMOR tables, etc. which shall generate 
CF compliant files. Here, it cannot be expected a priori that the file adheres 
to CF at all (including the missing "Convention" attribute), so the tool should 
produce hints to the developer as to what changes he/she should make. While of 
course most CF attributes are "optional", individuals and projects should 
nevertheless strive to implement a good part of them. Thus, a good "compliance 
test" would go beyond critizing what is there but wrong and notify the user of 
what is not there but should perhaps be added.

   At present the CF checker operates under principle number 1, but in order to 
proliferate CF I would suggest to consider some way of going towards variant 2. 
I believe that most of the testng that is necessary for this is embedded in the 
CF checker anyway, so it would probably be mostly a matter of some program 
logic and generation of verbose messages. Perhaps this can be realized on the 
web interface with a simple check box:
    [ ] suggest further improvements to the file's attribute structure

Best regards,

Martin

< Dr. Martin G. Schultz, ICG-2, Forschungszentrum Jülich  >
< D-52425 Jülich, Germany                                 >
< ph: +49 (0)2461 61 2831, fax: +49 (0)2461 61 8131       >
< email: [EMAIL PROTECTED]                          >
< web: http://www.fz-juelich.de/icg/icg-2/m_schultz       >


----------------------------------------------------------------------

Message: 1
Date: Thu, 08 May 2008 11:53:44 +0100
From: Philip Bentley <[EMAIL PROTECTED]>
Subject: Re: [CF-metadata] CF Conventions 1.2
To: [email protected]
Message-ID:
        <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="us-ascii"

Hi Jonathan, Ethan,


> Dear Ethan
> 
> I think that the current rules are a good compromise between the needs 
> of people who write and analyse data, and the needs of developers of 
> analysis and other software. The former group of people would like CF 
> to be modified fairly rapidly, when they are about to start producing 
> data from a project, and they want that data to have proper metadata. 
> As you will have seen from previous discussions, our discussions are 
> too slow as it is sometimes. Hence we decided the rules so that changes could 
> be made, but marked as provisional.

Indeed. I think the 4-plus years between CF 1.0 and CF 1.1 - according to the 
date stamps on the documents - says it all. Perhaps the recent flurry of CF 
proposal activity in part reflects a general desire to 'play catch-up'.

> 
> For provisional changes to become permanent depends on at least two 
> applications supporting them. That requires some development effort to 
> be invested. CF doesn't have staff resources of its own to commit to 
> it. I think the most likely applications to make changes first are the 
> cf-checker and libcf. It will be interesting to see how long it takes 
> for the changes so far agreed to be implemented in these or other 
> applications.
> 
> I fear that if we followed this approach:
> 
> > 2) Don't add changes to the upcoming version of the specification 
> > document "until at least two applications have successfully 
> > interpreted the test data".
> 
> development of CF would effectively be halted altogether. It would be 
> impossible for writers of data to agree changes to the CF standard on 
> a short enough timescale. Consequently they would bypass CF, and write 
> and analyse data with their own metadata conventions, and the 
> usefulness of CF in providing a common standard would be undermined.

I agree 100% with this. If, as a community, we set the barrier to progress [of 
the CF conventions] too high then people will necessarily devise local, 
incompatible solutions - not out of willfulness, but simply to meet project 
deadlines.

> 
> Applications don't have to keep entirely up to date, do they? I think 
> the value of the Conventions attribute should be that it is easy to be 
> clear about what conventions are being implemented in data and metadata.
> 
> I agree about the test data. We should construct a file which contains 
> some test data for the changes of CF 1.2. (The changes of CF 1.1 did 
> not introduce any new attribute.) We'll need a place to deposit such files.
> As moderator of that ticket, I'll discuss it with Phil and Velimir.

I can produce some simple test files for the changes at CF 1.2. But the 
question of what constitutes application conformance is, I suggest, not easily 
defined. For instance, I could create a noddy netcdf file with two new grid 
mapping attributes, as follows:

float temperature(t,z,lat,lon);
    :grid_mapping = "crs";
char crs;
    :grid_mapping_name = "latitude_longitude";
    :semi_major_axis = "92389234"; // new at CF 1.2
    :semi_minor_axis = "78682347"; // new at CF 1.2

And I could read this file today using, say, ncdump and ncview. Which clearly 
doesn't tell us much. Yet a proposer of a given CF change cannot force the 
hands of software developers to produce compliant software within a particular 
time frame, if at all. In some (many?) circumstances I think we have to take it 
as an act of faith that a particular update to the CF convention will be 
advantageous. Plus I believe that the robustness of the CF peer review and 
challenge mechanisms is sufficient to ensure that those updates will be 
advantageous.

Regards,
Phil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20080508/c0ac8bd7/attachment-0001.html
 

------------------------------

Message: 2
Date: Thu, 8 May 2008 12:04:14 +0100
From: Jonathan Gregory <[EMAIL PROTECTED]>
Subject: [CF-metadata]  CF Conventions 1.2
To: [email protected]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=us-ascii

Dear Phil

>    Perhaps the recent
>    flurry of CF proposal activity in part reflects a general desire to
>    'play catch-up'.

Yes, I think that is the case. It certainly is the case for the two proposals I 
have made, on the axis and cell_methods attributes. These were discussed on the 
email list and in abeyance for a long time because we had no way to adopt them 
formally until we agreed the new rules.

>    I can produce some simple test files for the changes at CF 1.2. But
>    the question of what constitutes application conformance is, I
>    suggest, not easily defined. For instance, I could create a noddy
>    netcdf file with two new grid mapping attributes, as follows:

Yes, I think such a file would be useful, because it does at least provide 
input data that the cf-checker can check for conformance, and other 
applications could likewise check that they can read in and interpret, if they 
are interested in these features. I agree with you that what "compliance"
actually means for an application is ill-defined. This is an issue which has 
come up before, of course. Since most of CF is optional, in one sense (but not 
a very useful sense) an application is compliant even if it ignores all that 
optional metadata. On the other hand I am sure no application currently exists 
which interprets all the metadata. But I don't think that means the metadata is 
not useful. It can still be read by humans, it describes the data properly, and 
we only add features when people have a need for them (usually people who 
intend to produce data).

Best wishes

Jonathan


------------------------------

Message: 3
Date: Thu, 8 May 2008 13:23:02 +0100
From: "Jon Blower" <[EMAIL PROTECTED]>
Subject: [CF-metadata] Tools and CF compliance Was: CF Conventions 1.2
To: "Philip Bentley" <[EMAIL PROTECTED]>
Cc: [email protected]
Message-ID:
        <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1

Hi Philip and list,

(I've started a new thread as this is probably a new topic for discussion.)

>  And I could read this file today using, say, ncdump and ncview. Which 
> clearly doesn't tell us much.

This is a really important point.  It would be very difficult, in the general 
case, to ascertain whether a certain piece of software actually interprets a 
certain CF attribute correctly.  Conversely it is perhaps unreasonable to 
expect a piece of software to implement correctly every feature of a certain CF 
version.

What a tool user really wants to know (I think) is, for a given NetCDF file, 
which attributes in the file are correctly interpreted by the tool.  I can't 
think of a neat way to do this - perhaps tool developers could publish a list 
of attributes that they claim to be able to interpret for each version of the 
tool they produce?  A given tool might then implement 100% of CF1.0 but 50% of 
CF1.2 for example.
Then the CF community could maintain a list of tools that users could go to to 
find out which tools might be most suited to their purpose.

An add-on to the CF compliance checker could be created that, having scanned a 
file for CF attributes, produces a list that says "Tool X understands all of 
the attributes in this file, but Tool Y only understands 7 out of 9".

All this requires effort of course, but I think it's useful to consider what we 
really mean when we call for "CF compliance".  How can we help users to judge 
which tools they should use and how can we help data providers to ensure that 
their data can be interpreted by a wide community?

Jon

On Thu, May 8, 2008 at 11:53 AM, Philip Bentley <[EMAIL PROTECTED]> wrote:
>
>  Hi Jonathan, Ethan,
>
>
>  Dear Ethan
>
> I think that the current rules are a good compromise between the needs 
> of people who write and analyse data, and the needs of developers of 
> analysis and other software. The former group of people would like CF 
> to be modified fairly rapidly, when they are about to start producing 
> data from a project, and they want that data to have proper metadata. 
> As you will have seen from previous discussions, our discussions are 
> too slow as it is sometimes. Hence we decided the rules so that 
> changes could be made, but marked as provisional.
>
>  Indeed. I think the 4-plus years between CF 1.0 and CF 1.1 - 
> according to the date stamps on the documents - says it all. Perhaps 
> the recent flurry of CF proposal activity in part reflects a general desire 
> to 'play catch-up'.
>
> For provisional changes to become permanent depends on at least two 
> applications supporting them. That requires some development effort to 
> be invested. CF doesn't have staff resources of its own to commit to 
> it. I think the most likely applications to make changes first are the 
> cf-checker and libcf. It will be interesting to see how long it takes 
> for the changes so far agreed to be implemented in these or other 
> applications.
>
> I fear that if we followed this approach:
>
> > 2) Don't add changes to the upcoming version of the specification 
> > document "until at least two applications have successfully 
> > interpreted the test data".
>
> development of CF would effectively be halted altogether. It would be 
> impossible for writers of data to agree changes to the CF standard on 
> a short enough timescale. Consequently they would bypass CF, and write 
> and analyse data with their own metadata conventions, and the 
> usefulness of CF in providing a common standard would be undermined.
>
>  I agree 100% with this. If, as a community, we set the barrier to 
> progress [of the CF conventions] too high then people will necessarily 
> devise local, incompatible solutions - not out of willfulness, but 
> simply to meet project deadlines.
>
> Applications don't have to keep entirely up to date, do they? I think 
> the value of the Conventions attribute should be that it is easy to be 
> clear about what conventions are being implemented in data and metadata.
>
> I agree about the test data. We should construct a file which contains 
> some test data for the changes of CF 1.2. (The changes of CF 1.1 did 
> not introduce any new attribute.) We'll need a place to deposit such files.
> As moderator of that ticket, I'll discuss it with Phil and Velimir.
>
>  I can produce some simple test files for the changes at CF 1.2. But 
> the question of what constitutes application conformance is, I 
> suggest, not easily defined. For instance, I could create a noddy 
> netcdf file with two new grid mapping attributes, as follows:
>
>  float temperature(t,z,lat,lon);
>      :grid_mapping = "crs";
>  char crs;
>      :grid_mapping_name = "latitude_longitude";
>      :semi_major_axis = "92389234"; // new at CF 1.2
>      :semi_minor_axis = "78682347"; // new at CF 1.2
>
>  And I could read this file today using, say, ncdump and ncview. Which 
> clearly doesn't tell us much. Yet a proposer of a given CF change 
> cannot force the hands of software developers to produce compliant 
> software within a particular time frame, if at all. In some (many?) 
> circumstances I think we have to take it as an act of faith that a 
> particular update to the CF convention will be advantageous. Plus I 
> believe that the robustness of the CF peer review and challenge 
> mechanisms is sufficient to ensure that those updates will be advantageous.
>
>  Regards,
>  Phil
> _______________________________________________
>  CF-metadata mailing list
>  [email protected]
>  http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>



--
--------------------------------------------------------------
Dr Jon Blower Tel: +44 118 378 5213 (direct line) Technical Director Tel: +44 
118 378 8741 (ESSC) Reading e-Science Centre Fax: +44 118 378 6413 ESSC Email: 
[EMAIL PROTECTED] University of Reading
3 Earley Gate
Reading RG6 6AL, UK
--------------------------------------------------------------


------------------------------

Message: 4
Date: Thu, 08 May 2008 09:23:10 -0600
From: John Caron <[EMAIL PROTECTED]>
Subject: Re: [CF-metadata] CF Conventions 1.2
Cc: [email protected]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed



Jonathan Gregory wrote:
> Dear Phil
> 
>>    Perhaps the recent
>>    flurry of CF proposal activity in part reflects a general desire to
>>    'play catch-up'.
> 
> Yes, I think that is the case. It certainly is the case for the two proposals
> I have made, on the axis and cell_methods attributes. These were discussed on
> the email list and in abeyance for a long time because we had no way to adopt
> them formally until we agreed the new rules.
> 
>>    I can produce some simple test files for the changes at CF 1.2. But
>>    the question of what constitutes application conformance is, I
>>    suggest, not easily defined. For instance, I could create a noddy
>>    netcdf file with two new grid mapping attributes, as follows:
> 
> Yes, I think such a file would be useful, because it does at least provide
> input data that the cf-checker can check for conformance, and other
> applications could likewise check that they can read in and interpret, if they
> are interested in these features. I agree with you that what "compliance"
> actually means for an application is ill-defined. This is an issue which has
> come up before, of course. Since most of CF is optional, in one sense (but not
> a very useful sense) an application is compliant even if it ignores all that
> optional metadata. On the other hand I am sure no application currently exists
> which interprets all the metadata. But I don't think that means the metadata 
> is
> not useful. It can still be read by humans, it describes the data properly,
> and we only add features when people have a need for them (usually people who
> intend to produce data).

As a tool developer, a real netcdf file that has the new features(s) in it is 
extremely useful. In 
fact I dont even try to implement a feature until I have a real example of it.

So on a practical level, requiring an example netcdf file before the final 
acceptance of a feature 
seems to me to be reasonable. Proving that software "correctly conforms" is 
difficult in the general 
case.

We have a repository at Unidata of sample CF files, but they dont document 
which features they use. 
It would be very useful to start that documentation, and tie in back to CF 
section numbers or anchors.

I propose we start a repository of sample files, ideally on the CF site, 
documented as to what CF 
features they use. It would be good if that documentation is a wiki (or 
equivilent), so that the 
initial person can make a start, then others can augment and comment on.


------------------------------

Message: 5
Date: Thu, 08 May 2008 10:07:23 -0600
From: Ethan Davis <[EMAIL PROTECTED]>
Subject: Re: [CF-metadata] CF Conventions 1.2
To: John Caron <[EMAIL PROTECTED]>
Cc: [email protected]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

John Caron wrote:
> We have a repository at Unidata of sample CF files, but they dont document 
> which features they use. 
> It would be very useful to start that documentation, and tie in back to CF 
> section numbers or anchors.
>
> I propose we start a repository of sample files, ideally on the CF site, 
> documented as to what CF 
> features they use. It would be good if that documentation is a wiki (or 
> equivilent), so that the 
> initial person can make a start, then others can augment and comment on

I think it would be useful for each approved change to have a document 
detailing the changes to be made to the CF spec. This document could be 
referenced by the test data documents. As it stands, we have the trac 
ticket discussions which can be very voluminous and even after approval 
aren't always clear what exact changes to the specification are to be made.

Perhaps the closing comment to the trac ticket should be a detailed 
change request. Though I think a separate (wiki?) document would be more 
useful (both during the trac ticket discussion and afterwards for 
documentation purposes).

Ethan

-- 
Ethan R. Davis                                Telephone: (303) 497-8155
Software Engineer                             Fax:       (303) 497-8690
UCAR Unidata Program Center                   E-mail:    [EMAIL PROTECTED]
P.O. Box 3000
Boulder, CO  80307-3000                       http://www.unidata.ucar.edu/
---------------------------------------------------------------------------




------------------------------

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


End of CF-metadata Digest, Vol 62, Issue 2
******************************************


-------------------------------------------------------------------
-------------------------------------------------------------------
Forschungszentrum Jülich GmbH
52425 Jülich

Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzende des Aufsichtsrats: MinDir'in Bärbel Brumme-Bothe
Geschäftsführung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr. Harald Bolt,
Dr. Sebastian M. Schmidt
-------------------------------------------------------------------
-------------------------------------------------------------------



_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to