Re: [CF-metadata] [cf-convention/cf-conventions] Proposal: Add QARTOD quality flag names to standard name list (#216)

Jessica Austin Mon, 03 Feb 2020 18:35:32 -0800

Thank you Nan and Alison for your comments and suggestions. 

I think we're all still in agreement on the standard name list, including 
Alison's suggestions for additional text to add to the non-aggregate flag 
names. The wording and scope for the aggregate flag still needs more 
clarification though.


**After some internal discussion, here's our proposed revision for the 
`aggregate_quality_flag` definition:**

Original: This flag is a summary of all quality tests run for another data 
variable, and is set to the highest-level (worst case) flag found. The linkage 
between the data variable and this variable is achieved using the 
ancillary_variables attribute

**Proposed (differences highlighted):** This flag is a summary of all 
*relevant* quality tests run for *the related ancillary parent data variable*, 
and is set to the highest-level (worst case) flag found. The linkage between 
the data variable and this variable is achieved using the ancillary_variables 
attribute.  *The aggregate quality flag represents the summary of all quality 
tests performed on the data variable, whether automated or manual, and whether 
present in the dataset as independent ancillary variables to the parent data 
variable or not.*

#### Our justification:

Through this process, we have specifically decided to *not* use the word QARTOD 
in our standard names, which means we are leaving these flags open to describe 
any kind of QC process: QARTOD, some other well-known testing scheme, 
human-in-the-loop testing, etc -- or even some combination of all these. These 
processes are complex and therefore difficult to describe completely using 
variable attributes. 

We are trying to strike a balance between keeping things simple and generic, 
while still being descriptive enough to be useful. Introducing a way to 
describe a single roll-up/aggregate/summary QC flag for a variable is extremely 
powerful! It is especially useful when it comes to writing scripts that use 
this data, which is the main motivation for this proposal. So we want to keep 
`aggregate_quality_flag` broad enough to encompass all kinds of testing that 
might happen on a data variable. We also want to make it easy enough to 
understand and use, so that it is widely adopted. 

In that same vein, we are purposefully leaving out any specifics on how to 
define *exactly* what tests contributed to the aggregate flag. An external 
script would treat `aggregate_quality_flag` the same no matter how it was 
constructed. A human might want to know what tests made up the aggregate flag, 
but they could look to links or text elsewhere in the dataset metadata that 
describes the QC process (`long_name`, `comments`, `references`, `history`, 
etc). Again, this is to strike a balance between being descriptive and being 
easy to use.

#### An example:

Consider a scenario where you have QARTOD running in real-time as your data 
comes in, and you also periodically do human-in-the-loop testing. The QARTOD 
tests are very well-defined, but the HIL testing is not -- maybe you have a 
combination of MATLAB scripts and manual flagging based on plots, for example. 
So in terms of the dataset, you could have something like:

```
float sea_water_practical_salinity(time, z);
    sea_water_practical_salinity:units = "1";
    sea_water_practical_salinity:long_name = "Salinity";
    sea_water_practical_salinity:standard_name = "sea_water_practical_salinity";
    sea_water_practical_salinity:ancillary_variables = 
"sea_water_practical_salinity_qc_agg 
sea_water_practical_salinity_qc_gross_range_test 
sea_water_practical_salinity_qc_manual";

int sea_water_practical_salinity_qc_agg(time, z);
    sea_water_practical_salinity_qc_agg:long_name = "Salinity Summary QC Flag";
    sea_water_practical_salinity_qc_agg:standard_name = 
"aggregate_quality_flag";
    sea_water_practical_salinity_qc_agg:missing_value = 2;
    sea_water_practical_salinity_qc_agg:flag_meanings = "PASS NOT_EVALUATED 
SUSPECT FAIL MISSING";
    sea_water_practical_salinity_qc_agg:flag_values = 1, 2, 3, 4, 9;

int sea_water_practical_salinity_qc_gross_range_test(time, z);
    sea_water_practical_salinity_qc_gross_range_test:long_name = "Salinity 
Gross Range QC Test Flag";
    sea_water_practical_salinity_qc_gross_range_test:standard_name = 
"gross_range_test_quality_flag";
    sea_water_practical_salinity_qc_gross_range_test:missing_value = 2;
    sea_water_practical_salinity_qc_gross_range_test:flag_meanings = "PASS 
NOT_EVALUATED SUSPECT FAIL MISSING";
    sea_water_practical_salinity_qc_gross_range_test:flag_values = 1, 2, 3, 4, 
9;

int sea_water_practical_salinity_qc_manual(time, z);
    sea_water_practical_salinity_qc_manual:long_name = "Salinity Manual Review 
QC Tests Flag";
    sea_water_practical_salinity_qc_manual:standard_name = "quality_flag";
    sea_water_practical_salinity_qc_manual:missing_value = 2;
    sea_water_practical_salinity_qc_manual:flag_meanings = "PASS FAIL_BIOFOUL 
FAIL_INSTR FAIL_TELEM";
    sea_water_practical_salinity_qc_manual:flag_values = 0, 1, 2, 3;
```

(And hopefully you describe or link to your overall qc process, including HIL 
testing, somewhere in the global metadata!)

The manual tests and results could be very specific to your group -- in this 
example for the manual tests the data point could either be PASS, or could FAIL 
due to instrument issues, bio-fouling, etc. 

While the QARTOD flat line test and the manual HIL tests involve different 
processes and flagging schemes, at the end of the day they should be somehow 
combined in a single QC result per data point. How you do that combination can 
be as arbitrary and complex as the QC process itself. This is where the 
`aggregate_quality_flag` is useful: the proposed definition gives you plenty of 
flexibility to include or exclude specific test results in your aggregate flag, 
depending on whether or not it is *relevant* at any given time.

Furthermore, we have come across scenarios where a group has a single qc flag 
per variable, that they update during periodic data reviews. In that case, the 
QARTOD and HIL variables would not be present -- just the "aggregate" flag. 
Hence the wording at the end: "whether present in the dataset as independent 
ancillary variables to the parent data variable or not".

#### Answers to specific questions from Alison and Nan

>From Alison:

* If we were to add standard names for more QARTOD defined quality tests, as 
has already been suggested, would the results of those tests then also form 
part of the aggregate?
   * Yes, if they were *relevant*. It's up to the operator to decide which 
variables are used for the aggregate flag
   * A script would treat the `aggregate_quality_flag` variable the same in 
either case. A human could look for documentation if they needed to know 
specifically what tests were used to create the aggregate flag 
* Would the aggregate flag include the results of quality tests that were 
defined by some standard other than QARTOD if they happened to appear in the 
same data file?
   * Same answer as above

>From Nan:

* There are cases when a test is run and the results are not considered 
important for some reason - failing a spike test because of data is determined 
to represent an actual measurement, or ... failing a gap test because of poor 
telemetry. The point of the long drawn out discussion was in part to make sure 
these standard names could be used by other QC systems; letting those systems 
document their own method of setting the value of the aggregate is important.
  * Yes, we agree with this sentiment
  * Nan, do you agree that the proposed new wording is flexible enough to allow 
for this scenario?
* I'd like to have some text describing the best way to convey the name of the 
QC system being used - maybe a recommended attribute on each of these quality 
variables? That would solve the problem of people mixing QARTOD and other 
tests. Can we do that in the standard name guidance, or does QARTOD need to 
describe that themselves?
   * If a test has a name `flat_line_test_quality_flag`, does it really matter 
what QC system was used to create it? Isn't that the whole point of creating a 
generic `standard_name` for the test?
   * If the QC system used is relevant to a user, we think that providing an 
overall description of the QC process (or a link to it) elsewhere in the 
dataset metadata is a simple yet effective way to convey this information
* Also, could we recommend that the aggregate have a way to list its component 
tests? These could be given as ancillary variables.
  * We believe this adds too much complexity without enough benefit. If the 
information is relevant to a user, then it could be listed in a global or 
variable attribute. But since this information is for humans and not scripts, 
we don't think it makes sense to create a strict definition here.



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/cf-convention/cf-conventions/issues/216#issuecomment-581714109

This list forwards relevant notifications from Github.  It is distinct from 
[email protected], although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
[email protected].

Re: [CF-metadata] [cf-convention/cf-conventions] Proposal: Add QARTOD quality flag names to standard name list (#216)

Reply via email to