Thank you Nan and Alison for your comments and suggestions.
I think we're all still in agreement on the standard name list, including
Alison's suggestions for additional text to add to the non-aggregate flag
names. The wording and scope for the aggregate flag still needs more
clarification though.
**After some internal discussion, here's our proposed revision for the
`aggregate_quality_flag` definition:**
Original: This flag is a summary of all quality tests run for another data
variable, and is set to the highest-level (worst case) flag found. The linkage
between the data variable and this variable is achieved using the
ancillary_variables attribute
**Proposed (differences highlighted):** This flag is a summary of all
*relevant* quality tests run for *the related ancillary parent data variable*,
and is set to the highest-level (worst case) flag found. The linkage between
the data variable and this variable is achieved using the ancillary_variables
attribute. *The aggregate quality flag represents the summary of all quality
tests performed on the data variable, whether automated or manual, and whether
present in the dataset as independent ancillary variables to the parent data
variable or not.*
#### Our justification:
Through this process, we have specifically decided to *not* use the word QARTOD
in our standard names, which means we are leaving these flags open to describe
any kind of QC process: QARTOD, some other well-known testing scheme,
human-in-the-loop testing, etc -- or even some combination of all these. These
processes are complex and therefore difficult to describe completely using
variable attributes.
We are trying to strike a balance between keeping things simple and generic,
while still being descriptive enough to be useful. Introducing a way to
describe a single roll-up/aggregate/summary QC flag for a variable is extremely
powerful! It is especially useful when it comes to writing scripts that use
this data, which is the main motivation for this proposal. So we want to keep
`aggregate_quality_flag` broad enough to encompass all kinds of testing that
might happen on a data variable. We also want to make it easy enough to
understand and use, so that it is widely adopted.
In that same vein, we are purposefully leaving out any specifics on how to
define *exactly* what tests contributed to the aggregate flag. An external
script would treat `aggregate_quality_flag` the same no matter how it was
constructed. A human might want to know what tests made up the aggregate flag,
but they could look to links or text elsewhere in the dataset metadata that
describes the QC process (`long_name`, `comments`, `references`, `history`,
etc). Again, this is to strike a balance between being descriptive and being
easy to use.
#### An example:
Consider a scenario where you have QARTOD running in real-time as your data
comes in, and you also periodically do human-in-the-loop testing. The QARTOD
tests are very well-defined, but the HIL testing is not -- maybe you have a
combination of MATLAB scripts and manual flagging based on plots, for example.
So in terms of the dataset, you could have something like:
```
float sea_water_practical_salinity(time, z);
sea_water_practical_salinity:units = "1";
sea_water_practical_salinity:long_name = "Salinity";
sea_water_practical_salinity:standard_name = "sea_water_practical_salinity";
sea_water_practical_salinity:ancillary_variables =
"sea_water_practical_salinity_qc_agg
sea_water_practical_salinity_qc_gross_range_test
sea_water_practical_salinity_qc_manual";
int sea_water_practical_salinity_qc_agg(time, z);
sea_water_practical_salinity_qc_agg:long_name = "Salinity Summary QC Flag";
sea_water_practical_salinity_qc_agg:standard_name =
"aggregate_quality_flag";
sea_water_practical_salinity_qc_agg:missing_value = 2;
sea_water_practical_salinity_qc_agg:flag_meanings = "PASS NOT_EVALUATED
SUSPECT FAIL MISSING";
sea_water_practical_salinity_qc_agg:flag_values = 1, 2, 3, 4, 9;
int sea_water_practical_salinity_qc_gross_range_test(time, z);
sea_water_practical_salinity_qc_gross_range_test:long_name = "Salinity
Gross Range QC Test Flag";
sea_water_practical_salinity_qc_gross_range_test:standard_name =
"gross_range_test_quality_flag";
sea_water_practical_salinity_qc_gross_range_test:missing_value = 2;
sea_water_practical_salinity_qc_gross_range_test:flag_meanings = "PASS
NOT_EVALUATED SUSPECT FAIL MISSING";
sea_water_practical_salinity_qc_gross_range_test:flag_values = 1, 2, 3, 4,
9;
int sea_water_practical_salinity_qc_manual(time, z);
sea_water_practical_salinity_qc_manual:long_name = "Salinity Manual Review
QC Tests Flag";
sea_water_practical_salinity_qc_manual:standard_name = "quality_flag";
sea_water_practical_salinity_qc_manual:missing_value = 2;
sea_water_practical_salinity_qc_manual:flag_meanings = "PASS FAIL_BIOFOUL
FAIL_INSTR FAIL_TELEM";
sea_water_practical_salinity_qc_manual:flag_values = 0, 1, 2, 3;
```
(And hopefully you describe or link to your overall qc process, including HIL
testing, somewhere in the global metadata!)
The manual tests and results could be very specific to your group -- in this
example for the manual tests the data point could either be PASS, or could FAIL
due to instrument issues, bio-fouling, etc.
While the QARTOD flat line test and the manual HIL tests involve different
processes and flagging schemes, at the end of the day they should be somehow
combined in a single QC result per data point. How you do that combination can
be as arbitrary and complex as the QC process itself. This is where the
`aggregate_quality_flag` is useful: the proposed definition gives you plenty of
flexibility to include or exclude specific test results in your aggregate flag,
depending on whether or not it is *relevant* at any given time.
Furthermore, we have come across scenarios where a group has a single qc flag
per variable, that they update during periodic data reviews. In that case, the
QARTOD and HIL variables would not be present -- just the "aggregate" flag.
Hence the wording at the end: "whether present in the dataset as independent
ancillary variables to the parent data variable or not".
#### Answers to specific questions from Alison and Nan
>From Alison:
* If we were to add standard names for more QARTOD defined quality tests, as
has already been suggested, would the results of those tests then also form
part of the aggregate?
* Yes, if they were *relevant*. It's up to the operator to decide which
variables are used for the aggregate flag
* A script would treat the `aggregate_quality_flag` variable the same in
either case. A human could look for documentation if they needed to know
specifically what tests were used to create the aggregate flag
* Would the aggregate flag include the results of quality tests that were
defined by some standard other than QARTOD if they happened to appear in the
same data file?
* Same answer as above
>From Nan:
* There are cases when a test is run and the results are not considered
important for some reason - failing a spike test because of data is determined
to represent an actual measurement, or ... failing a gap test because of poor
telemetry. The point of the long drawn out discussion was in part to make sure
these standard names could be used by other QC systems; letting those systems
document their own method of setting the value of the aggregate is important.
* Yes, we agree with this sentiment
* Nan, do you agree that the proposed new wording is flexible enough to allow
for this scenario?
* I'd like to have some text describing the best way to convey the name of the
QC system being used - maybe a recommended attribute on each of these quality
variables? That would solve the problem of people mixing QARTOD and other
tests. Can we do that in the standard name guidance, or does QARTOD need to
describe that themselves?
* If a test has a name `flat_line_test_quality_flag`, does it really matter
what QC system was used to create it? Isn't that the whole point of creating a
generic `standard_name` for the test?
* If the QC system used is relevant to a user, we think that providing an
overall description of the QC process (or a link to it) elsewhere in the
dataset metadata is a simple yet effective way to convey this information
* Also, could we recommend that the aggregate have a way to list its component
tests? These could be given as ancillary variables.
* We believe this adds too much complexity without enough benefit. If the
information is relevant to a user, then it could be listed in a global or
variable attribute. But since this information is for humans and not scripts,
we don't think it makes sense to create a strict definition here.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/cf-convention/cf-conventions/issues/216#issuecomment-581714109
This list forwards relevant notifications from Github. It is distinct from
[email protected], although if you do nothing, a subscription to the
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to
[email protected].