hi Thomas > is it your feeling that we need to have a better model of accuracy, i.e. > more like the confidence interval idea? Or are we ok with what we have?
well. a measured quantity is a group of data, with some or all of the following things known: - what was measured - how it was measured (+ who & when & where & environment conditions!) - units for the values - a possible range of values What was measured does usually need to be known. In terms of modeling this as a data type, I note that what was measured is usually considered to be outside the data type. And though I wonder whether this is actually right, we're not discussing this right now. In real life, we generally don't count how it was measured as part of the value. The idea is that you go read the "methods section" (whatever that means) if you care. Except that in clinical medicine, there's a few things where how something is measured matters. in clinical medicine, we generally say that something else was measured at this point. A classic example is Total Calcium and Ionized Calcium. (and it's not wrong to call them different things, my point is that it's arbitrary). Anyhow, I've never heard someone argue that they should be part of the datatype. I don't think there's much point in differentiating between a measured and a non-measured quantity - that's for philosophy. so, back to the possible range of values. This is a complex concept. Generally, the possible range of values is a bell shaped probability distribution (or a log bell curve), but it's rarely properly known whether it actually is - it's generally assumed that it is a bell curve. You could *approximate* the concept of a probability distribution by reporting a central value with a +/-, or an interval that expresses the 95% percentile. I know that we got taught in uni to track uncertainties (and sometimes even to quantitate the distribution curve), and to bring them through our equations (and conclusions!), but out in the real world, it's rarely done in published papers (shame, really) and I've never seen it done in clinical work (even in clinical research). In clinical medicine, the only behaviour I've seen is to report a single value, what was actually measured, and not say anything at all about the uncertainty. No, I'm wrong. Once I used to perform an assay where the methodological uncertainty in the number was clinically significant. We used to report a range rather than a point value, so's the doctor's couldn't be mistaken about it's meaning. Reporting <X or >X for a value is something that you have to do if you aren't normally reporting a range of values. So you said you didn't want to model that as an interval, but I was less than convinced - if you always reported an interval, it would be consistent. But even if you were consistent in this way, the methodological basis for the "interval" <5 or >5000 is not the same as the methodological basis for 100-110. These concepts overlap. If you added confidence interval - as an optional item - then you get an interesting situation. If I say that this value is <50 (ci=100), what am I saying? (and don't laugh, this is a common clinical result value to report). In clinical medicine, also, the things that may corrupt the result due to interference from drugs, unusual medical conditions, etc, these don't contribute to the distribution range, so it's not usually significant. This is starting to ramble. As I said, in clinical medicine, we only report a single value, let the interpreter figure out the distribution themselves. If they're not sure, they should contact the number on the report (in all legal jurisdictions I know, there must be one). I think that for the rare cases where the distribution range needs to be conveyed/stored outside the generating system, then the archetype should store it. The archetype already includes some of the other stuff in my original data grouping, so I don't see it as inappropriate to solve it this way. so, leave it as it is. Grahame

