Perhaps what is more important than being able to support a count of the number of stars in the galaxy, is how we treat these numbers once we have them captured at whatever that precision this might be.
Currently I think our notion of precision is a bit weak. In the datamart service for indicators, for example, we seem to have a fixed notion of precision which is based on decimal places - from my reading of the code, it seems we store accurate to 1 decimal place. What we probably should be doing is maintaining some confidence level of significant figures. This becomes quite obvious if we start inputting and storing values in scientific notation - those of us old enough to have used a slide rule will be familiar with this :-). So if I have a numerator (eg malaria cases) of 5436 and a denominator (eg population) of 155000, then what can I say about the indicator value? Well if I calculate on my calculator I get: 0.035070968 but obviously I am not confident in all those digits. But if my numerator is accurate to 4 significant figures and my denominator is accurate to 3, then I can be be confident to 2 significant figures in my result; ie I can report the value as: 0.035 I am not sure what the best strategy of managing precision in dhis should be, but it does strike me, for a system concerned with aggregation, we should attempt to attack it a bit more rigorously than we do. What this probably requires, at the point of capture, is to capture the precision of the number, particularly where we know we are capturing an estimate eg. as a result of rounding. This is done implicitly when using scientific notation. The problem is more visible when we capture a string like "155000". How precise is that? Well we don't actually know. Intuitively we suspect its not accurate to 6 significant figure, and that its accurate to at least 3. But it could be 4 (eg. 1.550E5).. Maybe its just me that worries a bit about these things. Does anyone else have a sense that it is important to be able to indicate the precision of calculated indicator values? Bob PS. Storing natural number 'counts' as a floating point number introduces some untidyness here, but one that can be dealt with as we "know" the numberType of the datalement value. PPS. this is is a very similar issue with an earlier discussion re rounding of coordinates during GML import. The number of decimal places should always be an outcome rather a target of specifying precision. On 19 September 2011 09:16, Morten Olav Hansen <morte...@gmail.com> wrote: >> Yes, this is my point. I am sure (without knowing the details) that >> there are restrictions on what would be a valid exponent and fraction >> for a decimal representation of a real number. If a number with 255 >> digits is stored as text, and whether the values are handled as a >> double (I think that all values are treated as doubles regardless of >> whether they are integers or not), this places different restrictions >> on the number length which we should allow. So, if someone types in an >> exponent with 200 numbers and 55 decimal points (which we could store >> as text), would be be a valid double value? > > The range of double should be -1.79769313486231570E+308 to > 1.79769313486231570E+308 (if using 64 bit java I assume..). > > There is also BigInteger / BigDecial that could be used, that supports > even bigger numbers. > > That said, this is just what Java has to offer, what DHIS2 supports I > do not know. > > -- > Morten > > _______________________________________________ > Mailing list: https://launchpad.net/~dhis2-devs > Post to : dhis2-devs@lists.launchpad.net > Unsubscribe : https://launchpad.net/~dhis2-devs > More help : https://help.launchpad.net/ListHelp > _______________________________________________ Mailing list: https://launchpad.net/~dhis2-devs Post to : dhis2-devs@lists.launchpad.net Unsubscribe : https://launchpad.net/~dhis2-devs More help : https://help.launchpad.net/ListHelp