Hi Eike,
Eike Rathke wrote:
[...]
As I mentioned in the previous post: raising an error is one option (the
best), but it depends on some pre-requisites:
- ability to easily find string-as-number cells
- easy conversion to number
[I know that Cor's extension does exist,
BUT Calc should do it natively]
No. Why should it? If an extension can do all work there is no need to
reimplement it in the core again.
My notion of *doing it natively* covers: everything that is bundled with
the original application. If it is an extension, fine, but is should be
installed *by default* with Calc. This is the requirement that must be
met before setting the default to raise an error. So every user that
witnesses the error can immediately correct the error.
Now, there is also an issue with the current Novell-implementation - as
I read from IssueZila. Empty cells do raise errors, too. This should
definitely be avoided. An empty cell is just empty, it is not a string
and not a number. It should be skipped by default.
I am going back to R (and the S+ language). R has a special data type
called NA (Not Available). Functions in R are also able to interpret
differently this, and the user has control over the interpretation of
NA. To quote from the R-help:
"na.action: is a function which indicates what should happen when the
data contain NAs. The default is set by the na.action setting of
options, and is na.fail if that is unset. The ‘factory-fresh’ default
is na.omit. Another possible value is NULL, no action. Value
na.exclude can be useful."
Therefore, one can instruct R to ignore NAs, to remove NAs, or to fail
if a NA is encountered. This option can be passed individually to every
function.
This would be a good way to handle missing values in Calc, too, but most
importantly, this concept can be extended to string-to-number
interpretation/ conversion:
- the default should be *string.fail*
- while other sensible options might be
*string.omit* (Calc does it currently) and
*string.convert* (Excels' way, but more consistently than in Excel).
The user can then explicitly specify how any function/cell will behave.
This is portable, this is reliable. This is reproducible, as every cell
has it individually specified (i.e. if the user needs some non-default
option, he is free to pass this option in the formula he uses without
affecting any other part of the spreadsheet).
Of course, this is not quite the way spreadsheet formulas get
interpreted today, so I still have to think how to pass these options at
formula/cell level. But it is the far superior way to handle the issue.
The reason why interpreting strings as numeric values on the fly is
harmful and we do not want to implement it.
BUT you do it. You interpret it *always* as 0. This is the worst
possible scenario.
It is at least consistent for =A1+A2 and =SUM(A1:A2).
Eike
It might be consistent, but not accurate and very wrong what the outcome
is. Excel can be said is only half the time wrong, while Calc is always.
I hate Excel's handling And I hope Calc will do it better in the future.
To take an example from numerical analysis (as we had a related thread
recently ;) ), one may compute something with an incredible precision,
over 100 bits, but IF all bits are wrong, the accuracy is still a meager
zero.
Now some remarks about causes of string-numbers (was mentioned in the
other e-mail):
1. it is true that this happens when pasting from the clipboard in the
wrong locale (again a problem of the spreadsheet application). As I
said, I worked extensively in a multi-locale environment and the only
sensible way to deal with this is that everyone uses the en-locale. This
is usually not implementable and will remain a pain.
BUT: there are not many decimal separators (most probably only "." and
"," will be encountered reliable often in practice), so sensing a
strange paste should trigger a notification that pasted value is a
string (non-obtrusively like a small icon in the cell). I still do not
have an optimal concept.
2. most instances of numbers-to-strings that I encountered (by a very
large margin) were user-typed strings. The user knowingly entered the
string because of a formatting error (most commonly x.y as date, but
sometimes something else). Fixing the date would probably remove a lot
of these instances.
3. csv-input: this was often problematic and could result in text
instead of numbers. I hope that modern spreadsheets behave better.
3.') exporting a spreadsheet with a proprietary software: this was
frequent in the distant past (csv was more frequent), but shouldn't pose
any major problem in the age of ODF; csv's might still be used, so
fixing any problems with csv should be pursued.
I hope that Calc becomes the best spreadsheet program around. However,
the competition is tough. Excel 2007, Mac's iNumbers and various
dedicated programs are eager to step in.
Sincerely,
Leonard
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]