On Mon, 18 Apr 2005, Peter Dalgaard wrote:

Martin Maechler <[EMAIL PROTECTED]> writes:

    BDR> We could do better by insisting that "." was the decimal
    BDR> point in all interval conversions _to_ numeric. Then the
    BDR> effect of setting LC_NUMERIC would primarily be on
    BDR> conversions _from_ numeric, especially printing and
    BDR> graphical output. (One issue would be what to do with
    BDR> scan(), which has a `dec' argument but is implemented
    BDR> assuming LC_NUMERIC=C. I would hope to continue to have
    BDR> `dec' but perhaps with a locale-dependent default.) The
    BDR> resulting asymmetry (R would not be able to parse its own
    BDR> output) would be unhappy, but seems inevitable. (This could
    BDR> be implemented easily by having a `dec' arg to EncodeReal
    BDR> and EncodeComplex, and using LC_NUMERIC to control that
    BDR> rather than actually setting the local category. For
    BDR> example, deparsing needs to be done in LC_NUMERIC=C.)

Yes, I like this quite a bit:

 -  Only allow "." as decimal point in conversions to numeric.

 -  Allowing "," (or other locale settings if there are) for
    conversions _from_ numeric will be very attractive to some
    (not to me) and will make the use of R's ``reporting
    facility' much more natural to them.

  That the asymmetry is bit unhappy -- and that will be a good reason
  to advocate (to the user community) that using "," for decimal
  point may be a bad idea in general.

Could I suggest that we tread very carefully here? This issue has caused several trip-ups historically:

- The locale-dependent "comma-separated variables" format, in some
 cases not separated by commas. And it seems that you can still get
 Excel files that use comma both for separation and as decimal point
 (I thought that problem disappeared with early versions of Paradox,
 but apparently not, according to a resent query on r-help).

- Exports from SAS as a text file cannot be read by SPSS and vice
 versa.

etc. Quite possibly, the "computer world" missed the opportunity to
agree on an international standard (what's the big deal with using
commas anyway?). As it is we probably have to adjust to it, but we
have to distinguish very carefully between reports, code, and data,
and choose appropriate conventions for each case.

I was treading _very_ carefully. Nowhere did I suggest altering any of write.table and friends. I did not even suggest altering read.table. I tentatively suggested the default in scan() might be locale-specific, but was otherwise leaving import/export completely alone.

The aim is to allow people to have commas in printed output and graph labels if they want. Note, nothing would be done unless people explicitly did something like Sys.setlocale("LC_MISSING", "fr_FR") so this would not affect naive users in any way.

Brian

--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to