Maris Nartiss wrote:

> The offending line is a reference in the comment section:
> http://trac.osgeo.org/grass/browser/grass/trunk/imagery/i.atcorr/computations.cpp#L1365
> 
> I browsed SUBMITTING file and didn't find any rules about source
> encoding. As a supporter of Unicode everywhere, I would suggest to add
> a requirement for source files to be in UTF-8. Upside - most of files
> already are in UTF-8. Thus only files with symbols outside of latin1
> would be affected.

Most files are ASCII. Those which aren't are almost evenly split
between ISO-8859-1 and UTF-8:

Files using ISO-8859-1:

raster/r.sunmask/g_solposition.c        U+00B0  DEGREE SIGN
imagery/i.topo.corr/main.c              U+00F1  LATIN SMALL LETTER N WITH TILDE
imagery/i.landsat.toar/landsat.h        U+00B5  MICRO SIGN
imagery/i.evapo.pm/functions.c          U+00B0  DEGREE SIGN
imagery/i.atcorr/computations.cpp       U+00E9  LATIN SMALL LETTER E WITH ACUTE
lib/raster/color_look.c                 U+00AD  SOFT HYPHEN
lib/raster/color_set.c                  U+00AD  SOFT HYPHEN

Files using UTF-8:

raster/r.sunmask/main.c                 U+00B0  DEGREE SIGN
raster/r.watershed/ram/do_flatarea.c    U+2013  EN DASH
vector/v.net.salesman/main.c            U+2013  EN DASH
gui/wxpython/lmgr/frame.py              U+00F6  LATIN SMALL LETTER O WITH 
DIAERESIS
                                        U+2019  RIGHT SINGLE QUOTATION MARK
lib/python/pygrass/functions.py         U+00B0  DEGREE SIGN
lib/arraystats/class.c                  U+00E9  LATIN SMALL LETTER E WITH ACUTE

Many of these are either gratuitous, e.g. use of soft hyphen or
en-dash when an ASCII "-" (U+002D HYPHEN-MINUS) would suffice.

Some are due to comments written in languages other than English
(i.topo.corr = Spanish, lib/arraystats = French); these should be
translated.

All but one are in comments: the pygrass one is a string literal,
which should really use escape notation (assuming that the
is_clean_name() function is actually correct, and not a half-baked
attempt at re-implementing G_legal_filename()).

So, if those are fixed, it boils down to whether we actually want to
have to deal with source-code encoding issue for the sake of comments
which include:

a) °C for degrees Celcius,
b) µm for micrometres (microns), and
c) proper names using the Latin script with accents (names using any
other script will invariably be romanised).

Personally, I would prefer it if source code was 7-bit clean.

-- 
Glynn Clements <[email protected]>
_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

Reply via email to