Rene Engelhard wrote:
> On Wed, Aug 21, 2019 at 03:44:36PM +1000, Trent W. Buck wrote:
> > I still advocate solving only MY problem, with a simple change:
> > 
> >     
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?att=2;bug=929923;filename=929923.patch;msg=22
> 
> And I still say that it at least for en_GB is wrong.
> As said: color vs. colour.
> You say that Australia is used to both, OK, I believe so - but I don't think 
> so
> for en_GB.

As I hinted before, mythes-en-us already contains "colour",
though admittedly not in all cases:

    bash5$ grep -Fc color /usr/share/mythes/th_en_US_v2.dat
    960
    bash5$ grep -Fc colour /usr/share/mythes/th_en_US_v2.dat
    661

A quick analysis of Debian 10's mythes-en-us [1] shows,

  * About 4.3% of the words are valid British-only words ((276K - 208K) ÷ 1.6M).
  * About 3.8% of the words are valid American-only words ((269K - 208K) ÷ 
1.6M).

So according to hunspell (the same spell-checker LibreOffice uses),
th_en_US_v2.dat is actually more British than American :-)


[1]

    bash5$ dpkg-query -W mythes-en-us hunspell hunspell-en-us hunspell-en-gb
    hunspell    1.7.0-2
    hunspell-en-gb      1:6.2.0-1
    hunspell-en-us      1:2018.04.16-1
    mythes-en-us        1:6.2.0-1

    bash5$ wc -w /usr/share/mythes/th_en_US_v2.dat | numfmt --to si   # how 
many words in total?
    1.6M /usr/share/mythes/th_en_US_v2.dat
    bash5$ hunspell -l -d en_US,en_GB /usr/share/mythes/th_en_US_v2.dat | wc -l 
| numfmt --to si  # how many words misspelt in "both" english varieties (i.e. 
false positives)?
    208K
    bash5$ hunspell -l -d en_US /usr/share/mythes/th_en_US_v2.dat | wc -l | 
numfmt --to si  # how many words misspelt in en_US?
    276K
    bash5$ hunspell -l -d en_GB /usr/share/mythes/th_en_US_v2.dat | wc -l | 
numfmt --to si  # how many words misspelt in en_GB?
    269K



PS: Out of curiosity, I looked up some references re "colour" specifically.

    The OED is different enough from en-GB to have its own locale 
(en-GB-oxendict), but
    AFAIK it is nevertheless the primary reference for en-GB spelling.
    I don't have a dead-tree version; it's online version appears to live here:

        https://www.lexico.com/en/definition/color
        https://www.lexico.com/en/definition/colour
        https://www.lexico.com/en/definition/-our

    which simply has rather dogmatic labels "US" and "British",
    though it notes that "-our" is merely a "variant spelling".

    Fowler (1e) definition of "colo(u)r" (p. 83) directs me to
    "See -OR & -OUR", which says

       It is not worth while either to resist such a gradual change or
       to fly in the face of national sentiment by trying to hurry it.

       The American abolition of -our [...] has probably retarded
       rather than quickened English progress in the same direction.

    For en-AU, the AGPS Style Manual (5e) on §3.1 through §3.18
    (pp. 39-42) simply advises doing whatever Macquarie says.
    I don't have a copy of Macquarie handy, and
    the online version is paywalled.

Reply via email to