Bug#1019235: lintian: 'licence' is not a misspelling

2022-09-18 Thread Thorsten Glaser
Package: lintian
Version: 2.115.3
Followup-For: Bug #1019235
X-Debbugs-Cc: t...@mirbsd.de

Spotted this too, please fix it, licence is proper English spelling,
not oversea barbarian dialect.


-- System Information:
Debian Release: bookworm/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'buildd-unstable'), (500, 
'unstable'), (1, 'experimental')
merged-usr: no
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-10-amd64 (SMP w/4 CPU threads)
Kernel taint flags: TAINT_FIRMWARE_WORKAROUND
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/lksh
Init: sysvinit (via /sbin/init)

Versions of packages lintian depends on:
ii  binutils2.38.90.20220713-2
ii  bzip2   1.0.8-5
ii  diffstat1.64-1
ii  dpkg1.21.9
ii  dpkg-dev1.21.9
ii  file1:5.41-4
ii  gettext 0.21-9
ii  gpg 2.2.39-1
ii  intltool-debian 0.35.0+20060710.5
ii  iso-codes   4.11.0-1
ii  libapt-pkg-perl 0.1.40+b1
ii  libarchive-zip-perl 1.68-1
ii  libberkeleydb-perl  0.64-1+b2
ii  libcapture-tiny-perl0.48-1
ii  libclass-xsaccessor-perl1.19-4
ii  libclone-perl   0.45-1+b2
ii  libconfig-tiny-perl 2.28-1
ii  libconst-fast-perl  0.014-2
ii  libcpanel-json-xs-perl  4.32-1
ii  libdata-dpath-perl  0.58-1
ii  libdata-validate-domain-perl0.10-1.1
ii  libdata-validate-uri-perl   0.07-2
ii  libdevel-size-perl  0.83-2
pn  libdigest-sha-perl  
ii  libdpkg-perl1.21.9
ii  libemail-address-xs-perl1.05-1
ii  libencode-perl  3.19-1
ii  libfile-basedir-perl0.09-1
ii  libfile-find-rule-perl  0.34-2
ii  libfont-ttf-perl1.06-2
ii  libhtml-html5-entities-perl 0.004-2
ii  libhtml-tokeparser-simple-perl  3.16-4
ii  libio-interactive-perl  1.023-1
ii  libipc-run3-perl0.048-2
ii  libjson-maybexs-perl1.004003-1
ii  liblist-compare-perl0.55-1
ii  liblist-someutils-perl  0.58-1
ii  liblist-utilsby-perl0.12-1
ii  libmldbm-perl   2.05-3
ii  libmoo-perl 2.005004-3
ii  libmoox-aliases-perl0.001006-2
ii  libnamespace-clean-perl 0.27-2
ii  libpath-tiny-perl   0.122-1
ii  libperlio-gzip-perl 0.20-1
ii  libperlio-utf8-strict-perl  0.009-1+b1
ii  libproc-processtable-perl   0.634-1+b1
ii  libregexp-wildcards-perl1.05-3
ii  libsereal-decoder-perl  5.001+ds-1
ii  libsereal-encoder-perl  5.001+ds-1
ii  libsort-versions-perl   1.62-2
ii  libsyntax-keyword-try-perl  0.27-1
ii  libterm-readkey-perl2.38-2
ii  libtext-levenshteinxs-perl  0.03-5
ii  libtext-markdown-discount-perl  0.13-1+b1
ii  libtext-xslate-perl 3.5.9-1+b1
ii  libtime-duration-perl   1.21-1
ii  libtime-moment-perl 0.44-2
ii  libtimedate-perl2.3300-2
ii  libunicode-utf8-perl0.62-1+b3
ii  liburi-perl 5.12-1
ii  libwww-mechanize-perl   2.15-1
ii  libwww-perl 6.67-1
ii  libxml-libxml-perl  2.0207+dfsg+really+2.0134-1
ii  libyaml-libyaml-perl0.84+ds-1
ii  lzip [lzip-decompressor]1.23-4
ii  lzop1.04-2
ii  man-db  2.10.2-3
ii  patchutils  0.4.2-1
ii  perl [libencode-perl]   5.34.0-5
ii  t1utils 1.41-4
ii  unzip   6.0-27
ii  xz-utils5.2.5-2.1

lintian recommends no packages.

Versions of packages lintian suggests:
pn  binutils-multiarch 
pn  libtext-template-perl  

-- no debconf information



Bug#1019235: lintian: 'licence' is not a misspelling

2022-09-11 Thread Axel Beckert
Control: clone -1 -2
Control: tag -1 + confirmed pending
Control: retitle -2 lintian: New spelling corrections should be automatically 
checked against an american and a british english dictionary
Control: severity -2 wishlist

Hi Andreas,

Andreas Beckmann wrote:
> 'licence' is a valid (mostly british) variant of license

Yep, noticed this as well before I saw your bug report. Already fixed in
https://salsa.debian.org/lintian/lintian/-/commit/7d801b2c9c88683051afe0937b46f065cb8873a2

> Perhaps (new) spelling corrections should be automatically checked
> against an american and a british english dictionary and carefully
> reconsidered if they are found?

Good idea! Cloning the bug report for that accordingly as this is a
separate thing.

Still don't have an idea how to actually do that, but I guess it will
be part of the test suite, not a commit hook.

> Without implying to delete all the matches (I haven't heard most of the
> matching words and would need to look up their meaning...):
> 
> $ grep -v ^# /usr/share/lintian/data/spelling/corrections | cut -d '|' -f 1 | 
> while read word ; do grep "^$word\$" /usr/share/dict/american-english 
> /usr/share/dict/british-english ; done

Thanks for figuring out this nice little command! I though will try to
optimize it to not call grep for each word but use something like:

  grep -Fw -f <(grep -v '^#' /usr/share/lintian/data/spelling/corrections | cut 
-d '|' -f 1) /usr/share/dict/american-english /usr/share/dict/british-english

I now wonder if we should use wamerican/wbritish or
wamerican-insane/wbritish-insane for that. Maybe wamerican/wbritish is
a good start and if we still get too many false posiives, we can
extend it to use wamerican-insane/wbritish-insane. (The latter will
probably also take longer. But then again with my optimized query
above it also just takes less than a second on a 7 year old laptop.
And it yields about 350 hits.)

Some comments about some of those you found:

> /usr/share/dict/american-english:bellow
> /usr/share/dict/british-english:bellow
> /usr/share/dict/american-english:singed
> /usr/share/dict/british-english:singed

Would keep these. The chances that it is a misspelling of "below" or
"signed" are IMHO much higher than the chance that it is used in
Debian in its actual meaning.

So in case we write a test for this, we should probably list
exceptions we want to keep in that test.

> /usr/share/dict/american-english:convertor
> /usr/share/dict/british-english:convertor
> /usr/share/dict/american-english:dependance
> /usr/share/dict/american-english:dependant
> /usr/share/dict/british-english:dependant
> /usr/share/dict/american-english:extravert
> /usr/share/dict/british-english:extravert
> /usr/share/dict/american-english:extraverts
> /usr/share/dict/british-english:extraverts
> /usr/share/dict/american-english:licence
> /usr/share/dict/british-english:licence
> /usr/share/dict/american-english:miniscule
> /usr/share/dict/british-english:miniscule
> /usr/share/dict/american-english:venders
> /usr/share/dict/american-english:vender
> /usr/share/dict/american-english:want's
> /usr/share/dict/british-english:want's

These should probably be removed. They all look like alternative
spellings, either historic or local.

Not sure about the remaining ones.

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#1019235: lintian: 'licence' is not a misspelling

2022-09-05 Thread Andreas Beckmann
Package: lintian
Version: 2.115.3
Severity: normal

'licence' is a valid (mostly british) variant of license
https://www.merriam-webster.com/dictionary/licence

Seen here:
https://www.pcre.org/licence.txt
(and quoted in src:nvidia-cuda-toolkit EULA.txt, i.e. d/license)

Perhaps (new) spelling corrections should be automatically checked
against an american and a british english dictionary and carefully
reconsidered if they are found?

Without implying to delete all the matches (I haven't heard most of the
matching words and would need to look up their meaning...):

$ grep -v ^# /usr/share/lintian/data/spelling/corrections | cut -d '|' -f 1 | 
while read word ; do grep "^$word\$" /usr/share/dict/american-english 
/usr/share/dict/british-english ; done
/usr/share/dict/american-english:arrant
/usr/share/dict/british-english:arrant
/usr/share/dict/american-english:bellow
/usr/share/dict/british-english:bellow
/usr/share/dict/american-english:buss
/usr/share/dict/british-english:buss
/usr/share/dict/american-english:convertor
/usr/share/dict/british-english:convertor
/usr/share/dict/american-english:dependance
/usr/share/dict/american-english:dependant
/usr/share/dict/british-english:dependant
/usr/share/dict/american-english:extravert
/usr/share/dict/british-english:extravert
/usr/share/dict/american-english:extraverts
/usr/share/dict/british-english:extraverts
/usr/share/dict/american-english:fount
/usr/share/dict/british-english:fount
/usr/share/dict/american-english:futz
/usr/share/dict/british-english:futz
/usr/share/dict/american-english:licence
/usr/share/dict/british-english:licence
/usr/share/dict/american-english:miniscule
/usr/share/dict/british-english:miniscule
/usr/share/dict/american-english:programers
/usr/share/dict/american-english:programing
/usr/share/dict/american-english:singed
/usr/share/dict/british-english:singed
/usr/share/dict/american-english:venders
/usr/share/dict/american-english:vender
/usr/share/dict/american-english:want's
/usr/share/dict/british-english:want's
/usr/share/dict/american-english:wont
/usr/share/dict/british-english:wont

Andreas