Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Felix Lechner
Hi,

On Mon, Sep 6, 2021 at 3:42 PM Jelmer Vernooij  wrote:
>
> something needs to tell the maintainer their package is wrong.

I struggle with telling anyone that "their package is wrong," when a
maintainer, possibly overwhelmed by zealous metadata collection
(hello, DuckDuckGo), added a defensible value. The sources I saw that
listed "https://github.com/join"; for Registration pointed to upstreams
offering issue resolution via Github. I believe the site requires
users to be registered. Moreover, the URL appeared to be correct.

> If they're a part of the debian/ packaging, then surely they're in the
> realm of what lintian checks for?

Lintian can do it, but you presented neither a good set of prohibited
or a good set of allowed values. In fact neither side is well-defined,
which makes for poor tags

> Should we create separate linters
> for certain files under debian/ like debian/upstream/metadata ?

I would certainly welcome such a tool.

Kind regards
Felix Lechner



Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Jelmer Vernooij
On Mon, Sep 06, 2021 at 03:10:05PM -0700, Felix Lechner wrote:
> On Mon, Sep 6, 2021 at 2:26 PM Jelmer Vernooij  wrote:
> > It won't provide maintainers of packages that use
> > invalid settings that they are. Isn't that purpose of lintian?
> I am not sure. Is it perhaps a gray zone the Janitor could fill?
I don't see how the janitor is related here. It's not a linting tool
and it can't report issues to maintainers. lintian-brush can fix a
subset of issues reported by lintian (where it can edit the canonical
source that matches the output scanned by lintian), but in the cases
where it can't we need the maintainer to fix the issue - and something
needs to tell the maintainer their package is wrong.

> There are a few open questions: Why for example does the Github signup
> page occur so often in the archive? [1] Do we actually need the field?
> [2] I am not even sure the reference is incorrect. What if an upstream
> manages bug reports via Github's issue tracker, like gocryptfs? [3]
> (Please don't worry—I did not set the Registration field there. [4])
> 
> To be sure, I am not opposed to your suggestion in principle, but
> people do a lot of weird stuff. Is the obscure (and often ignored)
> upstream metadata really worth our attention?

Whether these fields are useful enough to be included in
debian/upstream/metadata is a great question, and I'm very happy to
receive pushback in that regard. That should probably be a part of the
wider discussion around the finaliation of DEP-12.

> > Or, looking at a counter-example - there is e.g. a pypi-homepage
> > tag; not just a homepage classification.
> 
> I think there is a difference. A project's home page is often the
> first point of contact, especially in search of documentation. When do
> people look at the Registration field in the upstream metadata,
> please?

I think we should either kill these fields if they're not useful,
/or/ make sure that they have correct values in them. Leaving them with
often incorrect data makes them even less useful and just adds extra noise and 
work.

If they're a part of the debian/ packaging, then surely they're in the
realm of what lintian checks for? Should we create separate linters
for certain files under debian/ like debian/upstream/metadata ?

Jelmer

-- 
Jelmer Vernooij 
PGP Key: https://www.jelmer.uk/D729A457.asc


signature.asc
Description: PGP signature


Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Felix Lechner
Hi,

On Mon, Sep 6, 2021 at 2:26 PM Jelmer Vernooij  wrote:
>
> It won't provide maintainers of packages that use
> invalid settings that they are. Isn't that purpose of lintian?

I am not sure. Is it perhaps a gray zone the Janitor could fill?

There are a few open questions: Why for example does the Github signup
page occur so often in the archive? [1] Do we actually need the field?
[2] I am not even sure the reference is incorrect. What if an upstream
manages bug reports via Github's issue tracker, like gocryptfs? [3]
(Please don't worry—I did not set the Registration field there. [4])

To be sure, I am not opposed to your suggestion in principle, but
people do a lot of weird stuff. Is the obscure (and often ignored)
upstream metadata really worth our attention?

> Or, looking at a counter-example - there is e.g. a pypi-homepage
> tag; not just a homepage classification.

I think there is a difference. A project's home page is often the
first point of contact, especially in search of documentation. When do
people look at the Registration field in the upstream metadata,
please?

Kind regards
Felix Lechner

[1] 
https://codesearch.debian.net/search?q=https%3A%2F%2Fgithub.com%2Fjoin&literal=1&perpkg=1
[2] https://wiki.debian.org/UpstreamMetadata
[3] https://github.com/rfjakob/gocryptfs
[4] https://sources.debian.org/src/gocryptfs/1.8.0-1/debian/upstream/metadata/



Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Jelmer Vernooij
On Mon, Sep 06, 2021 at 01:44:31PM -0700, Felix Lechner wrote:
> On Mon, Sep 6, 2021 at 12:56 PM Jelmer Vernooij  wrote:
> > it would simply be a list of
> > known bad values
> 
> I am not sure I agree with the hardcoding of those values unless they
> create legal issues like license violations or the risk of criminal
> prosecution. How about we repurpose the classification tag
> 'upstream-metadata-field-present' to also provide the field contents,
> which is what you are after?
> 
> You could then use Lintian's query interface to examine the values to
> your liking.

That will mean maintainers need to decide which values are valid and
which ones aren't. It won't provide maintainers of packages that use
invalid settings that they are. Isn't that purpose of lintian?

Or, looking at a counter-example - there is e.g. a pypi-homepage
tag; not just a homepage classification.

Cheers,

Jelmer

-- 
Jelmer Vernooij 
PGP Key: https://www.jelmer.uk/D729A457.asc


signature.asc
Description: PGP signature


Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Felix Lechner
Control: retitle -1 lintian: export upstream metadata in classification tag

Hi,

On Mon, Sep 6, 2021 at 12:56 PM Jelmer Vernooij  wrote:
>
> it would simply be a list of
> known bad values

I am not sure I agree with the hardcoding of those values unless they
create legal issues like license violations or the risk of criminal
prosecution. How about we repurpose the classification tag
'upstream-metadata-field-present' to also provide the field contents,
which is what you are after?

You could then use Lintian's query interface to examine the values to
your liking.

Kind regards
Felix Lechner



Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Jelmer Vernooij
On Mon, Sep 06, 2021 at 12:51:43PM -0700, Felix Lechner wrote:
> On Mon, Sep 6, 2021 at 12:24 PM Jelmer Vernooij  wrote:
> >
> > Registration: https://github.com/join
> 
> As a tool without network access, Lintian may not be well-suited to
> synchronize upstream metadata.

This wouldn't require network access - it would simply be a list of
known bad values, in the same way that we have those in other places
(e.g. known bad hosting sites for Homepage).

Jelmer

-- 
Jelmer Vernooij 
PGP Key: https://www.jelmer.uk/D729A457.asc


signature.asc
Description: PGP signature


Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Felix Lechner
Hi,

On Mon, Sep 6, 2021 at 12:24 PM Jelmer Vernooij  wrote:
>
> Registration: https://github.com/join

As a tool without network access, Lintian may not be well-suited to
synchronize upstream metadata.

Kind regards
Felix Lechner



Bug#993813: warn about known invalid fields in debian/upstream/metadata

2021-09-06 Thread Jelmer Vernooij
Package: lintian
Version: 2.104.0
Severity: wishlist

Some packages have known incorrect values in debian/upstream/metadata. For 
example:

Registration: https://github.com/join

is certain to be incorrect - as we don't package GitHub.

It would be great if lintian could warn about these.

-- System Information:
Debian Release: 11.0
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-8-amd64 (SMP w/2 CPU threads)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages lintian depends on:
ii  binutils2.37-4
ii  bzip2   1.0.8-4
ii  diffstat1.64-1
ii  dpkg1.20.9
ii  dpkg-dev1.20.9
ii  file1:5.39-3
ii  gettext 0.21-4
ii  gpg 2.2.27-2
ii  intltool-debian 0.35.0+20060710.5
ii  libapt-pkg-perl 0.1.40
ii  libarchive-zip-perl 1.68-1
ii  libcapture-tiny-perl0.48-1
ii  libclass-xsaccessor-perl1.19-3+b7
ii  libclone-perl   0.45-1+b1
ii  libconfig-tiny-perl 2.26-1
ii  libcpanel-json-xs-perl  4.25-1+b1
ii  libdata-dpath-perl  0.58-1
ii  libdata-validate-domain-perl0.10-1.1
ii  libdevel-size-perl  0.83-1+b2
ii  libdpkg-perl1.20.9
ii  libemail-address-xs-perl1.04-1+b3
ii  libfile-basedir-perl0.08-1
ii  libfile-find-rule-perl  0.34-1
ii  libfont-ttf-perl1.06-1.1
ii  libhtml-html5-entities-perl 0.004-1.1
ii  libipc-run3-perl0.048-2
ii  libjson-maybexs-perl1.004003-1
ii  liblist-compare-perl0.55-1
ii  liblist-moreutils-perl  0.430-2
ii  liblist-utilsby-perl0.11-1
ii  libmoo-perl 2.004004-1
ii  libmoox-aliases-perl0.001006-1.1
ii  libnamespace-clean-perl 0.27-1
ii  libpath-tiny-perl   0.118-1
ii  libperlio-gzip-perl 0.19-1+b7
ii  libproc-processtable-perl   0.59-2+b1
ii  libsereal-decoder-perl  4.018+ds-1+b1
ii  libsereal-encoder-perl  4.018+ds-1+b1
ii  libtext-glob-perl   0.11-1
ii  libtext-levenshteinxs-perl  0.03-4+b8
ii  libtext-markdown-discount-perl  0.12-1+b1
ii  libtext-xslate-perl 3.5.8-1+b1
ii  libtime-duration-perl   1.21-1
ii  libtime-moment-perl 0.44-1+b3
ii  libtimedate-perl2.3300-2
ii  libtry-tiny-perl0.30-1
ii  libtype-tiny-perl   1.012002-1
ii  libunicode-utf8-perl0.62-1+b2
ii  liburi-perl 5.08-1
ii  libxml-libxml-perl  2.0134+dfsg-2+b1
ii  libyaml-libyaml-perl0.82+repack-1+b1
ii  lzip1.22-3
ii  lzop1.04-2
ii  man-db  2.9.4-2
ii  patchutils  0.4.2-1
ii  perl [libdigest-sha-perl]   5.32.1-5
ii  t1utils 1.41-4
ii  unzip   6.0-26
ii  xz-utils5.2.5-2

lintian recommends no packages.

Versions of packages lintian suggests:
pn  binutils-multiarch 
ii  libtext-template-perl  1.59-1

-- no debconf information