Control: tags -1 + patch Dear Orestis, python-debian mainatiners & debsources folks,
> Debian standard [1] suggests that the license field in the files > paragraph is required whereas when you parse it is only optional. > > This is sometimes causing a trouble when using the package since > the user has to verify that the license object in the files > paragraph is not None and thus raising AttributeError when > accessing the synopsis for example. > > I guess the solution must not be to consider the d/copyright file > as non machine readable but you might want to omit that specific > paragraph and log this error. I think this is a good suggestion, and looking at the use of `debian.copyright.Copyright` in debsources, I can see that they have had to do a similar dance to what you describe. https://salsa.debian.org/qa/debsources/blob/master/lib/debsources/ license_helper.py#L105 I've taken a first cut at making the `Copyright` reader more strict with regards to required fields but I would like some feedback from users of the `Copyright` class before merging it. https://salsa.debian.org/python-debian-team/python-debian/merge_requests/1 In particular: * It introduces a `MachineReadableFormatError` which is used for format errors; I think it's worth distinguishing between an error in the format and the copyright file not being in the format at all (`NotMachineReadableError`). `MachineReadableFormatError` is derived from `ValueError` which I think makes sense. * I've changed other uses of `ValueError` within `Copyright` to use `MachineReadableFormatError` to be consistent (but that should also be backwards compatible) * As suggested within comments already in the code, I've allowed a `strict=False` mode which continues to use python warnings rather than raising exceptions. * The comments in the code also talked about treating the http and https versions of the copyright spec as being the same; the spec has since been changed to explicitly say that both are OK but that the https URL is preferred and so the code will silently upgrade from http to https too. By throwing an exception as soon as an error is found, this becomes a bit of an all-or-nothing approach. Would a better approach be an incremental validation where as much as is possible is read with an `valid` attribute per stanza that propagates to the whole `Copyright` instance? Users would then check that the file is valid after read in rather than using exception handling. comments, please! (Either to this bug or to the MR on salsa) thanks Stuart -- Stuart Prescott http://www.nanonanonano.net/ stu...@nanonanonano.net Debian Developer http://www.debian.org/ stu...@debian.org GPG fingerprint 90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7