Package: python3-dateutil
Followup-For: Bug #1003044
X-Debbugs-Cc: debian-le...@lists.debian.org

(adding debian-legal on cc for any sanity-checks available)

To recap: we have a bug, #1003044, that is rated 'grave', and so it is
considered release-critical for Debian bookworm, although without a written
justification for the severity so far.  The bug relates to timezone data in the
'python-dateutil' source package.

The request, in short, is: can we repackage a specific tarball, derived from
public domain data and in an Apache-2.0 repository, from the python-dateutil
package's source?


Context follows.


Note: this message uses selective -- chronological, and where possible,
referentially sequential -- quotations; I'm trying to present a coherent
thread of the discussion so far related solely to the usage of the relevant
'dateutil-zoneinfo.tar.gz' file as contained in tagged, published releases of
the upstream python-dateutil[1] library.


Facts as I understand them:

  * In its original distributed form, the IANA tz database is public domain
    (and therefore is DFSG-compatible).

  * A file, dateutil-zoneinfo.tar.gz, was built by upstream and included in
    their own software releases. It was built from tzdata2021a.tar.gz according
    to the metadata within the dateutil-zoneinfo.tar.gz file, and the metadata
    includes integrity hashes.

  * For the relevant distributed versions, the upstream library is Apache-2.0
    licensed.

  * Debian requires[2] that users can rebuild distributed (aka 'binary')
    packages from source.

  * A bug[3] was reported about the inclusion of dateutil-zoneinfo.tar.gz in
    Debian's packaging, and subsequently that file was removed.  This remains
    the status quo at the time-of-writing.

  * IANA tz database releases do not remove old timezone names but instead
    add a backwards-compatible link from the previous name to the current.

    * I am not an expert about the tz database, but I believe that this is
      relevant because python-dateutil's code may attempt to access _both_ the
      system timezone database (likely to be more recent) _and_ subsequently
      the bundled dateutil-zoneinfo.tar.gz (likely to be older), the latter as
      a fallback, under some circumstances.

    * "If a name is changed, put its old spelling in the 'backward' file as a 
link to the new spelling. This means old spellings will continue to work. 
Ordinarily a name change should occur only in the rare case when a location's 
consensus English-language spelling changes; for example, in 2008 Asia/Calcutta 
was renamed to Asia/Kolkata due to long-time widespread use of the new city 
name instead of the old." - https://data.iana.org/time-zones/theory.html


If anyone feels like I'm misrepresenting (or under-representing) their
viewpoints, please say so.


On Tue, 21 Feb 2023 22:27:53 +0100, Felix wrote:
> I'm inclined to just ship the bundled timezone database with the package:

On Wed, 22 Feb 2023 11:52:25 +0000, James wrote:
> That may not be an option for us (at least without more work to find and
> package the sources of the relevant zoneinfo database): tz data content was
> removed from src:python-dateutil (the source of this package) to resolve
> previous bug #665894, relating to dfsg-compatibility.

On Fri, 3 Mar 2023 15:05:03 -0500, morph wrote:
> even if these APIs are deprecated upstream, i think breaking them on
> purpose (by removing the bundled timezone file) is uncalled for.

> Either we reintroduce the timezone file (that may not be a good idea)
> or translate these deprecated APIs into the recommended one, or we do
> something else entirely, it's up for debate.

On Sun, 05 Mar 2023 15:37:43 +0100, Armour wrote:
> I can't really comment on that. Other distros don't seem to remove it 
...
> One thing we could do is to regenerate the bundled database based on actual 
> zoneinfo. But then the package should be rebuilt every time zoneinfo is 
> updated...
...
> In my view, no actual user is asking for the possibility of using the bundled 
> database, or anything nebulous like using the system database even if the 
> bundled one is requested explicitly. They're simply asking for an irrelevant 
> warning to be removed.

On Sun, 5 Mar 2023 23:07:44 +0100, Felix wrote:
> That's probably true but there are direct users of the dateutil.zoneinfo API 
> which intrinsically 
> uses the bundled database.
...
> Therefore shipping the bundled zoneinfo tarball seems like the better 
> solution to me.
> The timezone database is clearly DFSG-free. We would have to repackage the 
> upstream tarball to 
> include the timezone database source though.
> Thankfully upstream ships the script to (re-)generate the zoneinfo tarball.


Although various options have been discussed, I currently agree with Felix's
recommendation (begin shipping the tarball again), as long as we can confirm
that that's OK to do.

Armour's concern may be correct that few - if any - people intentionally want
to use the outdated/static bundled data.  However, as noted by the backward
compatibility of the tz database, I don't think many people _will_ be using it
even if we bundle it - so I think the concern is mitigated.

We would become consistent with other operating systems' distributions and with
the 'pip install' (non-operating-system distribution) of python-dateutil; that
seems almost entirely positive, except for one unusual fact that, if we had
been consistent from the start, it may be unlikely that anyone would have
discovered the issue raised in #1003044 in the first place.


It could be the case that someone on legal can confirm that the tarball is
trivially DFSG-compatible; I don't feel confident enough to say that for sure
and would like another opinion.


[1] - https://github.com/dateutil/dateutil/

[2] - https://www.debian.org/social_contract

[3] - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=665894

Reply via email to