On Tue, Feb 21, 2023 at 02:23:34PM +0200, Adrian Bunk wrote:
> Looking at #1028371, should generated dependencies on python3-protobuf be
>   python3-protobuf (>= 3.21), python3-protobuf (<< 3.22)
> to ensure that the binary package is used with the same version
> as the protobuf-compiler used during the build?

I'm not the maintainer, but a drive-by contributor. I looked a bit into
this, given its RC severity. 

With my still somewhat limited understanding, a strict version alignment
between protobuf-compiler and python3-protobuf would probably resolve
this particular symptom, but the issues here seem to run deeper.

Specifically:
  * The protobuf project provides three different versions of Python
    bindings: pure Python, C++, and libupb-based[1]. These are
    selectable using the PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION
    environment variable.

  * Debian's python3-protobuf, from src:protobuf, ships the pure Python
    version, as well as the C++ bindings. The default implementation in
    Debian is "cpp".

  * The upb implementation is not included in src:protobuf, but in the
    upb upstream source[2], i.e. what is src:upb in Debian, even though
    the snapshot we have in Debian does not contain sources to Python
    bindings.

  * Upstream has switched the default implementation to "upb", and
    deprecated the "cpp" implementation. There is, in fact, no way for
    one to fetch the "cpp" version from PyPI. This is documented
    extensively in their May 2022 release notes[3]. However, Debian
    still ships, and defaults to, cpp, a major departure from upstream. 

  * Relatedly, when they made that switch, they also made changes to
    their versioning scheme, disconnecting the Python library's version
    from the source version. As a result, the Python API (both upb, as
    well as pure Python), is now versioned at "4.21", rather than
    "3.21". The Debian binary package python3-protobuf is versioned as
    "3.21.12-1", which is not a version that exists, or will ever exist,
    upstream. That binary package in fact, is shipping an egg named
    protobuf-4.21.12.egg-info. (This is all also well documented in their
    release notes[3]).

  * Finally, in the same release notes document[3], they also state:
    "Python upb requires generated code that has been generated from
    protoc 3.19.0 or newer.". 

    Indeed, if one fetches protobuf 4.21 from PyPI, and runs:
      python3 -c 'import bernhard'
        or
      PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=upb python3 -c 'import bernhard'

    ...a traceback message is emitted, but a much more informative one:
    > TypeError: Descriptors cannot not be created directly.
    >
    > If this call came from a _pb2.py file, your generated code is out of
    > date and must be regenerated with protoc >= 3.19.0.
    >
    > If you cannot immediately regenerate your protos, some other possible 
workarounds are:
    > 1. Downgrade the protobuf package to 3.20.x or lower.
    > 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will
    > use pure-Python parsing and will be much slower).
    >
    > More information: 
https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
  
  * The release notes specifically mention "upb" requiring protoc
    (protobuf-compiler) >= 3.19, but not "cpp". However, as established
    above, "cpp" is deprecated and not used by anyone (but Debian), and
    therefore they either meant "the non-Pure-Python implementation"
    there, or did not pay as much attention to forward- and
    backwards-compatibility, or informative error messages for their
    deprecated backend. It's likely, but not entirely clear, that the
    protoc dependency requirement is >= 3.19 here as well.

  * Finally note that the 3.21.12-1+b2 Python implementation still works
    with python3-bernhard, Built-Using:  protobuf-compiler (= 3.12.4-1+b3):
      PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python python3 -c 'import bernhard'

All in all: it's almost certainly necessary to make the dependency
tighter, to something like >= 3.19, if not tight to = 3.21.

I still feel uneasy about Debian shipping a version of python3-protobuf
that includes, and defaults to, an implementation that is deprecated
upstream (and on top of it, is misversioned). I'm not sure what to make
of this so late in the release cycle, though.

For trixie the path forward is probably something along the lines of
updating src:upb to a newer upstream, building the upb-based extension
as python3-protobuf-upb, and then changing src:protobuf to not build the
cpp extension, make python3-protobuf Arch: all, and then Recommend (or
Depend) on python3-protobuf-upb as the native/fast implementation.

Faidon

1: https://github.com/protocolbuffers/protobuf/tree/main/python
2: https://github.com/protocolbuffers/upb/tree/main/python
3: https://protobuf.dev/news/2022-05-06/

Reply via email to