I think the community has already reached consistence to freeze dependencies in minor release.
SPARK-54633 - SPIP: Accelerating Apache Spark Release Cadence [1] > Clear rules for changes allowed in minor vs. major releases: > - Dependencies are frozen and behavioral changes are minimized in minor > releases. I would interpret the proposed dependency policy applies to both Java/Scala and Python dependency management for Spark. If so, that means PySpark will always use pinned dependencies version since 4.3.0. But if the intention is to only apply such a dependency policy to Java/Scala, then it creates a very strange situation - an extremely conservative dependency management strategy for Java/Scala, and an extremely liberal one for Python. To Tian Gao, > Pinning versions is a double-edged sword, it doesn't always make us more > secure - that's my major point. Product must be usable first, then security, performance, etc. If it claims require `foo>=2.0.0`, how do you ensure it is compatible with foo `2.3.4`, `3.x.x`, `4.x.x`? Actually, such incompatible failures occurred many times, e.g.,[2]. On the contrary, if it claims require `foo==2.0.0`, that means it was thoroughly tested with `foo==2.0.0`, and users take their own risk to use it with other `foo` versions, for exmaple, if the `foo` strictly follow semantic version, it should work with `foo<3.0.0`, but this is not Spark's responsibility, users should assess and assume the risk of incompatibility themselves. [1] https://issues.apache.org/jira/browse/SPARK-54633 [2] https://github.com/apache/spark/pull/52633 Thanks, Cheng Pan > On Mar 28, 2026, at 06:59, Holden Karau <[email protected]> wrote: > > Response inline > > > Twitter: https://twitter.com/holdenkarau > Fight Health Insurance: https://www.fighthealthinsurance.com/ > <https://www.fighthealthinsurance.com/?q=hk_email> > Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 > <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > Pronouns: she/her > > > On Fri, Mar 27, 2026 at 1:01 PM Nicholas Chammas <[email protected] > <mailto:[email protected]>> wrote: >> >>> On Mar 27, 2026, at 12:31 PM, Holden Karau <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> One possibility would be to make the pinned version optional (eg >>> pyspark[pinned]) or publish a separate constraints file for people to >>> optionally use with -c? >> >> Perhaps I am misunderstanding your proposal, Holden, but this is possible >> today for people using modern Python packaging workflows that use lock >> files. In fact, it happens automatically; all transitive dependencies are >> pinned in the lock file, and this is by design. > > So for someone installing a fresh venv with uv/pip/or conda where does this > come from? > > The idea here is we provide the versions we used during the release stage so > if folks want a “known safe” initial starting point for a new env they’ve got > one. >> >> Furthermore, it is straightforward to add additional restrictions to your >> project spec (i.e. pyproject.toml) so that when the packaging tool builds >> the lock file, it does it with whatever restrictions you want that are >> specific to your project. That could include specific versions or version >> ranges of libraries to exclude, for example. > Yes, but as it stands we leave it to the end user to start from scratch > picking these versions, we can make their lives simpler by providing the > versions we tested against with a lock file they can choose to use, ignore, > or update to their desired versions and include. > > Also for interactive workloads I more often see a bare requirements file or > even pip installs in nb cells (but this could be sample bias). >> >> I had to do this, for example, on a personal project that used PySpark >> Connect but which was pulling in a version of grpc that was generating a lot >> of log noise >> <https://github.com/grpc/grpc/issues/38336#issuecomment-2588422915>. I >> pinned the version of grpc in my project file and let the packaging tool >> resolve all the requirements across PySpark Connect and my custom >> restrictions. >> >> Nick >>
