Re: [discuss] Pinning PySpark dependencies?

Holden Karau Mon, 18 May 2026 16:56:04 -0700

Single source of truth does sound desirable, let me take a look at
narrowing that down a bit too.


On Mon, May 18, 2026 at 4:30 PM Tian Gao via dev <[email protected]>
wrote:

> We can do either a list of packages from `pip freeze` on our website, or a
> `pyspark[pinned]` that has `==`. I'm okay with either (or both).
>
> If we want to do that, we probably want to pin our package versions on our
> stable spark versions. We only partially pin our dependencies for our CI
> for maintenance branches, so we do not even have the list now (we may have
> it for a certain date, but the list could change any time in the future).
>
> I think we should come up with a more official CI system so we always test
> the released versions (4.0, 4.1 ...) with a pinned versions of packages
> (which are the "known working dependencies"), and be more relaxed for dev
> branches (4.x, master) because we need to test against new releases for our
> dependencies.
>
> More importantly, it would be really nice to have a single source of
> truth. We have to many places to pin the python dependency versions.
>
> Tian
>
> On Sun, May 17, 2026 at 9:52 AM Holden Karau <[email protected]>
> wrote:
>
>> I am at PyCon USA Today and the PyPi head just did a call out to audit
>> and pin dependencies because the supply chain attacks are increasing hockey
>> stick style.
>>
>> I think we don’t need to pin just yet but let’s add publishing the
>> package versions we built with during CI.
>>
>>
>> Twitter: https://twitter.com/holdenkarau
>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>> <https://www.fighthealthinsurance.com/?q=hk_email>
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>> Pronouns: she/her
>>
>> On Wed, Apr 1, 2026 at 7:48 AM Devin Petersohn via dev <
>> [email protected]> wrote:
>>
>>> I think we should do something in response to the growing supply chain
>>> attacks rather than just leaving the problem to users. One alternative we
>>> could consider for Python specifically is an install target with upper
>>> bounded dependencies: `pip install "pyspark[deps-upper-bounded]"`. This
>>> wouldn't impact regular use, and seems like it would solve the other
>>> problems with publishing lock files, etc. As others have mentioned, this
>>> wouldn't *guarantee* security, but it would provide meaningful protection
>>> against the worst offenders we've recently seen.
>>>
>>> On Wed, Apr 1, 2026 at 9:37 AM Cheng Pan <[email protected]> wrote:
>>>
>>>> > How about as a compromise, we publish (but don’t lock to) the pip
>>>> freeze outputs of the venvs we use for testing?
>>>>
>>>> > Where do you propose to publish? Spark website? Maybe in our github
>>>> repo somewhere?
>>>>
>>>> > I was thinking just in the publisher artifacts directory we already
>>>> do.
>>>>
>>>> +1, I'm fine with any approach, as long as it provides sufficient info
>>>> to let user know which exactly version of dependencies was used for
>>>> testing.
>>>>
>>>> For Java/Scala, we have a script[1] generated dependency list in code
>>>> repo, at [2]
>>>>
>>>> [1]
>>>> https://github.com/apache/spark/blob/branch-4.1/dev/test-dependencies.sh
>>>> [2]
>>>> https://github.com/apache/spark/blob/branch-4.1/dev/deps/spark-deps-hadoop-3-hive-2.3
>>>>
>>>> Thanks,
>>>> Cheng Pan
>>>>
>>>>
>>>>
>>>> On Mar 31, 2026, at 03:12, Holden Karau <[email protected]> wrote:
>>>>
>>>> I was thinking just in the publisher artifacts directory we already do.
>>>>
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>> Pronouns: she/her
>>>>
>>>>
>>>> On Mon, Mar 30, 2026 at 10:26 AM Tian Gao <[email protected]>
>>>> wrote:
>>>>
>>>>> Where do you propose to publish? Spark website? Maybe in our github
>>>>> repo somewhere? For python packages, users rarely look for artifacts (and
>>>>> it's difficult to find).
>>>>>
>>>>> Tian
>>>>>
>>>>> On Mon, Mar 30, 2026 at 10:04 AM Holden Karau <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I hear that. How about as a compromise, we publish (but don’t lock
>>>>>> to) the pip freeze outputs of the venvs we use for testing?
>>>>>>
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>> Pronouns: she/her
>>>>>>
>>>>>>
>>>>>> On Mon, Mar 30, 2026 at 8:04 AM Nicholas Chammas <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> I think supply chain attacks are a problem, but I don’t think we
>>>>>>> want to be on the hook for a solution here, even if it’s meant just for 
>>>>>>> our
>>>>>>> project.
>>>>>>>
>>>>>>> There are “good enough” approaches available today for Python that
>>>>>>> mitigate most of the risk by excluding recent releases when resolving 
>>>>>>> what
>>>>>>> package versions to install.
>>>>>>>
>>>>>>> uv offers exclude-newer
>>>>>>> <https://docs.astral.sh/uv/reference/settings/#exclude-newer>. pip
>>>>>>> offers uploaded-prior-to
>>>>>>> <https://pip.pypa.io/en/stable/cli/pip_index/#cmdoption-uploaded-prior-to>.
>>>>>>> Poetry has an issue open
>>>>>>> <https://github.com/python-poetry/poetry/issues/10646> for a
>>>>>>> similar feature, plus at least one open PR to close it.
>>>>>>>
>>>>>>> Users concerned about supply chain attacks would probably get better
>>>>>>> results from using these options as compared to installing pinned
>>>>>>> dependencies provided by the projects they use.
>>>>>>>
>>>>>>> Nick
>>>>>>>
>>>>>>>
>>>>>>> On Mar 30, 2026, at 3:31 AM, Holden Karau <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> So I think we can ship it as an optional distribution element (it's
>>>>>>> literally just another file folks can choose to download/use if they 
>>>>>>> want).
>>>>>>>
>>>>>>> Asking users is an idea too, I could put together a survey if we
>>>>>>> want?
>>>>>>>
>>>>>>> On Sun, Mar 29, 2026 at 11:14 PM Tian Gao via dev <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> I believe "foo~=2.0.1" is a syntax sugar for "foo>=2.0.1,
>>>>>>>> foo==2.0.*". Similarly, "foo>=2.0.0, <3.0.0" is "foo~=2.0". This is a 
>>>>>>>> nit
>>>>>>>> and we don't need to focus on the syntax.
>>>>>>>>
>>>>>>>> I don't believe we can ship pyspark with a env lock file. That's
>>>>>>>> what users do in their own projects. It's not part of python package
>>>>>>>> system. What users do is normally install packages, test it out, then 
>>>>>>>> lock
>>>>>>>> it with either pip or uv - generate a lock file for all dependencies 
>>>>>>>> and
>>>>>>>> use it across their systems. It's not common for packages to list out a
>>>>>>>> "known working dependency list" for users.
>>>>>>>>
>>>>>>>> However, if we really want to try it out, we can do something like
>>>>>>>> `pip install pyspark[full-pinned] and install every dependency pyspark
>>>>>>>> requires with a pinned version. If our user needs an out-of-box 
>>>>>>>> solution
>>>>>>>> they can do that. We can also collect feedbacks and see the sentiment 
>>>>>>>> from
>>>>>>>> users.
>>>>>>>>
>>>>>>>> Tian
>>>>>>>>
>>>>>>>> On Sun, Mar 29, 2026 at 10:29 PM Cheng Pan <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> > If we consider PySpark the dominant package - meaning that if a
>>>>>>>>> user employs it, it must be the most important element in their 
>>>>>>>>> project and
>>>>>>>>> everything else must comply with it - pinning versions might be 
>>>>>>>>> viable.
>>>>>>>>>
>>>>>>>>> This is not always true, but definitely a major case.
>>>>>>>>>
>>>>>>>>> > I'm not familiar with Java dependency solutions or how users use
>>>>>>>>> spark with Java
>>>>>>>>>
>>>>>>>>> In Java/Scala, it's rare to use dynamic version for dependency
>>>>>>>>> management. Product declares transitive dependencies with pinned 
>>>>>>>>> version,
>>>>>>>>> and the package manager (Maven, SBT, Gradle, etc.) picks the most
>>>>>>>>> reasonable version based on resolution rules. The rules is a little
>>>>>>>>> different in Maven, SBT and Gradle, the Maven docs[1] explains how it 
>>>>>>>>> works.
>>>>>>>>>
>>>>>>>>> In short, in Java/Scala dependency management, the pinned version
>>>>>>>>> is more like a suggested version, it's easy to override by users.
>>>>>>>>>
>>>>>>>>> As Owen pointed out, things are completely different in Python
>>>>>>>>> world, both pinned version and latest version seems not ideal, then
>>>>>>>>>
>>>>>>>>> 1. pinned version (foo==2.0.0)
>>>>>>>>> 2. allow maintenance releases (foo~=2.0.0)
>>>>>>>>> 3. allow minor feature releases (foo>=2.0.0,<3.0.0)
>>>>>>>>> 4. latest version (foo>=2.0.0, or foo)
>>>>>>>>>
>>>>>>>>> seems 2 or 3 might be an acceptable solution? And, I still believe
>>>>>>>>> we should add a disclaimer that this compatibility only holds under 
>>>>>>>>> the
>>>>>>>>> assumption that 3rd-party packages strictly adhere to semantic 
>>>>>>>>> versioning.
>>>>>>>>>
>>>>>>>>> > You can totally produce a sort of 'lock' file -- uv.lock,
>>>>>>>>> requirements.txt -- expressing a known good / recommended specific 
>>>>>>>>> resolved
>>>>>>>>> environment. That is _not_ what Python dependency constraints are 
>>>>>>>>> for. It's
>>>>>>>>> what env lock flies are for.
>>>>>>>>>
>>>>>>>>> We definitely need such a dependency list in PySpark release, it's
>>>>>>>>> really important for users to set up a reproducible environment after 
>>>>>>>>> the
>>>>>>>>> release several years, and this is also a good reference for users who
>>>>>>>>> encounter 3rd-party packages bugs, or battle with dependency 
>>>>>>>>> conflicts when
>>>>>>>>> they install lots of packages in single environment.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Cheng Pan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mar 30, 2026, at 11:13, Sean Owen <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> TL;DR Tian is more correct, and == pinning versions is not
>>>>>>>>> achieving the desired outcome. There are other ways to do it; I can't 
>>>>>>>>> think
>>>>>>>>> of any other Python package that works that way. This thread is 
>>>>>>>>> conflating
>>>>>>>>> different things.
>>>>>>>>>
>>>>>>>>> While expressing dependence on "foo>=2.0.0" indeed can be an
>>>>>>>>> overly-broad claim -- do you really think it works with 5.x in 10 
>>>>>>>>> years? --
>>>>>>>>> expressing "foo==2.0.0" is very likely overly narrow. That says "does 
>>>>>>>>> not
>>>>>>>>> work with any other version at all" which is likely more incorrect 
>>>>>>>>> and more
>>>>>>>>> problematic for users.
>>>>>>>>>
>>>>>>>>> You can totally produce a sort of 'lock' file -- uv.lock,
>>>>>>>>> requirements.txt -- expressing a known good / recommended specific 
>>>>>>>>> resolved
>>>>>>>>> environment. That is _not_ what Python dependency constraints are 
>>>>>>>>> for. It's
>>>>>>>>> what env lock flies are for.
>>>>>>>>>
>>>>>>>>> To be sure there is an art to figuring out the right dependency
>>>>>>>>> bounds. A reasonable compromise is to allow maintenance releases, as a
>>>>>>>>> default when there is nothing more specific known. That is, write
>>>>>>>>> "foo~=2.0.2" to mean ">=2.0.0 and < 2.1".
>>>>>>>>>
>>>>>>>>> The analogy to Scala/Java/Maven land does not quite work, partly
>>>>>>>>> because Maven resolution is just pretty different, but mostly because 
>>>>>>>>> the
>>>>>>>>> core Spark distribution is the 'server side' and is necessarily a 'fat
>>>>>>>>> jar', a sort of statically-compiled artifact that simply has some 
>>>>>>>>> specific
>>>>>>>>> versions in them and can never have different versions because of 
>>>>>>>>> runtime
>>>>>>>>> resolution differences.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Mar 29, 2026 at 10:02 PM Tian Gao via dev <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> I agree that a product must be usable first. Pinning the version
>>>>>>>>>> (to a specific number with `==`) will make pyspark unusable.
>>>>>>>>>>
>>>>>>>>>> First of all, I think we can agree that many users use PySpark
>>>>>>>>>> with other Python packages. If we conflict with other packages, `pip
>>>>>>>>>> install -r requirements.txt` won't work. It will complain that the
>>>>>>>>>> dependencies can't be resolved, which completely breaks our user's
>>>>>>>>>> workflow. Even if the user locks the dependency version, it won't 
>>>>>>>>>> work. So
>>>>>>>>>> the user had to install PySpark first, then the other packages, to 
>>>>>>>>>> override
>>>>>>>>>> PySpark's dependency. They can't put their dependency list in a 
>>>>>>>>>> single file
>>>>>>>>>> - that is a horrible user experience.
>>>>>>>>>>
>>>>>>>>>> When I look at controversial topics, I always have a strong
>>>>>>>>>> belief, that I can't be the only smart person in the world. If an 
>>>>>>>>>> idea is
>>>>>>>>>> good, others must already be doing it. Can we find any recognized 
>>>>>>>>>> package
>>>>>>>>>> in the market that pins its dependencies to a specific version? The 
>>>>>>>>>> only
>>>>>>>>>> case it works is when this package is *all* the user needs. That's 
>>>>>>>>>> why we
>>>>>>>>>> pin versions for docker images, HTTP services, or standalone tools - 
>>>>>>>>>> users
>>>>>>>>>> just need something that works out of the box. If we consider 
>>>>>>>>>> PySpark the
>>>>>>>>>> dominant package - meaning that if a user employs it, it must be the 
>>>>>>>>>> most
>>>>>>>>>> important element in their project and everything else must comply 
>>>>>>>>>> with it
>>>>>>>>>> - pinning versions might be viable.
>>>>>>>>>>
>>>>>>>>>> I'm not familiar with Java dependency solutions or how users use
>>>>>>>>>> spark with Java, but I'm familiar with the Python ecosystem and 
>>>>>>>>>> community.
>>>>>>>>>> If we pin to a specific version, we will face significant criticism. 
>>>>>>>>>> If we
>>>>>>>>>> must do it, at least don't make it default. Like I said above, I 
>>>>>>>>>> don't have
>>>>>>>>>> a strong opinion about having a `pyspark[pinned]` - if users only 
>>>>>>>>>> need
>>>>>>>>>> pyspark and no other packages they could use that. But that's extra 
>>>>>>>>>> effort
>>>>>>>>>> for maintenance, and we need to think about what's pinned. We have a 
>>>>>>>>>> lot of
>>>>>>>>>> pyspark install versions.
>>>>>>>>>>
>>>>>>>>>> Tian Gao
>>>>>>>>>>
>>>>>>>>>> On Sun, Mar 29, 2026 at 7:12 PM Cheng Pan <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think the community has already reached consistence to freeze
>>>>>>>>>>> dependencies in minor release.
>>>>>>>>>>>
>>>>>>>>>>> SPARK-54633 - SPIP: Accelerating Apache Spark Release Cadence [1]
>>>>>>>>>>>
>>>>>>>>>>> > Clear rules for changes allowed in minor vs. major releases:
>>>>>>>>>>> > - Dependencies are frozen and behavioral changes are minimized
>>>>>>>>>>> in minor releases.
>>>>>>>>>>>
>>>>>>>>>>> I would interpret the proposed dependency policy applies to both
>>>>>>>>>>> Java/Scala and Python dependency management for Spark. If so, that 
>>>>>>>>>>> means
>>>>>>>>>>> PySpark will always use pinned dependencies version since 4.3.0. 
>>>>>>>>>>> But if the
>>>>>>>>>>> intention is to only apply such a dependency policy to Java/Scala, 
>>>>>>>>>>> then it
>>>>>>>>>>> creates a very strange situation - an extremely conservative 
>>>>>>>>>>> dependency
>>>>>>>>>>> management strategy for Java/Scala, and an extremely liberal one 
>>>>>>>>>>> for Python.
>>>>>>>>>>>
>>>>>>>>>>> To Tian Gao,
>>>>>>>>>>>
>>>>>>>>>>> > Pinning versions is a double-edged sword, it doesn't always
>>>>>>>>>>> make us more secure - that's my major point.
>>>>>>>>>>>
>>>>>>>>>>> Product must be usable first, then security, performance, etc.
>>>>>>>>>>> If it claims require `foo>=2.0.0`, how do you ensure it is 
>>>>>>>>>>> compatible with
>>>>>>>>>>> foo `2.3.4`, `3.x.x`, `4.x.x`? Actually, such incompatible failures
>>>>>>>>>>> occurred many times, e.g.,[2]. On the contrary, if it claims require
>>>>>>>>>>> `foo==2.0.0`, that means it was thoroughly tested with 
>>>>>>>>>>> `foo==2.0.0`, and
>>>>>>>>>>> users take their own risk to use it with other `foo` versions, for 
>>>>>>>>>>> exmaple,
>>>>>>>>>>> if the `foo` strictly follow semantic version, it should work with
>>>>>>>>>>> `foo<3.0.0`, but this is not Spark's responsibility, users should 
>>>>>>>>>>> assess
>>>>>>>>>>> and assume the risk of incompatibility themselves.
>>>>>>>>>>>
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/SPARK-54633
>>>>>>>>>>> [2] https://github.com/apache/spark/pull/52633
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Cheng Pan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mar 28, 2026, at 06:59, Holden Karau <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Response inline
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Mar 27, 2026 at 1:01 PM Nicholas Chammas <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mar 27, 2026, at 12:31 PM, Holden Karau <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> One possibility would be to make the pinned version optional
>>>>>>>>>>>> (eg pyspark[pinned]) or publish a separate constraints file for 
>>>>>>>>>>>> people to
>>>>>>>>>>>> optionally use with -c?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Perhaps I am misunderstanding your proposal, Holden, but this
>>>>>>>>>>>> is possible today for people using modern Python packaging 
>>>>>>>>>>>> workflows that
>>>>>>>>>>>> use lock files. In fact, it happens automatically; all transitive
>>>>>>>>>>>> dependencies are pinned in the lock file, and this is by design.
>>>>>>>>>>>>
>>>>>>>>>>> So for someone installing a fresh venv with uv/pip/or conda
>>>>>>>>>>> where does this come from?
>>>>>>>>>>>
>>>>>>>>>>> The idea here is we provide the versions we used during the
>>>>>>>>>>> release stage so if folks want a “known safe” initial starting 
>>>>>>>>>>> point for a
>>>>>>>>>>> new env they’ve got one.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Furthermore, it is straightforward to add additional
>>>>>>>>>>>> restrictions to your project spec (i.e. pyproject.toml) so that 
>>>>>>>>>>>> when the
>>>>>>>>>>>> packaging tool builds the lock file, it does it with whatever 
>>>>>>>>>>>> restrictions
>>>>>>>>>>>> you want that are specific to your project. That could include 
>>>>>>>>>>>> specific
>>>>>>>>>>>> versions or version ranges of libraries to exclude, for example.
>>>>>>>>>>>>
>>>>>>>>>>> Yes, but as it stands we leave it to the end user to start from
>>>>>>>>>>> scratch picking these versions, we can make their lives simpler by
>>>>>>>>>>> providing the versions we tested against with a lock file they can 
>>>>>>>>>>> choose
>>>>>>>>>>> to use, ignore, or update to their desired versions and include.
>>>>>>>>>>>
>>>>>>>>>>> Also for interactive workloads I more often see a bare
>>>>>>>>>>> requirements file or even pip installs in nb cells (but this could 
>>>>>>>>>>> be
>>>>>>>>>>> sample bias).
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I had to do this, for example, on a personal project that used
>>>>>>>>>>>> PySpark Connect but which was pulling in a version of grpc
>>>>>>>>>>>> that was generating a lot of log noise
>>>>>>>>>>>> <https://github.com/grpc/grpc/issues/38336#issuecomment-2588422915>.
>>>>>>>>>>>> I pinned the version of grpc in my project file and let the 
>>>>>>>>>>>> packaging tool
>>>>>>>>>>>> resolve all the requirements across PySpark Connect and my custom
>>>>>>>>>>>> restrictions.
>>>>>>>>>>>>
>>>>>>>>>>>> Nick
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>> Pronouns: she/her
>>>>>>>
>>>>>>>
>>>>>>>
>>>>

-- 
Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her

Re: [discuss] Pinning PySpark dependencies?

Reply via email to