nchammas commented on a change in pull request #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#discussion_r393721061
########## File path: dev/requirements.txt ########## @@ -1,5 +1,7 @@ -flake8==3.5.0 +pycodestyle==2.5.0 +flake8==3.7.9 jira==1.0.3 PyGithub==1.26.0 Unidecode==0.04.19 -sphinx +sphinx==2.3.1 +numpy==1.18.1 Review comment: @HyukjinKwon - When looking at project dependencies, there is an important distinction between projects that are used as libraries and projects that are used as stand-alone applications. If your project is a library, then you know others are importing you alongside other dependencies too. To minimize the chance of transitive dependency conflicts, you want to be flexible in how you specify your dependencies. When your project is a stand-alone application, you don't have to worry about such things. You can pin every dependency to a specific version to get the most predictable and reliable build and runtime behavior. In our case, the Spark build environment is more akin to a stand-alone application than a library. We don't need to worry about downstream users struggling with dependency conflicts. We can get the most stable build behavior by pinning everything, and there is no downside as far as I can tell. I'll use [Trio](https://github.com/python-trio/trio) as an example again to illustrate my point: * Trio is a library that others will typically import alongside many other dependencies. So in [Trio's setup.py](https://github.com/python-trio/trio/blob/4d956a4ba51241ca5d22800fb1e0e5c36ba9bb47/setup.py#L81-L92) they are very flexible in how they specify their dependencies. * Trio's test environment, on the other hand, is only used by Trio contributors. So Trio [locks down](https://github.com/python-trio/trio/blob/4d956a4ba51241ca5d22800fb1e0e5c36ba9bb47/test-requirements.txt) every test requirement using [pip-tools](https://github.com/jazzband/pip-tools). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
