Maciej Szymkiewicz created SPARK-38082:
------------------------------------------

             Summary: Update minimum numpy version
                 Key: SPARK-38082
                 URL: https://issues.apache.org/jira/browse/SPARK-38082
             Project: Spark
          Issue Type: Improvement
          Components: ML, MLlib, PySpark
    Affects Versions: 3.3.0
            Reporter: Maciej Szymkiewicz


Currently, we use set numpy version in {{extras_require}} to be {{>=1.7}}.

However, 1.7 has been released over almost 9 years old and since then, some 
methods that we use have been deprecated in favor of new additions, and new API 
({{numpy.typing}}, that is of some interest to us, has been added.

We should update minimum version requirement to:

- {{>=1.9.0}} ‒ this is minimum reasonable bound, that will allow us to replace 
deprecated {{tostring}} calls with {{tobytes}}.
- {{>=1.15}} (released 2018-07-23) ‒ this is reasonable bound to match our 
minimum supported pandas version.
- {{>=1.20.0}} (released 2021-01-30) ‒ to fully utilize numpy typing.

The last one might be somewhat controversial, but 1.15 should require much 
discussion.





--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to