Yikun edited a comment on pull request #34717: URL: https://github.com/apache/spark/pull/34717#issuecomment-979659317
Just to start the discussion: Here is top 20 download from pypi stat: By using below sql according [1], we can got the top 20 download version of Pandas in last 3 months. ```SQL SELECT file.version AS file_version, COUNT(*) AS num_downloads, FROM `the-psf.pypi.file_downloads` WHERE file.project = 'pandas' AND -- Only query the last 3 months of history DATE(timestamp) BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 3 MONTH) AND CURRENT_DATE() GROUP BY `file_version` ORDER BY `num_downloads` DESC ``` | version | number | percent -- | -- | -- | -- 1 | 0.25.3 | 35149221 | 14.28% 2 | 1.1.5 | 28722806 | 11.67% 3 | 1.3.4 | 20944236 | 8.51% 4 | 1.3.3 | 16861573 | 6.85% 5 | 0.24.2 | 13235233 | 5.38% 6 | 1.0.5 | 9201989 | 3.74% 7 | 1.3.2 | 9077326 | 3.69% 8 | 1.2.5 | 7902532 | 3.21% 9 | 1.2.4 | 5754284 | 2.34% 10 | 1.1.4 | 5710439 | 2.32% 11 | 1.1.0 | 4760847 | 1.93% 12 | 1.1.2 | 4621441 | 1.88% 13 | 1.2.3 | 4607043 | 1.87% 14 | 1.0.3 | 4601230 | 1.87% 15 | 0.23.4 | 4251044 | 1.73% 16 | 0.25.0 | 3862673 | 1.57% 17 | 1.2.1 | 2952346 | 1.20% 18 | 1.0.1 | 2690006 | 1.09% 19 | 0.22.0 | 2680710 | 1.09% 20 | 1.2.0 | 2645339 | 1.07% 21 | 0.24.1 | 2635411 | 1.07% [1] https://packaging.python.org/guides/analyzing-pypi-package-downloads/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org