Yikun edited a comment on pull request #34717:
URL: https://github.com/apache/spark/pull/34717#issuecomment-979659317


   Just to start the discussion:
   
   Here is top 20 download from pypi stat:
   
   By using below sql according [1], we can got the top 20 download version of 
Pandas in last 3 months.
   ```SQL
   SELECT
     file.version AS file_version,
     COUNT(*) AS num_downloads,
   FROM `the-psf.pypi.file_downloads`
   WHERE file.project = 'pandas'
   AND 
     -- Only query the last 3 months of history
     DATE(timestamp)
       BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 3 MONTH)
       AND CURRENT_DATE()
   GROUP BY `file_version`
   ORDER BY `num_downloads` DESC
   ```
   
     | version | number | percent
   -- | -- | -- | --
   1 | 0.25.3 | 35149221 | 14.28%
   2 | 1.1.5 | 28722806 | 11.67%
   3 | 1.3.4 | 20944236 | 8.51%
   4 | 1.3.3 | 16861573 | 6.85%
   5 | 0.24.2 | 13235233 | 5.38%
   6 | 1.0.5 | 9201989 | 3.74%
   7 | 1.3.2 | 9077326 | 3.69%
   8 | 1.2.5 | 7902532 | 3.21%
   9 | 1.2.4 | 5754284 | 2.34%
   10 | 1.1.4 | 5710439 | 2.32%
   11 | 1.1.0 | 4760847 | 1.93%
   12 | 1.1.2 | 4621441 | 1.88%
   13 | 1.2.3 | 4607043 | 1.87%
   14 | 1.0.3 | 4601230 | 1.87%
   15 | 0.23.4 | 4251044 | 1.73%
   16 | 0.25.0 | 3862673 | 1.57%
   17 | 1.2.1 | 2952346 | 1.20%
   18 | 1.0.1 | 2690006 | 1.09%
   19 | 0.22.0 | 2680710 | 1.09%
   20 | 1.2.0 | 2645339 | 1.07%
   21 | 0.24.1 | 2635411 | 1.07%
   
   [1] https://packaging.python.org/guides/analyzing-pypi-package-downloads/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to