pingzh commented on a change in pull request #21332:
URL: https://github.com/apache/airflow/pull/21332#discussion_r801092172



##########
File path: airflow/models/serialized_dag.py
##########
@@ -20,11 +20,12 @@
 
 import hashlib
 import logging
+import zlib

Review comment:
       although `bz2` can achieve smaller size of compress data, but we noticed 
that the `zlib` is faster than `bz2` when  compressing and decompressing data 
with the same level of compression (zlib with default level 6, also set 
compresslevel as 6 for bz2).
   
   test with python3.7.9, `raw json data size`: `514MB`, `bz2` compressed: 
`14M` and `zlib` compressed: `44M`.
   
   ```
   ~$ python test_bz2.py
   bz2 took 89.23221111297607
   bz2 decompress took 15.989518880844116
   
   
   ~$ python test_zlib.py
   zlib compress took 8.829093933105469
   zlib decompress took 1.1581130027770996
   
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to