pingzh commented on a change in pull request #21332:
URL: https://github.com/apache/airflow/pull/21332#discussion_r801092172
##########
File path: airflow/models/serialized_dag.py
##########
@@ -20,11 +20,12 @@
import hashlib
import logging
+import zlib
Review comment:
although `bz2` can achieve smaller size of compress data, but we noticed
that the `zlib` is faster than `bz2` when compressing and decompressing data
with the same level of compression (zlib with default level 6, also set
compresslevel as 6 for bz2).
test with python3.7.9, `raw json data size`: `514MB`, `bz2` compressed:
`14M` and `zlib` compressed: `44M`.
```
~$ python test_bz2.py
bz2 took 89.23221111297607
bz2 decompress took 15.989518880844116
~$ python test_zlib.py
zlib compress took 8.829093933105469
zlib decompress took 1.1581130027770996
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]