Re: [PR] [WORK-IN-PROGRESS] http: Compressed response example in Python [arrow-experiments]

via GitHub Mon, 09 Sep 2024 18:27:58 -0700


felipecrv commented on PR #35:
URL: https://github.com/apache/arrow-experiments/pull/35#issuecomment-2339431184


   Stats when running the server.py/client.py pair on the same M1 Pro macbook:
   
   ```
   $ python client.py
   [identity]: Requesting data from http://127.0.0.1:8008 with `identity` 
encoding.
   [identity]: Schema received in 0.008 seconds. schema=(ticker, price, volume).
   [identity]: First batch of 6836 received and processed in 0.008 seconds
   [identity]: Processing of all batches completed in 0.209 seconds.
       [zstd]: Requesting data from http://127.0.0.1:8008 with `zstd` encoding.
       [zstd]: Schema received in 0.005 seconds. schema=(ticker, price, volume).
       [zstd]: First batch of 6836 received and processed in 0.005 seconds
       [zstd]: Processing of all batches completed in 2.418 seconds.
         [br]: Requesting data from http://127.0.0.1:8008 with `br` encoding.
         [br]: Schema received in 0.103 seconds. schema=(ticker, price, volume).
         [br]: First batch of 6836 received and processed in 0.103 seconds
         [br]: Processing of all batches completed in 7.650 seconds.
       [gzip]: Requesting data from http://127.0.0.1:8008 with `gzip` encoding.
       [gzip]: Schema received in 0.045 seconds. schema=(ticker, price, volume).
       [gzip]: First batch of 6836 received and processed in 0.045 seconds
       [gzip]: Processing of all batches completed in 48.114 seconds.
   ```
   
   The uncompressed response size is almost 1GB. I think brotli is getting 
really high compression ratio here because the batches of data are random 
slices of the same base array.
   
   ```
   output.arrows         943M 
   output.arrows.br       36M 
   output.arrows.gz      344M 
   output.arrows.zstd    336M 
   ```
   
   From one laptop to another on my home Wi-Fi and 1/10 of the records:
   
   ```
   $ python client.py
   [identity]: Requesting data from http://192.168.68.103:8008 with `identity` 
encoding.
   [identity]: Schema received in 0.110 seconds. schema=(ticker, price, volume).
   [identity]: First batch of 684 received and processed in 0.133 seconds
   [identity]: Processing of all batches completed in 17.448 seconds.
       [zstd]: Requesting data from http://192.168.68.103:8008 with `zstd` 
encoding.
       [zstd]: Schema received in 0.023 seconds. schema=(ticker, price, volume).
       [zstd]: First batch of 684 received and processed in 0.032 seconds
       [zstd]: Processing of all batches completed in 6.133 seconds.
         [br]: Requesting data from http://192.168.68.103:8008 with `br` 
encoding.
         [br]: Schema received in 0.118 seconds. schema=(ticker, price, volume).
         [br]: First batch of 684 received and processed in 0.118 seconds
         [br]: Processing of all batches completed in 1.096 seconds.
       [gzip]: Requesting data from http://192.168.68.103:8008 with `gzip` 
encoding.
       [gzip]: Schema received in 0.203 seconds. schema=(ticker, price, volume).
       [gzip]: First batch of 684 received and processed in 0.203 seconds
       [gzip]: Processing of all batches completed in 6.294 seconds.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [WORK-IN-PROGRESS] http: Compressed response example in Python [arrow-experiments]

Reply via email to