Did you test the net speed from the client(at Asia) and server(at US)?
E.g., you can use iperf for a quick evaluation if you're running linux.
On 8/20/21 1:39 AM, Abe Hsu 許 育銘 (abehsu) wrote:
Micron Confidential
Hi team:
I am Abe from Taiwan. This is my first time sent mail to apache
community, if i do something wrong, please correct me. I am
investigating using Arrow Flight as data exchange protocol. I am using
python to establish a Flight Server. And the performance is a little not
as my expectation, so I would like to ask some suggestion from team. I
set up Flight Server on US, and my python client code is setup on Asia
(e.g: Taiwan).
I find if I want to transfer 178MB data with 1001730 rows from US to
Asia. It will need 10s. I expect it will less than 1s?
Any parts I am missing?
time python client.py get -c ‘get’
RangeIndex: 1001731 entries, 0 to 1001730
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 cmte_id 1001731 non-null object
1 cand_id 1001731 non-null object
2 cand_nm 1001731 non-null object
3 contbr_nm 1001731 non-null object
4 contbr_city 1001712 non-null object
5 contbr_st 1001727 non-null object
6 contbr_zip 1001731 non-null int64
7 contbr_employer 988002 non-null object
8 contbr_occupation 993301 non-null object
9 contb_receipt_amt 1001731 non-null float64
10 contb_receipt_dt 1001731 non-null object
11 receipt_desc 14166 non-null object
12 memo_cd 92482 non-null object
13 memo_text 97770 non-null object
14 form_tp 1001731 non-null object
15 file_num 1001731 non-null int64
dtypes: float64(1), int64(2), object(13)
memory usage: 122.3+ MB
real 0m10.405s
user 0m0.297s
sys 0m0.996s
I will have this expectation is because I look into those articles.
·https://www.dremio.com/is-time-to-replace-odbc-jdbc
With an average size batch size (256K records), the performance of
Flight exceeded 20 Gb/s for a single stream running on a single core.
·https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/
As far as absolute speed, in our C++ data throughput benchmarks, we are
seeing end-to-end TCP throughput in excess of 2-3GB/s on localhost
without TLS enabled. This benchmark shows a transfer of ~12 gigabytes of
data in about 4 seconds:
Many Thanks,
Abe
Micron Confidential