[
https://issues.apache.org/jira/browse/HUDI-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361972#comment-17361972
]
Rajesh Mahindra commented on HUDI-818:
--
Benchmarks results across EMR node with both SDD and HDD below. tl;dr: Do not
see any significant regressions/ unexpected spikes in latencies for spillable
map, that may require immediate attn.
Case 1: Benchmark results with EMR m5.xlarge
4 vCore, 16 GiB memory, EBS only storage
EBS Storage:2000 GiB with ST1 HDD storage
---
THROUGHPUT using dd:
---
[hadoop@ip-172-31-26-21 hudi]$ dd if=/dev/zero of=/mnt/test bs=512 count=1
oflag=direct
1+0 records in
1+0 records out
512 bytes (5.1 MB) copied, 35.8048 s, 143 kB/s
[hadoop@ip-172-31-26-21 ~]$ dd if=/dev/zero of=/mnt/test bs=1K count=1
oflag=direct
1+0 records in
1+0 records out
1024 bytes (10 MB) copied, 33.2558 s, 308 kB/s
[hadoop@ip-172-31-26-21 ~]$ dd if=/dev/zero of=/mnt/test bs=1M count=1
oflag=direct
1+0 records in
1+0 records out
1048576 bytes (10 GB) copied, 42.2197 s, 248 MB/s
LATENCY using IOPING:
--
FOR 512 Bytes block size
[hadoop@ip-172-31-26-21 hudi]$ sudo ~/ioping-0.8/ioping -R /dev/nvme1n1p2 -s
512 -w 120
--- /dev/nvme1n1p2 (device 1.9 TiB) ioping statistics ---
61.5 k requests completed in 2.0 min, 512 iops, 256.2 KiB/s
min/avg/max/mdev = 1 us / 2.0 ms / 34.6 ms / 2.2 ms
FOR 4K block size
[hadoop@ip-172-31-26-21 ~]$ sudo ./ioping-0.8/ioping -R /dev/nvme1n1p2 -s 4K -w
120
--- /dev/nvme1n1p2 (device 1.9 TiB) ioping statistics ---
61.7 k requests completed in 2.0 min, 515 iops, 2.0 MiB/s
min/avg/max/mdev = 176 us / 1.9 ms / 31.9 ms / 2.1 ms
BENCHMARKING WITH LOAD OF GET AND PUT (Code written in
org.apache.hudi.common.util.collection.TestExternalSpillableMap):
2 RUNS with 5M records of 500B each:
GET MEM: \{0=860225, 1=485}
GET DISK: \{128=1, 0=4033664, 65=1, 129=1, 1=105603, 99=1, 5=1, 199=1, 44=1,
77=1, 16=1, 145=1, 117=3, 118=1, 123=1, 124=3, 221=1, 125=2, 126=1, 30=1, 31=1}
PUT MEM: \{0=859029, 1=423}
PUT DISK: \{0=4108753, 1=31712, 130=1, 131=2, 128=4, 129=3, 3588=1, 133=2,
136=1, 139=1, 142=1, 144=1, 145=1, 20=1, 21=1, 152=1, 153=1, 157=1, 3621=1,
37=2, 44=1, 172=1, 49=1, 50=1, 54=1, 55=1, 60=1, 61=1, 68=1, 70=1, 71=1, 78=1,
209=1, 82=1, 83=1, 85=1, 89=1, 93=1, 226=1, 101=1, 108=1, 109=3, 111=1, 112=1,
113=2, 114=1, 116=2, 117=2, 118=3, 119=2, 120=1, 121=1, 122=3, 124=3, 125=3,
126=2, 127=7}
GET MEM: \{0=860207, 1=668, 3=1, 5=1}
GET DISK: \{0=3988026, 1=150580, 2=185, 3=104, 4=61, 5=68, 6=27, 7=19, 8=10,
9=9, 10=7, 11=4, 12=1, 204=1, 13=2, 15=2, 146=1, 18=1, 19=1, 21=1, 150=1,
155=1, 226=1, 165=1, 230=1, 169=1, 44=1, 239=1, 114=1, 179=1, 253=1, 190=1,
255=1, 191=1}
PUT MEM: \{0=860348, 1=614, 9=1}
PUT DISK: \{0=4084431, 1=54357, 129=1, 130=1, 2=65, 3=31, 4=23, 261=1, 5=23,
6=9, 7=9, 8=2, 265=1, 9=4, 10=1, 139=1, 11=1, 12=3, 140=1, 14=2, 270=1, 144=1,
17=1, 273=1, 145=1, 146=3, 147=3, 20=1, 21=2, 150=1, 280=1, 155=2, 156=1,
285=1, 287=1, 163=1, 169=1, 170=3, 171=2, 172=2, 173=1, 176=1, 178=1, 180=1,
181=1, 182=1, 183=2, 314=1, 187=1, 316=1, 191=1, 192=1, 4803=1, 197=1, 202=1,
75=1, 208=1, 209=1, 84=1, 213=1, 214=1, 223=2, 224=1, 225=1, 227=1, 228=1,
101=1, 232=1, 237=1, 238=1, 240=1, 242=1, 243=1, 372=1, 245=1, 247=1, 248=1,
250=1, 254=1}
Case 1: Benchmark results with EMR m5.xlarge
4 vCore, 16 GiB memory, EBS only storage
EBS Storage:2000 GiB with GP2 SDD storage
---
THROUGHPUT using dd:
---
[hadoop@ip-172-31-30-32 hudi]$ dd if=/dev/zero of=/mnt/test bs=512 count=1
oflag=direct
1+0 records in
1+0 records out
512 bytes (5.1 MB) copied, 8.11925 s, 631 kB/s
[hadoop@ip-172-31-30-32 ~]$ dd if=/dev/zero of=/mnt/test bs=1K count=10
oflag=direct
10+0 records in
10+0 records out
10240 bytes (102 MB) copied, 85.7164 s, 1.2 MB/s
[hadoop@ip-172-31-30-32 mnt]$ dd if=/dev/zero of=/mnt/test bs=1M count=1
oflag=direct
1+0 records in
1+0 records out
1048576 bytes (10 GB) copied, 88.494 s, 118 MB/s
LATENCY using IOPING:
-
For 512 Bytes block size
[hadoop@ip-172-31-30-32 hudi]$ sudo ~/ioping-0.8/ioping -R /dev/nvme1n1p2 -s
512 -w 120
--- /dev/nvme1n1p2 (device 1.9 TiB) ioping statistics ---
227.7 k requests completed in 2.0 min, 1.9 k iops, 950.8 KiB/s
min/avg/max/mdev = 2 us / 525 us / 19.1 ms / 506 us
For 4K block size
[hadoop@ip-172-31-30-32 ~]$ sudo ./ioping-0.8/ioping -R /dev/nvme1n1p2 -s 4K -w
120
--- /dev/nvme1n1p2 (device 1.9 TiB) ioping statistics ---
223.4 k requests completed in 2.0 min, 2.0 k iops, 7.6 MiB/s
min/avg/max/mdev = 127 us / 511 us / 35.0 ms