zeroshade commented on pull request #10716: URL: https://github.com/apache/arrow/pull/10716#issuecomment-896879021
@emkornfield The performance difference is actually entirely based on the data and types (which is why I left both implementations in here). My current theory is that the custom hash table implementation (which I took from the C++ memo table implementation) is that it's simply a case of optimized handling of smaller values resulting is significantly fewer allocations. With go1.16.1 on my local laptop: ``` goos: windows goarch: amd64 pkg: github.com/apache/arrow/go/parquet/internal/encoding cpu: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz BenchmarkMemoTableFloat64/100_unique_n_65535/go_map-12 100 12464686 ns/op 7912 B/op 15 allocs/op BenchmarkMemoTableFloat64/100_unique_n_65535/xxh3-12 230 5325610 ns/op 7680 B/op 2 allocs/op BenchmarkMemoTableFloat64/1000_unique_n_65535/go_map-12 85 14819479 ns/op 123337 B/op 70 allocs/op BenchmarkMemoTableFloat64/1000_unique_n_65535/xxh3-12 186 5388082 ns/op 130560 B/op 4 allocs/op BenchmarkMemoTableFloat64/5000_unique_n_65535/go_map-12 79 16167963 ns/op 493125 B/op 195 allocs/op BenchmarkMemoTableFloat64/5000_unique_n_65535/xxh3-12 132 7631678 ns/op 523776 B/op 5 allocs/op BenchmarkMemoTableInt32/100_unique_n_65535/xxh3-12 247 4648034 ns/op 5120 B/op 2 allocs/op BenchmarkMemoTableInt32/100_unique_n_65535/go_map-12 602 1885666 ns/op 6468 B/op 17 allocs/op BenchmarkMemoTableInt32/1000_unique_n_65535/xxh3-12 1429 895664 ns/op 87040 B/op 4 allocs/op BenchmarkMemoTableInt32/1000_unique_n_65535/go_map-12 632 1893139 ns/op 104802 B/op 72 allocs/op BenchmarkMemoTableInt32/5000_unique_n_65535/xxh3-12 1020 1295938 ns/op 349184 B/op 5 allocs/op BenchmarkMemoTableInt32/5000_unique_n_65535/go_map-12 489 2457437 ns/op 420230 B/op 187 allocs/op BenchmarkMemoTable/100_unique_len_32-32_n_65535/xxh3-12 398 2990361 ns/op 16904 B/op 23 allocs/op BenchmarkMemoTable/100_unique_len_32-32_n_65535/go_map-12 428 2811322 ns/op 19799 B/op 28 allocs/op BenchmarkMemoTable/100_unique_len_8-32_n_65535/xxh3-12 356 3463212 ns/op 16904 B/op 23 allocs/op BenchmarkMemoTable/100_unique_len_8-32_n_65535/go_map-12 380 3149174 ns/op 19781 B/op 28 allocs/op BenchmarkMemoTable/1000_unique_len_32-32_n_65535/xxh3-12 366 3188208 ns/op 176200 B/op 32 allocs/op BenchmarkMemoTable/1000_unique_len_32-32_n_65535/go_map-12 361 9588561 ns/op 211407 B/op 67 allocs/op BenchmarkMemoTable/1000_unique_len_8-32_n_65535/xxh3-12 336 3529636 ns/op 176200 B/op 32 allocs/op BenchmarkMemoTable/1000_unique_len_8-32_n_65535/go_map-12 326 3676726 ns/op 211358 B/op 67 allocs/op BenchmarkMemoTable/5000_unique_len_32-32_n_65535/xxh3-12 252 4702015 ns/op 992584 B/op 42 allocs/op BenchmarkMemoTable/5000_unique_len_32-32_n_65535/go_map-12 255 4884533 ns/op 1131141 B/op 181 allocs/op BenchmarkMemoTable/5000_unique_len_8-32_n_65535/xxh3-12 235 4814247 ns/op 722248 B/op 41 allocs/op BenchmarkMemoTable/5000_unique_len_8-32_n_65535/go_map-12 244 5340692 ns/op 860673 B/op 179 allocs/op BenchmarkMemoTableAllUnique/values_1024_len_32-32/go_map-12 4509 271040 ns/op 211432 B/op 67 allocs/op BenchmarkMemoTableAllUnique/values_1024_len_32-32/xxh3-12 8245 150411 ns/op 176200 B/op 32 allocs/op BenchmarkMemoTableAllUnique/values_1024_len_8-32/go_map-12 4552 255188 ns/op 211443 B/op 67 allocs/op BenchmarkMemoTableAllUnique/values_1024_len_8-32/xxh3-12 9828 148377 ns/op 176200 B/op 32 allocs/op BenchmarkMemoTableAllUnique/values_32767_len_32-32/go_map-12 100 11416073 ns/op 6324082 B/op 1176 allocs/op BenchmarkMemoTableAllUnique/values_32767_len_32-32/xxh3-12 222 5578530 ns/op 3850569 B/op 49 allocs/op BenchmarkMemoTableAllUnique/values_32767_len_8-32/go_map-12 100 11123497 ns/op 6323152 B/op 1171 allocs/op BenchmarkMemoTableAllUnique/values_32767_len_8-32/xxh3-12 212 6094342 ns/op 3850569 B/op 49 allocs/op BenchmarkMemoTableAllUnique/values_65535_len_32-32/go_map-12 44 25062816 ns/op 12580560 B/op 2384 allocs/op BenchmarkMemoTableAllUnique/values_65535_len_32-32/xxh3-12 74 15704849 ns/op 10430028 B/op 53 allocs/op BenchmarkMemoTableAllUnique/values_65535_len_8-32/go_map-12 40 26667808 ns/op 12580008 B/op 2381 allocs/op BenchmarkMemoTableAllUnique/values_65535_len_8-32/xxh3-12 81 14417075 ns/op 10430024 B/op 53 allocs/op ``` The ones labeled xxh3 are my custom implementation and the go_map is the go builtin map based implementation, if you look closely at the results of the benchmark, in *most* cases, the xxh3 based implementation is fairly significantly faster, sometimes even twice as fast, with significantly fewer allocations per loop (for example, in the binary case with all unique values you can see the 2384 allocations in the go map based implementation vs the 53 in my custom implementation, having a ~37% performance improvement over the go-map based implementation. But if we look at some other cases, for example the `100_unique_len_32-32_n_65535` and `100_unique_len_8-32_n_65535` cases, which correspond to 65535 binary strings of length 32 or of lengths between 8 and 32 with exactly 100 unique values among them, we see that the go map based implementation is actually slightly more performant despite a few more allocations. The same thing happens with the Int32 memotable with only 100 unique values over 65535 values, but when we increase to 1000 unique values or 5000 unique values my custom one seems to do better. Which seems to indicate that the builtin go map is faster in cases with lower cardinality of unique values, while my custom implementation is more performant for inserting new values. Except in the Float64 case, where my custom implementation is faster in all cases even in the 100 unique value case, which I attribute to the faster hashing of smaller byte values via xxh3. **TL;DR**: All in all, in *most* cases, the custom implementation is faster but in some cases with lower cardinality of unique values (specifically int32 and binary strings) the implementation using go's map as the basis can be more performant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
