joellubi commented on PR #43835:
URL: https://github.com/apache/arrow/pull/43835#issuecomment-2312895450
> LGTM
>
> How do the benchmarks look?
```
goos: darwin
goarch: arm64
pkg: github.com/apache/arrow/go/v18/parquet/pqarrow
BenchmarkWriteTableCompressed/codec=UNCOMPRESSED-14 13
101496837 ns/op 546.93 MB/s 222180261 B/op 1297249 allocs/op
BenchmarkWriteTableCompressed/codec=SNAPPY-14 10
105285342 ns/op 527.25 MB/s 190905955 B/op 1297284 allocs/op
BenchmarkWriteTableCompressed/codec=GZIP-14 7
163399286 ns/op 339.73 MB/s 196390286 B/op 1297654 allocs/op
BenchmarkWriteTableCompressed/codec=BROTLI-14 2
503086730 ns/op 110.34 MB/s 698402444 B/op 1298624 allocs/op
BenchmarkWriteTableCompressed/codec=ZSTD-14 8
139438708 ns/op 398.11 MB/s 1014671664 B/op 1298644 allocs/op
BenchmarkWriteTableCompressed/codec=LZ4_RAW-14 9
115800444 ns/op 479.38 MB/s 167152016 B/op 1297259 allocs/op
BenchmarkReadTableCompressed/codec=UNCOMPRESSED-14 42
34890312 ns/op 561.85 MB/s 235618446 B/op 2431 allocs/op
BenchmarkReadTableCompressed/codec=SNAPPY-14 38
33923055 ns/op 272.67 MB/s 210971180 B/op 2428 allocs/op
BenchmarkReadTableCompressed/codec=GZIP-14 22
55319667 ns/op 80.78 MB/s 202261534 B/op 2721 allocs/op
BenchmarkReadTableCompressed/codec=BROTLI-14 16
71177099 ns/op 28.35 MB/s 233693905 B/op 2779 allocs/op
BenchmarkReadTableCompressed/codec=ZSTD-14 21
56364889 ns/op 37.02 MB/s 196485768 B/op 2444 allocs/op
BenchmarkReadTableCompressed/codec=LZ4_RAW-14 36
33904137 ns/op 267.63 MB/s 210578330 B/op 2427 allocs/op
PASS
ok github.com/apache/arrow/go/v18/parquet/pqarrow 20.220s
```
LZ4_RAW seems to be competitive with SNAPPY in terms of throughput, lagging
behind it by a relatively small margin. Some research suggests that LZ4
_should_ be faster than Snappy in the general case, but a number of benchmarks
on columnar data specifically tend to still skew towards Snappy for speed. So
I'd say these results are consistent with expectations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]