This is an automated email from the ASF dual-hosted git repository.
zeroshade pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 8abb941f57 ARROW-16983: [Go][Parquet] fix EstimatedDataEncodedSize of
DeltaByteArrayEncoder (#13522)
8abb941f57 is described below
commit 8abb941f57316d77b4a5eb209f5a108c275fe120
Author: Matt DePero <[email protected]>
AuthorDate: Wed Jul 6 13:30:32 2022 -0700
ARROW-16983: [Go][Parquet] fix EstimatedDataEncodedSize of
DeltaByteArrayEncoder (#13522)
`DeltaByteArrayEncoder` extends `encoder` which calculates
`EstimatedDataEncodedSize()` by calling `Len()` on its `sink`.
`DeltaByteArrayEncoder` however does not write its data out to sink, instead
writing out to `prefixEncoder` and `suffixEncoder`, causing
EstimatedDataEncodedSize to always return zero, resulting in `FlushCurrentPage`
never being called.
Authored-by: Matt DePero <[email protected]>
Signed-off-by: Matthew Topol <[email protected]>
---
go/parquet/internal/encoding/delta_byte_array.go | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/go/parquet/internal/encoding/delta_byte_array.go
b/go/parquet/internal/encoding/delta_byte_array.go
index c1d1ee5f72..1f573de4eb 100644
--- a/go/parquet/internal/encoding/delta_byte_array.go
+++ b/go/parquet/internal/encoding/delta_byte_array.go
@@ -39,6 +39,10 @@ type DeltaByteArrayEncoder struct {
lastVal parquet.ByteArray
}
+func (enc *DeltaByteArrayEncoder) EstimatedDataEncodedSize() int64 {
+ return enc.prefixEncoder.EstimatedDataEncodedSize() +
enc.suffixEncoder.EstimatedDataEncodedSize()
+}
+
func (enc *DeltaByteArrayEncoder) initEncoders() {
enc.prefixEncoder = &DeltaBitPackInt32Encoder{
deltaBitPackEncoder: &deltaBitPackEncoder{encoder:
newEncoderBase(enc.encoding, nil, enc.mem)}}