zeroshade commented on code in PR #485:
URL: https://github.com/apache/arrow-go/pull/485#discussion_r2310553142
##########
parquet/file/column_reader.go:
##########
@@ -437,16 +446,27 @@ func (c *columnChunkReader) initDataDecoder(page Page,
lvlByteLen int64) error {
format.Encoding_DELTA_LENGTH_BYTE_ARRAY,
format.Encoding_DELTA_BINARY_PACKED,
format.Encoding_BYTE_STREAM_SPLIT:
- c.curDecoder =
c.decoderTraits.Decoder(parquet.Encoding(encoding), c.descr, false, c.mem)
- c.decoders[encoding] = c.curDecoder
+ c.curDecoder =
c.decoderTraits.Decoder(parquet.Encoding(enc), c.descr, false, c.mem)
+ c.decoders[enc] = c.curDecoder
case format.Encoding_RLE_DICTIONARY:
return errors.New("parquet: dictionary page must be
before data page")
default:
- return fmt.Errorf("parquet: unknown encoding type %s",
encoding)
+ return fmt.Errorf("parquet: unknown encoding type %s",
enc)
+ }
+ }
+
+ switch c.descr.PhysicalType() {
+ case parquet.Types.FixedLenByteArray:
+ c.curDecoder = &encoding.FixedLenByteArrayDecoderWrapper{
+ FixedLenByteArrayDecoder:
c.curDecoder.(encoding.FixedLenByteArrayDecoder),
+ }
+ case parquet.Types.ByteArray:
+ c.curDecoder = &encoding.ByteArrayDecoderWrapper{
+ ByteArrayDecoder:
c.curDecoder.(encoding.ByteArrayDecoder),
Review Comment:
Yes, but remember that the code you're changing called ` buf :=
memory.NewResizableBuffer(p.mem)` which does allocate a new buffer that the
data is copied into before the changes you're making. The changes in this PR
are introducing buffer re-use that wasn't there before.
I'm not saying it's a bad thing, just pointing out part of the reason for
the previous behavior that allocated new buffers was to prevent the problem
you've identified. Now that we're reusing these buffers in this change, we have
to do the copying you're proposing.
Thats all
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]