zeroshade commented on code in PR #485: URL: https://github.com/apache/arrow-go/pull/485#discussion_r2310553142
########## parquet/file/column_reader.go: ########## @@ -437,16 +446,27 @@ func (c *columnChunkReader) initDataDecoder(page Page, lvlByteLen int64) error { format.Encoding_DELTA_LENGTH_BYTE_ARRAY, format.Encoding_DELTA_BINARY_PACKED, format.Encoding_BYTE_STREAM_SPLIT: - c.curDecoder = c.decoderTraits.Decoder(parquet.Encoding(encoding), c.descr, false, c.mem) - c.decoders[encoding] = c.curDecoder + c.curDecoder = c.decoderTraits.Decoder(parquet.Encoding(enc), c.descr, false, c.mem) + c.decoders[enc] = c.curDecoder case format.Encoding_RLE_DICTIONARY: return errors.New("parquet: dictionary page must be before data page") default: - return fmt.Errorf("parquet: unknown encoding type %s", encoding) + return fmt.Errorf("parquet: unknown encoding type %s", enc) + } + } + + switch c.descr.PhysicalType() { + case parquet.Types.FixedLenByteArray: + c.curDecoder = &encoding.FixedLenByteArrayDecoderWrapper{ + FixedLenByteArrayDecoder: c.curDecoder.(encoding.FixedLenByteArrayDecoder), + } + case parquet.Types.ByteArray: + c.curDecoder = &encoding.ByteArrayDecoderWrapper{ + ByteArrayDecoder: c.curDecoder.(encoding.ByteArrayDecoder), Review Comment: Yes, but remember that the code you're changing called ` buf := memory.NewResizableBuffer(p.mem)` which does allocate a new buffer that the data is copied into before the changes you're making. The changes in this PR are introducing buffer re-use that wasn't there before. I'm not saying it's a bad thing, just pointing out part of the reason for the previous behavior that allocated new buffers was to prevent the problem you've identified. Now that we're reusing these buffers in this change, we have to do the copying you're proposing. Thats all -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org