zeroshade commented on code in PR #34631:
URL: https://github.com/apache/arrow/pull/34631#discussion_r1157664027
##########
go/parquet/pqarrow/column_readers.go:
##########
@@ -546,6 +551,14 @@ func transferBinary(rdr file.RecordReader, dt
arrow.DataType) *arrow.Chunked {
defer chunks[idx].Data().Release()
defer chunks[idx].Release()
}
+ } else if dt.ID() == arrow.EXTENSION && len(chunks) > 0 &&
arrow.StorageTypeEqual(chunks[0].DataType(), dt) &&
!arrow.TypeEqual(chunks[0].DataType(), dt) {
Review Comment:
this is kinda what i was touching on above. Because you're passing the
extension data type down here rather than the expected underlying storage, this
becomes a bit more complicated.
I think it might be preferable to do something like:
```go
chunks := brdr.GetBuilderChunks()
storage := dt
if dt.ID() == arrow.EXTENSION {
storage = dt.(arrow.ExtensionType).StorageType()
}
if storage == arrow.BinaryTypes.String || storage ==
arrow.BinaryTypes.LargeString {
// if storage != dt -> do the loop with `NewExtensionArrayWithStorage`
// otherwise the look with `MakeFromData`
}
return arrow.NewChunked(dt, chunks)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]