zeroshade opened a new pull request, #851:
URL: https://github.com/apache/arrow-go/pull/851
### Rationale for this change
Closes #827.
`*scalar.Extension` implements `value()`, `equals()`, `Validate()`,
`ValidateFull()`, `CastTo()`, and `String()`, but neither `Release()` nor
`Retain()`.
`compute.ScalarDatum.Release()` only forwards `Release()` when the wrapped
scalar satisfies the unexported `releasable` interface
(`arrow/compute/datum.go`):
```go
func (d *ScalarDatum) Release() {
if v, ok := d.Value.(releasable); ok {
v.Release()
}
}
```
Because `*scalar.Extension` doesn't satisfy that interface, the type
assertion fails and the inner storage scalar is never released — even though
the storage scalar (e.g. `*scalar.FixedSizeList` backing the interval extension
types) carries buffer-bearing arrays. Any code path that produces an extension
scalar and releases it through a `ScalarDatum` leaks those buffers. Concretely
this affects the `*types.IntervalYearToMonthType` / `*types.IntervalDayType`
paths in `arrow/compute/exprs`.
`*scalar.Extension` was the lone outlier among buffer-holding scalar types:
`*scalar.Binary`, `*scalar.List`, `*scalar.Struct`, `*scalar.Dictionary`,
`*scalar.SparseUnion`, `*scalar.DenseUnion`, and `*scalar.RunEndEncoded` all
already delegate `Retain`/`Release` to their inner storage.
### What changes are included in this PR?
1. **`*scalar.Extension` now implements `Retain()` and `Release()`**,
delegating to the storage scalar when it satisfies the `Releasable` interface
(`arrow/scalar/scalar.go`). This mirrors `*scalar.Binary` and the nested scalar
types:
```go
func (s *Extension) Retain() {
if r, ok := s.Value.(Releasable); ok {
r.Retain()
}
}
func (s *Extension) Release() {
if r, ok := s.Value.(Releasable); ok {
r.Release()
}
}
```
This is purely additive:
- Storage scalars that don't implement `Releasable` (e.g. primitive
numeric storage) are no-ops, matching today's behavior.
- Storage scalars that do (e.g. `*scalar.FixedSizeList` for the interval
extension types) are now properly released.
- The `Retain()` half restores symmetry so a caller taking ownership of
an extension scalar can bump the storage refcount.
2. **`TestLiteralToDatumIntervalYearToMonth` is switched to
`memory.NewCheckedAllocator` with `defer mem.AssertSize(t, 0)`**
(`arrow/compute/exprs/exec_internal_test.go`). That test previously used
`memory.DefaultAllocator` specifically to dodge this leak and carried a comment
deferring the fix to this issue; the workaround comment is removed.
### Are these changes tested?
Yes. `TestLiteralToDatumIntervalYearToMonth` now asserts zero outstanding
allocations via a checked allocator. Verified that it **fails without** the
scalar change (`invalid memory size exp=0, got=384`, with leak traces pointing
at the extension scalar's storage) and **passes with** it. The full
`arrow/scalar/...` and `arrow/compute/exprs/...` suites pass, and `go vet` is
clean.
### Are there any user-facing changes?
No breaking changes. Adding `Retain()`/`Release()` methods to an exported
type is non-breaking in Go. The only behavioral change is the intended one:
buffers held by an extension scalar's storage are now released when the scalar
(or a `ScalarDatum` wrapping it) is released, instead of leaking.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]