zeroshade opened a new pull request, #825:
URL: https://github.com/apache/arrow-go/pull/825

   ### Rationale for this change
   
   Closes #815.
   
   `arrow/compute/exprs.literalToDatum` handled interval year/month literals 
only through the `*expr.ProtoLiteral` path. The substrait-go library also 
surfaces these as the dedicated `expr.IntervalYearToMonthLiteral` type — that 
is the value 
[`intervalYearToMonthLiteralFromProto`](https://github.com/substrait-io/substrait-go/blob/v8.1.1/expr/interval_year_to_month.go#L41-L47)
 returns when deserializing protobuf — and any caller reaching `literalToDatum` 
with this type previously fell through to `arrow.ErrNotImplemented`.
   
   ### What changes are included in this PR?
   
   1. **New type-switch cases for `expr.IntervalYearToMonthLiteral`** (value 
form, the canonical form returned by substrait-go's deserializer) **and 
`*expr.IntervalYearToMonthLiteral`** (pointer form, used in parts of 
substrait-go itself, e.g. `expr/string_test.go:86`). Both produce the same 
`FixedSizeList<int32>[2]` extension scalar that the existing 
`*expr.ProtoLiteral` path already produces, so callers reaching either form get 
a Datum that is `compute.Equals` to the one they would have gotten via the 
protobuf path.
   2. **New helper `intervalYearToMonthDatum(mem, years, months)`**. The 
existing `*types.IntervalYearToMonthType` branch inside the 
`*expr.ProtoLiteral` case now calls this helper instead of inlining the builder 
logic, so the two paths cannot drift.
   3. **Allocator threading**. The helper takes the `mem memory.Allocator` 
parameter that `literalToDatum` already receives, instead of hardcoding 
`memory.DefaultAllocator` (which the original inlined branch did). All three 
call sites forward `mem`.
   
   ### Are these changes tested?
   
   Yes. New `TestLiteralToDatumIntervalYearToMonth` in 
`arrow/compute/exprs/exec_internal_test.go` constructs a `*expr.ProtoLiteral` 
baseline and a parallel `expr.IntervalYearToMonthLiteral` (value) and 
`*expr.IntervalYearToMonthLiteral` (pointer), then asserts the resulting Datums 
are `compute.Equals`. Existing `parquet/file/...`, `parquet/pqarrow/...`, and 
`arrow/compute/exprs/...` test suites all continue to pass; full repo `go build 
./...` is clean.
   
   The test deliberately uses `memory.DefaultAllocator` rather than 
`memory.NewCheckedAllocator` with `AssertSize(t, 0)`. `*scalar.Extension` does 
not implement `Release()` (see `arrow/scalar/scalar.go:397-466`), so an 
extension scalar's underlying storage is never released even when the wrapping 
Datum is — adding a checked-allocator assertion would chase a pre-existing leak 
that is unrelated to this issue. (Filing that as a separate follow-up issue is 
straightforward and out of scope here.)
   
   ### Are there any user-facing changes?
   
   Yes, but **purely additive**:
   
   - Callers passing an `expr.IntervalYearToMonthLiteral` (or pointer to one) 
to `literalToDatum`, `ExecuteScalarExpression`, or related entry points used to 
receive `arrow.ErrNotImplemented`; they now receive a properly-formed extension 
Datum identical to what the `*expr.ProtoLiteral` path produces.
   - No public API signatures change.
   - No existing call site behavior changes (the `*expr.ProtoLiteral` path 
produces a byte-identical Datum to before).
   
   ### Out of scope (separate follow-up)
   
   `*scalar.Extension` does not implement `Release()`, so extension scalars 
(including the ones produced by this code path, the existing 
`*types.IntervalYearToMonthType` path, the `*types.IntervalDayType` path, and 
`intervalDay()` / `intervalYear()` extension types in general) leak their 
storage through `compute.ScalarDatum.Release()`. Worth a separate issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to