real-mj-song commented on issue #18629:
URL: https://github.com/apache/pinot/issues/18629#issuecomment-4699401511

   **Scoping update on the "null handling in `buildColumnar()`" architectural 
gap** — resolution implemented as part of the column-major work (#18638).
   
   **Fixed**
   - **Single-value null → typed-collector NPE.** 
`ColumnarSegmentPreIndexStatsContainer` now substitutes 
`fieldSpec.getDefaultNullValue()` for `null` before dispatching to the typed 
stats collector — mirroring `NullValueTransformer` on the row-major path and 
the column-major *index* path (`SegmentColumnarIndexCreator.indexColumn`), 
which already substituted the default and set the null-value vector. This 
removes the `((Integer) entry)`-style cast NPE for every SV type 
(INT/LONG/FLOAT/DOUBLE/BIG_DECIMAL/STRING/BYTES/BOOLEAN). Verified equivalent 
to the row-major path for all non-time SV columns.
   - **Multi-value whole-value null.** Both the stats and index paths now 
substitute `new Object[]{defaultNullValue}` (gated on `isSingleValueField()`), 
matching `NullValueTransformerUtils.getDefaultNullValue`. Previously the index 
path threw `ClassCastException` casting the scalar default to `Object[]` on the 
first whole-null MV row, and the stats path mis-recorded an MV-null as a single 
SV entry (wrong `maxNumberOfMultiValues` / `totalNumberOfEntries`).
   
   Segment-local columnar tests (`ColumnarRowMajorEquivalenceTest`, 
`SegmentColumnarIndexCreatorTest`, `ColumnarSchemaEvolutionTest`) and the Arrow 
column-major suite pass.
   
   **Remaining (minor)**
   - **TIME column parity.** Both paths substitute the raw 
`fieldSpec.getDefaultNullValue()` rather than the row-major time-validated / 
current-time override 
(`NullValueTransformerUtils.getDefaultNullValue(fieldSpec, tableConfig, 
schema)`). Low severity — stats and index stay mutually consistent (segment 
metadata agrees with the forward index). Full row-major parity for the time 
column would route both through `NullValueTransformerUtils`.
   - **Element-level nulls inside an MV array** are passed through unchanged — 
consistent with the row-major path, whose substitution is whole-value-granular 
(it does not substitute per-element MV nulls either).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to