bersprockets commented on code in PR #39117:
URL: https://github.com/apache/spark/pull/39117#discussion_r1051693056
##########
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeArrayWriter.java:
##########
@@ -191,4 +192,17 @@ public void write(int ordinal, Decimal input, int
precision, int scale) {
setNull(ordinal);
}
}
+
+ @Override
+ public void write(int ordinal, CalendarInterval input) {
Review Comment:
Inspired by how`InterpretedUnsafeProjection` already handles decimals, I
changed `InterpretedUnsafeProjection` to always call `unsafeWriter.write` for
calendar interval values, regardless of whether the input value is null. This
broke the case where the calendar interval value is contained in an array (or
map). In that case, the base implementation (`UnsafeWriter.write(int,
CalendarInterval)`) did not properly set null values (it stomped on the
numElements field).
To fix it, again I looked at how decimals are treated, and therefore added
an override of `write(int, CalendarInterval)` to `UnsafeArrayWriter`.
Note, the previous lack of this override was not a problem for generated
code. That's because `GenerateUnsafeProjection` handles decimals and intervals
that are contained within an array different than decimals and intervals that
are at the top level or contained in a struct.
In the _array_ case, `GenerateUnsafeProjection` generates code that
explicitly sets null when the input value is null:
```java
/* 054 */ for (int index_0 = 0; index_0 < numElements_0; index_0++) {
/* 055 */
/* 056 */ if (tmpInput_0.isNullAt(index_0)) {
/* 057 */ mutableStateArray_1[0].setNull8Bytes(index_0);
/* 058 */ } else {
/* 059 */ mutableStateArray_1[0].write(index_0,
tmpInput_0.getInterval(index_0));
/* 060 */ }
/* 061 */
/* 062 */ }
```
Compare to the top-level/struct case, where the generated code always uses
`writer.write`:
```java
/* 033 */ boolean isNull_0 = i.isNullAt(0);
/* 034 */ CalendarInterval value_0 = isNull_0 ?
/* 035 */ null : (i.getInterval(0));
/* 036 */ if (isNull_0) {
/* 037 */ mutableStateArray_0[0].write(0, (CalendarInterval) null);
/* 038 */ } else {
/* 039 */ mutableStateArray_0[0].write(0, value_0);
/* 040 */ }
```
This means that `InterpretedUnsafeProjection` and `GenerateUnsafeProjection`
have slightly different logic in the handling of null decimal or null calendar
interval values in arrays (and maps): `InterpretedUnsafeProjection` always
calls `writer.write`, and `GenerateUnsafeProjection` always calls
`writer.write` only for the top-level/struct case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]