jerry-024 commented on code in PR #7933:
URL: https://github.com/apache/paimon/pull/7933#discussion_r3301852575
##########
paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/globalindex/GenericIndexTopoBuilder.java:
##########
@@ -626,16 +693,20 @@ public void processElement(StreamRecord<ShardTask>
element) throws Exception {
}
// Only write rows within this shard's range
if (currentRowId >= task.shardRange.from) {
- Object fieldData =
indexFieldGetter.getFieldOrNull(row);
- if (fieldData == null) {
- LOG.info(
- "Null vector at rowId={}, stopping
shard [{}, {}].",
- currentRowId,
- task.shardRange.from,
- task.shardRange.to);
- break;
+ if (multiColumn) {
+ ((GlobalIndexMultiColumnWriter)
indexWriter).write(row);
Review Comment:
Bug: missing null check in multi-column path.
The single-column path below checks for null and breaks the shard, and
`ESIndexTopoBuilder.BuildESIndexOperator.processElement()` in this same PR also
checks all columns for null before writing. But this multi-column path writes
`row` directly with no null check.
Known implementations like `LuminaVectorGlobalIndexWriter.write()` throw
`IllegalArgumentException` on null input — so a multi-column index containing a
vector column will crash the Flink job if any row has a null value in the
indexed columns.
Suggested fix — add the same null-field guard that `ESIndexTopoBuilder` has:
```java
if (multiColumn) {
boolean hasNull = false;
for (InternalRow.FieldGetter getter : indexFieldGetters) {
if (getter.getFieldOrNull(row) == null) {
hasNull = true;
break;
}
}
if (hasNull) {
LOG.info(
"Null value in indexed columns at rowId={}, stopping shard
[{}, {}].",
currentRowId,
task.shardRange.from,
task.shardRange.to);
break;
}
((GlobalIndexMultiColumnWriter) indexWriter).write(row);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]