hudi-agent commented on code in PR #18877:
URL: https://github.com/apache/hudi/pull/18877#discussion_r3345317570
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java:
##########
@@ -678,7 +678,8 @@ private MergeOnReadInputFormat mergeOnReadInputFormat(
this.limit == NO_LIMIT_CONSTANT ? Long.MAX_VALUE : this.limit, //
ParquetInputFormat always uses the limit value
getParquetConf(this.conf, this.hadoopConf.unwrap()),
this.conf.get(FlinkOptions.READ_UTC_TIMEZONE),
- this.internalSchemaManager
+ this.internalSchemaManager,
+ tableSchema
);
}
Review Comment:
🤖 [Line 687] Should `inferSchemaFromDdl()` also forward
`conf.get(FlinkOptions.VECTOR_COLUMNS)` to the converter? Today it calls the
1-arg `convertToSchema(this.tableRowType)`, so when a stream reader starts
against an empty table created with `hoodie.vector.columns=...`, the inferred
schema loses the VECTOR metadata and subsequent reads of written files may fail
to decode the FIXED_LEN_BYTE_ARRAY payload.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/ClusteringOperator.java:
##########
@@ -169,6 +169,7 @@ public void open() throws Exception {
this.table = writeClient.getHoodieTable();
this.schema = HoodieSchemaConverter.convertToSchema(rowType);
+ this.writeConfig.setSchema(schema.toString());
Review Comment:
🤖 This override drops VECTOR metadata from `writeConfig.schema`. `rowType`
here is ultimately built from
`HoodieSchemaConverter.convertToDataType(tableSchema)` (see
`HoodieFlinkClusteringJob#cluster`), which maps VECTOR fields back into
`ARRAY<FLOAT/DOUBLE/TINYINT>`. After `convertToSchema(rowType)` the schema
stored here is a plain ARRAY schema. Since
`HoodieRowDataCreateHandle#createNewFileWriter` now reads the writer schema
from `config.getSchema()`, clustering would then rewrite VECTOR columns as
regular LIST parquet types and drop the `hoodie.vector.columns` footer key.
Could you verify with a clustering round-trip on a VECTOR table — e.g., derive
the writer schema from `tableSchema` (with metadata fields added) instead of
from `rowType`?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-flink-datasource/hudi-flink1.17.x/src/main/java/org/apache/hudi/table/format/cow/ParquetSplitReaderUtil.java:
##########
@@ -365,7 +365,15 @@ private static ColumnReader createColumnReader(
case VARCHAR:
case BINARY:
case VARBINARY:
- return new BytesColumnReader(descriptor, pageReader);
+ switch (descriptor.getPrimitiveType().getPrimitiveTypeName()) {
+ case BINARY:
+ return new BytesColumnReader(descriptor, pageReader);
+ case FIXED_LEN_BYTE_ARRAY:
+ return new FixedLenBytesColumnReader(
+ descriptor, pageReader);
+ default:
+ throw new AssertionError();
Review Comment:
🤖 Minor: the new `default: throw new AssertionError();` for the
CHAR/VARCHAR/BINARY/VARBINARY switch has no message. If a parquet file ever
shows up with, say, INT32/INT64 for one of these logical types (uncommon, but
the prior `BytesColumnReader` would at least surface a typed error), debugging
this AssertionError would be hard. Could you include the primitive type name in
the message — same pattern as the existing
`getPrimitiveTypeCheckFailureMessage` used in `createWritableColumnVector`?
Also applies to the matching changes in 1.18.x, 1.19.x, 1.20.x, 2.0.x, and
2.1.x.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/io/storage/row/parquet/ParquetSchemaConverter.java:
##########
@@ -367,14 +402,24 @@ private static Type convertToParquetType(
Types
.repeatedGroup()
.addField(convertToParquetType("key", keyType,
Type.Repetition.REQUIRED))
- .addField(convertToParquetType("value", valueType,
valueType.isNullable() ? Type.Repetition.OPTIONAL : Type.Repetition.REQUIRED))
+ .addField(convertToParquetType(
+ "value",
Review Comment:
🤖 nit: it might be worth adding an error message to this `checkArgument`,
the same way the analogous check in `ParquetRowDataWriter` does (`"Hoodie
schema should be RECORD type."`). Without a message, the resulting
`IllegalArgumentException` gives no hint about what went wrong.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/io/storage/row/HoodieRowDataParquetWriteSupport.java:
##########
@@ -52,6 +52,11 @@ public WriteSupport.FinalizedWriteContext finalizeWrite() {
Map<String, String> extraMetadata =
bloomFilterWriteSupportOpt.map(HoodieBloomFilterWriteSupport::finalizeMetadata)
.orElse(Collections.emptyMap());
+ String vectorColumnsMetadata =
HoodieSchema.buildVectorColumnsMetadataValue(hoodieSchema);
+ if (!vectorColumnsMetadata.isEmpty()) {
+ extraMetadata = new java.util.HashMap<>(extraMetadata);
Review Comment:
🤖 nit: could you add an import for `HashMap` rather than using the
fully-qualified `java.util.HashMap<>` inline? Fully-qualified names in method
bodies are pretty unusual and stand out to readers.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/util/HoodieSchemaConverter.java:
##########
@@ -238,6 +280,97 @@ public static HoodieSchema convertToSchema(LogicalType
logicalType, String rowNa
return nullable ? HoodieSchema.createNullable(schema) : schema;
}
+ private static Map<String, Integer> parseVectorColumns(String vectorColumns)
{
+ Map<String, Integer> parsed = new LinkedHashMap<>();
+ for (String rawEntry : vectorColumns.split(",")) {
+ String entry = rawEntry.trim();
+ if (entry.isEmpty()) {
+ continue;
+ }
+ String[] parts = entry.split(":", -1);
+ if (parts.length > 2 || parts[0].trim().isEmpty()) {
+ throw new IllegalArgumentException(
+ "Invalid VECTOR column descriptor '" + entry + "'. Expected
format: columnName[:dimension].");
+ }
+ String columnName = parts[0].trim();
+ String normalizedColumnName = columnName.toLowerCase(Locale.ROOT);
+ if (parsed.containsKey(normalizedColumnName)) {
+ throw new IllegalArgumentException("Duplicate VECTOR column descriptor
for column: " + columnName);
+ }
+ int dimension = DEFAULT_VECTOR_DIMENSION;
+ if (parts.length == 2) {
+ String dimensionText = parts[1].trim();
+ if (dimensionText.isEmpty()) {
+ throw new IllegalArgumentException(
+ "Invalid VECTOR column descriptor '" + entry + "'. Dimension
must not be empty.");
+ }
+ try {
+ dimension = Integer.parseInt(dimensionText);
+ } catch (NumberFormatException e) {
+ throw new IllegalArgumentException("Invalid VECTOR dimension for
column '" + columnName + "': " + dimensionText, e);
+ }
+ }
+ if (dimension <= 0) {
+ throw new IllegalArgumentException("VECTOR dimension must be positive
for column '" + columnName + "': " + dimension);
+ }
+ parsed.put(normalizedColumnName, dimension);
+ }
+ return parsed;
+ }
+
+ private static HoodieSchema convertVectorField(String fieldName, LogicalType
fieldType, int dimension) {
+ if (!(fieldType instanceof ArrayType)) {
+ throw new IllegalArgumentException(
+ "VECTOR column '" + fieldName + "' must be declared as ARRAY<FLOAT>,
ARRAY<DOUBLE>, or ARRAY<TINYINT>, but got: "
+ + fieldType.asSummaryString());
+ }
+ HoodieSchema.Vector.VectorElementType elementType =
inferVectorElementType(fieldName, ((ArrayType) fieldType).getElementType());
+ HoodieSchema vectorSchema = HoodieSchema.createVector(dimension,
elementType);
+ return fieldType.isNullable() ? HoodieSchema.createNullable(vectorSchema)
: vectorSchema;
+ }
+
+ private static HoodieSchema.Vector.VectorElementType
inferVectorElementType(String fieldName, LogicalType elementType) {
+ switch (elementType.getTypeRoot()) {
+ case FLOAT:
+ return HoodieSchema.Vector.VectorElementType.FLOAT;
+ case DOUBLE:
+ return HoodieSchema.Vector.VectorElementType.DOUBLE;
+ case TINYINT:
+ return HoodieSchema.Vector.VectorElementType.INT8;
+ default:
+ throw new IllegalArgumentException(
+ "VECTOR column '" + fieldName + "' must use ARRAY<FLOAT>,
ARRAY<DOUBLE>, or ARRAY<TINYINT>, but got ARRAY<"
+ + elementType.asSummaryString() + ">.");
+ }
+ }
+
+ /**
+ * Validates that all configured VECTOR column descriptors resolve to
top-level fields.
+ *
+ * <p>Callers that accept {@code hoodie.vector.columns} as a table option
should run this
+ * validation before schema inference so an unknown column is rejected
instead of being
+ * silently ignored by schema conversion.
+ *
+ * @param logicalType Flink logical type
+ * @param vectorColumnsStr comma-separated vector column descriptors, or
null/empty
+ */
+ public static void validateVectorColumns(LogicalType logicalType, String
vectorColumnsStr) {
+ if (vectorColumnsStr == null || vectorColumnsStr.trim().isEmpty()) {
+ return;
+ }
+ Map<String, Integer> vectorColumns = parseVectorColumns(vectorColumnsStr);
+ List<String> fieldNames = ((RowType) logicalType).getFieldNames();
+ List<String> normalizedFieldNames = fieldNames.stream()
+ .map(fieldName -> fieldName.toLowerCase(Locale.ROOT))
+ .collect(Collectors.toList());
+ vectorColumns.keySet().stream()
Review Comment:
🤖 nit: throwing inside `ifPresent` is a bit unusual — have you considered
extracting the `Optional` into a variable and using a plain `if
(opt.isPresent()) throw ...`? It reads more naturally and makes the intent
immediately clear.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/util/VectorConversionUtils.java:
##########
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.util;
+
+import org.apache.hudi.common.schema.HoodieSchema;
+import org.apache.hudi.common.schema.HoodieSchemaType;
+import org.apache.hudi.common.util.HoodieVectorUtils;
+import org.apache.hudi.common.util.ValidationUtils;
+import org.apache.hudi.common.util.collection.ClosableIterator;
+import org.apache.hudi.common.util.collection.CloseableMappingIterator;
+import org.apache.hudi.exception.SchemaCompatibilityException;
+
+import org.apache.flink.table.api.DataTypes;
+import org.apache.flink.table.data.ArrayData;
+import org.apache.flink.table.data.GenericArrayData;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.table.types.logical.ArrayType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.LogicalTypeRoot;
+import org.apache.flink.table.types.logical.RowType;
+
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * Utilities for reading Hoodie VECTOR columns into Flink {@link RowData}.
+ *
+ * <p>VECTOR columns are stored as fixed-length bytes in parquet/log files.
The Flink reader
+ * reads those physical bytes first and then decodes them into Flink array
data according to
+ * the vector element type declared in {@link HoodieSchema.Vector}.
+ */
+public final class VectorConversionUtils {
+
+ private VectorConversionUtils() {
+ }
+
+ /**
+ * Detects VECTOR columns in the selected read projection.
+ *
+ * <p>The returned map is keyed by the projected field ordinal, not the
ordinal in the full
+ * table schema. This matches the ordinal layout of the {@link RowData}
emitted by the reader.
+ *
+ * @param fullFieldNames field names in the full query/table row type
+ * @param selectedFields ordinals selected from {@code fullFieldNames}
+ * @param tableSchema hoodie table schema containing logical VECTOR type
metadata
+ * @return projected ordinal to vector schema information
+ */
+ public static Map<Integer, HoodieSchema.Vector> detectVectorColumns(
+ String[] fullFieldNames,
+ int[] selectedFields,
+ HoodieSchema tableSchema) {
+ Map<String, HoodieSchema.Vector> vectorFields =
getVectorFields(tableSchema);
+ Map<Integer, HoodieSchema.Vector> vectorColumnInfo = new LinkedHashMap<>();
+ for (int i = 0; i < selectedFields.length; i++) {
+ HoodieSchema.Vector vector =
vectorFields.get(fullFieldNames[selectedFields[i]].toLowerCase(Locale.ROOT));
+ if (vector != null) {
+ vectorColumnInfo.put(i, vector);
+ }
+ }
+ return vectorColumnInfo;
+ }
+
+ /**
+ * Rewrites VECTOR fields in a requested row data type to BYTES for parquet
reads.
+ *
+ * <p>Parquet stores VECTOR values as fixed-length byte arrays. This method
keeps the requested
+ * row shape unchanged, but asks the physical parquet reader to materialize
VECTOR columns as
+ * bytes. The resulting rows should be passed through
+ * {@link #wrapVectorColumnIterator(ClosableIterator, RowType, Map)} to
decode the bytes into
+ * array data.
+ *
+ * @param dataType requested row data type
+ * @param requestedSchema requested Hudi schema
+ * @param vectorColumnInfo projected VECTOR column metadata
+ * @return row data type to use for the physical parquet read
+ */
+ public static DataType getParquetReadDataType(
+ DataType dataType,
+ HoodieSchema requestedSchema,
+ Map<Integer, HoodieSchema.Vector> vectorColumnInfo) {
+ if (vectorColumnInfo.isEmpty()) {
+ return dataType;
+ }
+
+ RowType rowType = (RowType) dataType.getLogicalType();
+ List<RowType.RowField> readFields = new
ArrayList<>(rowType.getFields().size());
+ for (RowType.RowField field : rowType.getFields()) {
+ readFields.add(field.copy());
+ }
+ for (Integer requestedOrdinal : vectorColumnInfo.keySet()) {
+ String fieldName =
requestedSchema.getFields().get(requestedOrdinal).name();
+ int fieldOrdinal = rowType.getFieldIndex(fieldName);
+ if (fieldOrdinal >= 0) {
+ RowType.RowField field = readFields.get(fieldOrdinal);
+ readFields.set(fieldOrdinal, newRowField(field,
bytesType(field.getType())));
+ }
+ }
+ return DataTypes.of(new RowType(rowType.isNullable(), readFields));
+ }
+
+ /**
+ * Rewrites VECTOR field types in the full field type array to BYTES for
parquet reads.
+ *
+ * <p>This variant is used by copy-on-write input format code paths that
keep the full field
+ * type array and apply projection separately.
+ *
+ * @param fullFieldNames field names in the full query/table row type
+ * @param fullFieldTypes field types corresponding to {@code fullFieldNames}
+ * @param tableSchema hoodie table schema containing logical VECTOR type
metadata
+ * @return field types to use for the physical parquet read
+ */
+ public static DataType[] getParquetReadFieldTypes(
+ String[] fullFieldNames,
+ DataType[] fullFieldTypes,
+ HoodieSchema tableSchema) {
+ Map<String, HoodieSchema.Vector> vectorFields =
getVectorFields(tableSchema);
+ if (vectorFields.isEmpty()) {
+ return fullFieldTypes;
+ }
+ DataType[] readFieldTypes = Arrays.copyOf(fullFieldTypes,
fullFieldTypes.length);
+ for (int i = 0; i < fullFieldNames.length; i++) {
+ if
(vectorFields.containsKey(fullFieldNames[i].toLowerCase(Locale.ROOT))) {
+ readFieldTypes[i] =
DataTypes.of(bytesType(fullFieldTypes[i].getLogicalType()));
+ }
+ }
+ return readFieldTypes;
+ }
+
+ /**
+ * Wraps a row iterator and decodes VECTOR byte values into Flink array
values.
+ *
+ * <p>{@code rowType} must describe the physical rows emitted by {@code
rowDataItr}, where VECTOR
+ * columns have already been read as BYTES. Non-vector fields are copied
with type-aware Flink
+ * field getters to preserve their original representation.
+ *
+ * @param rowDataItr physical row iterator
+ * @param rowType row type of the physical rows emitted by {@code
rowDataItr}
+ * @param vectorColumnInfo projected VECTOR column metadata keyed by row
ordinal
+ * @return iterator emitting rows with VECTOR columns decoded as arrays
+ */
+ public static ClosableIterator<RowData> wrapVectorColumnIterator(
+ ClosableIterator<RowData> rowDataItr,
+ RowType rowType,
+ Map<Integer, HoodieSchema.Vector> vectorColumnInfo) {
+ RowData.FieldGetter[] fieldGetters = createFieldGetters(rowType);
+ return new CloseableMappingIterator<>(
+ rowDataItr, rowData -> convertVectorColumns(rowData, fieldGetters,
vectorColumnInfo));
+ }
+
+ /**
+ * Wraps a projected row iterator and decodes VECTOR byte values into Flink
array values.
+ *
+ * <p>The selected physical row type is derived from {@code fullFieldTypes}
and
+ * {@code selectedFields}, then delegated to {@link
#wrapVectorColumnIterator(ClosableIterator, RowType, Map)}.
+ *
+ * @param rowDataItr physical row iterator
+ * @param fullFieldTypes field types before projection
+ * @param selectedFields ordinals selected from {@code fullFieldTypes}
+ * @param vectorColumnInfo projected VECTOR column metadata keyed by row
ordinal
+ * @return iterator emitting rows with VECTOR columns decoded as arrays
+ */
+ public static ClosableIterator<RowData> wrapVectorColumnIterator(
+ ClosableIterator<RowData> rowDataItr,
+ DataType[] fullFieldTypes,
+ int[] selectedFields,
+ Map<Integer, HoodieSchema.Vector> vectorColumnInfo) {
+ return wrapVectorColumnIterator(rowDataItr, createRowType(fullFieldTypes,
selectedFields), vectorColumnInfo);
+ }
+
+ /**
+ * Returns VECTOR fields keyed by lower-cased field name.
+ */
+ private static Map<String, HoodieSchema.Vector> getVectorFields(HoodieSchema
tableSchema) {
+ return tableSchema.getFields().stream()
+ .filter(field -> field.schema().getNonNullType().getType() ==
HoodieSchemaType.VECTOR)
+ .collect(Collectors.toMap(
+ field -> field.name().toLowerCase(Locale.ROOT),
+ field -> (HoodieSchema.Vector) field.schema().getNonNullType()));
+ }
+
+ /**
+ * Creates a BYTES logical type while preserving the original nullability.
+ */
+ private static LogicalType bytesType(LogicalType originalType) {
+ return DataTypes.BYTES().getLogicalType().copy(originalType.isNullable());
+ }
+
+ /**
+ * Creates a row field with a replacement logical type while preserving
metadata.
+ */
+ private static RowType.RowField newRowField(RowType.RowField field,
LogicalType type) {
+ return field.getDescription()
+ .map(description -> new RowType.RowField(field.getName(), type,
description))
+ .orElseGet(() -> new RowType.RowField(field.getName(), type));
+ }
+
+ /**
+ * Converts VECTOR columns in one row from physical bytes to Flink array
data.
+ */
+ private static RowData convertVectorColumns(
+ RowData rowData,
+ RowData.FieldGetter[] fieldGetters,
+ Map<Integer, HoodieSchema.Vector> vectorColumnInfo) {
+ GenericRowData converted = new GenericRowData(rowData.getArity());
+ converted.setRowKind(rowData.getRowKind());
+ for (int i = 0; i < rowData.getArity(); i++) {
+ if (rowData.isNullAt(i)) {
+ converted.setField(i, null);
+ } else if (vectorColumnInfo.containsKey(i)) {
+ converted.setField(i, createVectorArrayData(rowData.getBinary(i),
vectorColumnInfo.get(i)));
+ } else {
+ converted.setField(i, fieldGetters[i].getFieldOrNull(rowData));
+ }
+ }
+ return converted;
+ }
+
+ /**
+ * Creates type-aware field getters for the physical row type.
+ */
+ private static RowData.FieldGetter[] createFieldGetters(RowType rowType) {
+ RowData.FieldGetter[] fieldGetters = new
RowData.FieldGetter[rowType.getFieldCount()];
+ for (int i = 0; i < fieldGetters.length; i++) {
+ fieldGetters[i] = RowData.createFieldGetter(rowType.getTypeAt(i), i);
+ }
+ return fieldGetters;
+ }
+
+ /**
+ * Builds a projected row type from full field types and selected ordinals.
+ */
+ private static RowType createRowType(DataType[] fullFieldTypes, int[]
selectedFields) {
+ List<RowType.RowField> rowFields = new ArrayList<>(selectedFields.length);
+ for (int i = 0; i < selectedFields.length; i++) {
+ rowFields.add(new RowType.RowField(String.valueOf(i),
fullFieldTypes[selectedFields[i]].getLogicalType()));
+ }
+ return new RowType(rowFields);
+ }
+
+ /**
+ * Wraps a decoded primitive vector array as Flink array data.
+ */
+ public static GenericArrayData createVectorArrayData(byte[] bytes,
HoodieSchema.Vector vectorSchema) {
+ Object vectorArray = HoodieVectorUtils.decodeVectorBytes(bytes,
vectorSchema);
+ if (vectorArray instanceof float[]) {
+ return new GenericArrayData((float[]) vectorArray);
+ } else if (vectorArray instanceof double[]) {
+ return new GenericArrayData((double[]) vectorArray);
+ } else if (vectorArray instanceof byte[]) {
+ return new GenericArrayData((byte[]) vectorArray);
+ }
+ throw new UnsupportedOperationException("Unsupported decoded vector array
type: " + vectorArray.getClass());
+ }
+
+ /**
+ * Encodes Flink array data into the canonical Hudi VECTOR fixed-bytes
representation.
+ */
+ public static byte[] encodeVectorArrayData(ArrayData arrayData,
HoodieSchema.Vector vectorSchema) {
+ int dimension = vectorSchema.getDimension();
+ HoodieSchema.Vector.VectorElementType elementType =
vectorSchema.getVectorElementType();
+ ValidationUtils.checkArgument(arrayData.size() == dimension,
+ () -> "Vector dimension mismatch: schema expects " + dimension + "
elements but got " + arrayData.size());
+ int bufferSize = Math.multiplyExact(dimension,
elementType.getElementSize());
+ ByteBuffer buffer =
ByteBuffer.allocate(bufferSize).order(HoodieSchema.VectorLogicalType.VECTOR_BYTE_ORDER);
+ switch (elementType) {
+ case FLOAT:
+ for (int i = 0; i < dimension; i++) {
+ buffer.putFloat(arrayData.getFloat(i));
Review Comment:
🤖 `encodeVectorArrayData` doesn't check `arrayData.isNullAt(i)` before
calling `getFloat/getDouble/getByte`. Since
`HoodieSchemaConverter.convertVectorField` accepts `ARRAY<FLOAT>` with nullable
elements (the default for Flink DDL `ARRAY<FLOAT>`), a row with a null element
inside the vector would be silently encoded as `0.0` rather than failing or
being detected. Worth either rejecting nullable element types in
`inferVectorElementType` or guarding here with an explicit null check that
throws.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]