Zouxxyy commented on code in PR #5905: URL: https://github.com/apache/paimon/pull/5905#discussion_r2249747912
########## paimon-common/src/main/java/org/apache/paimon/data/variant/GenericVariantBuilder.java: ########## @@ -274,7 +274,7 @@ public void appendTimestampNtz(long microsSinceEpoch) { public void appendFloat(float f) { checkCapacity(1 + 4); writeBuffer[writePos++] = primitiveHeader(FLOAT); - writeLong(writeBuffer, writePos, Float.floatToIntBits(f), 8); + writeLong(writeBuffer, writePos, Float.floatToIntBits(f), 4); Review Comment: Port [SPARK-52833](https://issues.apache.org/jira/browse/SPARK-52833) ########## paimon-format/src/main/java/org/apache/paimon/format/parquet/reader/ParquetSplitReaderUtil.java: ########## @@ -51,98 +52,50 @@ import org.apache.paimon.shade.guava30.com.google.common.collect.ImmutableList; -import org.apache.parquet.ParquetRuntimeException; -import org.apache.parquet.column.ColumnDescriptor; import org.apache.parquet.io.ColumnIO; import org.apache.parquet.io.GroupColumnIO; import org.apache.parquet.io.MessageColumnIO; import org.apache.parquet.io.PrimitiveColumnIO; import org.apache.parquet.schema.GroupType; -import org.apache.parquet.schema.InvalidSchemaException; import org.apache.parquet.schema.LogicalTypeAnnotation; -import org.apache.parquet.schema.PrimitiveType; import org.apache.parquet.schema.Type; +import javax.annotation.Nullable; + import java.util.ArrayList; import java.util.List; import java.util.Objects; -import static org.apache.paimon.utils.Preconditions.checkArgument; import static org.apache.parquet.schema.Type.Repetition.REPEATED; import static org.apache.parquet.schema.Type.Repetition.REQUIRED; /** Util for generating parquet readers. */ public class ParquetSplitReaderUtil { public static WritableColumnVector createWritableColumnVector( - int batchSize, - DataType fieldType, - Type type, - List<ColumnDescriptor> columnDescriptors, Review Comment: `columnDescriptors` looks confused, and only used in validate so just remove it ########## paimon-format/src/main/java/org/apache/paimon/format/parquet/reader/ParquetColumnVector.java: ########## @@ -67,7 +75,20 @@ public class ParquetColumnVector { return; } - if (isPrimitive) { + if (column.variantFileType().isPresent()) { + ParquetField fileContentCol = column.variantFileType().get(); + WritableColumnVector fileContent = + ParquetSplitReaderUtil.createWritableColumnVector( Review Comment: Here I need a clean `createWritableColumnVector` method. like Spark's: `public OffHeapColumnVector(int capacity, DataType type)` But this method in paimon seems to require parquetType, therefore, I have no choice but to introduce the `parquetType` type in `ParquetField`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@paimon.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org