rahil-c commented on code in PR #18190:
URL: https://github.com/apache/hudi/pull/18190#discussion_r2893040695


##########
hudi-spark-datasource/hudi-spark4.0.x/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala:
##########
@@ -212,6 +215,41 @@ private[sql] class AvroDeserializer(rootAvroType: Schema,
         val decimal = createDecimal(bigDecimal, d.getPrecision, d.getScale)
         updater.setDecimal(ordinal, decimal)
 
+      // Handle VECTOR logical type (FLOAT, DOUBLE, INT8)
+      case (FIXED, ArrayType(elementType, false)) => avroType.getLogicalType 
match {
+        case vectorLogicalType: VectorLogicalType =>
+          val dimension = vectorLogicalType.getDimension
+          val elementSize = elementType match {
+            case FloatType => 4
+            case DoubleType => 8
+            case ByteType => 1
+            case _ => throw new IncompatibleSchemaException(incompatibleMsg)
+          }
+          (updater, ordinal, value) => {
+            val bytes = value.asInstanceOf[GenericData.Fixed].bytes()
+            val expectedSize = dimension * elementSize
+            if (bytes.length != expectedSize) {
+              throw new IncompatibleSchemaException(
+                s"VECTOR byte size mismatch: expected=$expectedSize, 
actual=${bytes.length}")
+            }
+            elementType match {
+              case FloatType =>
+                val buffer = 
ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN)

Review Comment:
   @voonhous Thanks for bringing this up, I should have documented this more 
clearly in the vector type schema PR with a java doc comment. The idea of 
making this constant to be used also is a good idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to