RussellSpitzer opened a new issue, #4581:
URL: https://github.com/apache/iceberg/issues/4581

   Because of the String -> Fixed Binary conversion the readers and writers are 
both incorrect.
   
   The vectorized reader initializes a FixedBinary reader on a column we report 
is a String causing an unsupported reader exception.
   
   ```java
   java.lang.UnsupportedOperationException: Unsupported type: UTF8String
        at 
org.apache.iceberg.arrow.vectorized.ArrowVectorAccessor.getUTF8String(ArrowVectorAccessor.java:82)
        at 
org.apache.iceberg.spark.data.vectorized.IcebergArrowColumnVector.getUTF8String(IcebergArrowColumnVector.java:140)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.sort_addToSorter_0$(Unknown
 Sour
   ```
        
   The writer is broken because it gets String Columns from Spark but needs to 
write fixed binary.
   
   Something like this needed as a fix
   ```java
     private static PrimitiveWriter<UTF8String> uuids(ColumnDescriptor desc) {
       return new UUIDWriter(desc);
     }
   
     private static class UUIDWriter extends PrimitiveWriter<UTF8String> {
       private ByteBuffer buffer = ByteBuffer.allocate(16);
   
       private UUIDWriter(ColumnDescriptor desc) {
         super(desc);
       }
   
       @Override
       public void write(int repetitionLevel, UTF8String string) {
         UUID uuid = UUID.fromString(string.toString());
         buffer.rewind();
         buffer.putLong(uuid.getMostSignificantBits());
         buffer.putLong(uuid.getLeastSignificantBits());
         buffer.rewind();
         column.writeBinary(repetitionLevel, 
Binary.fromReusedByteBuffer(buffer));
       }
     }
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to