jinchengchenghh commented on issue #8948:
URL: 
https://github.com/apache/incubator-gluten/issues/8948#issuecomment-2724825095

   Delete scan reads the metadata column `_file`, the information is not in 
velox.
   
https://github.com/apache/iceberg/blob/main/spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BatchDataReader.java#L98
   
   In iceberg, the data file reader create a constant column for the metadata 
columns,
   ```
   public static Map<Integer, ?> constantsMap(
         ContentScanTask<?> task,
         Types.StructType partitionType,
         BiFunction<Type, Object, Object> convertConstant) {
       PartitionSpec spec = task.spec();
       StructLike partitionData = task.file().partition();
   
       // use java.util.HashMap because partition data may contain null values
       Map<Integer, Object> idToConstant = Maps.newHashMap();
   
       // add _file
       idToConstant.put(
           MetadataColumns.FILE_PATH.fieldId(),
           convertConstant.apply(Types.StringType.get(), task.file().path()));
   
       // add _spec_id
       idToConstant.put(
           MetadataColumns.SPEC_ID.fieldId(),
           convertConstant.apply(Types.IntegerType.get(), 
task.file().specId()));
   
       // add _partition
       if (partitionType != null) {
         if (!partitionType.fields().isEmpty()) {
           StructLike coercedPartition = coercePartition(partitionType, spec, 
partitionData);
           idToConstant.put(
               MetadataColumns.PARTITION_COLUMN_ID,
               convertConstant.apply(partitionType, coercedPartition));
         } else {
           // use null as some query engines may not be able to handle empty 
structs
           idToConstant.put(MetadataColumns.PARTITION_COLUMN_ID, null);
         }
       }
   ```
   And the PARTITION_COLUMN_ID is also constant column.
   
   So we need to extract the metadata column, only query the data columns from 
velox and add the _delete or _file flag as extra constant Vector to data 
RowVector.
   
   _delete is used in equality delete file reader test. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to