jayzhan211 commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2380359012
@findepi I think the main idea of the issue is to find the mapping between the arrow's DataType and the datafusion's logical type. In fact we could forgot about logical type and process them all in arrow's DataType but to simplify to enumerate all the semantic equivalent type we need a more **simplified** type from arrow's DataType and handle the case whenever we just need to simplify version. In this case, the LogicalType, which is the **simplfied** version should be less than the arrow's DataType. Therefore, we can have a one direction of mapping from arrow's DataType and UserDefined/Extension Type. The LogicalType (Datafusion native Type) is our single of truth in Datafusion. It has the similar role like rust native type. What we need is two kinds of trait for type mapping. One for UserDefined type, another for arrow's DataType. If their mapped type is the same, it indicates that we can decode the value as the expected type, otherwise, it is a type mismatch. ```rust #[derive(Clone)] pub enum LogicalType { Int32, String, Float32, Float64, FixedSizeList(Box<LogicalType>, usize), // and more Extenstion(Arc<dyn ExtensionType>), } pub trait ExtensionType { fn logical_type(&self) -> LogicalType; } pub struct JsonType {} impl ExtensionType for JsonType { fn logical_type(&self) -> LogicalType { LogicalType::String } } pub struct GeoType { n_dim: usize } impl ExtensionType for GeoType { fn logical_type(&self) -> LogicalType { LogicalType::FixedSizeList(Box::new(LogicalType::Float64), self.n_dim) } } pub trait PhysicalType { fn logical_type(&self) -> LogicalType; } impl PhysicalType for DataType { fn logical_type(&self) -> LogicalType { match self { DataType::Int32 => LogicalType::Int32, DataType::FixedSizeList(f, n) => { LogicalType::FixedSizeList(Box::new(f.data_type().logical_type()), *n as usize) } _ => todo!("") } } } ``` Love to hear the feedback about whether this makes sense or what may fail in my assumption of type mapping -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org