jayzhan211 commented on issue #11513:
URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2380359012
@findepi
I think the main idea of the issue is to find the mapping between the
arrow's DataType and the datafusion's logical type.
In fact we could forgot about logical type and process them all in arrow's
DataType but to simplify to enumerate all the semantic equivalent type we need
a more **simplified** type from arrow's DataType and handle the case whenever
we just need to simplify version. In this case, the LogicalType, which is the
**simplfied** version should be less than the arrow's DataType. Therefore, we
can have a one direction of mapping from arrow's DataType and
UserDefined/Extension Type. The LogicalType (Datafusion native Type) is our
single of truth in Datafusion. It has the similar role like rust native type.
What we need is two kinds of trait for type mapping. One for UserDefined type,
another for arrow's DataType. If their mapped type is the same, it indicates
that we can decode the value as the expected type, otherwise, it is a type
mismatch.
```rust
#[derive(Clone)]
pub enum LogicalType {
Int32,
String,
Float32,
Float64,
FixedSizeList(Box<LogicalType>, usize),
// and more
Extenstion(Arc<dyn ExtensionType>),
}
pub trait ExtensionType {
fn logical_type(&self) -> LogicalType;
}
pub struct JsonType {}
impl ExtensionType for JsonType {
fn logical_type(&self) -> LogicalType {
LogicalType::String
}
}
pub struct GeoType {
n_dim: usize
}
impl ExtensionType for GeoType {
fn logical_type(&self) -> LogicalType {
LogicalType::FixedSizeList(Box::new(LogicalType::Float64),
self.n_dim)
}
}
pub trait PhysicalType {
fn logical_type(&self) -> LogicalType;
}
impl PhysicalType for DataType {
fn logical_type(&self) -> LogicalType {
match self {
DataType::Int32 => LogicalType::Int32,
DataType::FixedSizeList(f, n) => {
LogicalType::FixedSizeList(Box::new(f.data_type().logical_type()), *n as usize)
}
_ => todo!("")
}
}
}
```
Love to hear the feedback about whether this makes sense or what may fail in
my assumption of type mapping
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]