jayzhan211 commented on code in PR #12853:
URL: https://github.com/apache/datafusion/pull/12853#discussion_r1816904255


##########
datafusion/common/src/types/native.rs:
##########
@@ -188,6 +190,147 @@ impl LogicalType for NativeType {
     fn signature(&self) -> TypeSignature<'_> {
         TypeSignature::Native(self)
     }
+
+    fn default_cast_for(&self, origin: &DataType) -> Result<DataType> {
+        use DataType::*;
+
+        fn default_field_cast(to: &LogicalField, from: &Field) -> 
Result<FieldRef> {
+            Ok(Arc::new(Field::new(
+                to.name.clone(),
+                to.logical_type.default_cast_for(from.data_type())?,
+                to.nullable,
+            )))
+        }
+
+        Ok(match (self, origin) {
+            (Self::Null, _) => Null,
+            (Self::Boolean, _) => Boolean,
+            (Self::Int8, _) => Int8,
+            (Self::Int16, _) => Int16,
+            (Self::Int32, _) => Int32,
+            (Self::Int64, _) => Int64,
+            (Self::UInt8, _) => UInt8,
+            (Self::UInt16, _) => UInt16,
+            (Self::UInt32, _) => UInt32,
+            (Self::UInt64, _) => UInt64,
+            (Self::Float16, _) => Float16,
+            (Self::Float32, _) => Float32,
+            (Self::Float64, _) => Float64,
+            (Self::Decimal(p, s), _) if p <= &38 => Decimal128(*p, *s),
+            (Self::Decimal(p, s), _) => Decimal256(*p, *s),
+            (Self::Timestamp(tu, tz), _) => Timestamp(*tu, tz.clone()),
+            (Self::Date, _) => Date32,
+            (Self::Time(tu), _) => match tu {
+                TimeUnit::Second | TimeUnit::Millisecond => Time32(*tu),
+                TimeUnit::Microsecond | TimeUnit::Nanosecond => Time64(*tu),
+            },
+            (Self::Duration(tu), _) => Duration(*tu),
+            (Self::Interval(iu), _) => Interval(*iu),
+            (Self::Binary, LargeUtf8) => LargeBinary,
+            (Self::Binary, Utf8View) => BinaryView,
+            (Self::Binary, _) => Binary,
+            (Self::FixedSizeBinary(size), _) => FixedSizeBinary(*size),
+            (Self::Utf8, LargeBinary) => LargeUtf8,
+            (Self::Utf8, BinaryView) => Utf8View,
+            (Self::Utf8, _) => Utf8,

Review Comment:
   Some initial thoughts
   
   First, I think utf8view is the default type for string.
   
   In my proposed way,
   I think the mapping should be based on `arrow::can_cast_to` so if the 1st 
target type is not supported  we go to the next one.
   
   But in this design, each type has a single target type to cast to. Is the 
default type always castable by arrow's `can_cast_to`?  Maybe we can just add 
the check inside. Like
   
   ```rust
   (Self::Utf8, data_type) if can_cast_type(data_type, Utf8View)=> Utf8View,
   (Self::Utf8, data_type) if can_cast_type(data_type, LargeUtf8)=> LargeUtf8,
   (Self::Utf8, data_type) if can_cast_type(data_type, Utf8)=> Utf8,
   ``` 
   
   Another issue is whether the default type is great for every use case, for 
example, I think utf8view should be the default type but is there any usecase 
that require utf8 instead? If yes, how can they modify it? It seems they need 
to copy the whole `default_cast_for` and change it for their logical type. (We 
can assume no such case for now, but just to point out that it may not work if 
there is such a case)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to