jayzhan211 commented on issue #12622:
URL: https://github.com/apache/datafusion/issues/12622#issuecomment-2661197895

   @tobixdev 
   
   I found LogicalScalar to be a largely duplicated concept with only minor 
differences from ScalarValue, mainly aimed at reducing complexity. However, the 
reduction is minimal—for instance, eliminating try_as_str alone doesn’t seem to 
justify introducing LogicalScalar.
   
   Furthermore, if we separate logical and physical types in LogicalPlan and 
PhysicalPlan, we would need to implement coercion and casting logic at both 
layers. Since this isn’t currently the case, adding such complexity doesn’t 
seem worthwhile.
   
   As I previously suggested 
[here](https://github.com/apache/datafusion/issues/14296#issuecomment-2618344472),
 it may be a better approach to retain the "physical-type"—or more precisely, 
the "decoding-type" (arrow::DataType)—within the logical layer. We can still 
use simplified types like NativeType when beneficial, as we already do in 
function definitions.
   
   > Currently the type model in DataFusion, both in logical and physical 
plans, relies purely on arrow’s 
[DataType](https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html) 
enum. But whilst this choice makes sense its physical execution engine 
(DataTypes map 1:1 with the physical array representation, defining implicit 
rules for storage, access, and retrieval), it has flaws when dealing with its 
logical plans. This is due to the fact that some logically equivalent array 
types in the Arrow format have very different DataTypes – for example a logical 
string array can be represented in the Arrow array of DataType Utf8, 
Dictionary(Utf8), RunEndEncoded(Utf8), and StringView (without mentioning the 
different indexes types that can be specified for dictionaries and REE arrays).
   
   To solve the the issue mentioned, having function like `try_as_str` or 
`NativeType` makes more sense to me now.
   
   In summary, I think we should keep `ScalarValue` as it is. And utilize 
`NativeType` or `TypeClass` (similar to TypeSignatureClass but not for 
Signature only).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to