findepi commented on issue #14247:
URL: https://github.com/apache/datafusion/issues/14247#issuecomment-2613408823

   Yes, once plan is lowered into "container" arrow types (like assembly), we 
no longer need to remember what were the logical/extension types. Before the 
lowering happens, the functions and operators need to be resolved. This doesn't 
happen at the Expr construction time, though, so it IMO calls for a strict 
separation of phases:
   
   1. Exprs are syntactical (eg created directly from SQL syntax or dataframe 
API).
   2. Then analyzer needs to "resolve" operators. This needs to be type aware. 
E.g. for builtin types it can use `=` operators, but for other it needs to be 
told how to compare comparisons (and such), so a UDF gets inserted into the 
plan.
      - After this phase, the plan is "resolved" and doesn't need to remember 
the original types (except maybe for output fields metadata).
   
   > I think this might get tricky when multiple extension types were used (it 
might be hard to hook json and geometry without a bunch of glue code)
   
   Extensible coercion rules is a tricky thing indeed. Maybe we can leave 
without them (for now)
   
   But there are simpler thing to solve as well, like casts: If "my JSON" type 
uses DataType::Binary as its container type, it still wants to define its own 
family of casts to various other types (numbers, text, etc.). So the Cast Expr 
would need to resolve to some UDF, when source type or target type are not 
native types.
   
   
   
   
   > finding any that were relevant to the user defined type and rewriting the 
expressions to use a function (eg. rewrite `geo_col1 = geo_col2` into 
`udf_compare_geos(geo_col1, geo_col2)`
   
   That sounds easy because we don't have to write this logic even once.
   But once such logic is written somewhere, there is no reason for it not to 
be part of datafusion project, for the benefit of all consumers. I think such 
logic should belong to datafusion.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to