mingmwang opened a new issue, #7098:
URL: https://github.com/apache/arrow-datafusion/issues/7098

   ### Is your feature request related to a problem or challenge?
   
   Currently DataFusion does not support Interval arithmetic on string type, 
and the stats `analyze` process will be failed during CBO stats estimation 
process.
   
   I think maybe we can normalize the string intervals to numeric intervals 
first and leverage the numeric intervals to estimate the `selectivity`.
   
   Init input Bound:
   ["aaa", "xyz123"] => [0, 1]
   
   Estimated Bound (example)
   ["ccc", "ddd"] => [0.2, 0.3]
   
   estimated selectivity:
   0.1
   
   The Normalization of the string intervals can be based on ascil code 
ordering, for any non-ascil char,  we can just return early.
   
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to