alamb opened a new issue, #20969:
URL: https://github.com/apache/datafusion/issues/20969

   ### Is your feature request related to a problem or challenge?
   
   While reviewing  this PR from @erratic-pattern 
   - https://github.com/apache/datafusion/pull/20962:
   
   I found myself confused about the intended type contract for InListExpr, 
specifically which combinations of `needle` and `haystack` types are expected 
to be valid. I would have expected `InListExpr` to compare values of the same 
logical type, and ideally for the implementation comments to make that explicit.
   
   In particular, the current logic around dictionary arrays is hard to reason 
about. If InListExpr intentionally works on logically equivalent types rather 
than identical physical Arrow types, I think that should be documented clearly. 
If top-level dictionary haystacks are just an implementation detail, it may be 
cleaner to normalize them when building the static filter so the filter always 
stores the underlying value representation.
   
   
   
   ### Describe the solution you'd like
   
     At a minimum, I think it would help to:
     - document the valid needle / haystack type combinations
     - explain why dictionary needles are special-cased in contains
     - make the tests reflect the intended contract
   
   
   
   ### Describe alternatives you've considered
   
   I think it is potentially worth changing `InListExpr`:
   1.  to *always* flatten dictionaries (only use `values` in the `in_list` and 
then specially handle input `DictionaryArray`s
   2. Only accept the same datatype for `needle` and `haystack`
   
   This would make the API contract easier to reason about (always the same 
types) but the implementation can be optimized internally as well
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to