felipecrv commented on PR #43256:
URL: https://github.com/apache/arrow/pull/43256#issuecomment-2244138609

   > We change SetLookupState::Init so that it takes the SetLookupOptions& 
argument without const. When initializing, we sort and remove duplicates from 
the value set and update the options in place. This effectively makes it so 
that binding an expression can mutate the input in a way that doesn't change 
semantics.
   
   It's probably a bad idea to mutate `SetLookupOptions`. I would simply stored 
the simplified/equivalent `value_set` your simplification builds in the 
`KernelState` itself.
   
   You don't have to do it in `SetLookupState::Init`. You can mutate it after 
the `Init` as an optional pre-processing step that you do to help the kernel 
"execute better".
   
   > Sorting the value set inside SetLookupState would change the behavior of 
index_in. Hence, we'd also need to modify InitSetLookup to plumb through 
whether we are initializing is_in or index_in.
   
   Yeah, don't mutate `SetLookupState`.
   
   > Inside Simplify, if we see an is_in call that is unbound, then we 
immediately fail simplification, since an unbound is_in is not guaranteed to 
have a sorted and unique value set. We cannot bind the expression on the spot 
since we don't have access to a schema.
   
   I don't understand this.
   
   > Once we compute the simplified value set, we must avoid re-binding so that 
we don't re-sorted the value set (and re-construct the memo table). We only 
re-bind once after all inequalities in the guarantee have been processed.
   
   Yeah. You should be able to look at the `KernelState` and decide how to 
proceed.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to