yujun777 commented on PR #62698: URL: https://github.com/apache/doris/pull/62698#issuecomment-4403142531
I think a boolean `deterministic` property may still be too coarse here. For UDFs, it may be better to consider the same three categories as function volatility: - `immutable`: same input always returns the same output, e.g. `def f(x): return x + 1` or `lower/abs`-like pure computation. - `stable`: stable within one statement/query but may change across statements, e.g. a UDF returning the query start time, session/database context, or a statement-level config value. - `volatile`: each invocation may return a different result or the call count/location has semantics, e.g. `uuid.uuid4()`, `random.random()`, HTTP/RPC calls, or UDFs with side effects. So for Python UDF we probably need to identify which of these cases we want to support. The current deterministic true/false split can distinguish immutable from non-immutable, but it cannot distinguish stable functions from volatile functions, while optimizer rules need different behavior for those two cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
