Tushar7012 commented on issue #20210: URL: https://github.com/apache/datafusion/issues/20210#issuecomment-3865033889
Hi @neilconway , I looked into the issue and the failure seems to come from how apply_cmp currently assumes scalar (flat) inputs when handling operators like LIKE, NOT LIKE, and RegexMatch. When nested expressions (such as structs, lists, or other nested types) reach this path, the operator logic still gets applied, which leads to datatype mismatches and, in some cases, internal assertions instead of a user-facing error. My approach is to fail fast and explicitly for nested data types. These operators don’t have a well-defined semantic meaning for nested inputs, so instead of letting them flow through and break later, the plan is to add a clear validation step before operator execution. If a nested type is detected, we return a descriptive execution error (for example, stating that LIKE / NOT LIKE / RegexMatch are not supported for nested types). This keeps the behavior predictable, avoids panics or assertions, and aligns with how other unsupported operator/type combinations are handled in DataFusion. I’ll also add regression tests to ensure: the original failure is covered, and the new behavior returns a clean, informative error instead of crashing. This should resolve the issue without introducing any breaking API changes or large refactors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
