Jefffrey commented on PR #19415: URL: https://github.com/apache/datafusion/pull/19415#issuecomment-3702129012
I think we only need to look at the `set_ops.rs` file that was modified in this PR; specifically this line: https://github.com/apache/datafusion/blob/b818f93416d18d06374a0707f5ef571f8a384070/datafusion/functions-nested/src/set_ops.rs#L423 The last argument there, `None`, is usually where the null buffer would be provided, if any. By having it as `None` we essentially can never have nulls in the returned array, which is what we want to change. See the distinct function in the same file: https://github.com/apache/datafusion/blob/b818f93416d18d06374a0707f5ef571f8a384070/datafusion/functions-nested/src/set_ops.rs#L549-L559 But in our case, we would need to consider the nulls of both input arrays; I recommend exploring the APIs available to see how this would be possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
