jorisvandenbossche commented on pull request #9164: URL: https://github.com/apache/arrow/pull/9164#issuecomment-758682512
Looking at the behavior of `%in%` in R (cc @nealrichardson), there NA's also get matched (eg `c(1, 2, NA) %in% c(1, 3)` gives true,false,false and `c(1, 2, NA) %in% c(1, 3, NA)` gives true,false,true), so that is consistent with the behaviour we have in Arrow right now. The SQL `IN` operator does not seem to match Nulls, because there it is a short-hand for multiple comparisons. But, in practice, you can only use this (as far as I know, only limited SQL knowledge) in a WHERE clause. So whether the Null in the column gives False or Null doesn't matter much, because in both cases the row does not get preserved in a WHERE filter. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
