[ https://issues.apache.org/jira/browse/IMPALA-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Rodoni updated IMPALA-8945: -------------------------------- Description: Reported by [~icook] The Impala docs entry for the IS DISTINCT FROM operator states: The <=> operator, used like an equality operator in a join query, is more efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The <=> operator can use a hash join, while the OR expression cannot. But this expression is not equivalent to A <=> B. See the attached screenshot demonstrating their non-equivalence. An expression that is equivalent to A <=> B is this: (A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B)) This expression should replace the existing incorrect expression. Another expression that is equivalent to A <=> B is: if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B) This one is a bit easier to follow. If you use this one in the docs, just replace the following line with: The <=> operator can use a hash join, while the if expression cannot. was: The Impala docs entry for the IS DISTINCT FROM operator states: The <=> operator, used like an equality operator in a join query, is more efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The <=> operator can use a hash join, while the OR expression cannot. But this expression is not equivalent to A <=> B. See the attached screenshot demonstrating their non-equivalence. An expression that is equivalent to A <=> B is this: (A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B)) This expression should replace the existing incorrect expression. Another expression that is equivalent to A <=> B is: if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B) This one is a bit easier to follow. If you use this one in the docs, just replace the following line with: The <=> operator can use a hash join, while the if expression cannot. > Impala Doc: Incorrect Claim of Equivalence in Impala Docs > --------------------------------------------------------- > > Key: IMPALA-8945 > URL: https://issues.apache.org/jira/browse/IMPALA-8945 > Project: IMPALA > Issue Type: Bug > Components: Docs > Reporter: Alex Rodoni > Assignee: Alex Rodoni > Priority: Major > > Reported by [~icook] > The Impala docs entry for the IS DISTINCT FROM operator states: > The <=> operator, used like an equality operator in a join query, is more > efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The > <=> operator can use a hash join, while the OR expression cannot. > But this expression is not equivalent to A <=> B. See the attached screenshot > demonstrating their non-equivalence. An expression that is equivalent to A > <=> B is this: > (A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B)) > This expression should replace the existing incorrect expression. > Another expression that is equivalent to A <=> B is: > if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B) > This one is a bit easier to follow. If you use this one in the docs, just > replace the following line with: > The <=> operator can use a hash join, while the if expression cannot. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org