[
https://issues.apache.org/jira/browse/CALCITE-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419035#comment-17419035
]
Stamatis Zampetakis commented on CALCITE-4292:
----------------------------------------------
Hi [~shlok7296], from the example you provided the current behavior is the
correct one.
>From an SQL perspective if the name field does not appear in the data it is
>considered {{NULL}}. The SQL semantics indicate that {{NULL <> "NMAX"}}
>evaluates to {{UNKNOWN}} and since this condition appears in the {{WHERE}}
>clause {{UNKNOWN}} values are treated as {{FALSE}}. So the row with
>{{"_id":"02401"}} should not be part of the result since name is {{NULL}}.
I am inclined to close this JIRA as not a problem. Do you agree [~shlok7296]?
> Wrong results in ElasticSearch when query contains NOT EQUAL
> ------------------------------------------------------------
>
> Key: CALCITE-4292
> URL: https://issues.apache.org/jira/browse/CALCITE-4292
> Project: Calcite
> Issue Type: Bug
> Components: elasticsearch-adapter
> Reporter: Shlok Srivastava
> Assignee: Bill Neil
> Priority: Major
> Labels: ElasticSearch, NotEquals, QueryBuilder, calcite,
> pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> *Data:*
> {noformat}
> { "_id" : "01701", "name" : "NMAX", "loc" : [ -71.42548600000001, 42.300665
> ], "pop" : 65046, "state" : "MA" }
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492
> ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401 , "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498,
> "state" : "MA" }
> {noformat}
>
> *Query:*
> {code:java}
> SELECT* from zips WHERE name <> "NMAX"{code}
>
> *Expected result:*
> {noformat}
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492
> ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401", "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498,
> "state" : "MA" }
> {noformat}
>
> *Current Result:*
> {noformat}
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492
> ], "pop" : 57871, "state" : "MA" }
> {noformat}
> RelNode for same -
> {code:java}
> relB.not(relB.equals(relb.literal("Name"),relb.literal"NMQAX")){code}
>
> The elasticsearch query formed for above RelNode is this :
> {code:java}
> {
> "query": {
> "constant_score": {
> "filter": {
> "bool": {
> "must": {
> "exists": {
> "field": "Name"
> }
> },
> "must_not": {
> "term": {
> "Name": "NMQAX"
> }
> }
> }
> }
> }
> }
> }
> {code}
> *Problem* : The above query ignores document which do not have _Name_ field
> which is ideally included by elasticsearch but ignored due to must exists
> condition.
> *Solution* : Remove the exists condition from Not equals Query Expression.
> Elasticsearch doesn't put this condition therefore keeping queries in sync.
> [Code|https://github.com/apache/calcite/blob/1050b36cafbb0c487b7a2ade3efd12850609717e/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/PredicateAnalyzer.java#L782]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)