albertogpz commented on pull request #6200:
URL: https://github.com/apache/geode/pull/6200#issuecomment-814286023


   @agingade 
   Thanks for your comments. Please, let me answer inline to your questions.
   
   > @albertogpz
   > Thanks for your contributions to make the query engine robust. Sorry for 
delay in responding...
   > 
   > Here is what I believe should be happening.
   > 
   >     * The result for a query should be consistent in both using index or 
non-index case.
   > 
   This is what I tried to fix with this PR and the test new cases now show 
that results are the same with and without indexes while prior to the PR, they 
returned different results.
   
   >     * The query engine returns UNDEFINED when it is unable to find the 
next level field.
   >       E.g: if address.city and address is null (This is documented).
   >       This is not same when you are looking for a non existing "key" in 
the map; UNDEFINED needs to be returned when positions is null and query is 
trying to access field from it.
   >       E.g.: positions['*']  should be returning UNDEFINED.
   So, you mean if positions is not null but it only contains a mapping for the 
'SUN" key, positions["ERIC"] should return null instead of UNDEFINED?
   
   > 
   >     * Query engine supports heterogenous objects stored in a region.
   >       E.g: Employee or Customer.
   >       Inline with supporting this, its designed/architected such that if a 
field is not found in the object it will be ignored.
   >       E.g query with employeeID is not going to return customer objects 
unless it has that field.
   > 
   ok. That should not have been changed.
   
   >     * To be inline with the above design (query expectation), when a map 
field is not present available it should ignore that entry/object from adding 
to the result.
   >       E.g. if positions['SUN'] if SUN key is not present query should 
ignore that object.
   >       This is also different from null check.
   >       If there is a SUN key with null value it should be returned for 
queries looking for null value. And non null check will return if the key is 
there and its value is not null.
   Currently if you have an entry for which positions has the following 
mappings: {"SUN" => null} and another entry for which positions has the 
following mapping: {"ERICSSON" => "3"} querying for positions["SUN"] = null 
will return both entries.
   With my PR, only the first one would be returned. What should be the right 
behavior?
   
   
   >       Try the query with non map field and the behavior should be same.
   > 
   > 
   > Please let me know if you have any questions on the expected behavior.
   > 
   > The overall behavior of the query on map should be in-line with non-map 
fields.
   > 
   > The usage of index should not be avoided; unless the results are 
inconsistent with non-index query results. Instead of avoiding/blocking the use 
of index, it will be good to address the issues with indexed queries and make 
the behavior consistent.
   > 
   I tried to support the use of indexes in all queries with my PR but I found 
no way to do it with != queries using an index of type: positions[*] (index on 
all keys) or positions["SUN", "ERICSSON"].
   For the second type of index, I have an alternative solution in a draft PR 
that will allow the use of the index although it is more costly in memory (see 
https://github.com/apache/geode/pull/6238)
   
   > Can you confirm above requirements are met in this PR.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to