AneeqYusuf opened a new issue #11439:
URL: https://github.com/apache/druid/issues/11439


   We are using Druid as an OLAP for our BI and Operations team, so we have 
some Kafka CDC pipelines indexing data into druid.
   When we run the following code to fetch all records without a deleted_at 
value assigned, the output does not match the one from the source system.
   ```
   select __time, id, deleted_at, max(updated_at) as updated_at from 
"issues.issues"
   where deleted_at = 0
   GROUP BY __time, id, deleted_at
   ```
   Several ids from the source system that do not have a deleted_at value are 
missing from the Druid query. For example, if I run the following code, it 
shows there is no data:
   ```
   select __time, id, deleted_at, max(updated_at) as updated_at from 
"issues.issues"
   where deleted_at = 0 and id = 'dad6e1c5-b9b5-4256-8fce-1cd84b035c71'
   GROUP BY __time, id, deleted_at
   ```
   However, if I remove the deleted_at = 0 from the where clause, I can see the 
original record and it does have the deleted_at value set to 0.
   
   I can only conclude that not all of the relevant records are being fetched 
by Druid SQL, but not sure if it is a calcite problem or something from Druid.
   
   Any help on this would be much appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to