scheler opened a new issue, #17822:
URL: https://github.com/apache/druid/issues/17822

   Faulty/incorrect values noticed in FixedBucketsHistogram column after rollup.
   
   ### Affected Version
   
   30.0.1
   
   ### Description
   
   We are noticing some entries where the bucket counts array in the 
FixedBucketsHistogram column has incorrect values. This is leading to incorrect 
computation of percentiles. The data is ingested via Kafka, and we verified the 
faulty records are not coming from the source. There is a rollup configured and 
we have narrowed down to the rollup causing the faulty records to appear.  
However, it is not clear how to troubleshoot this further.
   
   For eg., the bucket values in the ingested records look like this - 
   
   ```
   1741883320000: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0]
   1741883320000: [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0]
   1741883330000: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0]
   1741883330000: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0]
   1741883330000: [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0]
   ```
   and after rollup, we see
   
   1741883330000: [3150, 15, 6, 64, 91, 346, 1602, 1752, 2063, 971, 594, 221, 
145, 25, 97, 13, 32, 35, 16, 12, 1, 3, 45, 17, 0]
   
   Note that the above entries are based on extraction from the base64 data 
from the column values.
   
   ```
   FixedBucketsHistogram histogram = FixedBucketsHistogram.fromBase64(base64);
    Arrays.toString(histogram.getHistogram()));
   ```
   
   The rollup is configured for every 10s:
   
   ```
         "granularitySpec": {
           "type": "uniform",
           "segmentGranularity": "HOUR",
           "queryGranularity": {
             "type": "duration",
             "duration": 10000,
             "origin": "1970-01-01T00:00:00.000Z"
           },
           "rollup": true,
           "intervals": []
         },
   ```
   
   These faulty records are not many, maybe about 5-10 in a day, but the issue 
is when they are included in an topN aggregation query they affect the results. 
We have been unable to exclude them in the query, so I appreciate any ideas 
around that too.
   
   Any ideas on what could be causing this or how to troubleshoot this further?
   
   Please let me know if any additional information would be helpful. Thanks!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to