a2l007 commented on issue #7741: Indexing tasks containing thetaSketches results in incorrect sketch values URL: https://github.com/apache/incubator-druid/issues/7741#issuecomment-521781933 @somanullah What is the sketch size used in your case? I've managed to reproduce this issue internally but don't have a fix yet. @leventov Could you help in giving some pointers on how to debug this issue, since this seems to entirely based on the rewrite of the index merging code via #5335. The overall gist of the issue is that the combined values of theta sketches after index merging using the `RowPointer` approach is different from the combined values using Rowboats merging. This mismatch happens only in case of a fairly large input data file. For example, the testcase I have currently has close to 110,000 events with a single thetasketch metric. Querying the indices merged using the RowBoat technique gives me the sketch value to be 1756 and querying those merged using the RowPointer technique gives the value as 1746. This difference widens as there are bigger indices to be merged. I don't think this issue is related to sketch merge unpredictability since the sketch size and the data set are the same and this should return consistent results.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
