patelprateek commented on issue #414: URL: https://github.com/apache/datasketches-java/issues/414#issuecomment-1252631189
Sorry about not being clear. I am trying to create theta sketches for my dataset for various dimensions that are used for indexing. Now for certain filtering queries I need to take set intersection to get cardinality estimation. AFAIK theta sketches are the most optimal for set intersection operation . The cardinalities of these sets are very dynamic , can range from (10 - 10 million) . Taking intersection of two theta sketch where cardinality difference is high , returns in very imprecise answers . Similarly repeated intersections cause the error rate to go high. I also read in some other blogs online that this is a known issue with theta sketches. Wanted to get some insights on possible workarounds , or may be some other sketch I can look into. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
