jltqst27 opened a new issue #7444: Are there documentation on the limitation of 
intersection using theta sketch?
URL: https://github.com/apache/incubator-druid/issues/7444
 
 
   Not sure if this is the right place to ask this but I am testing set 
intersection on druid using theta sketch, but I am having trouble producing 
result with good accuracy after the second or third intersection.
   
   Below is some sample code that I used to play with theta sketch directly
   
           UpdateSketch sketch1 = 
UpdateSketch.builder().setLogNominalEntries(26).build();
           UpdateSketch sketch2 = 
UpdateSketch.builder().setLogNominalEntries(26).build();
           UpdateSketch sketch3 = 
UpdateSketch.builder().setLogNominalEntries(26).build();
           UpdateSketch sketch4 = 
UpdateSketch.builder().setLogNominalEntries(26).build();
   
   
           Set<Integer> k1 = new HashSet<>();
           Set<Integer> k2 = new HashSet<>();
           Set<Integer> k3 = new HashSet<>();
           Set<Integer> k4 = new HashSet<>();
   
           Random rand = new Random();
   
           for (int key = 0; key < 1200000; key++) {
               int n1 = rand.nextInt(2000000);
               int n2 = rand.nextInt(2000000);
               int n3 = rand.nextInt(2000000);
               int n4 = rand.nextInt(2000000);
   
               sketch1.update(n1);
               sketch2.update(n2);
               sketch3.update(n3);
               sketch3.update(n4);
               k1.add(n1);
               k2.add(n2);
               k3.add(n3);
               k4.add(n4);
           }
   
           int count1_2 = 0;
           int count1_2_3 = 0;
           int count1_2_3_4 = 0;
           for(Integer k: k1) {
               if (k2.contains(k)) {
                   count1_2 += 1;
               }
   
               if (k2.contains(k) && k3.contains(k)) {
                   count1_2_3 += 1;
               }
   
               if (k2.contains(k) && k3.contains(k) && k4.contains(k)) {
                   count1_2_3_4 += 1;
               }
           }
   
           Intersection intersection = 
SetOperation.builder().buildIntersection();
           intersection.update(sketch1);
           intersection.update(sketch2);
           Sketch intersectionResult1_2 = intersection.getResult();
           intersection.update(sketch3);
           Sketch intersectionResult1_2_3 = intersection.getResult();
           intersection.update(sketch4);
           Sketch intersectionResult1_2_3_4 = intersection.getResult();
   
           System.out.println(count1_2);
           System.out.println(count1_2_3);
           System.out.println(count1_2_3_4);
           System.out.println(intersectionResult1_2.toString());
           System.out.println(intersectionResult1_2_3.toString());
           System.out.println(intersectionResult1_2_3_4.toString());`
   
   
   I am seeing the first interaction is very accurate but the second and third 
interaction not so much
   
   Is there a documentation about what kind of result we can expect for 
intersection?
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to