ravindra-wagh commented on issue #68: Can we add an enhancement to merge 
update_theta_sketch?
URL: 
https://github.com/apache/incubator-datasketches-cpp/issues/68#issuecomment-552300890
 
 
   Let us say we have data for 2 columns say for **X** and **Y** and we want to 
perform 3 standard set of operations on them: **union, intersection and 
difference**.
   The columns data are in big size and distributed among the nodes for the 
performance so let us say we have 4 nodes and each node create it's own 
_update_theta_sketch_ for the received data. Finally leader node combines all 
the sketches into one sketch.
   
   **Column X computation:**
   
                    X
                    |
        --------------------------
        |       |       |        |
        sk1    sk2     sk3      sk4   ==> All are update_theta_sketch
        |       |       |        |
         -------------------------
                   |
                 merge()
                   |
          update_theta_sketch_X
        
   **Column Y computation:**
   
                    Y
                    |
        --------------------------
        |       |       |        |
        sk1    sk2     sk3      sk4   ==> All are update_theta_sketch
        |       |       |        |
         -------------------------
                   |
                 merge()
                   |
          update_theta_sketch_Y
   Now with **update_theta_sketch_X** and **update_theta_sketch_Y** sketches, 
we can easily perform union, intersection and difference on them. 
   Instead of merge(), if we use **theta_union** to combine them then we would 
get **theta_union_X** and **theta_union_Y**. With these, we can perform only 
union operations and not intersection and difference directly. To perform 
intersection and difference, first we have to convert _theta_union_X_ and 
_theta_union_Y_ to **compact_theta_sketch_X** and **compact_theta_sketch_Y** 
respectively and then apply intersection and difference on them.
   If we have **merge()** function as part of **update_theta_sketch** class, 
then we can easily perform all the operations using base class only.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to