ravindra-wagh commented on issue #68: Can we add an enhancement to merge update_theta_sketch? URL: https://github.com/apache/incubator-datasketches-cpp/issues/68#issuecomment-552300890 Let us say we have data for 2 columns say for **X** and **Y** and we want to perform 3 standard set of operations on them: **union, intersection and difference**. The columns data are in big size and distributed among the nodes for the performance so let us say we have 4 nodes and each node create it's own _update_theta_sketch_ for the received data. Finally leader node combines all the sketches into one sketch. **Column X computation:** X | -------------------------- | | | | sk1 sk2 sk3 sk4 ==> All are update_theta_sketch | | | | ------------------------- | merge() | update_theta_sketch_X **Column Y computation:** Y | -------------------------- | | | | sk1 sk2 sk3 sk4 ==> All are update_theta_sketch | | | | ------------------------- | merge() | update_theta_sketch_Y Now with **update_theta_sketch_X** and **update_theta_sketch_Y** sketches, we can easily perform union, intersection and difference on them. Instead of merge(), if we use **theta_union** to combine them then we would get **theta_union_X** and **theta_union_Y**. With these, we can perform only union operations and not intersection and difference directly. To perform intersection and difference, first we have to convert _theta_union_X_ and _theta_union_Y_ to **compact_theta_sketch_X** and **compact_theta_sketch_Y** respectively and then apply intersection and difference on them. If we have **merge()** function as part of **update_theta_sketch** class, then we can easily perform all the operations using base class only.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
