ZacBlanco opened a new pull request, #554:
URL: https://github.com/apache/datasketches-java/pull/554

   When generating KllSketches, systems may need to have an idea about how much 
memory utilization there is for a particular sketch. For sketches with 
fixed-width types the answer can be computed efficiently.
   
   With the KllItemsSketch, this is more difficult because the sketch can 
support String-types with variable widths. This commit adds implementation 
support to expose the `getTotalItemsNumBytes` method so that external systems 
can roughly track the memory utilization of a particular sketch.
   
   The change accomplishes this by intercepting the code where a new item is 
added to the items array, or when a new array is generated entirely. This will 
add a slight overhead due to the sketch now needing to compute the length of 
inputs. For fixed-width types the overhead is low. For string this will require 
a call to encode the string as UTF-8/16 before adding it to the array.
   
   For fixed-width types, the calculations have little effective overhead as 
the computation is a single array-access lookup + multiplication with the type 
width.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to