ZacBlanco opened a new pull request, #554: URL: https://github.com/apache/datasketches-java/pull/554
When generating KllSketches, systems may need to have an idea about how much memory utilization there is for a particular sketch. For sketches with fixed-width types the answer can be computed efficiently. With the KllItemsSketch, this is more difficult because the sketch can support String-types with variable widths. This commit adds implementation support to expose the `getTotalItemsNumBytes` method so that external systems can roughly track the memory utilization of a particular sketch. The change accomplishes this by intercepting the code where a new item is added to the items array, or when a new array is generated entirely. This will add a slight overhead due to the sketch now needing to compute the length of inputs. For fixed-width types the overhead is low. For string this will require a call to encode the string as UTF-8/16 before adding it to the array. For fixed-width types, the calculations have little effective overhead as the computation is a single array-access lookup + multiplication with the type width. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
