+1 Similarly to Sketch, I would also point to ItemsSketch that is used for Frequencies and one of the Quantile sketches that could fall under the code cleanup initiative
On Tue, Sep 30, 2025 at 3:46 PM Jon Malkin <[email protected]> wrote: > These all sound good to me > > On Mon, Sep 29, 2025, 6:34 PM Lee Rhodes <[email protected]> wrote: > >> The upcoming 9.0.0 release of the DataSketches-java library is providing >> an opportunity to clean up and refactor some things in the existing code >> that should make the code base easier to understand and use. >> >> Some of the code has been around for over 10 years and the library has >> changed considerably since then. This note is an early heads-up into some >> of the major changes coming to the library. A more complete summary of the >> changes will be published at the time of release. >> >> - Java 25 is the first Java LTS release of the FFM API, and provides >> an amazing opportunity for us to obsolete the entire DataSketches-Memory >> repository. The DS-Memory repo has served us well for over 10 years by >> providing fast access to off-heap data structures. But FFM now provides >> that capability directly as part of the Java language. And, as a result, >> DS-Memory will be completely replaced with FFM in the DS-Java 9.0.0 >> release. >> >> Our backward compatibility promise is primarily focused on the binary >> compatibility of being able to read serializations of earlier versions of >> the code with our latest versions. This is still true. However, up until >> recently we have been supporting binary compatibility with very early code >> versions at Yahoo that existed long before our code was first open-sourced >> in August of 2015 (we became part of ASF in 2020). Since Yahoo is in the >> process of obsoleting those systems that used some of that code there is no >> reason that we need to keep supporting it. >> >> - With DS-Java 9.0.0, we plan to remove the complex code that allowed >> us to read that old code. If someone, for some reason, should need to >> read >> sketches serialized prior to August of 2015, it is still possible to do so >> with DS-java versions prior to 9.0.0, e.g., 8.0.0. Those sketches can then >> be re-serialized, which will be able to be read with DS-Java 9.0.0 and >> beyond. >> >> The Theta Sketch family was our first family of sketches that we >> developed, and as a result it has some peculiarities. For example, the >> root class is called "Sketch" (because it was the only one). As we >> developed other sketch families, we made the root class names carry more >> meaningful information, e.g., HllSketch, CpcSketch, KllSketch, etc. >> >> - With DS-Java 9.0.0, we will be bringing the Theta Sketch family up >> to the same standard as the other sketches, so the root class will be >> called ThetaSketch, and the names of its child classes will be named >> accordingly. >> >> The Theta Sketch family is one of the largest families of sketches with >> many classes, many of which just "evolved" -- and unfortunately, with a >> lot of duplication and incomplete coverage. It is hard to understand all >> the relationships and where to find the function you might be looking for. >> >> - With DS-Java 9.0.0, we will try to remove much of the duplication >> and provide some better documentation to help you find your way around. >> >> We would certainly like to hear from you. >> >> >> >> >>
