+1

Similarly to Sketch, I would also point to ItemsSketch that is used for
Frequencies and one of the Quantile sketches that could fall under the code
cleanup initiative



On Tue, Sep 30, 2025 at 3:46 PM Jon Malkin <[email protected]> wrote:

> These all sound good to me
>
> On Mon, Sep 29, 2025, 6:34 PM Lee Rhodes <[email protected]> wrote:
>
>> The upcoming 9.0.0 release of the DataSketches-java library is providing
>> an opportunity to clean up and refactor some things in the existing code
>> that should make the code base easier to understand and use.
>>
>> Some of the code has been around for over 10 years and the library has
>> changed considerably since then.  This note is an early heads-up into some
>> of the major changes coming to the library.  A more complete summary of the
>> changes will be published at the time of release.
>>
>>    - Java 25 is the first Java LTS release of the FFM API, and provides
>>    an amazing opportunity for us to obsolete the entire DataSketches-Memory
>>    repository.  The DS-Memory repo has served us well for over 10 years by
>>    providing fast access to off-heap data structures.  But FFM now provides
>>    that capability directly as part of the Java language. And, as a result,
>>    DS-Memory will be completely replaced with FFM in the DS-Java 9.0.0 
>> release.
>>
>> Our backward compatibility promise is primarily focused on the binary
>> compatibility of being able to read serializations of earlier versions of
>> the code with our latest versions. This is still true.  However, up until
>> recently we have been supporting binary compatibility with very early code
>> versions at Yahoo that existed long before our code was first open-sourced
>> in August of 2015 (we became part of ASF in 2020).  Since Yahoo is in the
>> process of obsoleting those systems that used some of that code there is no
>> reason that we need to keep supporting it.
>>
>>    - With DS-Java 9.0.0, we plan to remove the complex code that allowed
>>    us to read that old code.  If someone, for some reason, should need to 
>> read
>>    sketches serialized prior to August of 2015, it is still possible to do so
>>    with DS-java versions prior to 9.0.0, e.g., 8.0.0. Those sketches can then
>>    be re-serialized, which will be able to be read with DS-Java 9.0.0 and
>>    beyond.
>>
>> The Theta Sketch family was our first family of sketches that we
>> developed, and as a result it has some peculiarities.  For example, the
>> root class is called "Sketch" (because it was the only one).  As we
>> developed other sketch families, we made the root class names carry more
>> meaningful information, e.g., HllSketch, CpcSketch, KllSketch, etc.
>>
>>    -  With DS-Java 9.0.0, we will be bringing the Theta Sketch family up
>>    to the same standard as the other sketches, so the root class will be
>>    called  ThetaSketch, and the names of its child classes will be named
>>    accordingly.
>>
>> The Theta Sketch family is one of the largest families of sketches with
>> many classes,  many of which just "evolved" -- and unfortunately, with a
>> lot of duplication and incomplete coverage. It is hard to understand all
>> the relationships and where to find the function you might be looking for.
>>
>>    - With DS-Java 9.0.0, we will try to remove much of the duplication
>>    and provide some better documentation to help you find your way around.
>>
>> We would certainly like to hear from you.
>>
>>
>>
>>
>>

Reply via email to