The upcoming 9.0.0 release of the DataSketches-java library is providing an
opportunity to clean up and refactor some things in the existing code that
should make the code base easier to understand and use.

Some of the code has been around for over 10 years and the library has
changed considerably since then.  This note is an early heads-up into some
of the major changes coming to the library.  A more complete summary of the
changes will be published at the time of release.

   - Java 25 is the first Java LTS release of the FFM API, and provides an
   amazing opportunity for us to obsolete the entire DataSketches-Memory
   repository.  The DS-Memory repo has served us well for over 10 years by
   providing fast access to off-heap data structures.  But FFM now provides
   that capability directly as part of the Java language. And, as a result,
   DS-Memory will be completely replaced with FFM in the DS-Java 9.0.0 release.

Our backward compatibility promise is primarily focused on the binary
compatibility of being able to read serializations of earlier versions of
the code with our latest versions. This is still true.  However, up until
recently we have been supporting binary compatibility with very early code
versions at Yahoo that existed long before our code was first open-sourced
in August of 2015 (we became part of ASF in 2020).  Since Yahoo is in the
process of obsoleting those systems that used some of that code there is no
reason that we need to keep supporting it.

   - With DS-Java 9.0.0, we plan to remove the complex code that allowed us
   to read that old code.  If someone, for some reason, should need to read
   sketches serialized prior to August of 2015, it is still possible to do so
   with DS-java versions prior to 9.0.0, e.g., 8.0.0. Those sketches can then
   be re-serialized, which will be able to be read with DS-Java 9.0.0 and
   beyond.

The Theta Sketch family was our first family of sketches that we developed,
and as a result it has some peculiarities.  For example, the root class is
called "Sketch" (because it was the only one).  As we developed other
sketch families, we made the root class names carry more meaningful
information, e.g., HllSketch, CpcSketch, KllSketch, etc.

   -  With DS-Java 9.0.0, we will be bringing the Theta Sketch family up to
   the same standard as the other sketches, so the root class will be called
   ThetaSketch, and the names of its child classes will be named accordingly.

The Theta Sketch family is one of the largest families of sketches with
many classes,  many of which just "evolved" -- and unfortunately, with a
lot of duplication and incomplete coverage. It is hard to understand all
the relationships and where to find the function you might be looking for.

   - With DS-Java 9.0.0, we will try to remove much of the duplication and
   provide some better documentation to help you find your way around.

We would certainly like to hear from you.

Reply via email to