+1 (BINDING) - LICENSE looks good - NOTICE looks good - checked sha1 and gpg signature on all artifacts - "mvn clean test" runs as expected - "mvn clean install" runs as expected and produced the expected artifacts
Will Lauer [email protected] On Fri, Nov 21, 2025 at 8:42 PM Lee Rhodes <[email protected]> wrote: > Hello Apache DataSketches PMC and Community, > > This is a call for vote to release Apache DataSketches-java candidate > version: 9.0.0-RC1 > > - This is the core Java component of the DataSketches library that > includes all the sketch algorithms in production-ready packages. These > sketches can be called directly from this component or used in conjunction > with the adaptor components such as Hadoop Pig, Hadoop Hive, or the > aggregator adaptors built into Apache Druid. > > Major changes with this release: > > This release is a major release where we took the opportunity to do some > significant refactoring that will constitute incompatible changes from > previous releases. Any incompatibility with prior releases is always an > inconvenience to users who wish to just upgrade to the latest release and > run. However, some of the code in this library was written in 2013 and > meanwhile the Java language has evolved enormously since then. We chose to > use this major release as the opportunity to modernize some of the code to > achieve the following goals: > > *Remove the dependency on the DataSketches-Memory component and use FFM > instead.* > > - The DataSketches-Memory component was originally developed in 2014 > to address the need for fast access to off-heap memory data structures and > used Unsafe and other JVM internals as there were no satisfactory Java > language features to do this at the time. > - The FFM capabilities introduced into the language in Java 22, are > now part of the Java 25 LTS release, which we support. Since the > capabilities of FFM are a superset of the original DataSketches-Memory > component, it made sense to rewrite the code to eliminate the dependency on > DataSketches-Memory and use FFM instead. This impacted code across the > entire library. > - This provided several advantages to the code base. By removing this > dependency on DataSketches-Memory, there are now no runtime dependencies! > This should make integrating this library into other Java systems much > simpler. Since FFM is tightly integrated into the Java language, it has > improved performance, especially with bulk operations. > - As an added note: There are numerous other improvements to the Java > language that we could perhaps take advantage of in a rewrite, e.g., > Records, text blocks, switch expressions, sealed, var, modules, patterns, > etc. However, faced with the risk of accidentally creating bugs due to too > many changes at one time, we focused on FFM, which actually improved > performance as opposed to just creating syntactic sugar. > > *Align public sketch class names so that the sketch family name is part of > the class name. * > > - For example, the Theta sketch family was the first family written > for the library and its base class was called *Sketch*. The Tuple > sketch family evolved soon after and its base class was also called > *Sketch*. If a user wanted to use both the Theta and Tuple families > in the same class one of them had to be fully qualified every time it was > referenced. > - Unfortunately, this style propagated so some of the other early > sketch families where we ended up with two different sketch families with > a *ItemsSketch, > etc*. For the more recent additions to the library we started > including the sketch family name in all the relevant sketch-like public > classes of a sketch family. > - In this release we have refactored these older sketches with new > names that now include the sketch family name. This is an incompatible > change for user code moving from earlier releases, but this can be readily > fixed with search-and-replace tools. This release is not perfect, but > hopefully more consistent across all the different sketch families. > > Known Issues: > > *SpotBugs* > > - Make sure you configure SpotBugs with the > /tools/FindBugsExcludeFilter.xml file. Otherwise, you may get a lot of > false positive or low risk issues that we have examined and eliminated with > this exclusion file. > > *Checkstyle* > > - At the time of this writing, Checkstyle had not been upgraded to > handle Java 25 features. > > References for this release: > > *Source repository: * > https://github.com/apache/datasketches-java > > *Git Tag for this release: * > https://github.com/apache/datasketches-java/releases/tag/9.0.0-RC1 on > branch 9.0.X > > *Git HashId for this release starts with: * > f3b334b on branch 9.0.X > > *The Release Candidate / Zip Repository: * > https://dist.apache.org/repos/dist/dev/datasketches/java/9.0.0-RC1 > > *The public signing key can be found in the KEYS file: * > https://dist.apache.org/repos/dist/dev/datasketches/KEYS > > *The artifacts have been signed with --keyid-format SHORT:* > 8CD4A902 > > *Repository: Maven Central [Nexus](http://repository.apache.org > <http://repository.apache.org>) (Jar Artifacts):* > > https://repository.apache.org/content/groups/staging/org/apache/datasketches/datasketches-java/9.0.0/ > > *Build & Test Guide:* > https://github.com/apache/datasketches-java/blob/9.0.0-RC1/README.md > > The vote will be performed as follows: > This letter will be published on dev@ and remain open for at least 72 > hours (excluding weekends and holidays), AND until at least 3 (+1) PMC > votes or a majority of (+1) PMC votes are acquired. Anyone in the community > can vote. This vote will close no earlier than Monday Dec 1, 2025, 6:00 PM > PST. > > Please vote accordingly: > > [ ] +1 approve > [ ] +0 no opinion > [ ] -1 disapprove with the reason > > Thanks, > Lee Rhodes > [email protected] >
