Folks, I hope everyone is safe and healthy during these challenging times! Some updates:
- The website Downloads <https://datasketches.apache.org/docs/Community/Downloads.html> page has been completely redesigned and automated. When any of our components are released to dist there is a step in our Release Process <https://dist.apache.org/repos/dist/dev/incubator/datasketches/scripts/APACHE_JAVA_RELEASE_STEPS.md> that just by running a script will automatically update the downloads page with the latest release versions. - We have also added 3 new TODO lists for Java <https://github.com/apache/incubator-datasketches-java/projects/1>, C++ <https://github.com/apache/incubator-datasketches-cpp/projects/1> and the Website <https://github.com/apache/incubator-datasketches-website/projects/1>. These are brand new and will be filling up with tasks soon. - There are a number of new additions to the website that should make it easier for users to find the right sketches for their applications: - Sketch Features Matrix <https://datasketches.apache.org/docs/Architecture/SketchFeaturesMatrix.html>. This provides in one view a comparison of the major features of the different sketches and sketch families in the library. - Features Matrix for Distinct Count Sketches <https://datasketches.apache.org/docs/DistinctCountFeaturesMatrix.html>. Our library has a wide variety of sketches for counting distinct values, each with different capabilities and trade-offs for different applications. This matrix tries to remove some of the mystery by highlighting the major differences between the various distinct counting sketches. - HLL vs CPC Figures of Merit <https://datasketches.apache.org/docs/DistinctCountMeritComparisons.html> There is always a lot of interest in the Flajolet, et al, HyperLogLog (HLL) sketch. Not only do we have leading implementations of the HLL sketch, our team developed a new *Compressed Probabilistic Counting* (CPC) sketch that outperforms the HLL sketch in terms of accuracy per stored space. This new sketch is discussed briefly on our Research <https://datasketches.apache.org/docs/Community/Research.html> page, which also links to the theoretical paper <https://arxiv.org/abs/1708.06839> that discusses the new algorithm. There are also a new section in the Distinct Counting section of the website documentation that discusses the CPC sketch along with programming examples. - Sketches by Component Repository <https://datasketches.apache.org/docs/Architecture/SketchesByComponent.html>. This new page organizes the library by the major repository components and lists the sketches that are available in each of the components. - Sketch Criteria for Library Inclusion <https://datasketches.apache.org/docs/Architecture/SketchCriteria.html>. For new contributors to the library, this page outlines our current criteria for including new sketch algorithms into the library. As always, we look forward to your comments and suggestions! Cheers, Lee.