As a user I really love the matrix. The questions I were asking are mostly
answered by it!

May I know how the TODO[1] gets break down into workable tasks?
How can contributors participate if someone is willing to help?

[1] https://github.com/apache/incubator-datasketches-java/projects/1

Evans


leerho <lee...@gmail.com> 於 2020年3月24日 週二 上午8:40寫道:

> Folks, I hope everyone is safe and healthy during these challenging times!
>
> Some updates:
>
>    - The website Downloads
>    <https://datasketches.apache.org/docs/Community/Downloads.html> page
>    has been completely redesigned and automated.  When any of our components
>    are released to dist there is a step in our Release Process
>    
> <https://dist.apache.org/repos/dist/dev/incubator/datasketches/scripts/APACHE_JAVA_RELEASE_STEPS.md>
>  that
>    just by running a script will automatically update the downloads page with
>    the latest release versions.
>    - We have also added 3 new TODO lists for Java
>    <https://github.com/apache/incubator-datasketches-java/projects/1>, C++
>    <https://github.com/apache/incubator-datasketches-cpp/projects/1> and
>    the Website
>    <https://github.com/apache/incubator-datasketches-website/projects/1>.
>    These are brand new and will be filling up with tasks soon.
>    - There are a number of new additions to the website that should make
>    it easier for users to find the right sketches for their applications:
>       - Sketch Features Matrix
>       
> <https://datasketches.apache.org/docs/Architecture/SketchFeaturesMatrix.html>.
>  This
>       provides in one view a comparison of the major features of the different
>       sketches and sketch families in the library.
>       - Features Matrix for Distinct Count Sketches
>       <https://datasketches.apache.org/docs/DistinctCountFeaturesMatrix.html>.
>       Our library has a wide variety of sketches for counting distinct values,
>       each with different capabilities and trade-offs for different
>       applications.  This matrix tries to remove some of the mystery by
>       highlighting the major differences between the various distinct counting
>       sketches.
>       - HLL vs CPC Figures of Merit
>       
> <https://datasketches.apache.org/docs/DistinctCountMeritComparisons.html> 
> There
>       is always a lot of interest in the Flajolet, et al, HyperLogLog (HLL)
>       sketch.  Not only do we have leading implementations of the HLL sketch, 
> our
>       team developed a new *Compressed Probabilistic Counting* (CPC)
>       sketch that outperforms the HLL sketch in terms of accuracy per stored
>       space. This new sketch is discussed briefly on our Research
>       <https://datasketches.apache.org/docs/Community/Research.html>
>       page, which also links to the theoretical paper
>       <https://arxiv.org/abs/1708.06839> that discusses the new
>       algorithm. There are also a new section in the Distinct Counting 
> section of
>       the website documentation that discusses the CPC sketch along with
>       programming examples.
>       - Sketches by Component Repository
>       
> <https://datasketches.apache.org/docs/Architecture/SketchesByComponent.html>.
>       This new page organizes the library by the major repository components 
> and
>       lists the sketches that are available in each of the components.
>       - Sketch Criteria for Library Inclusion
>       <https://datasketches.apache.org/docs/Architecture/SketchCriteria.html>.
>       For new contributors to the library, this page outlines our current
>       criteria for including new sketch algorithms into the library.
>
> As always, we look forward to your comments and suggestions!
>
> Cheers,
>
> Lee.
>

Reply via email to