proost opened a new issue, #96:
URL: https://github.com/apache/datasketches-go/issues/96

   # Summary
   
   Add Go version t-digest.
   
   # Design
   
   ## Double
   
   Adding Double Precision t-digest. Because Go doesn’t support meta 
programming in C++, pairing (float32, uint32) and (float64, uint64) is 
difficult. In this time, i will add double precision only.
   
   ```go
   // Double provides an implementation of t-Digest double precision type for 
estimating quantiles and ranks.
   // This implementation is based on the paper:
   // Ted Dunning, Otmar Ertl. "Extremely Accurate Quantiles Using t-Digests"
   // and the reference implementation: https://github.com/tdunning/t-digest
   // NOTE: This implementation is similar to MergingDigest in the above Java 
implementation
   type Double struct {
        min               float64
        max               float64
        centroids         []doublePrecisionCentroid
        buffer            []float64
        centroidsWeight   uint64
        centroidsCapacity int
        k                 uint16
        reverseMerge      bool
   }
   
   type doublePrecisionCentroid struct {
        mean   float64
        weight uint64
   }
   
   func (c *doublePrecisionCentroid) add(other doublePrecisionCentroid) {
        c.weight += other.weight
        c.mean += (other.mean - c.mean) * float64(other.weight) / 
float64(c.weight)
   }
   
   // Update updates a value to the TDigest
   func (d *Double) Update(value float64) error
   
   // Merge merges another Double into this one
   func (d *Double) Merge(other *Double) error
   
   // IsEmpty returns true if the TDigest has not seen any data
   func (d *Double) IsEmpty() bool
   
   // MinValue returns the minimum value seen by the TDigest
   func (d *Double) MinValue() (float64, error)
   
   // MaxValue returns the maximum value seen by the TDigest
   func (d *Double) MaxValue() (float64, error)
   
   // TotalWeight returns the total weight of all values
   func (d *Double) TotalWeight() uint64
   
   // K returns the compression parameter k
   func (d *Double) K() uint16
   
   // Rank computes the approximate normalized rank of the given value
   func (d *Double) Rank(value float64) (float64, error)
   
   // Quantile computes the approximate quantile value corresponding to the 
given normalized rank
   func (d *Double) Quantile(rank float64) (float64, error)
   
   // PMF returns an approximation to the Probability Mass Function (PMF)
   // of the input stream.
   func (d *Double) PMF(splitPoints []float64) ([]float64, error)
   
   // CDF returns an approximation to the Cumulative Distribution Function (CDF)
   // which is the cumulative analog of the PMF of the input stream.
   func (d *Double) CDF(splitPoints []float64) ([]float64, error)
   
   // String returns a human-readable summary of the TDigest
   func (d *Double) String(shouldPrintCentroids bool) string
   
   // SerializedSizeBytes computes the serialized size in bytes of the TDigest.
   func (d *Double) SerializedSizeBytes(withBuffer bool) int
   ```
   
   ## Serialization/Deserialization
   
   - I will use Encoder/Decoder pattern like it was before.
   
   # Release Schedule
   
   I sent a [PR](https://github.com/apache/datasketches-cpp/pull/471) and 
discussion opens. When finish the PR, I will start upload implementation.
   
   I will upload 2 PRs.
   
   1. A PR for double precision t-digest.
   2. A PR for  and serialization / deserialization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to