[ 
https://issues.apache.org/jira/browse/FLINK-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13924:
-----------------------------------
    Labels: pull-request-available  (was: )

> Add summarizer and summary for sparse vector and dense vector.
> --------------------------------------------------------------
>
>                 Key: FLINK-13924
>                 URL: https://issues.apache.org/jira/browse/FLINK-13924
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Library / Machine Learning
>            Reporter: Xu Yang
>            Priority: Major
>              Labels: pull-request-available
>
> Summarizer is the class for calculating statistics, summary is the result 
> class of summarizer. Summary defines methods to get statistics. Assuming that 
> the data has dense vector and sparse vector, vectors size are not equal also, 
> so if DenseVectorSummarizer visit a sparse vector, it will change to 
> SparseVectorSummarizer. 
> Statistics include vectorSize, count, mean, variance, min, max, 
> standardDeviation, normL1, normL2.
>  * Add SparseVectorSummarizer which will calculate statistics for sparse 
> vector.
>  * Add SparseVectorSummary which can get statistics of sparse vector.
>  * Add DenseVectorSummarizer which will calculate statistics for dense vector.
>  * Add DenseVectorSummary which can get statistics of sparse vector.
>  * Add StatisticsUtil which provides utility functions for summarizer and 
> summary.
>  * Add VectorSummarizerUtil which provides utility functions for 
> VectorSummarizer.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to