[
https://issues.apache.org/jira/browse/FLINK-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-13924:
-----------------------------------
Labels: pull-request-available (was: )
> Add summarizer and summary for sparse vector and dense vector.
> --------------------------------------------------------------
>
> Key: FLINK-13924
> URL: https://issues.apache.org/jira/browse/FLINK-13924
> Project: Flink
> Issue Type: Sub-task
> Components: Library / Machine Learning
> Reporter: Xu Yang
> Priority: Major
> Labels: pull-request-available
>
> Summarizer is the class for calculating statistics, summary is the result
> class of summarizer. Summary defines methods to get statistics. Assuming that
> the data has dense vector and sparse vector, vectors size are not equal also,
> so if DenseVectorSummarizer visit a sparse vector, it will change to
> SparseVectorSummarizer.
> Statistics include vectorSize, count, mean, variance, min, max,
> standardDeviation, normL1, normL2.
> * Add SparseVectorSummarizer which will calculate statistics for sparse
> vector.
> * Add SparseVectorSummary which can get statistics of sparse vector.
> * Add DenseVectorSummarizer which will calculate statistics for dense vector.
> * Add DenseVectorSummary which can get statistics of sparse vector.
> * Add StatisticsUtil which provides utility functions for summarizer and
> summary.
> * Add VectorSummarizerUtil which provides utility functions for
> VectorSummarizer.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)