[
https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923119#comment-15923119
]
Timothy Hunter commented on SPARK-19634:
----------------------------------------
I was not able to finish it in time, but the bulk of the code is in this branch:
https://github.com/apache/spark/compare/master...thunterdb:19634?expand=1
Note that it currently includes a (non-working) UDAF and an incomplete
TypedImperativeAggregate. It turns out that UDAF interface is not suited for
this sort of aggregators, which I realized quite late. I started to refactor my
code to use TypedImperativeAggregate, but did not have to finish it. If someone
wants to pick up this task, he or she is welcome to do it.
> Feature parity for descriptive statistics in MLlib
> --------------------------------------------------
>
> Key: SPARK-19634
> URL: https://issues.apache.org/jira/browse/SPARK-19634
> Project: Spark
> Issue Type: Sub-task
> Components: ML
> Affects Versions: 2.1.0
> Reporter: Timothy Hunter
> Assignee: Timothy Hunter
>
> This ticket tracks porting the functionality of
> spark.mllib.MultivariateOnlineSummarizer over to spark.ml.
> A design has been discussed in SPARK-19208 . Here is a design doc:
> https://docs.google.com/document/d/1ELVpGV3EBjc2KQPLN9_9_Ge9gWchPZ6SGtDW5tTm_50/edit#
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]