[jira] [Commented] (SPARK-19634) Feature parity for descriptive statistics in MLlib

Timothy Hunter (JIRA) Mon, 13 Mar 2017 15:52:55 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923119#comment-15923119
 ]


Timothy Hunter commented on SPARK-19634:
----------------------------------------

I was not able to finish it in time, but the bulk of the code is in this branch:

https://github.com/apache/spark/compare/master...thunterdb:19634?expand=1

Note that it currently includes a (non-working) UDAF and an incomplete 
TypedImperativeAggregate. It turns out that UDAF interface is not suited for 
this sort of aggregators, which I realized quite late. I started to refactor my 
code to use TypedImperativeAggregate, but did not have to finish it. If someone 
wants to pick up this task, he or she is welcome to do it.

> Feature parity for descriptive statistics in MLlib
> --------------------------------------------------
>
>                 Key: SPARK-19634
>                 URL: https://issues.apache.org/jira/browse/SPARK-19634
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>    Affects Versions: 2.1.0
>            Reporter: Timothy Hunter
>            Assignee: Timothy Hunter
>
> This ticket tracks porting the functionality of 
> spark.mllib.MultivariateOnlineSummarizer over to spark.ml.
> A design has been discussed in SPARK-19208 . Here is a design doc:
> https://docs.google.com/document/d/1ELVpGV3EBjc2KQPLN9_9_Ge9gWchPZ6SGtDW5tTm_50/edit#



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-19634) Feature parity for descriptive statistics in MLlib

Reply via email to