[ 
https://issues.apache.org/jira/browse/MAHOUT-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833743#comment-15833743
 ] 

ASF GitHub Bot commented on MAHOUT-1856:
----------------------------------------

Github user andrewpalumbo commented on a diff in the pull request:

    https://github.com/apache/mahout/pull/246#discussion_r97236955
  
    --- Diff: 
math-scala/src/main/scala/org/apache/mahout/math/algorithms/transformer/StandardScaler.scala
 ---
    @@ -0,0 +1,119 @@
    +/**
    +  * Licensed to the Apache Software Foundation (ASF) under one
    +  * or more contributor license agreements. See the NOTICE file
    +  * distributed with this work for additional information
    +  * regarding copyright ownership. The ASF licenses this file
    +  * to you under the Apache License, Version 2.0 (the
    +  * "License"); you may not use this file except in compliance
    +  * with the License. You may obtain a copy of the License at
    +  *
    +  * http://www.apache.org/licenses/LICENSE-2.0
    +  *
    +  * Unless required by applicable law or agreed to in writing,
    +  * software distributed under the License is distributed on an
    +  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +  * KIND, either express or implied. See the License for the
    +  * specific language governing permissions and limitations
    +  * under the License.
    +  */
    +
    +package org.apache.mahout.math.algorithms.transformer
    +
    +import org.apache.mahout.math.drm
    +
    +import org.apache.mahout.math.scalabindings._
    +
    +import org.apache.mahout.math.scalabindings.RLikeVectorOps
    +import org.apache.mahout.math.{Vector => MahoutVector}
    +
    +import org.apache.mahout.math.scalabindings.RLikeOps._
    +import org.apache.mahout.math.scalabindings._
    +import org.apache.mahout.math.scalabindings.RLikeVectorOps
    +import org.apache.mahout.math.scalabindings.MatrixOps
    +
    +import org.apache.mahout.math._
    +import org.apache.mahout.math.scalabindings._
    +import org.apache.mahout.math.drm._
    +import org.apache.mahout.math.scalabindings.RLikeOps._
    +import org.apache.mahout.math.drm.RLikeDrmOps._
    +
    +
    +import org.apache.mahout.math.Matrix
    +
    +import collection._
    +import JavaConversions._
    +
    +import Math.sqrt
    +
    +import scala.reflect.{ClassTag,classTag}
    +
    +/**
    +  * Scales columns to mean 0 and unit variance
    +  */
    +class StandardScaler extends Transformer{
    +  var meanVec: MahoutVector = _
    +  var variance: MahoutVector = _
    +  var stdev: MahoutVector = _
    +  var summary = ""
    +
    +  def fit[K](input: DrmLike[K]) = {
    +    val mNv = dcolMeanVars(input)
    +    meanVec = mNv._1
    +    variance = mNv._2
    +    stdev = mNv._2.sqrt
    +    isFit = true
    +
    +  }
    +
    +  def transform[K: ClassTag](input: DrmLike[K]): DrmLike[K] = {
    +
    +    if (!isFit) {
    +      //throw an error
    --- End diff --
    
    As well, I think that this could be another argument for moving  
Hyperparamaters into `fit(...)`, e.g. If for some reason we wanted to 
standardize on N(mean = 0, stdDev = 2),  we could still call `StandardScaler` 
and `fit (Map["mu" -> 0, "sigma" ->2])`:
    ```
    val drmStandardized = StandardScaler(unscaledDrm).fit(Map["mu" -> 0, 
"sigma" ->2]).transform()
    ```


> Create a framework for new Mahout Clustering, Classification, and 
> Optimization  Algorithms
> ------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-1856
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1856
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 0.12.1
>            Reporter: Andrew Palumbo
>            Assignee: Trevor Grant
>            Priority: Critical
>             Fix For: 0.13.0
>
>
> To ensure that Mahout does not become "A loose bag of algorithms", Create 
> basic traits with funtions common to each class of algorithm. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to