Github user ogeagla commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4140#discussion_r23711914
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala ---
    @@ -61,19 +61,39 @@ class StandardScaler(withMean: Boolean, withStd: 
Boolean) extends Logging {
      * :: Experimental ::
      * Represents a StandardScaler model that can transform vectors.
      *
    - * @param withMean whether to center the data before scaling
    - * @param withStd whether to scale the data to have unit standard deviation
    - * @param mean column mean values
      * @param variance column variance values
    + * @param mean column mean values
    + * @param withStd whether to scale the data to have unit standard deviation
    + * @param withMean whether to center the data before scaling
      */
     @Experimental
    -class StandardScalerModel private[mllib] (
    -    val withMean: Boolean,
    -    val withStd: Boolean,
    +class StandardScalerModel (
    +    val variance: Vector,
         val mean: Vector,
    -    val variance: Vector) extends VectorTransformer {
    +    var withStd: Boolean,
    +    var withMean: Boolean) extends VectorTransformer {
     
    -  require(mean.size == variance.size)
    +  def this(variance: Vector, mean: Vector) {
    +    this(variance, mean, withStd = variance != null, withMean = mean != 
null)
    +    require(this.withStd || this.withMean, "at least one of variance or 
mean vectors must be provided")
    +    if (this.withStd && this.withMean) require(mean.size == variance.size, 
"mean and variance vectors must have equal size if both are provided")
    +  }
    +
    +  def this(variance: Vector) = this(variance, null)
    +
    +  @DeveloperApi
    +  def setWithMean(withMean: Boolean): this.type = {
    +    require(!(withMean && this.mean == null),"cannot set withMean to true 
while mean is null")
    +    this.withMean = withMean
    +    this
    +  }
    +
    +  @DeveloperApi
    +  def setWithStd(withStd: Boolean): this.type = {
    +    require(!(withStd && this.variance == null), "cannot set withStd to 
true while variance is null")
    +    this.withStd = withStd
    +    this
    +  }
     
       private lazy val factor: Array[Double] = {
    --- End diff --
    
    I agree with that, I can make those changes.  An additional one-time 
overhead is the computing of sqrt of the `variance` from the `summary` in 
`StandardScaler.fit` to provide to the `StandardScalerModel` constructor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to