Github user dbtsai commented on the pull request:

    https://github.com/apache/spark/pull/3435#issuecomment-64308394
  
    Wow, with 
    ```scala
      private[this] val factor: Array[Double] = {
        val f = Array.ofDim[Double](variance.size)
        var i = 0
        while (i < f.size) {
          f(i) = if (variance(i) != 0.0) 1.0 / math.sqrt(variance(i)) else 0.0
          i += 1
        }
        f
      }
    
      private[this] val shift: Array[Double] = mean.toArray
    ```
    and
    ```scala
                while (i < size) {
                  values(i) = (values(i) - shift(i)) * factor(i)
                  i += 1
                }
    ```
    , I got different bytecode as the following
    ```
       L14
        LINENUMBER 108 L14
       FRAME FULL [org/apache/spark/mllib/feature/StandardScalerModel 
org/apache/spark/mllib/linalg/Vector [D org/apache/spark/mllib/linalg/Vector 
org/apache/spark/mllib/linalg/DenseVector T [D I I] []
        ILOAD 8
        ILOAD 7
        IF_ICMPGE L15
       L16
        LINENUMBER 109 L16
        ALOAD 6
        ILOAD 8
        ALOAD 6
        ILOAD 8
        DALOAD
        ALOAD 0
        GETFIELD org/apache/spark/mllib/feature/StandardScalerModel.shift : [D
        ILOAD 8
        DALOAD
        DSUB
        ALOAD 0
        GETFIELD org/apache/spark/mllib/feature/StandardScalerModel.factor : [D
        ILOAD 8
        DALOAD
        DMUL
        DASTORE
       L17
        LINENUMBER 110 L17
        ILOAD 8
        ICONST_1
        IADD
        ISTORE 8
        GOTO L14
    ```
    It's slightly slower than the local reference version. 
    DenseVector withMean and withStd: 5.92secs
    DenseVector withMean and withoutStd: 5.36secs
    DenseVector withoutMean and withStd: 5.51secs
    SparseVector withoutMean and withStd: 1.30secs
    
    Instead of calling `INVOKESPECIAL `, it's now calling `GETFIELD`. 
    What's difference between `private[this]` and `private`? Also, it doesn't 
    work with `private [this] lazy val` which will generate the same bytecode 
    as `private lazy val`. As a result, `shift` and `factor` will be always 
evaluated
    when we create the model. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to