Gary King created SPARK-13132:
---------------------------------

             Summary: LogisticRegression spends 35% of its time fetching the 
standardization parameter
                 Key: SPARK-13132
                 URL: https://issues.apache.org/jira/browse/SPARK-13132
             Project: Spark
          Issue Type: Improvement
          Components: ML
    Affects Versions: 1.6.0
            Reporter: Gary King


when L1 regularization is used, the inner functor passed to the quasi-newton 
optimizer in {{org.apache.spark.ml.classification.LogisticRegression#train}} 
makes repeated calls to {{$(standardization)}}. because this ultimately 
involves repeated string interpolation triggered by 
{{org.apache.spark.ml.param.Param#hashCode}}, this line of code consumes 
35%-45% of the entire training time in my application.

the range depends on whether the application sets an explicit value for the 
standardization parameter or relies on the default value (which needs an extra 
map lookup, resulting in an extra string interpolation, compared to the 
explicitly set case)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to