Gary King created SPARK-13132: --------------------------------- Summary: LogisticRegression spends 35% of its time fetching the standardization parameter Key: SPARK-13132 URL: https://issues.apache.org/jira/browse/SPARK-13132 Project: Spark Issue Type: Improvement Components: ML Affects Versions: 1.6.0 Reporter: Gary King
when L1 regularization is used, the inner functor passed to the quasi-newton optimizer in {{org.apache.spark.ml.classification.LogisticRegression#train}} makes repeated calls to {{$(standardization)}}. because this ultimately involves repeated string interpolation triggered by {{org.apache.spark.ml.param.Param#hashCode}}, this line of code consumes 35%-45% of the entire training time in my application. the range depends on whether the application sets an explicit value for the standardization parameter or relies on the default value (which needs an extra map lookup, resulting in an extra string interpolation, compared to the explicitly set case) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org