[GitHub] spark pull request: [SPARK-6278][MLLIB] Mention the change of obje...

srowen Thu, 12 Mar 2015 05:17:43 -0700

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4978#discussion_r26296626
  
    --- Diff: docs/mllib-guide.md ---
    @@ -102,6 +102,7 @@ In the `spark.mllib` package, there were several 
breaking changes.  The first ch
         * In `DecisionTree`, the deprecated class method `train` has been 
removed.  (The object/static `train` methods remain.)
         * In `Strategy`, the `checkpointDir` parameter has been removed.  
Checkpointing is still supported, but the checkpoint directory must be set 
before calling tree and tree ensemble training.
     * `PythonMLlibAPI` (the interface between Scala/Java and Python for MLlib) 
was a public API but is now private, declared `private[python]`.  This was 
never meant for external use.
    +* In linear regression (including Lasso and ridge regression), the squared 
loss is now divided by 2. So in order to produce the same result as in 1.2, the 
step size you choose needs to be multiplied by 2.
    --- End diff --
    
    Hm, it also occurred to me that if the step size doubles, then it affects 
the regularization parameter as well. Doesn't it have to be half as large as 
well in order to get the same result? I'm probably overlooking something about 
the formulation, but I didn't see the reg param updated in 
https://github.com/apache/spark/commit/a96b72781ae40bb303613990b8d8b4721b84e1c3 
and if the loss term was halved, leaving all else equal, the regularization 
term is relatively twice as large right?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-6278][MLLIB] Mention the change of obje...

Reply via email to