Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/18305#discussion_r124384213
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/loss/RDDLossFunction.scala ---
@@ -62,8 +62,8 @@ private[ml] class RDDLossFunction[
val newAgg = instances.treeAggregate(thisAgg)(seqOp, combOp,
aggregationDepth)
val gradient = newAgg.gradient
val regLoss = regularization.map { regFun =>
- val (regLoss, regGradient) = regFun.calculate(coefficients.data)
- BLAS.axpy(1.0, Vectors.dense(regGradient), gradient)
+ val (regLoss, regGradient) =
regFun.calculate(Vectors.fromBreeze(coefficients))
--- End diff --
When you say compose, do you just mean have the same input/output types, or
do you mean you want to do something like this?:
````scala
val combinedLoss: DiffFunction[Vector] = dataLoss.add(regLoss)
````
I do agree it's awkward. Ideally, the loss functions would use Spark
vectors but that isn't possible since Breeze requires that the type being
optimized implements certain type classes. However, we don't directly optimize
the regularization so it doesn't have to be a Breeze vector. Could we make it
all Breeze? Yes, but then we'd have to convert the gradient from the aggregator
to Breeze and add them using Breeze ops... or something. I don't see that one
way is really better than the other, until we can implement something more
comprehensive - for example, implementing optimization framework directly in
Spark would/will solve this: https://issues.apache.org/jira/browse/SPARK-17136
I suppose I'm willing to change this, though I'm not sure how much it
helps, and since it's all internal we aren't locking ourselves into anything.
Additionally, doing these awkward conversions only happens here, in one place
so it's not spread throughout the codebase or anything. Thoughts?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]