[
https://issues.apache.org/jira/browse/SPARK-17134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428848#comment-15428848
]
DB Tsai edited comment on SPARK-17134 at 8/19/16 9:21 PM:
----------------------------------------------------------
It may also worth to try the following. I see some performance improvement when
the # of classes are high, and we can avoid doing normalization again and
again. By doing so, the access pattern of coefficients array will not be
sequential, and will change the locality of caching in CPU. As a result, I
think the performance will be case by case. Maybe we can store the coefficients
in a transpose matrix which may help the locality?
We need more investigation to understand the problem.
{code:borderStyle=solid}
val margins = Array.ofDim[Double](numClasses)
features.foreachActive { (index, value) =>
if (featuresStd(index) != 0.0 && value != 0.0) {
var i = 0
val temp = value / featuresStd(index)
while ( i < numClasses) {
margins(i) += coefficients(i * numFeaturesPlusIntercept + index) * temp
i += 1
}
}
}
if (fitIntercept) {
var i = 0
val length = features.size
while ( i < numClasses) {
margins(i) += coefficients(i * numFeaturesPlusIntercept + length)
i += 1
}
}
val maxMargin = margins.max
val marginOfLabel = margins(label.toInt)
{code}
was (Author: dbtsai):
{code:borderStyle=solid}
val margins = Array.ofDim[Double](numClasses)
features.foreachActive { (index, value) =>
if (featuresStd(index) != 0.0 && value != 0.0) {
var i = 0
val temp = value / featuresStd(index)
while ( i < numClasses) {
margins(i) += coefficients(i * numFeaturesPlusIntercept + index) * temp
i += 1
}
}
}
if (fitIntercept) {
var i = 0
val length = features.size
while ( i < numClasses) {
margins(i) += coefficients(i * numFeaturesPlusIntercept + length)
i += 1
}
}
val maxMargin = margins.max
val marginOfLabel = margins(label.toInt)
{code}
> Use level 2 BLAS operations in LogisticAggregator
> -------------------------------------------------
>
> Key: SPARK-17134
> URL: https://issues.apache.org/jira/browse/SPARK-17134
> Project: Spark
> Issue Type: Sub-task
> Components: ML
> Reporter: Seth Hendrickson
>
> Multinomial logistic regression uses LogisticAggregator class for gradient
> updates. We should look into refactoring MLOR to use level 2 BLAS operations
> for the updates. Performance testing should be done to show improvements.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]