Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/3997#issuecomment-70217074
@hhbyyh @srowen There are some performance issues if we use unnecessary
index lookup. Having many `other.values(i)` calls is slower than `val
otherValues = other.values` and then many `otherValues(i)` calls. I'm
suggesting some code like the following (I didn't try compiling the code):
~~~scala
var ii0 = this.indices
var vv0 = this.values
var ii1 = this.indices
var vv1 = this.values
var j0 = 0
var j1 = 0
var i0 = 0
var i1 = 0
var v0 = 0.0
var v1 = 0.0
var pj0 = -1
var pj1 = -1
var allEqual = true
while(allEqual && j0 < sz0 && j1 < sz1) {
if (pj0 < j0) {
i0 = ii0(j0)
v0 = vv0(j0)
pj0 = j0
}
if (pj1 < j1) {
i1 = ii1(j1)
v1 = vv1(j1)
pj1 = j1
}
if (i0 == i1) {
allEqual &&= v0 == v1
j0 += 1
j1 += 1
} else if (i0 < i1) {
allEqual &&= v0 == 0.0
j0 += 1
} else {
allEqual &&= v1 == 0.0
j1 += 1
}
while (allEqual & j0 < sz0) {
allEqual &&= vv0(j0) == 0.0
j0 += 1
}
while (allEqual & j1 < sz1) {
allEqual &&= vv1(j1) == 0.0
j1 += 1
}
allEqual
~~~
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]