GitHub user bwahlgreen opened a pull request:
https://github.com/apache/spark/pull/15296
[SPARK-17721][MLlib][ML] Fix for multiplying transposed SparseMatrix with
SparseVector [WIP]
## What changes were proposed in this pull request?
* changes the implementation of gemv with transposed SparseMatrix and
SparseVector both in mllib-local and mllib (identical)
* adds a test that was failing before this change, but succeeds with these
changes.
The problem in the previous implementation was that it only increments `i`,
that is enumerating the columns of a row in the SparseMatrix, when the
row-index of the vector matches the column-index of the SparseMatrix. In cases
where a particular row of the SparseMatrix has non-zero values at
column-indices lower than corresponding non-zero row-indices of the
SparseVector, the non-zero values of the SparseVector are enumerated without
ever matching the column-index at index `i` and the remaining column-indices
i+1,...,indEnd-1 are never attempted. The test cases in this PR illustrate this
issue.
## How was this patch tested?
I have run the specific `gemv` tests in both mllib-local and mllib. I am
currently still running `./dev/run-tests`.
## ___
As per instructions, I hereby state that this is my original work and that
I license the work to the project (Apache Spark) under the project's open
source license.
Mentioning @dbtsai, @viirya and @brkyvz whom I can see have worked/authored
on these parts before.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bwahlgreen/spark bugfix-spark-17721
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15296.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15296
----
commit 276591c25418cb0766fa82853fa361d113b02e2c
Author: Bjarne Fruergaard <[email protected]>
Date: 2016-09-29T09:32:08Z
fix gemv
commit 60bc8a4c2d5fa5feb7bf8bf15b483c37a410b017
Author: Bjarne Fruergaard <[email protected]>
Date: 2016-09-29T09:32:35Z
additional tests for gemv
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]