On Sat, Dec 12, 2009 at 8:58 AM, Sean Owen <[email protected]> wrote:
> ...
> I think that's the culprit in fact, having to load all the column
> vectors, since they're not light.
>
If the vector matrix product is done like this:
Vector w = v.like();
MultiplyAdd scale = new MultiplyAdd();
while (v.iterateNonZero().hasNext()) {
Vector.Element element = v.iterateNonZero().next();
scale.setScale(element.get());
w.assign(getColumn(element.index()), scale);
}
Then you might have a better speed, especially since you can lazy load just
the columns you want. Google collections has an interesting dynamic map
builder that would be useful for this.
>
> One approach is to make the user vectors more sparse by throwing out
> data, though I don't like it so much.
>
This is often useful actually, but more in the sense of only retaining
recent events than sparsification. If you get lots of data per user, then
this isn't much of a problem. If you use rarer data, then you may have more
of an issue (ratings are the prime example).
>
> One question -- in SparseVector, can't we internally remove entries
> when they are set to 0.0? since implicitly missing entries are 0?
>
>
Absolutely. Depending on representation, deletion of elements may not help
so much unless subsequent elements are added that fill in the holes.
--
Ted Dunning, CTO
DeepDyve