Re: [jira] Updated: (MAHOUT-121) Speed up distance calculations for sparse vectors

Grant Ingersoll Wed, 24 Jun 2009 16:21:20 -0700


On Jun 24, 2009, at 6:53 PM, Ted Dunning wrote:

Grant,

This optimization should have made a large difference.  Did it?

Yes. Still quantifying, but very promising. Still having a hard timefinding good t1, t2 values for the simple tests I am runningclustering Wikipedia data, so that is clouding things. It seems nomatter what I pick, I get one vector per canopy. Obviously, somethingis wrong, but I don't know what. Sigh. Of course, it could be thefact the docs I'm clustering aren't related, I guess. I'm only doingthe first 1000 from a dump. I'll try a bigger version now.

All the tests pass with the changes, though, and I had the sameproblem before.

The triangle inequality trick should help by a factor of two or moreas
well.

Re: [jira] Updated: (MAHOUT-121) Speed up distance calculations for sparse vectors

Reply via email to