PS I think the issue is really more like this, after some more testing.
When lambda (overfitting parameter) is high, the X and Y in the
factorization A = X*Y' are forced to have a small (frobenius) norm.
They underfit A, potentially a lot, if lambda is high; the values of A
are always small and
Okay, it sheds some light on the problem.
Thanks for sharing.
On Mon, Apr 8, 2013 at 4:33 AM, Sean Owen sro...@gmail.com wrote:
PS I think the issue is really more like this, after some more testing.
When lambda (overfitting parameter) is high, the X and Y in the
factorization A = X*Y' are
This sounds like the best suggestion so far.
On Apr 3, 2013, at 8:45 AM, Julien Nioche wrote:
This is typically what Behemoth can be used for
https://github.com/DigitalPebble/behemoth. It has a Mahout module to
generate vectors at the same format as SparseVectorsFromSequenceFiles.
I don't see the problem here. We only want to compare two items so Jaccard and
Tanimoto are identical.
Could you file a JIRA and suggest a javadoc patch?
Why did this take you to an ancient journal instead of Wikipedia?
On Apr 7, 2013, at 6:54 AM, James Endicott wrote:
As far as I can
I didn't want to file a suggestion for a javadoc patch without hearing from
someone who knows a bit more about the math history behind it because I
didn't want to suggest something that may be in error. When I checked the
Wikipedia article on it, the article noted that there was confusion an
Hi,
It seems to be that in-memory kmeans clustering is removed from Mahout 0.7.
Does this mean that it is no longer possible to do in-memory kmeans clustering
with Mahout?
Or, is Hadoop based kmeans clustering the only option?
Thanks
Ahmet
On Sat, Apr 6, 2013 at 3:26 PM, Pat Ferrel p...@occamsmachete.com wrote:
I guess I don't understand this issue.
In my case both the item ids and user ids of the separate DistributedRow
Matrix will match and I know the size for the entire space from a previous
step where I create id maps. I
To my mind, you as the reader have a major voice here.
So if you were confused/not happy with the doc, then there is a problem.
You will know best how to fix that when you get done.
So let us know how!
On Mon, Apr 8, 2013 at 2:16 PM, James Endicott endicott.ja...@gmail.comwrote:
I didn't