I think the downsampling belongs into RowSimilarityJob. But I also think that we need a special CrossRowSimilarityJob that computes B'A and also downsamples them during the computation. Furthermore it should compute LLR similarities between the rows not dot products.
--sebastian On 05.08.2013 16:14, Pat Ferrel wrote: > OK, iI see it in my build now. Also not sufficient repos in the pom. > > Looks like some major refactoring of RowSimilarity is in progress. > > Sebastian, are you sure downsampling belongs in RowSimilairty? It won't be > applied to [B'A]? > > If so I'll update to the lastest Mahout trunk. > > On Aug 4, 2013, at 8:57 PM, B Lyon <[email protected]> wrote: > > Hi Pat > > Below is the compilation error - it's what led me to look at the SAMPLE_SIZE > stuff in the first place, where I confirmed via javap that the downloaded > mahout jar did not have it any more and then I started looking at the svn > source. Mebbe I've got something else misconfigured somehow, although I > don't see how it would compile if it's looking for that static field that's > removed. > > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile > (default-compile) on project solr-recommender: Compilation failure: > Compilation failure: > [ERROR] > /Users/bradflyon/Documents/solr-recommender/src/main/java/finderbots/recommenders/hadoop/PrepareActionMatrixesJob.java:[120,71] > cannot find symbol > [ERROR] symbol : variable SAMPLE_SIZE > [ERROR] location: class > org.apache.mahout.cf.taste.hadoop.preparation.ToItemVectorsMapper > [ERROR] > /Users/bradflyon/Documents/solr-recommender/src/main/java/finderbots/recommenders/hadoop/PrepareActionMatrixesJob.java:[168,71] > cannot find symbol > [ERROR] symbol : variable SAMPLE_SIZE > [ERROR] location: class > org.apache.mahout.cf.taste.hadoop.preparation.ToItemVectorsMapper > [ > > > On Sun, Aug 4, 2013 at 8:57 PM, Pat Ferrel <[email protected]> wrote: > Just updated to today's Mahout trunk and everything works for me. > > Can you send me the error? > > Sebastian, do we really want this limit in RowSimilairty? It will not be > applied to [B'A] unless you also do a mod to give us RowSimilairty on two > matrices. Now that would be very nice indeed… > > On Aug 3, 2013, at 9:48 PM, B Lyon <[email protected]> wrote: > > Hi Pat > > I was going to just play with building the sold-recommender stuff in its > current wip state and noticed a compile error (running mvn install) I think > because the 0.9 snapshot has some changes on July 30th > > http://svn.apache.org/viewvc?view=revision&revision=1508302 > > Basically, back on June 18, Ted noticed that the downsampling might not be > being done at the right place to actually avoid overwork due to "perversely > prolific users" (thread is here: > http://web.archiveorange.com/archive/v/z6zxQatCzHoFxbdLF0of), and someone > else (Sebastian Schelter) has already acted on this (July 30) to move the > downsampling to somewhere else (Mahout-1289 - > https://issues.apache.org/jira/browse/MAHOUT-1289), which (among other > things) removes the SAMPLE_SIZE static variable from ToItemVectorsMapper. I > don't know how the general changes affect what you were setting up/playing > with. Let me know if I've missed something here. > > >
