Sean Owen <srowen <at> gmail.com> writes:

> 
> Parallel ALS is exactly an example of where you can use matrix
> factorization for "0/1" data.
> 
> On Mon, May 6, 2013 at 9:22 PM, Tevfik Aytekin <tevfik.aytekin <at> 
gmail.com> wrote:
> > Hi Sean,
> > Isn't boolean preferences is supported in the context of memory-based
> > recommendation algorithms in Mahout?
> > Are there matrix factorization algorithms in Mahout which can work
> > with this kind of data (that is, the kind of data which consists of
> > users and the movies they have seen).
> >
> >
> >
> >
> > On Mon, May 6, 2013 at 10:34 PM, Sean Owen <srowen <at> gmail.com> 
wrote:
> >> Yes, it goes by the name 'boolean prefs' in the project since target
> >> variables don't have values -- they just exist or don't.
> >> So, yes it's certainly supported but the question here is how to
> >> evaluate the output.
> >>
> >> On Mon, May 6, 2013 at 8:29 PM, Tevfik Aytekin <tevfik.aytekin <at> 
gmail.com> wrote:
> >>> This problem is called one-class classification problem. In the domain
> >>> of collaborative filtering it is called one-class collaborative
> >>> filtering (since what you have are only positive preferences). You may
> >>> search the web with these key words to find papers providing
> >>> solutions. I'm not sure whether Mahout has algorithms for one-class
> >>> collaborative filtering.
> >>>
> >>> On Mon, May 6, 2013 at 1:42 PM, Sean Owen <srowen <at> gmail.com> 
wrote:
> >>>> ALS-WR weights the error on each term differently, so the average
> >>>> error doesn't really have meaning here, even if you are comparing the
> >>>> difference with "1". I think you will need to fall back to mean
> >>>> average precision or something.
> >>>>
> >>>> On Mon, May 6, 2013 at 11:24 AM, William <icswilliam2010 <at> 
gmail.com> wrote:
> >>>>> Sean Owen <srowen <at> gmail.com> writes:
> >>>>>
> >>>>>>
> >>>>>> If you have no ratings, how are you using RMSE? this typically
> >>>>>> measures error in reconstructing ratings.
> >>>>>> I think you are probably measuring something meaningless.
> >>>>>>
> >>>>>
> >>>>>
> >>>>> I suppose the rate of seen movies are 1. Is it right?
> >>>>> If I use Collaborative Filtering with ALS-WR to get some 
recommendations, I
> >>>>> must have a real rating-matrix?
> >>>>>
> >>>>>
> >>>>>

I was wondering what kind of format the output produced by parallelALS is 
stored in. More specifically I am looking for a way to decode/read this 
information. 

I have been able to run the mahout parallelALS command, calculate RMSE using 
mahout evaluateFactorization, and generate recommendations via mahout 
recommendfactorized.  

However I would like to take a closer look at things like the factorized 
products for my probeSet (stored in --tempDir from the 'mahout 
evaluateFactorization' command) and the actual feature vectors stored in the 
/out/U/ and /out/M/ directories.

thanks
AJ 


Reply via email to