The machinery of SVD is almost always described in terms of least squares
matrix approximation without mentioning the probabilistic underpinnings of
why least-squares is a good idea. The connection, however, goes all the way
back to Gauss' reduction of planetary position observations (this is *why
Thanks David, that helped.
On Wed, Apr 1, 2009 at 1:47 AM, David Hall wrote:
> On Tue, Mar 31, 2009 at 11:43 PM, Atul Kulkarni
> wrote:
> > questions in line.
> >
> > On Wed, Apr 1, 2009 at 1:27 AM, Ted Dunning
> wrote:
> >
> >> Nobody is working on SVD yet, but one GSOC applicant has said th
On Tue, Mar 31, 2009 at 11:43 PM, Atul Kulkarni wrote:
> questions in line.
>
> On Wed, Apr 1, 2009 at 1:27 AM, Ted Dunning wrote:
>
>> Nobody is working on SVD yet, but one GSOC applicant has said that they
>> would like to work on LDA which is a probabilistic relative of SVD.
>>
> I do not unde
On Wed, Apr 1, 2009 at 1:30 AM, Ted Dunning wrote:
> I would hope that your SVD implementation would not be limited to NetFlix
> like problems, but would be applicable to any reasonably sparse matrix-like
> data.
>
Yes, ofcourse. it would apply to any large sparse matrix implementation.
>
> Like
questions in line.
On Wed, Apr 1, 2009 at 1:27 AM, Ted Dunning wrote:
> Nobody is working on SVD yet, but one GSOC applicant has said that they
> would like to work on LDA which is a probabilistic relative of SVD.
>
I do not understand the relation in LDA and SVD. In my limited understanding
I u
I would hope that your SVD implementation would not be limited to NetFlix
like problems, but would be applicable to any reasonably sparse matrix-like
data.
Likewise, I would expect a good SVD implementation to be useful for nearest
neighbor methods or direct prediction by smoothing the history vec
Nobody is working on SVD yet, but one GSOC applicant has said that they
would like to work on LDA which is a probabilistic relative of SVD.
The approach in your reference (3) is highly amenable to parallel
implementation.
Large-scale SVD would be a very interesting application for Mahout.
On Tue
>I agree that getting a parallel SVD running is in and of itself
>probably a good project in terms of size. On the other hand it would
>be better to end up with a basic recommender as a final product. But
>even if SVD by itself doesn't make up a complete unit by itself for
>collaborative filtering
SVD or a cousin was a very common feature among the leading netflix
entries. SVD is, indeed, very slow if you do a complete decomposition. The
point, of course, for large sparse matrices is that you want an
approximation so you only compute the first few singular vectors/values. To
do this effic
Simple co-occurrence counting is at the heart of most large-scale
recommendation systems. Counting plus simple (but sound) statistical
filtering suffices for a broad range of recommendation tasks with very high
quality results. For statistical filtering, I typically recommend the G^2
statistic as
On Thu, Mar 5, 2009 at 4:24 AM, Sean Owen wrote:
> This would be a fantastic project, implementing a Recommender based on
> this approach . I tried implementing an SVD technique a couple years
> ago and it was waaay too slow on one machine. Revisiting with Hadoop
> sounds great.
SVD (at least h
On Thu, Mar 5, 2009 at 1:08 PM, QIU, Yin wrote:
> Glad that you are so positive about this. I just googled and found the
> article addressing parallel SVD [1], which was devised by Google. I
> shall spend some time reading this. If we are really going to do this
> project, implementing only the SV
Hi Sean,
> Really, I have never run this code in a real Hadoop environment. There
> could be bugs, or improvements, that fall out from that. For example
> there might be some more efficient way to use Hadoop that I don't see.
> I don't have anything specific in mind -- these are unknown-unknowns
>
On Thu, Mar 5, 2009 at 7:27 AM, QIU, Yin wrote:
> I don't know slope one recommender yet. Maybe I should read that first
> to know how you manage to divide the tasks. However, a little
> explanation in advance would be appreciated.
http://en.wikipedia.org/wiki/Slope_One explains slope one pretty
Hi!
> Yes there is a framework in the code for running a Recommender across
> machines in Hadoop, and a Hadoop job which distributes part of the
> processing for a slope one recommender.
I don't know slope one recommender yet. Maybe I should read that first
to know how you manage to divide the ta
(Oops, of course. Didn't mean to imply there should be a side conversation
but that's how it came out. I just mean there is definitely at least one
person here who could and would 'mentor' such a project.)
On 4 Mar 2009, 12:20 PM, "Grant Ingersoll" wrote:
On Mar 4, 2009, at 3:55 AM, Sean Owen wr
On Mar 4, 2009, at 3:55 AM, Sean Owen wrote:
Yes there is a framework in the code for running a Recommender across
machines in Hadoop, and a Hadoop job which distributes part of the
processing for a slope one recommender.
Both could use testing, refinement and enhancement.
I do not know of an
Yes there is a framework in the code for running a Recommender across
machines in Hadoop, and a Hadoop job which distributes part of the
processing for a slope one recommender.
Both could use testing, refinement and enhancement.
I do not know of an algorithm which is by nature efficiently distrib
Hi mahout folks,
For this year's GSoC, I'm particularly interested in CF-related
algorithms running on MapReduce-like environments. Will anyone tell me
about the current status of recommender algorithms in Mahout please?
Does it need any improvement?
Thanks a lot.
--
Yin Qiu
19 matches
Mail list logo