So X is the array of existing ratings, would y be a 2D array then? If not,
how do you map the ratings given back to a single user (since y is
typically, to my knowledge, 1D in sklearn)?

I am still a little confused, but your example helped. Can you could go
into a little more detail on X, x, and y?

Let's say for an example of 5 users, 11 total items. That would make X a
5x11 matrix, right? What about y and x?


On Thu, Jan 16, 2014 at 8:29 AM, Manoj Kumar <manojkumarsivaraj...@gmail.com
> wrote:

> Thanks for your responses.
>
> @Kyle:
> At the risk of sounding really naive, I'd like to make the following
> comments. I'm referring to this paper that Sukru had posted,
> http://www.stat.osu.edu/~dmsl/Sarwar_2001.pdf which is item based
> collaborative filtering. I don't think there is really any need for masking
> the items that are not selected by the target user (or the user for which
> you need to predict the item rating) here. I believe it would work for
> dense cases too. Lets look at a sample session here.
>
>     from sklearn.recsys import item_cf  # Tentative names.
>     clf = item_cf()  # Here arguments like similarity criteria, number of
> recommendations can be given in the __init__
>     # Lets say there are n users who have have already rated,
>     # X is an 2-D array with the first dimension of n, the second can vary
> according to the number of items they have
>     # rated.
>     # y is the ratings they have provided. This can be either binary like
> +1 or -1 , or continuous.
>     clf.fit(X, y)
>     # After doing clf.fit(X, y) , an attribute clf.items_ would return the
> total number of items.
>     clf.predict(x)  # This will return the top n recommendations of x
>     # For each item in clf.items_ provided item is not in x, similarity is
> calculated by taking the top k similar items in x.
>
> For user based CF, yes we need to provide a mask for the item for which we
> need to predict the rating, but I suppose that can be provided in the
> __init__ (can't it)?
>
> @Alex and Nick: Thanks for your references, I'll have a look right now.
>
> However a point I don't intutively understand what clf.transform() /
> clf.fit_transform must be doing in these cases. Any pointers?  Considering
> the mentor problem, I don't think that would be a problem if the community
> is genuinely interested in this project. If I do get a +1, I can start
> thinking about the timeline, algorithms I'd like to implement etc. I'm
> really looking forward to extending my really minor scikit-learn work right
> now as part of GSoC.
>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to