Well y can be 2-D too, there are estimators like MultiTaskElasticNet
especially meant for multi-task y.

I was thinking something along these lines. Lets say
["ham", "spam", "ram", "bam", "tam"] are the five items.

and if first user gives
"ham" - 2
"spam" - 3

the second user gives
"ram" - 1
"bam" - -3
"tam" - 4

then I was thinking X = [["ham", "spam"], ["ram", "bam", "tam"]] and y =
[[2, 3], []]













On Thu, Jan 16, 2014 at 8:56 PM, Kyle Kastner <kastnerk...@gmail.com> wrote:

> So X is the array of existing ratings, would y be a 2D array then? If not,
> how do you map the ratings given back to a single user (since y is
> typically, to my knowledge, 1D in sklearn)?
>
> I am still a little confused, but your example helped. Can you could go
> into a little more detail on X, x, and y?
>
> Let's say for an example of 5 users, 11 total items. That would make X a
> 5x11 matrix, right? What about y and x?
>
>
> On Thu, Jan 16, 2014 at 8:29 AM, Manoj Kumar <
> manojkumarsivaraj...@gmail.com> wrote:
>
>> Thanks for your responses.
>>
>> @Kyle:
>> At the risk of sounding really naive, I'd like to make the following
>> comments. I'm referring to this paper that Sukru had posted,
>> http://www.stat.osu.edu/~dmsl/Sarwar_2001.pdf which is item based
>> collaborative filtering. I don't think there is really any need for masking
>> the items that are not selected by the target user (or the user for which
>> you need to predict the item rating) here. I believe it would work for
>> dense cases too. Lets look at a sample session here.
>>
>>     from sklearn.recsys import item_cf  # Tentative names.
>>     clf = item_cf()  # Here arguments like similarity criteria, number of
>> recommendations can be given in the __init__
>>     # Lets say there are n users who have have already rated,
>>     # X is an 2-D array with the first dimension of n, the second can
>> vary according to the number of items they have
>>     # rated.
>>     # y is the ratings they have provided. This can be either binary like
>> +1 or -1 , or continuous.
>>     clf.fit(X, y)
>>     # After doing clf.fit(X, y) , an attribute clf.items_ would return
>> the total number of items.
>>     clf.predict(x)  # This will return the top n recommendations of x
>>     # For each item in clf.items_ provided item is not in x, similarity
>> is calculated by taking the top k similar items in x.
>>
>> For user based CF, yes we need to provide a mask for the item for which
>> we need to predict the rating, but I suppose that can be provided in the
>> __init__ (can't it)?
>>
>> @Alex and Nick: Thanks for your references, I'll have a look right now.
>>
>> However a point I don't intutively understand what clf.transform() /
>> clf.fit_transform must be doing in these cases. Any pointers?  Considering
>> the mentor problem, I don't think that would be a problem if the community
>> is genuinely interested in this project. If I do get a +1, I can start
>> thinking about the timeline, algorithms I'd like to implement etc. I'm
>> really looking forward to extending my really minor scikit-learn work right
>> now as part of GSoC.
>>
>>
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>


-- 
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to