HI Sean, I am reading the paper of implicit training. Collaborative Filtering for Implicit Feedback Datasets <http://labs.yahoo.com/files/HuKorenVolinsky-ICDM08.pdf>
It mentioned "To this end, let us introduce a set of binary variables p_ui, which indicates the preference of user u to item i. The p_ui values are derived by binarizing the r_ui values: p_ui = 1 if r_ui > 0 and p_ui=0 if r_ui = 0 " If for user_item without interactions, I do not include it in the training data. All the r_ui will >0 and all the p_ui is always 1? Or the Mllib's implementation automatically takes care of those no interaction user_product pairs ? On Thu, Feb 12, 2015 at 3:13 PM, Sean Owen <so...@cloudera.com> wrote: > Where there is no user-item interaction, you provide no interaction, > not an interaction with strength 0. Otherwise your input is fully > dense. > > On Thu, Feb 12, 2015 at 11:09 PM, Crystal Xing <crystalxin...@gmail.com> > wrote: > > Hi, > > > > I have some implicit rating data, such as the purchasing data. I read > the > > paper about the implicit training algorithm used in spark and it > mentioned > > the for user-prodct pairs which do not have implicit rating data, such > as no > > purchase, we need to provide the value as 0. > > > > This is different from explicit training where when we provide training > > data, for user-product pair without a rating, we just do not have them in > > the training data instead of adding a user-product pair with rating 0. > > > > Am I understand this correctly? > > > > Or for implicit training implementation in spark, the missing data will > be > > automatically filled out as zero and we do not need to add them in the > > training data set? > > > > Thanks, > > > > Crystal. >