So in you opinion, do you think that the NLP task should be done in the Engine part using a library like mallet or should be implemented in algorithm focused library : mahout?
2017-05-10 23:52 GMT+04:00 Pat Ferrel <[email protected]>: > That is how to make personalized content-based recommendations.You’d have > to input content by attaching it to items and recording it separately as a > usage event per content bit. The input , for instance would be every term > in the description of an item the user purchased. The input would be huge > and the current UR + PIO is not optimized for that kind of input. It is not > a recommended mode to use the UR and is of dubious value without NLP > techniques such as word2vec or NER instead of bag-of-word type content. It > might be ok if you have rich metadata like categories or tags. > > In general content based recommendations are often little better than some > filtering of popular or rotating promoted items (with no purchase history), > both can be done fairly easily with the UR. > > Content based with NLP techniques for short lived items like news can work > well but require extra phases in from of the recommender to do the NLP. > > > > On May 10, 2017, at 12:33 PM, Marius Rabenarivo < > [email protected]> wrote: > > Hello, > > So to what does the matrix T and vector h_t in this slide match to? : > https://docs.google.com/presentation/d/1MzIGFsATNeAYnLfoR6797ofcLeFRK > SX7KB8GAYNtNPY/edit#slide=id.gf4d43b9e8_1_24 > > 2017-05-10 21:10 GMT+04:00 Pat Ferrel <[email protected]>: > >> Content based recommendations are based on, well, content. You can really >> only make recs if you have an example item as with the recommendations you >> see at the bottom of product page on Amazon. >> >> For this make sure t have lots of properties of items, even keywords from >> descriptions will work, but also categories, tags, brands, price ranges. >> etc. These all must be encoded as JSON arrays of strings so prices might be >> one of [“$0-$1”, “$1-$5”, …] other things like descriptions categories or >> tags can have several strings attached. >> >> Then issue an item-based query with itemBias set higher (>1) to make use >> of usage information first before content since it performs better. Then >> add query fields for the various properties but include the values of the >> item referenced in the “item” field. >> >> You will get similar items based on usage data unless there is none then >> content will take over to recommend things with similar content. Play with >> the itemBias, try >1 by varying amounts since you want usage based >> similarity over content most of the time you have usage based data in the >> model. There is no hard rule for the bias. >> >> >> On May 10, 2017, at 6:36 AM, Dennis Honders <[email protected]> >> wrote: >> >> According to the docs, the UR is considered as hybrid collaborative >> filtering / content-based filtering. >> In my case I have a purchase history. Quite a lot of products are never >> bought so traditional techniques won't be able to make recommendations. For >> those products (never bought/sold), will recommendations be made with >> content-based filtering techniques? >> If so, what techniques are used in UR? >> >> 2017-05-08 19:02 GMT+02:00 Pat Ferrel <[email protected]>: >> >>> yes to all for UR v0.5.0 >>> >>> UR v0.6.0 is sitting in the `develop` branch waiting for one more minor >>> fix to be released. It uses the latest release of Mahout 0.13.0 so no need >>> to build it for the project. Several new features too. I expect it to be >>> out this week. >>> >>> >>> On May 8, 2017, at 3:07 AM, Dennis Honders <[email protected]> >>> wrote: >>> >>> Hi, >>> >>> Are the following docs up-to-date? >>> >>> PredictionIO: http://actionml.com/docs/pio_quickstart. >>> Is version 0.11.0 suitable for UR? >>> >>> The UR: http://actionml.com/docs/ur. >>> Is 0.5.0 the latest version? >>> Is Mahout still necessary? >>> >>> Thanks, >>> >>> Dennis >>> >>> >> >> > >
