As you described here :
https://groups.google.com/forum/#!topic/actionml-user/yPKzj1Ej7hM

2017-06-05 0:28 GMT+04:00 Marius Rabenarivo <[email protected]>:

> You previously said that the combo of w2v + LDA can be combined with the
> existing UR but
> would be a separate template add-on to create enriching events for the UR.
>
> Can you give some guidance about how it should be implemented?
>
> 2017-06-04 23:14 GMT+04:00 Marius Rabenarivo <[email protected]>:
>
>> Thank you very much for all these clarifications?
>>
>> Yes, I have items with no conversions.
>> I did read in the literature that content-based recs are less sensible to
>> cold-start problem
>> so I headed to it.
>>
>> You suggested to use Word2Vec in previous post for item with few content
>> attached to it.
>>
>> I already computed Word2Vec for my items using simple sum and want to use
>> them to
>> do some smoothing in the sparse user-item matrix.
>>
>> I was thinking that a kind of tensor operation may be used with CF with
>> the Word2Vec vectors
>> atached to items.
>>
>> 2017-06-04 23:05 GMT+04:00 Pat Ferrel <[email protected]>:
>>
>>> TT’ does not solve cold start because you need user history for
>>> personalizations. There are several other techniques that I’ve mentioned
>>> many times on the list that help with cold start but TT’ is for a slightly
>>> different thing. It’s use is when you have a user’s history of item
>>> preferences but the items are too old to recommend and you only want to
>>> recommend new ones with no history. If you think about news, it is close to
>>> being like this. Or patent application, law opinions or judgments too. To
>>> be helpful there needs to be a lot of content for each item and you only
>>> want new things recommended.
>>>
>>> What cold-start do you need to “solve” new anonymous users with no
>>> history or items with no conversions? Search the PIO list and AML group for
>>> past posts on this.
>>>
>>> Tag use is implemented as both CF and content similarity (not TT’). If
>>> you ask for item-based recommendation and the item has no conversions, you
>>> will get popular items by default. If you boost items with the same tags as
>>> the item the user is looking at, you get popular items mostly with similar
>>> tags. If you disable the popularity part you get items with similar tags,
>>> This requires that you attach tags to the items with $set and your query
>>> should contain the tags (or any other properties) of the example item.
>>> There are many ways of mixing this. You could also just get recs and mix-in
>>> new inventory by some small random amount. You can use different placements
>>> for these so you aren’t ruining recs with too much randomized cold-items.
>>>
>>> Anyway, the best way to do this depends on your GUI and data.
>>>
>>>
>>> On Jun 4, 2017, at 11:35 AM, Marius Rabenarivo <
>>> [email protected]> wrote:
>>>
>>> I didn't mean to tell you what it means, but I just wanted to make it
>>> clear for my part.
>>>
>>> As I understand, the T part is a personalization that we should make if
>>> we want
>>> to use content based information when doing recommendation.
>>>
>>> For my use case, I want to use it for to overcome the cold start problem.
>>>
>>> I was thinking that it was already implemented as you documented it in
>>> the slides
>>> but I didn't find tag use in the code.
>>>
>>> Is it SimilarityAnalysis.rowSimilarity() in Mahout that implement TT'?
>>> (just to confirm)
>>>
>>> 2017-06-04 22:06 GMT+04:00 Pat Ferrel <[email protected]>:
>>>
>>>> No offense Marius but I wrote the slides and the equation so I do
>>>> indeed know what they are saying. Whether a user writes a tag or you are
>>>> detecting the user preference for a tag you wrote, they are user indicators
>>>> of preference. The LLR filtering of these secondary indicators is what CCO
>>>> is all about and leaves you with a model that can be compared to a user’s
>>>> history and contains only indicators that correlate to some conversion
>>>> behavior.
>>>>
>>>> T in the "whole enchilada" it used to personalize content based
>>>> recommendations. Each row of T represent an item and it’s content as
>>>> tokens. Tokens are stemmed, tokenized text terms, of can be entities in the
>>>> item’s text (using some form of NLP) or tags, etc.  TT’ then gives you
>>>> items and items that are most similar in terms of whatever content you were
>>>> using in T. Now you take the users’s history of content item preference,
>>>> which articles did they read for instance, and the most similar items in
>>>> TT’. These will be personalized content-based recommendations.
>>>>
>>>> This is not implemented in the UR but is in the CCO tools in Mahout.
>>>> The reason it is not implemented is that it still requires users history
>>>> and content-based recs are worse predictors than collaborative filtering
>>>> with user history. In CF you treat the terms or tags as indicators of
>>>> preference you do not find items similar by content.
>>>>
>>>> The personalized content-based recs may serve for edge conditions where
>>>> you are recommending items with no usage behavior as the most common case,
>>>> like news articles where you have no items all the time with no usage
>>>> events. In this case extracting something better than “bag-of-words” for
>>>> content is quite important. So highly detailed user tagging or NLP
>>>> techniques can greatly increase the quality of results.
>>>>
>>>>
>>>>
>>>>
>>>> On Jun 4, 2017, at 4:09 AM, Marius Rabenarivo <
>>>> [email protected]> wrote:
>>>>
>>>> IMHO, T represents tag it an Anonymous tag (or property) labeling task
>>>> and what you propose is Personalized tag (or property) labeling
>>>> as described in https://arxiv.org/pdf/1203.4487.pdf (Section 1.4.5
>>>> Emerging new classification) p. 40
>>>>
>>>> 2017-06-04 8:14 GMT+04:00 Marius Rabenarivo <[email protected]
>>>> >:
>>>>
>>>>> And what the T in the slides is for?
>>>>>
>>>>> How can we implement it if it's is not implemented yet?
>>>>>
>>>>> 2017-06-04 8:11 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>
>>>>>> Buy purchasing an item with a tag that you have given it, they are
>>>>>> displaying a preference for that tag.
>>>>>>
>>>>>>
>>>>>> On Jun 3, 2017, at 12:36 PM, Marius Rabenarivo <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>> So the tag here is assumed to be a tag given by the user to an item?
>>>>>>
>>>>>> I was thinking that it was some kind of tag we give to the item by
>>>>>> some mean (classification, LDA, etc)
>>>>>>
>>>>>> 2017-06-03 21:14 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>>
>>>>>>> A = history of all purchases (in the e-com case)
>>>>>>> B = history of all tag preferences
>>>>>>>
>>>>>>> r = [A’A]h_a + [A’B]h_b
>>>>>>>
>>>>>>> The part in the slides about content-based recs is not needed here
>>>>>>> because you have captured them as user preferences.
>>>>>>>
>>>>>>>
>>>>>>> On Jun 2, 2017, at 7:22 PM, Marius Rabenarivo <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> Please correct side to size in my previous e-mail
>>>>>>>
>>>>>>> 2017-06-03 6:14 GMT+04:00 Marius Rabenarivo <mariusrabenarivo@g
>>>>>>> mail.com>:
>>>>>>>
>>>>>>>> What will be the size of the matrix if we send an event like
>>>>>>>> tag-pref
>>>>>>>> We will get a |U|x|T| matrix I think (where T is the set of all
>>>>>>>> tags).
>>>>>>>>
>>>>>>>> So [AtA] will be a |T| x |T| matrix and we will do a dot product
>>>>>>>> with the user history hT to get recommendation right?
>>>>>>>>
>>>>>>>> I was assuming that A should be of side |U| x |I| where I is the
>>>>>>>> set of all items as it should be added to other terms of the whole
>>>>>>>> enchilada formula afterwards.
>>>>>>>>
>>>>>>>> Thank you for your guidance Pat.
>>>>>>>>
>>>>>>>> 2017-06-02 21:35 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>>>>
>>>>>>>>> Please refer to the documents. The “event” is the name of the type
>>>>>>>>> of event or indicator if preference, it implies the type of
>>>>>>>>> the targetEntityId. So a “tag-pref’ event would be accompanied by
>>>>>>>>> a targetEntityId = tag-id. This is separate from attaching “tag” 
>>>>>>>>> properties
>>>>>>>>> to items with the $set event for use with filter and boost rules. One 
>>>>>>>>> looks
>>>>>>>>> at the data as a possible preference indicator and the other is used 
>>>>>>>>> to
>>>>>>>>> restrict results. This is why we usually name events so they sound 
>>>>>>>>> like a
>>>>>>>>> user preference of some type, whereas item property values are simply 
>>>>>>>>> item
>>>>>>>>> attributes, intrinsic to the items and independent of an individual 
>>>>>>>>> user.
>>>>>>>>>
>>>>>>>>> The event can have any name that makes sense to you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jun 2, 2017, at 9:19 AM, Marius Rabenarivo <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> so, the event field should be the token and targetEntityId the
>>>>>>>>> item ID, right?
>>>>>>>>>
>>>>>>>>> 2017-06-02 20:07 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>>>>>
>>>>>>>>>> Yes, each is analyzed separately as a separate event. If you are
>>>>>>>>>> using REST you can send up to 50 events in a single array. Some SDKs 
>>>>>>>>>> may
>>>>>>>>>> support this too.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Jun 2, 2017, at 8:56 AM, Marius Rabenarivo <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> So I have to send an event like category-preference for each tag
>>>>>>>>>> associated to an item right?
>>>>>>>>>>
>>>>>>>>>> entityId: userd-id
>>>>>>>>>> event: category-preference
>>>>>>>>>> targetEntityId : tag/token
>>>>>>>>>>
>>>>>>>>>> 2017-06-02 19:47 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>>>>>>
>>>>>>>>>>> When a user expresses a preference for a tag, word or term as in
>>>>>>>>>>> search or even in content like descriptions, these can be considered
>>>>>>>>>>> secondary events. The most useful are tags and search terms in our
>>>>>>>>>>> experience. Content can be used but each term/token needs to be 
>>>>>>>>>>> sent as a
>>>>>>>>>>> separate preference while search phrases can be used though again 
>>>>>>>>>>> turning
>>>>>>>>>>> them into tokens may be better.
>>>>>>>>>>>
>>>>>>>>>>> Please looks through the docs here: http://actionml.com/docs/ur or
>>>>>>>>>>> the siide deck here: https://www.slideshare.n
>>>>>>>>>>> et/pferrel/unified-recommender-39986309
>>>>>>>>>>>
>>>>>>>>>>> The major innovation of CCO, the algorithm behind the UR, is the
>>>>>>>>>>> use of these cross-domain indicators. They are not guaranteed to 
>>>>>>>>>>> predict
>>>>>>>>>>> conversions but the CCO algo tests them and weights them low if 
>>>>>>>>>>> they do not
>>>>>>>>>>> so we tend to test for strength of prediction of the entire 
>>>>>>>>>>> category of
>>>>>>>>>>> indictor and drop them if weak or set a minLLR threshold and filter 
>>>>>>>>>>> weak
>>>>>>>>>>> individual indicators out.
>>>>>>>>>>>
>>>>>>>>>>> Technically these are not called latent, that has another
>>>>>>>>>>> meaning in Machine Learning having to do with Latent Factor 
>>>>>>>>>>> Analysis.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Jun 1, 2017, at 11:26 PM, Marius Rabenarivo <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello everyone!
>>>>>>>>>>>
>>>>>>>>>>> Do you have an idea on how to use latent informations associated
>>>>>>>>>>> to items like tag, word vector embedding in Mahout's
>>>>>>>>>>> SimilarityAnalysis.cooccurrences?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>>> Marius
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google Groups "actionml-user" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>> To post to this group, send email to actionml-user@googlegroups.
>>>>>>>>>>> com.
>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>> https://groups.google.com/d/msgid/actionml-user/CAC-AT
>>>>>>>>>>> VEO_YON-5E95iPJjBR-FUgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com
>>>>>>>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-FUgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "actionml-user" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>> To post to this group, send email to actionml-user@googlegroups.
>>>>>>>>> com.
>>>>>>>>> To view this discussion on the web visit https://groups.google.co
>>>>>>>>> m/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bac
>>>>>>>>> s5KMzcqS0kDdc0A%40mail.gmail.com
>>>>>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bacs5KMzcqS0kDdc0A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "actionml-user" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> To view this discussion on the web visit https://groups.google.co
>>>>>>> m/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3
>>>>>>> EdULpqjHK3LtEfdcQ%40mail.gmail.com
>>>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3EdULpqjHK3LtEfdcQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "actionml-user" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> To view this discussion on the web visit https://groups.google.co
>>>>>> m/d/msgid/actionml-user/CAC-ATVHa-v4Aw8Ebo4xESzKUxvyyhfEfBoS
>>>>>> PnD%2Bv_-4ZCpR0AQ%40mail.gmail.com
>>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVHa-v4Aw8Ebo4xESzKUxvyyhfEfBoSPnD%2Bv_-4ZCpR0AQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "actionml-user" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> To view this discussion on the web visit https://groups.google.co
>>> m/d/msgid/actionml-user/CAC-ATVFoJQpX8XWJ25cQo7CEF8YR%3DRzWx
>>> VHTFFZWv_fjGgC6LA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVFoJQpX8XWJ25cQo7CEF8YR%3DRzWxVHTFFZWv_fjGgC6LA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>
>

Reply via email to