You previously said that the combo of w2v + LDA can be combined with the
existing UR but
would be a separate template add-on to create enriching events for the UR.

Can you give some guidance about how it should be implemented?

2017-06-04 23:14 GMT+04:00 Marius Rabenarivo <[email protected]>:

> Thank you very much for all these clarifications?
>
> Yes, I have items with no conversions.
> I did read in the literature that content-based recs are less sensible to
> cold-start problem
> so I headed to it.
>
> You suggested to use Word2Vec in previous post for item with few content
> attached to it.
>
> I already computed Word2Vec for my items using simple sum and want to use
> them to
> do some smoothing in the sparse user-item matrix.
>
> I was thinking that a kind of tensor operation may be used with CF with
> the Word2Vec vectors
> atached to items.
>
> 2017-06-04 23:05 GMT+04:00 Pat Ferrel <[email protected]>:
>
>> TT’ does not solve cold start because you need user history for
>> personalizations. There are several other techniques that I’ve mentioned
>> many times on the list that help with cold start but TT’ is for a slightly
>> different thing. It’s use is when you have a user’s history of item
>> preferences but the items are too old to recommend and you only want to
>> recommend new ones with no history. If you think about news, it is close to
>> being like this. Or patent application, law opinions or judgments too. To
>> be helpful there needs to be a lot of content for each item and you only
>> want new things recommended.
>>
>> What cold-start do you need to “solve” new anonymous users with no
>> history or items with no conversions? Search the PIO list and AML group for
>> past posts on this.
>>
>> Tag use is implemented as both CF and content similarity (not TT’). If
>> you ask for item-based recommendation and the item has no conversions, you
>> will get popular items by default. If you boost items with the same tags as
>> the item the user is looking at, you get popular items mostly with similar
>> tags. If you disable the popularity part you get items with similar tags,
>> This requires that you attach tags to the items with $set and your query
>> should contain the tags (or any other properties) of the example item.
>> There are many ways of mixing this. You could also just get recs and mix-in
>> new inventory by some small random amount. You can use different placements
>> for these so you aren’t ruining recs with too much randomized cold-items.
>>
>> Anyway, the best way to do this depends on your GUI and data.
>>
>>
>> On Jun 4, 2017, at 11:35 AM, Marius Rabenarivo <
>> [email protected]> wrote:
>>
>> I didn't mean to tell you what it means, but I just wanted to make it
>> clear for my part.
>>
>> As I understand, the T part is a personalization that we should make if
>> we want
>> to use content based information when doing recommendation.
>>
>> For my use case, I want to use it for to overcome the cold start problem.
>>
>> I was thinking that it was already implemented as you documented it in
>> the slides
>> but I didn't find tag use in the code.
>>
>> Is it SimilarityAnalysis.rowSimilarity() in Mahout that implement TT'?
>> (just to confirm)
>>
>> 2017-06-04 22:06 GMT+04:00 Pat Ferrel <[email protected]>:
>>
>>> No offense Marius but I wrote the slides and the equation so I do indeed
>>> know what they are saying. Whether a user writes a tag or you are detecting
>>> the user preference for a tag you wrote, they are user indicators of
>>> preference. The LLR filtering of these secondary indicators is what CCO is
>>> all about and leaves you with a model that can be compared to a user’s
>>> history and contains only indicators that correlate to some conversion
>>> behavior.
>>>
>>> T in the "whole enchilada" it used to personalize content based
>>> recommendations. Each row of T represent an item and it’s content as
>>> tokens. Tokens are stemmed, tokenized text terms, of can be entities in the
>>> item’s text (using some form of NLP) or tags, etc.  TT’ then gives you
>>> items and items that are most similar in terms of whatever content you were
>>> using in T. Now you take the users’s history of content item preference,
>>> which articles did they read for instance, and the most similar items in
>>> TT’. These will be personalized content-based recommendations.
>>>
>>> This is not implemented in the UR but is in the CCO tools in Mahout. The
>>> reason it is not implemented is that it still requires users history and
>>> content-based recs are worse predictors than collaborative filtering with
>>> user history. In CF you treat the terms or tags as indicators of preference
>>> you do not find items similar by content.
>>>
>>> The personalized content-based recs may serve for edge conditions where
>>> you are recommending items with no usage behavior as the most common case,
>>> like news articles where you have no items all the time with no usage
>>> events. In this case extracting something better than “bag-of-words” for
>>> content is quite important. So highly detailed user tagging or NLP
>>> techniques can greatly increase the quality of results.
>>>
>>>
>>>
>>>
>>> On Jun 4, 2017, at 4:09 AM, Marius Rabenarivo <
>>> [email protected]> wrote:
>>>
>>> IMHO, T represents tag it an Anonymous tag (or property) labeling task
>>> and what you propose is Personalized tag (or property) labeling
>>> as described in https://arxiv.org/pdf/1203.4487.pdf (Section 1.4.5
>>> Emerging new classification) p. 40
>>>
>>> 2017-06-04 8:14 GMT+04:00 Marius Rabenarivo <[email protected]>
>>> :
>>>
>>>> And what the T in the slides is for?
>>>>
>>>> How can we implement it if it's is not implemented yet?
>>>>
>>>> 2017-06-04 8:11 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>
>>>>> Buy purchasing an item with a tag that you have given it, they are
>>>>> displaying a preference for that tag.
>>>>>
>>>>>
>>>>> On Jun 3, 2017, at 12:36 PM, Marius Rabenarivo <
>>>>> [email protected]> wrote:
>>>>>
>>>>> So the tag here is assumed to be a tag given by the user to an item?
>>>>>
>>>>> I was thinking that it was some kind of tag we give to the item by
>>>>> some mean (classification, LDA, etc)
>>>>>
>>>>> 2017-06-03 21:14 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>
>>>>>> A = history of all purchases (in the e-com case)
>>>>>> B = history of all tag preferences
>>>>>>
>>>>>> r = [A’A]h_a + [A’B]h_b
>>>>>>
>>>>>> The part in the slides about content-based recs is not needed here
>>>>>> because you have captured them as user preferences.
>>>>>>
>>>>>>
>>>>>> On Jun 2, 2017, at 7:22 PM, Marius Rabenarivo <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>> Please correct side to size in my previous e-mail
>>>>>>
>>>>>> 2017-06-03 6:14 GMT+04:00 Marius Rabenarivo <mariusrabenarivo@g
>>>>>> mail.com>:
>>>>>>
>>>>>>> What will be the size of the matrix if we send an event like tag-pref
>>>>>>>
>>>>>>> We will get a |U|x|T| matrix I think (where T is the set of all
>>>>>>> tags).
>>>>>>>
>>>>>>> So [AtA] will be a |T| x |T| matrix and we will do a dot product
>>>>>>> with the user history hT to get recommendation right?
>>>>>>>
>>>>>>> I was assuming that A should be of side |U| x |I| where I is the set
>>>>>>> of all items as it should be added to other terms of the whole enchilada
>>>>>>> formula afterwards.
>>>>>>>
>>>>>>> Thank you for your guidance Pat.
>>>>>>>
>>>>>>> 2017-06-02 21:35 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>>>
>>>>>>>> Please refer to the documents. The “event” is the name of the type
>>>>>>>> of event or indicator if preference, it implies the type of
>>>>>>>> the targetEntityId. So a “tag-pref’ event would be accompanied by
>>>>>>>> a targetEntityId = tag-id. This is separate from attaching “tag” 
>>>>>>>> properties
>>>>>>>> to items with the $set event for use with filter and boost rules. One 
>>>>>>>> looks
>>>>>>>> at the data as a possible preference indicator and the other is used to
>>>>>>>> restrict results. This is why we usually name events so they sound 
>>>>>>>> like a
>>>>>>>> user preference of some type, whereas item property values are simply 
>>>>>>>> item
>>>>>>>> attributes, intrinsic to the items and independent of an individual 
>>>>>>>> user.
>>>>>>>>
>>>>>>>> The event can have any name that makes sense to you.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jun 2, 2017, at 9:19 AM, Marius Rabenarivo <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>> so, the event field should be the token and targetEntityId the item
>>>>>>>> ID, right?
>>>>>>>>
>>>>>>>> 2017-06-02 20:07 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>>>>
>>>>>>>>> Yes, each is analyzed separately as a separate event. If you are
>>>>>>>>> using REST you can send up to 50 events in a single array. Some SDKs 
>>>>>>>>> may
>>>>>>>>> support this too.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jun 2, 2017, at 8:56 AM, Marius Rabenarivo <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> So I have to send an event like category-preference for each tag
>>>>>>>>> associated to an item right?
>>>>>>>>>
>>>>>>>>> entityId: userd-id
>>>>>>>>> event: category-preference
>>>>>>>>> targetEntityId : tag/token
>>>>>>>>>
>>>>>>>>> 2017-06-02 19:47 GMT+04:00 Pat Ferrel <[email protected]>:
>>>>>>>>>
>>>>>>>>>> When a user expresses a preference for a tag, word or term as in
>>>>>>>>>> search or even in content like descriptions, these can be considered
>>>>>>>>>> secondary events. The most useful are tags and search terms in our
>>>>>>>>>> experience. Content can be used but each term/token needs to be sent 
>>>>>>>>>> as a
>>>>>>>>>> separate preference while search phrases can be used though again 
>>>>>>>>>> turning
>>>>>>>>>> them into tokens may be better.
>>>>>>>>>>
>>>>>>>>>> Please looks through the docs here: http://actionml.com/docs/ur or
>>>>>>>>>> the siide deck here: https://www.slideshare.n
>>>>>>>>>> et/pferrel/unified-recommender-39986309
>>>>>>>>>>
>>>>>>>>>> The major innovation of CCO, the algorithm behind the UR, is the
>>>>>>>>>> use of these cross-domain indicators. They are not guaranteed to 
>>>>>>>>>> predict
>>>>>>>>>> conversions but the CCO algo tests them and weights them low if they 
>>>>>>>>>> do not
>>>>>>>>>> so we tend to test for strength of prediction of the entire category 
>>>>>>>>>> of
>>>>>>>>>> indictor and drop them if weak or set a minLLR threshold and filter 
>>>>>>>>>> weak
>>>>>>>>>> individual indicators out.
>>>>>>>>>>
>>>>>>>>>> Technically these are not called latent, that has another meaning
>>>>>>>>>> in Machine Learning having to do with Latent Factor Analysis.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Jun 1, 2017, at 11:26 PM, Marius Rabenarivo <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> Hello everyone!
>>>>>>>>>>
>>>>>>>>>> Do you have an idea on how to use latent informations associated
>>>>>>>>>> to items like tag, word vector embedding in Mahout's
>>>>>>>>>> SimilarityAnalysis.cooccurrences?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Marius
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "actionml-user" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To post to this group, send email to actionml-user@googlegroups.
>>>>>>>>>> com.
>>>>>>>>>> To view this discussion on the web visit https://groups.google.co
>>>>>>>>>> m/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-FUgEv8TQsOA
>>>>>>>>>> 0rtD-xg0u-tNA_g%40mail.gmail.com
>>>>>>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-FUgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "actionml-user" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected]
>>>>>>>> .
>>>>>>>> To view this discussion on the web visit https://groups.google.co
>>>>>>>> m/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bac
>>>>>>>> s5KMzcqS0kDdc0A%40mail.gmail.com
>>>>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bacs5KMzcqS0kDdc0A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "actionml-user" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> To view this discussion on the web visit https://groups.google.co
>>>>>> m/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3
>>>>>> EdULpqjHK3LtEfdcQ%40mail.gmail.com
>>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3EdULpqjHK3LtEfdcQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "actionml-user" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> To view this discussion on the web visit https://groups.google.co
>>>>> m/d/msgid/actionml-user/CAC-ATVHa-v4Aw8Ebo4xESzKUxvyyhfEfBoS
>>>>> PnD%2Bv_-4ZCpR0AQ%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVHa-v4Aw8Ebo4xESzKUxvyyhfEfBoSPnD%2Bv_-4ZCpR0AQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "actionml-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> To view this discussion on the web visit https://groups.google.co
>> m/d/msgid/actionml-user/CAC-ATVFoJQpX8XWJ25cQo7CEF8YR%3DRzWx
>> VHTFFZWv_fjGgC6LA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/actionml-user/CAC-ATVFoJQpX8XWJ25cQo7CEF8YR%3DRzWxVHTFFZWv_fjGgC6LA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>

Reply via email to