Let's see if I've understood how LLR is used in UR. Let P be the matrix for
the primary conversion indicator (say purchases) and Pt its transposed.

Then, with a second matrix, which can be P again to make PtP or a matrix
for a secondary indicator (say L for likes) to make PtL, we take a row from
Pt (item A) and a column from the second matrix (either P or L, in this
example) (item B) and we calculate the table that Ted Dunning explains on
his webpage: the number of coocurrences that item A *AND* B have been
purchased (or purchased AND liked), the number of times that item A *OR* B
have been purchased (or purchased OR liked), and the number of times that
*neither* item A nor B have been purchased (or purchased or liked). With
this counts we calculate LLR following the formulas that Ted Dunning
provides and the resulting LLR is what goes into the AB element in matrix
PtP or PtL. Correct?

Thank you!

On 16 November 2017 at 17:03, Noelia Osés Fernández <[email protected]>
wrote:

> Wonderful! Thanks Daniel!
>
> Suneel, I'm still new to the Apache ecosystem and so I know that Mahout is
> used but only vaguely... I still don't know the different parts well enough
> to have a good understanding of what each of them do (Spark, MLLib, PIO,
> Mahout,...)
>
> Thank you both!
>
> On 16 November 2017 at 16:59, Suneel Marthi <[email protected]> wrote:
>
>> Indeed so. Ted Dunning is an Apache Mahout PMC and committer and the
>> whole idea of Search-based Recommenders stems from his work and insights.
>> If u didn't know, the PIO UR uses Apache Mahout under the hood and hence u
>> see the LLR.
>>
>> On Thu, Nov 16, 2017 at 3:49 PM, Daniel Gabrieli <
>> [email protected]> wrote:
>>
>>> I am pretty sure the LLR stuff in UR is based off of this blog post and
>>> associated paper:
>>>
>>> http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html
>>>
>>> Accurate Methods for the Statistics of Surprise and Coincidence
>>> by Ted Dunning
>>>
>>> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.5962
>>>
>>>
>>> On Thu, Nov 16, 2017 at 10:26 AM Noelia Osés Fernández <
>>> [email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I've been trying to understand how the UR algorithm works and I think I
>>>> have a general idea. But I would like to have a *mathematical
>>>> description* of the step in which the LLR comes into play. In the CCO
>>>> presentations I have found it says:
>>>>
>>>> (PtP) compares column to column using
>>>> *log-likelihood based correlation test*
>>>>
>>>> However, I have searched for "log-likelihood based correlation test" in
>>>> google but no joy. All I get are explanations of the likelihood-ratio test
>>>> to compare two models.
>>>>
>>>> I would very much appreciate a math explanation of log-likelihood based
>>>> correlation test. Any pointers to papers or any other literature that
>>>> explains this specifically are much appreciated.
>>>>
>>>> Best regards,
>>>> Noelia
>>>>
>>>
>>
>
>

Reply via email to