Re: Log-likelihood based correlation test?

2017-11-23 Thread Pat Ferrel
Use the default. Tuning with a threshold is only for atypical data and unless you have a harness for cross-validation you would not know if you were making things worse or better. We have our own tools for this but have never had the need for threshold tuning. Yes, llrDownsampled(PtP) is the

Re: Log-likelihood based correlation test?

2017-11-22 Thread Noelia Osés Fernández
Thanks Pat! How can I tune the threshold? And when you say "compare to each item in the model", do you mean each row in PtP? On 21 November 2017 at 19:56, Pat Ferrel wrote: > No PtP non-zero elements have LLR calculated. The highest scores in the > row are kept, or

Re: Log-likelihood based correlation test?

2017-11-21 Thread Noelia Osés Fernández
Pat, If I understood your explanation correctly, you say that some elements of PtP are removed by the LLR (set to zero, to be precise). But the elements that survive are calculated by matrix multiplication. The final PtP is put into EleasticSearc and when we query for user recommendations ES uses

Re: Log-likelihood based correlation test?

2017-11-20 Thread Pat Ferrel
Yes, this will show the model. But if you do this a lot there are tools like Restlet that you plug in to Chrome. They will allow you to build queries of all sorts. For instance GET http://localhost:9200/urindex/_search?pretty will show the item rows of the UR model put into the index for the

Re: Log-likelihood based correlation test?

2017-11-20 Thread Daniel Gabrieli
There is a REST client for Elasticsearch and bindings in many popular languages but to get started quickly I found this commands helpful: List Indices: curl -XGET 'localhost:9200/_cat/indices?v' Get some documents from an index: curl -XGET 'localhost:9200//_search?q=*' Then look at the

Re: Log-likelihood based correlation test?

2017-11-20 Thread Noelia Osés Fernández
Thanks Daniel! And excuse my ignorance but... how do you inspect the ES index? On 20 November 2017 at 15:29, Daniel Gabrieli wrote: > There is this cli tool and article with more information that does produce > scores: > >

Re: Log-likelihood based correlation test?

2017-11-20 Thread Daniel Gabrieli
There is this cli tool and article with more information that does produce scores: https://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html But I don't know of any commands that return diagnostics about LLR from the PIO framework / UR engine. That would be a nice feature if it

Re: Log-likelihood based correlation test?

2017-11-20 Thread Noelia Osés Fernández
This thread is very enlightening, thank you very much! Is there a way I can see what the P, PtP, and PtL matrices of an app are? In the handmade case, for example? Are there any pio calls I can use to get these? On 17 November 2017 at 19:52, Pat Ferrel wrote: > Mahout

Re: Log-likelihood based correlation test?

2017-11-17 Thread Daniel Gabrieli
Maybe someone can correct me if I am wrong but in the code I believe Elasticsearch is used instead of "resulting LLR is what goes into the AB element in matrix PtP or PtL." By default the strongest 50 LLR scores get set as searchable values in Elasticsearch per item-event pair. You can configure

Re: Log-likelihood based correlation test?

2017-11-16 Thread Noelia Osés Fernández
Wonderful! Thanks Daniel! Suneel, I'm still new to the Apache ecosystem and so I know that Mahout is used but only vaguely... I still don't know the different parts well enough to have a good understanding of what each of them do (Spark, MLLib, PIO, Mahout,...) Thank you both! On 16 November

Re: Log-likelihood based correlation test?

2017-11-16 Thread Suneel Marthi
Indeed so. Ted Dunning is an Apache Mahout PMC and committer and the whole idea of Search-based Recommenders stems from his work and insights. If u didn't know, the PIO UR uses Apache Mahout under the hood and hence u see the LLR. On Thu, Nov 16, 2017 at 3:49 PM, Daniel Gabrieli

Re: Log-likelihood based correlation test?

2017-11-16 Thread Daniel Gabrieli
I am pretty sure the LLR stuff in UR is based off of this blog post and associated paper: http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html Accurate Methods for the Statistics of Surprise and Coincidence by Ted Dunning