Use the default. Tuning with a threshold is only for atypical data and unless
you have a harness for cross-validation you would not know if you were making
things worse or better. We have our own tools for this but have never had the
need for threshold tuning.
Yes, llrDownsampled(PtP) is the
Thanks Pat!
How can I tune the threshold?
And when you say "compare to each item in the model", do you mean each row
in PtP?
On 21 November 2017 at 19:56, Pat Ferrel wrote:
> No PtP non-zero elements have LLR calculated. The highest scores in the
> row are kept, or
Pat,
If I understood your explanation correctly, you say that some elements of
PtP are removed by the LLR (set to zero, to be precise). But the elements
that survive are calculated by matrix multiplication. The final PtP is put
into EleasticSearc and when we query for user recommendations ES uses
Yes, this will show the model. But if you do this a lot there are tools like
Restlet that you plug in to Chrome. They will allow you to build queries of all
sorts. For instance
GET http://localhost:9200/urindex/_search?pretty
will show the item rows of the UR model put into the index for the
There is a REST client for Elasticsearch and bindings in many popular
languages but to get started quickly I found this commands helpful:
List Indices:
curl -XGET 'localhost:9200/_cat/indices?v'
Get some documents from an index:
curl -XGET 'localhost:9200//_search?q=*'
Then look at the
Thanks Daniel!
And excuse my ignorance but... how do you inspect the ES index?
On 20 November 2017 at 15:29, Daniel Gabrieli
wrote:
> There is this cli tool and article with more information that does produce
> scores:
>
>
There is this cli tool and article with more information that does produce
scores:
https://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html
But I don't know of any commands that return diagnostics about LLR from the
PIO framework / UR engine. That would be a nice feature if it
This thread is very enlightening, thank you very much!
Is there a way I can see what the P, PtP, and PtL matrices of an app are?
In the handmade case, for example?
Are there any pio calls I can use to get these?
On 17 November 2017 at 19:52, Pat Ferrel wrote:
> Mahout
Maybe someone can correct me if I am wrong but in the code I believe
Elasticsearch is used instead of "resulting LLR is what goes into the AB
element in matrix PtP or PtL."
By default the strongest 50 LLR scores get set as searchable values in
Elasticsearch per item-event pair.
You can configure
Wonderful! Thanks Daniel!
Suneel, I'm still new to the Apache ecosystem and so I know that Mahout is
used but only vaguely... I still don't know the different parts well enough
to have a good understanding of what each of them do (Spark, MLLib, PIO,
Mahout,...)
Thank you both!
On 16 November
Indeed so. Ted Dunning is an Apache Mahout PMC and committer and the whole
idea of Search-based Recommenders stems from his work and insights. If u
didn't know, the PIO UR uses Apache Mahout under the hood and hence u see
the LLR.
On Thu, Nov 16, 2017 at 3:49 PM, Daniel Gabrieli
I am pretty sure the LLR stuff in UR is based off of this blog post and
associated paper:
http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html
Accurate Methods for the Statistics of Surprise and Coincidence
by Ted Dunning
12 matches
Mail list logo