Hi,
Sorry my bad, I searched for the piece of code on the internet and your
repository came first. I had the right repo in prod (
https://github.com/actionml/universal-recommender), I doubled check. Sorry
for the misunderstanding.
We still have the same problem.
Here the engine.json we use:
*{*
* "comment":"",*
* "id": "default",*
* "description": "settings",*
* "engineFactory": "org.template.RecommendationEngine",*
* "datasource": {*
* "params" : {*
* "name": "sample-handmade-data.txt",*
* "appName": "piourcluster",*
* "eventNames": ["facet","view"]*
* }*
* },*
* "sparkConf": {*
* "spark.serializer": "org.apache.spark.serializer.KryoSerializer",*
* "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io
<http://sparkbindings.io>.MahoutKryoRegistrator",*
* "spark.kryo.referenceTracking": "false",*
* "spark.kryoserializer.buffer": "300m",*
* "es.index.auto.create": "true",*
* "es.nodes":"espionode1:9200,espionode2:9200,espionode3:9200"*
* },*
*"algorithms": [*
* {*
* "name": "ur",*
* "params": {*
* "appName": "piourcluster",*
* "indexName": "urindex",*
* "typeName": "items",*
* "eventNames": ["facet", "view"],*
* "blacklistEvents": [],*
* "maxEventsPerEventType": 50000,*
* "maxCorrelatorsPerEventType": 50,*
* "maxQueryEvents": 100,*
* "num": 11,*
* "rankings": [*
* {*
* "name": "popRank",*
* "type": "popular"*
* }*
* ],*
* "returnSelf": true*
* }*
* }*
* ]*
*}*
We don't have a blacklist in our query, the query is basic, we use the Java
API giving it the user id and the number of recommendation we want back.
2017-03-31 21:55 GMT+02:00 Pat Ferrel <[email protected]>:
> you should not be using code from that repo. See the pio template gallery,
> it points to the correct template. My personal version is for experimental
> branches.
>
> The repo is here: https://github.com/actionml/universal-recommender
>
> The function is here: https://github.com/actionml/universal-
> recommender/blob/master/src/main/scala/URAlgorithm.scala#L634 and looks
> like it is doing the right thing.
>
> Try it with UR v0.5.0 from the correct repo and if it doesn’t work, I’ll
> take a look. Please send along the engine.json you used. just to be sure we
> are on the same page. BTW are you using a blacklist in your query also?
> Please give an example query.
>
>
> On Mar 31, 2017, at 6:45 AM, Bruno LEBON <[email protected]> wrote:
>
> Hi,
>
> Thanks for your answer. We tried that already but it doesnt change
> anything, we still have blacklisted items (primary events mainly or only
> from what I see).
>
> I think the piece of code in charge of blacklisting is this one: (from
> here https://github.com/pferrel/template-scala-parallel-universal-
> recommendation/blob/master/src/main/scala/URAlgorithm.scala)
>
> * /** Create a list of item ids that the user has interacted with or are
> not to be included in recommendations */*
> * def getExcludedItems(userEvents: Seq[Event], query: Query): Seq[String]
> = {*
>
> * val blacklistedItems = userEvents.filter { event =>*
> * // either a list or an empty list of filtering events so honor them*
> * blacklistEvents match {*
> * case Nil => modelEventNames.head equals event.event*
> * case _ => blacklistEvents contains event.event*
> * }*
> * }.map(_.targetEntityId.getOrElse("")) ++
> query.blacklistItems.getOrEmpty.distinct*
>
> * // Now conditionally add the query item itself*
> * val includeSelf = query.returnSelf.getOrElse(returnSelf)*
> * val allExcludedItems = if (!includeSelf && query.item.nonEmpty) {*
> * blacklistedItems :+ query.item.get*
> * } // add the query item to be excuded*
> * else {*
> * blacklistedItems*
> * }*
> * allExcludedItems.distinct*
> * }*
>
> But my knowledge of Scala is very limited, so I dont understand the
> details. Does it say that if the parameter blacklistEvents is empty, aka =
> [], then no events are to be excluded (plus/minus the includeSelf option).
>
> Do I have the right version of UR? (https://github.com/pferrel/
> template-scala-parallel-universal-recommendation)
>
> 2017-03-30 20:00 GMT+02:00 Pat Ferrel <[email protected]>:
>
>> *"blacklistEvents": [[]], should be **"blacklistEvents": [],*
>>
>>
>> On Mar 30, 2017, at 8:57 AM, Bruno LEBON <[email protected]> wrote:
>>
>> Hello,
>>
>> We test the universal recommender on a cluster made following the
>> tutorial from actionML. Once the build/train/deploy is done we send PIO a
>> request to get recommendation.
>> For example:
>> *curl -H "Content-Type: application/json" -d '{ "user":
>> "4e810ef4-977a-4f04-b585-cf2c2996ec93", "num": 11 }'
>> http://localhost:8001/queries.json <http://localhost:8001/queries.json>*
>>
>> In the pio.log we see the requests made to Elasticsearch. They look like:
>>
>> *{"size":11,"query":{"bool":{"should":[{"terms":{"facet":["estag_begin-couleur-noir-estag_end","cocooning","sexy","charme","estag_begin-taille-105h-estag_end","estag_begin-taille-4-estag_end","estag_begin-primadonna-estag_end","transparent","estag_begin-aubade-estag_end","estag_begin-couleur-rouge-estag_end","une-piece","estag_begin-simone-perele-estag_end","maintien","moins-de-20-euros-intervalle-de-prix","estag_begin-taille-taille-unique-estag_end","estag_begin-moins-50-pour-cent-estag_end","elasthanne","blouse","body","coque","string","slip","estag_begin-taille-95a-estag_end"]}},{"terms":{"view":[]}},{"constant_score":{"filter":{"match_all":{}},"boost":0}}],"must":[],"must_not":{"ids":{"values":["estag_begin-taille-95a-estag_end","string","estag_begin-aubade-estag_end","slip","elasthanne","coque","body","blouse","estag_begin-moins-50-pour-cent-estag_end","estag_begin-primadonna-estag_end","estag_begin-taille-taille-unique-estag_end","moins-de-20-euros-intervalle-de-prix","maintien","estag_begin-simone-perele-estag_end","une-piece","estag_begin-couleur-rouge-estag_end","transparent","sexy","estag_begin-taille-4-estag_end","estag_begin-taille-105h-estag_end","charme","cocooning","estag_begin-couleur-noir-estag_end"],"boost":0}},"minimum_should_match":1}},"sort":[{"_score":{"order":"desc"}},{"popRank":{"unmapped_type":"double","order":"desc"}}]}*
>>
>> The important part is the fact that there is a must_not that is not
>> empty. We want it to be empty, we have the following engine.json:
>> *{*
>> * "comment":"",*
>> * "id": "default",*
>> * "description": "settings",*
>> * "engineFactory": "org.template.RecommendationEngine",*
>> * "datasource": {*
>> * "params" : {*
>> * "name": "sample-handmade-data.txt",*
>> * "appName": "piourcluster",*
>> * "eventNames": ["facet","view"]*
>> * }*
>> * },*
>> * "sparkConf": {*
>> * "spark.serializer": "org.apache.spark.serializer.KryoSerializer",*
>> * "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io
>> <http://sparkbindings.io/>.MahoutKryoRegistrator",*
>> * "spark.kryo.referenceTracking": "false",*
>> * "spark.kryoserializer.buffer": "300m",*
>> * "es.index.auto.create": "true",*
>> * "es.nodes":"espionode1:9200,espionode2:9200,espionode3:9200"*
>> * },*
>> *"algorithms": [*
>> * {*
>> * "name": "ur",*
>> * "params": {*
>> * "appName": "piourcluster",*
>> * "indexName": "urindex",*
>> * "typeName": "items",*
>> * "eventNames": ["facet", "view"],*
>> * "blacklistEvents": [[]],*
>> * "maxEventsPerEventType": 50000,*
>> * "maxCorrelatorsPerEventType": 50,*
>> * "maxQueryEvents": 100,*
>> * "num": 11,*
>> * "rankings": [*
>> * {*
>> * "name": "popRank",*
>> * "type": "popular"*
>> * }*
>> * ],*
>> * "returnSelf": true*
>> * }*
>> * }*
>> * ]*
>> *}*
>>
>> From what we understand the fact that we have an array containing an
>> empty array for the parameter blacklistEvents tells UR that we don't want
>> any event to be blacklisted, not even the primary one.
>> We also added the parameter returnSelf : true to ask UR not to blacklist
>> any items part of the query.
>>
>> So why do we have blacklisted events in our query (ie the must_not part
>> of it) ?
>>
>> (Note that when we do a change in the engine.json and launch a deploy, we
>> see in the log some parameters value appearing, thus we know we modify the
>> right engine.json file.)
>>
>> Regards
>> Bruno
>>
>>
>>
>>
>>
>>
>
>