Hi, Thanks for your answer. We tried that already but it doesnt change anything, we still have blacklisted items (primary events mainly or only from what I see).
I think the piece of code in charge of blacklisting is this one: (from here https://github.com/pferrel/template-scala-parallel-universal-recommendation/blob/master/src/main/scala/URAlgorithm.scala ) * /** Create a list of item ids that the user has interacted with or are not to be included in recommendations */* * def getExcludedItems(userEvents: Seq[Event], query: Query): Seq[String] = {* * val blacklistedItems = userEvents.filter { event =>* * // either a list or an empty list of filtering events so honor them* * blacklistEvents match {* * case Nil => modelEventNames.head equals event.event* * case _ => blacklistEvents contains event.event* * }* * }.map(_.targetEntityId.getOrElse("")) ++ query.blacklistItems.getOrEmpty.distinct* * // Now conditionally add the query item itself* * val includeSelf = query.returnSelf.getOrElse(returnSelf)* * val allExcludedItems = if (!includeSelf && query.item.nonEmpty) {* * blacklistedItems :+ query.item.get* * } // add the query item to be excuded* * else {* * blacklistedItems* * }* * allExcludedItems.distinct* * }* But my knowledge of Scala is very limited, so I dont understand the details. Does it say that if the parameter blacklistEvents is empty, aka = [], then no events are to be excluded (plus/minus the includeSelf option). Do I have the right version of UR? ( https://github.com/pferrel/template-scala-parallel-universal-recommendation) 2017-03-30 20:00 GMT+02:00 Pat Ferrel <[email protected]>: > *"blacklistEvents": [[]], should be **"blacklistEvents": [],* > > > On Mar 30, 2017, at 8:57 AM, Bruno LEBON <[email protected]> wrote: > > Hello, > > We test the universal recommender on a cluster made following the tutorial > from actionML. Once the build/train/deploy is done we send PIO a request to > get recommendation. > For example: > *curl -H "Content-Type: application/json" -d '{ "user": > "4e810ef4-977a-4f04-b585-cf2c2996ec93", "num": 11 }' > http://localhost:8001/queries.json <http://localhost:8001/queries.json>* > > In the pio.log we see the requests made to Elasticsearch. They look like: > > *{"size":11,"query":{"bool":{"should":[{"terms":{"facet":["estag_begin-couleur-noir-estag_end","cocooning","sexy","charme","estag_begin-taille-105h-estag_end","estag_begin-taille-4-estag_end","estag_begin-primadonna-estag_end","transparent","estag_begin-aubade-estag_end","estag_begin-couleur-rouge-estag_end","une-piece","estag_begin-simone-perele-estag_end","maintien","moins-de-20-euros-intervalle-de-prix","estag_begin-taille-taille-unique-estag_end","estag_begin-moins-50-pour-cent-estag_end","elasthanne","blouse","body","coque","string","slip","estag_begin-taille-95a-estag_end"]}},{"terms":{"view":[]}},{"constant_score":{"filter":{"match_all":{}},"boost":0}}],"must":[],"must_not":{"ids":{"values":["estag_begin-taille-95a-estag_end","string","estag_begin-aubade-estag_end","slip","elasthanne","coque","body","blouse","estag_begin-moins-50-pour-cent-estag_end","estag_begin-primadonna-estag_end","estag_begin-taille-taille-unique-estag_end","moins-de-20-euros-intervalle-de-prix","maintien","estag_begin-simone-perele-estag_end","une-piece","estag_begin-couleur-rouge-estag_end","transparent","sexy","estag_begin-taille-4-estag_end","estag_begin-taille-105h-estag_end","charme","cocooning","estag_begin-couleur-noir-estag_end"],"boost":0}},"minimum_should_match":1}},"sort":[{"_score":{"order":"desc"}},{"popRank":{"unmapped_type":"double","order":"desc"}}]}* > > The important part is the fact that there is a must_not that is not empty. > We want it to be empty, we have the following engine.json: > *{* > * "comment":"",* > * "id": "default",* > * "description": "settings",* > * "engineFactory": "org.template.RecommendationEngine",* > * "datasource": {* > * "params" : {* > * "name": "sample-handmade-data.txt",* > * "appName": "piourcluster",* > * "eventNames": ["facet","view"]* > * }* > * },* > * "sparkConf": {* > * "spark.serializer": "org.apache.spark.serializer.KryoSerializer",* > * "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io > <http://sparkbindings.io>.MahoutKryoRegistrator",* > * "spark.kryo.referenceTracking": "false",* > * "spark.kryoserializer.buffer": "300m",* > * "es.index.auto.create": "true",* > * "es.nodes":"espionode1:9200,espionode2:9200,espionode3:9200"* > * },* > *"algorithms": [* > * {* > * "name": "ur",* > * "params": {* > * "appName": "piourcluster",* > * "indexName": "urindex",* > * "typeName": "items",* > * "eventNames": ["facet", "view"],* > * "blacklistEvents": [[]],* > * "maxEventsPerEventType": 50000,* > * "maxCorrelatorsPerEventType": 50,* > * "maxQueryEvents": 100,* > * "num": 11,* > * "rankings": [* > * {* > * "name": "popRank",* > * "type": "popular"* > * }* > * ],* > * "returnSelf": true* > * }* > * }* > * ]* > *}* > > From what we understand the fact that we have an array containing an empty > array for the parameter blacklistEvents tells UR that we don't want any > event to be blacklisted, not even the primary one. > We also added the parameter returnSelf : true to ask UR not to blacklist > any items part of the query. > > So why do we have blacklisted events in our query (ie the must_not part of > it) ? > > (Note that when we do a change in the engine.json and launch a deploy, we > see in the log some parameters value appearing, thus we know we modify the > right engine.json file.) > > Regards > Bruno > > > > > >
