Re: UR optimizing results

Pat Ferrel Wed, 24 May 2017 10:43:34 -0700

I suggest you read the docs here: http://actionml.com/docs/ur Pay particular 
attention to attaching properties to items and using fields to query for those 
properties. This is the only way to get items with no usage data. You could 
promote items with business rules or adopt some kind of ordering or items that 
puts new items ahead of popular ones. So check custom “rankings” and "item 
properties”.


itemBias is used for item-based queries and refers to item-similarity based on 
usage data, not content similarity.

It is difficult to truly mix content-based recs where no usage data exists and 
collaborative filtering because you would be giving up the advantage of CF. 
Therefore I suggest some separate rolling promotion mechanism in a separate 
placement. Then you’ll get usage data, at least detail views.


On May 24, 2017, at 10:33 AM, Dennis Honders <[email protected]> wrote:

Thanks again for the answer. I will read the paper soon. 
How can recommendations be configured for content-based filtering (based on 
item properties) for products which are never sold? Instead of using e.g. 
populair items. 

Boosting with these properties is done with itemBias. 

Op 24 mei 2017 om 17:54 heeft Pat Ferrel <[email protected] 
<mailto:[email protected]>> het volgende geschreven:

> I split answers in 2 since the config is a completely separate thing.
> 
> increasing maxCorrelatorsPerEventType it usually the wrong thing to do. It is 
> making the model fuzzier, for lack of a better term. I fact we’d like to 
> restrict the correlators to only the best and maxCorrelatorsPerEventType is a 
> crude way to do this that is worse the more you allow. Another new method is 
> an LLR threshold, which can be set per indicator to use the correlation value 
> as a threshold for inclusion as a correlator. maxCorrelatorsPerEventType just 
> take the top ones even if their scores are low. This is why making this 
> number big will not make things better because it will include more of lower 
> quality.
> 
> Also maxEventsPerEventType increases memory usage and takes far longer to 
> calculate the model for very little if any gain. This is from a paper by 
> Sebastian Schelter, one of the inventors of CCO 
> https://ssc.io/pdf/rec11-schelter.pdf <https://ssc.io/pdf/rec11-schelter.pdf>
> 
> I’d leave those as defaulted and measure a baseline KPI before doing A/B 
> tests or cross-validation to try different numbers there.
> 
> 
> On May 24, 2017, at 8:28 AM, Dennis Honders <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> Current data: 
> 
> {"event": "cart-transaction", "entityId": "1", "entityType": "user", 
> "targetEntityId": "12", "targetEntityType": "item"}, 
> 
> {"event": "$set", "entityType": "item", "entityId": "12", "properties": 
> {"category": ["1", "2", "3", "4", "5", "6", "7"], "manufacturer": 1, "label": 
> "test", "price": "$1-$2"}}
> 
> Questions: 
> 
> Cart-transaction is the primary for shopping cart recommendation, maybe use 
> user-buy-item as secondary event or is there no link between this?
> 
> Item-based queries are for similar items. For shopping cart recommendations, 
> complementary recommendations will suite better? If so, those are made by 
> 'user-id' (cart-id). How can this be done?
> 
> I like to do content-based recommendation for items that haven't been in a 
> transaction. I think this can be configured in the engine.json. Any advice 
> for doing this?
> 
> Engine.json: 
> 
> {
>   "comment":" This config file uses default settings for all but the required 
> values see README.md for docs",
>   "id": "default",
>   "description": "Default settings",
>   "engineFactory": "com.actionml.RecommendationEngine",
>   "datasource": {
>     "params" : {
>       "name": "ur-name",
>       "appName": "Test",
>       "eventNames": ["cart-transaction"]
>     }
>   },
>   "sparkConf": {
>     "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
>     "spark.kryo.registrator": 
> "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
>     "spark.kryo.referenceTracking": "false",
>     "spark.kryoserializer.buffer.mb": "300",
>     "spark.kryoserializer.buffer": "300m",
>     "es.index.auto.create": "true"
>   },
>   "algorithms": [
>     {
>       "comment": "simplest setup where all values are default, popularity 
> based backfill, must add eventsNames",
>       "name": "ur",
>       "params": {
>               "appName": "Test",
>               "indexName": "test",
>               "typeName": "cart",
>               "comment": "must have data for the first event or the model 
> will not build, other events are optional",
>               "eventNames": ["cart-transaction"],
>               "maxEventsPerEventType": 50000,
>               "maxCorrelatorsPerEventType": 5000,
>               "num": 10, 
>               "itemBias": 2.0,
>               "rankings": [{
>                       "name": "preferredRank",
>                       "type": "userDefined"
>               }]
>       }
>     }
>   ]
> }
> 
>

Re: UR optimizing results

Reply via email to