I’m the maintainer of the Universal Recommender. We have OSS support at 
https://groups.google.com/forum/#!forum/actionml-user 
<https://groups.google.com/forum/#!forum/actionml-user>

Do you wish to take advantage of the same user being in multiple 
datasets/tenants? The answer below is assuming no.

There are several ways to do this. First the PIO EventServer is multi-tenant, 
just keep data in separate “apps” which really should be named “datasets” they 
are IDed by keys generated when you do `pio app new <your-app-name>

The PredictionServer is not multi-tenant but you can put a separate process on 
different ports. You would train each tenant from a different directory 
containing the UR and the correct engine.json for that tenant/dataset. Then 
deploy it on some port that is specific to the tenant/model. This will create 
somewhat heavyweight processes for each port.

We have a version of PIO that supports multi-tenancy with lightweight Actors 
one per tenant. You deploy with a resource-id and when you make queries include 
the REST resource id in the URI. All engines are on the same port running in 
the same process so it’s very light-weight and performant. Otherwise the query 
works the same. Private message me to hear more.

I would not advise the item property method, unless you know there is no 
overlap in user-ids it may produce undesired results in the model and these may 
leak into recommendations. You can solve that with a filter (instead of the 
boost below) but there are better ways to solve this.


On Sep 8, 2016, at 4:08 PM, David Jones <[email protected]> wrote:

Hi All,

I have a use case where I have events coming in from many seperate tenants and 
I want to use the Universal Product Recommender engine. The challenge is 
separating data from each tenant throughout the PIO process.

I can think of three possible ways to solve this issue, but they all have 
tradeoffs:

1) Create Multiple Apps

You have one app per tenant. When you create events, you use the access key 
specific to that tenant. Then you query for recommendations using that same 
access key to get recommendations for just that app.

Issue: each engine has to specify an “appName” in engine.json. So now you have 
to have an engine per tenant (AKA app) that has all the same source code except 
for the “appName” will be different.

This’ll result in a bunch of duplicated code and you’ll have to train and 
deploy each one individually.

There is also no API for creating apps, so something will need to be created to 
bridge that to allow a new tenant to be on boarded.

2) Use Channels

You create one app, but create a channel per tenant. When you create an event 
you specific the channel.

Issue: the Universal Recommender engine can be modified to look at data for a 
single channel name but that name cannot be dynamically queried, it’ll be 
hardcoded into DataSource.scala. So now you’re in this same situation where 
you’ll need to create one engine per tenant, where each engine has the exact 
same source code except a one line change in the DataSource.scala file.

3) Use Product Properties

Provided your user ids are unique over all tenants, you could set a property on 
each product with a tenant id.

This way you can use one app, one engine, and simply query for recommendations 
and supply a significant bias to products that contain the tenant id property.

Example, give me the top recommendations for user xyz who is on tenant_id 12.

{
  "user": "xyz",
  "fields": [
    {
      "name": tenant_id",
      "values": ["12"],
      "bias": 10
    }
  ]
}

Issues: since all the data for all tenants is in one place, you’re going to 
have to train over all tenant’s data each time. There’s also issues around risk 
of deleting data from the wrong tenant should a tenant leave.

-
I was wondering if anyone has done something to any of these options? Perhaps 
there are other options? Are there any better ones? I’m thinking option 3) 
might be the best for our needs.

Thanks,
David.

Reply via email to