Could you please provide links to resources on the PIO that supports
multi-tenancy with lightweight Actors one per tenant.

On Thu, Sep 8, 2016 at 7:52 PM, Pat Ferrel <[email protected]> wrote:

> I’m the maintainer of the Universal Recommender. We have OSS support at
> https://groups.google.com/forum/#!forum/actionml-user
>
> Do you wish to take advantage of the same user being in multiple
> datasets/tenants? The answer below is assuming no.
>
> There are several ways to do this. First the PIO EventServer is
> multi-tenant, just keep data in separate “apps” which really should be
> named “datasets” they are IDed by keys generated when you do `pio app new
> <your-app-name>
>
> The PredictionServer is not multi-tenant but you can put a separate
> process on different ports. You would train each tenant from a different
> directory containing the UR and the correct engine.json for that
> tenant/dataset. Then deploy it on some port that is specific to the
> tenant/model. This will create somewhat heavyweight processes for each port.
>
> We have a version of PIO that supports multi-tenancy with lightweight
> Actors one per tenant. You deploy with a resource-id and when you make
> queries include the REST resource id in the URI. All engines are on the
> same port running in the same process so it’s very light-weight and
> performant. Otherwise the query works the same. Private message me to hear
> more.
>
> I would not advise the item property method, unless you know there is no
> overlap in user-ids it may produce undesired results in the model and these
> may leak into recommendations. You can solve that with a filter (instead of
> the boost below) but there are better ways to solve this.
>
>
>
> On Sep 8, 2016, at 4:08 PM, David Jones <[email protected]> wrote:
>
> Hi All,
>
> I have a use case where I have events coming in from many seperate tenants
> and I want to use the Universal Product Recommender engine. The challenge
> is separating data from each tenant throughout the PIO process.
>
> I can think of three possible ways to solve this issue, but they all have
> tradeoffs:
>
> *1) Create Multiple Apps*
>
> You have one app per tenant. When you create events, you use the access
> key specific to that tenant. Then you query for recommendations using that
> same access key to get recommendations for just that app.
>
> Issue: each engine has to specify an “appName” in engine.json. So now you
> have to have an engine per tenant (AKA app) that has all the same source
> code except for the “appName” will be different.
>
> This’ll result in a bunch of duplicated code and you’ll have to train and
> deploy each one individually.
>
> There is also no API for creating apps, so something will need to be
> created to bridge that to allow a new tenant to be on boarded.
>
> *2) Use Channels*
>
> You create one app, but create a channel per tenant. When you create an
> event you specific the channel.
>
> Issue: the Universal Recommender engine can be modified to look at data
> for a single channel name but that name cannot be dynamically queried,
> it’ll be hardcoded into DataSource.scala. So now you’re in this same
> situation where you’ll need to create one engine per tenant, where each
> engine has the exact same source code except a one line change in the
> DataSource.scala file.
>
> *3) Use Product Properties*
>
> Provided your user ids are unique over all tenants, you could set a
> property on each product with a tenant id.
>
> This way you can use one app, one engine, and simply query for
> recommendations and supply a significant bias to products that contain the
> tenant id property.
>
> Example, give me the top recommendations for user xyz who is on tenant_id
> 12.
>
> {
>   "user": "xyz",
>   "fields": [
>     {
>       "name": tenant_id",
>       "values": ["12"],
>       "bias": 10
>     }
>   ]
> }
>
> Issues: since all the data for all tenants is in one place, you’re going
> to have to train over all tenant’s data each time. There’s also issues
> around risk of deleting data from the wrong tenant should a tenant leave.
>
> -
> I was wondering if anyone has done something to any of these options?
> Perhaps there are other options? Are there any better ones? I’m thinking
> option 3) might be the best for our needs.
>
> Thanks,
> David.
>
>

Reply via email to