No, both engines can even use the same data. Two engines is far simpler to 
construct and manage. You make 2 queries but internally with multiple 
algorithms 2 queries would be made also. The only thing you save is having 2 
deployed engine processes rather than having them in the same single process at 
the cost if complexity.


On Feb 28, 2017, at 6:11 PM, Kanak Singh <[email protected]> wrote:

This way, I can deploy one engine for one user of my web app, and that can take 
care of all prediction needs of that user.
If I deploy multiple engines for one user, it will limit the number of users I 
can have, right?

On Tue, Feb 28, 2017 at 4:49 PM, Kanak Singh <[email protected] 
<mailto:[email protected]>> wrote:
This way, I only have to deploy that one engine, instead of deploying multilpe 
ones.

On Tue, Feb 28, 2017 at 4:19 PM, Pat Ferrel <[email protected] 
<mailto:[email protected]>> wrote:
May I ask why you are packaging these as multiple algorithms in a template 
rather than multiple templates with one algo each?


On Feb 28, 2017, at 3:29 PM, Kanak Singh <[email protected] 
<mailto:[email protected]>> wrote:

Hi all,
This is my first email so pardon me for any mistakes in email etiquette.

I am trying to create a multi purpose engine in which I can include multiple 
algorithms that operate on multiple data sets, and that is what I need some 
help with.

I have read the documentation on including multiple algorithms and handling 
multiple events. I see that the DataSource and Preparator can collect data 
across different channels/ different event types and pass it on as individual 
RDD[LabeledPoint] fields of the PreparedData object. And I can add multiple 
Algorithm files and have each of them access the respective PreparedData field 
that I choose.

However, for this I have to decide which algorithms I can offer to the client 
and how many of them, at the time of creating an Engine template folder. I am 
trying to go one step further by making one engine handle an unspecified number 
of data sets (dynamically uploaded by client using POST requests) and operate 
on them using any algorithm (specified perhaps in a 'pio train --algorithm' 
option).

First of all, is this a good idea?
If so, what is the best way to do it?

Any help or leads would be appreciated.

Best.




Reply via email to