One of my biggest work pieces is scaling of deployments including all services 
to fit the data. Then once this is accomplished changing that as data changes 
or fine tuning to reduce costs. upgrading has not be difficult for the PIO part 
but certainly is when upgrading storage backends or heaven forbid, changing the 
type of storage backend.

I’d love to see this addressed. We have Chef recipes and a collection of docker 
containers, all in OSS, as well as some Terraform scripts for spinning up AWS 
that is closed source. Would some form of this help? I know Heroku has its own 
install system.


On Jul 11, 2017, at 10:31 AM, Pat Ferrel <p...@occamsmachete.com> wrote:

Understood, you have immediate practical reasons for 1 integrated deployment 
with the 2 endpoints. But Apache is a do-ology, meaning those who do something 
win the argument as long as they have enough consensus. I have enough 
experience with PIO that I have chosen to fix a lot of issues with the 
prototype design, having already gone down the “quick hack” path once. You may 
want to do something else if you have the resources.

I fear that my deeper changes will not get enough consensus and we may end up 
with a competing ML/AI server framework some day. That is another ASF tendency. 
Innovations happen before going into ASF, often not under ASF rules.

In any case—how much of your problem is workflow vs installation vs bundling of 
APIs? Can you explain it more?


On Jul 11, 2017, at 9:37 AM, Mars Hall <m...@heroku.com> wrote:

> On Jul 10, 2017, at 18:03, Kenneth Chan <kenn...@apache.org> wrote:
> 
> it's all same set of events collected for my application and i can create 
> multiple engine to use these data for different purpose.


Clear to me, ⬆️ this is the prevailing reasoning behind the "separateness" of 
the Eventserver. I do not foresake this design goal, but ask that we consider 
the usability & durability of PredictionIO when deploying multiple engines with 
different versions of PIO and different storage configurations. This will 
probably happen for anyone who uses PredictionIO long-term in production, as 
their new projects come on-line with newer & better versions & configurations.

I encounter this situation of needing separate PIO installs regularly when 
testing the next release or development builds of PIO and when evaluating 
engine templates or algorithms that require new, different storage configs. 
Also, those in the consulting world are frequently required to keep client data 
separated for all kinds of privacy & legal reasons; with the storage corruption 
bug I reported, one client's data could become visible to or intermingled with 
another client's app.

In starting this thread, I was hoping to find some traction with the idea of 
making it possible to completely self-contain a PredictionIO app by adding the 
Events API to the process started with `pio deploy`.

Goal: Queries & Events APIs in the same process.

When considering the architecture of apps, sharing a database between two or 
more apps is considered a very naughty way to get around having clear, clean, 
inter-process API's. My team at Salesforce/Heroku has been struck by this exact 
issue with PredictionIO. So, I am seeking a way to fix this without requiring a 
rewrite of PredictionIO. I am excited to hear about the new architecture 
prototypes, yet our reality is that this is an issue now.

*Mars

( <> .. <> )



Reply via email to