This sounds like a good case for Donald’s suggestion. What I was trying to add to the discussion is a way to make all commands rely on state in the megastore, rather than any file on any machine in a cluster or on ordering of execution or execution from a location in a directory structure. All commands would then be stateless.
This enables real use cases like provisioning PIO machines and running `pio deploy <resource-id>` to get a new PredictionServer. Provisioning can be container and discovery based rather cleanly. On Sep 17, 2016, at 5:26 PM, Mars Hall <m...@heroku.com> wrote: Hello folks, Great to hear about this possibility. I've been working on running PredictionIO on Heroku https://www.heroku.com Heroku's 12-factor architecture https://12factor.net prefers "stateless builds" to ensure that compiled artifacts result in processes which may be cheaply restarted, replaced, and scaled via process count & size. I imagine this stateless property would be valuable for others as well. The fact that `pio build` inserts stateful metadata into a database causes ripples throughout the lifecycle of PIO engines on Heroku: * An engine cannot be built for production without the production database available. When a production database contains PII (personally identifiable information) which has security compliance requirements, the build system may not be privileged to access that PII data. This also affects CI (continuous integration/testing), where engines would need to be rebuilt in production, defeating assurances CI is supposed to provide. * The build artifacts cannot be reliably reused. "Slugs" at Heroku are intended to be stateless, so that you can rollback to a previous version during the lifetime of an app. With `pio build` causing database side-effects, there's a greater-than-zero probability of slug-to-metadata inconsistencies eventually surfacing in a long-running system. From my user-perspective, a few changes to the CLI would fix it: 1. add a "skip registration" option, `pio build --without-engine-registration` 2. a new command `pio app register` that could be run separately in the built engine (before training) Alas, I do not know PredictionIO internals, so I can only offer a suggestion for how this might be solved. Donald, one specific note, Regarding "No automatic version matching of PIO binary distribution and artifacts version used in the engine template": The Heroku slug contains the PredictionIO binary distribution used to build the engine, so there's never a version matching issue. I guess some systems might deploy only the engine artifacts to production where a pre-existing PIO binary is available, but that seems like a risky practice for long-running systems. Thanks for listening, *Mars Hall Customer Facing Architect Salesforce App Cloud / Heroku San Francisco, California > On Sep 16, 2016, at 10:42, Donald Szeto <don...@apache.org> wrote: > > Hi all, > > I want to start the discussion of removing engine registration. How many > people actually take advantage of being able to run pio commands everywhere > outside of an engine template directory? This will be a nontrivial change on > the operational side so I want to gauge the potential impact to existing > users. > > Pros: > - Stateless build. This would work well with many PaaS. > - Eliminate the "pio build" command once and for all. > - Ability to use your own build system, i.e. Maven, Ant, Gradle, etc. > - Potentially better experience with IDE since engine templates no longer > depends on an SBT plugin. > > Cons: > - Inability to run pio engine training and deployment commands outside of > engine template directory. > - No automatic version matching of PIO binary distribution and artifacts > version used in the engine template. > - A less unified user experience: from pio-build-train-deploy to build, then > pio-train-deploy. > > Regards, > Donald