Great to hear about this possibility. I've been working on running PredictionIO
on Heroku https://www.heroku.com
Heroku's 12-factor architecture https://12factor.net prefers "stateless builds"
to ensure that compiled artifacts result in processes which may be cheaply
restarted, replaced, and scaled via process count & size. I imagine this
stateless property would be valuable for others as well.
The fact that `pio build` inserts stateful metadata into a database causes
ripples throughout the lifecycle of PIO engines on Heroku:
* An engine cannot be built for production without the production database
available. When a production database contains PII (personally identifiable
information) which has security compliance requirements, the build system may
not be privileged to access that PII data. This also affects CI (continuous
integration/testing), where engines would need to be rebuilt in production,
defeating assurances CI is supposed to provide.
* The build artifacts cannot be reliably reused. "Slugs" at Heroku are intended
to be stateless, so that you can rollback to a previous version during the
lifetime of an app. With `pio build` causing database side-effects, there's a
greater-than-zero probability of slug-to-metadata inconsistencies eventually
surfacing in a long-running system.
From my user-perspective, a few changes to the CLI would fix it:
1. add a "skip registration" option, `pio build --without-engine-registration`
2. a new command `pio app register` that could be run separately in the built
engine (before training)
Alas, I do not know PredictionIO internals, so I can only offer a suggestion
for how this might be solved.
Donald, one specific note,
Regarding "No automatic version matching of PIO binary distribution and
artifacts version used in the engine template":
The Heroku slug contains the PredictionIO binary distribution used to build the
engine, so there's never a version matching issue. I guess some systems might
deploy only the engine artifacts to production where a pre-existing PIO binary
is available, but that seems like a risky practice for long-running systems.
Thanks for listening,
Customer Facing Architect
Salesforce App Cloud / Heroku
San Francisco, California
> On Sep 16, 2016, at 10:42, Donald Szeto <don...@apache.org> wrote:
> Hi all,
> I want to start the discussion of removing engine registration. How many
> people actually take advantage of being able to run pio commands everywhere
> outside of an engine template directory? This will be a nontrivial change on
> the operational side so I want to gauge the potential impact to existing
> - Stateless build. This would work well with many PaaS.
> - Eliminate the "pio build" command once and for all.
> - Ability to use your own build system, i.e. Maven, Ant, Gradle, etc.
> - Potentially better experience with IDE since engine templates no longer
> depends on an SBT plugin.
> - Inability to run pio engine training and deployment commands outside of
> engine template directory.
> - No automatic version matching of PIO binary distribution and artifacts
> version used in the engine template.
> - A less unified user experience: from pio-build-train-deploy to build, then