After much thought, and based on the fact that I've been using the 
Heroku/Web2py stack for more than a year now, I think I'll try to build a 
scaffolding app dedicated to Heroku in order to demonstrate exactly how one 
should go about building a web2py-based application on a PaaS cloud with no 
persistent filesystem.
I could also contribute the doc section on Heroku but I'm not quite sure 
I'm senior enough just yet.

That being said, the most difficult issue I've been having so far with 
web2py on Heroku (and still haven't solved yet) is *how to handle migration 
files*.

If you guys can help me do that then it's pretty much all clear down the 
road ;)


So, basically, there are 4 approaches to storing data on cloud systems :

   1. Use a remote bucket (e.g. Amazon S3)
   2. Store data in the database
   3. Postdeploy hooks
   4. CVS (Git/Mercurial...)

Let's go over those :

*1. Bucket*

If you use a library like pyfs, you can replace the filesystem used to 
access local files with a remote system (a bucket). That's very handy for 
instance if you want to manage uploads (e.g. pictures) in web2py using a 
remote storage service like Amazon S3.

HOWEVER, migration files being files you need to access a lot and very fast 
in web2py, any latency you add in the loading of those files would 
drastically impact your app's responsivity.
Even using a CDN service like Cloudfront, I would highly recommend not 
going for that option.

*2. Database*

That is the preferred option so far, and the only one offered in web2py's 
doc.
gluon/contrib/heroku.py uses the UseDatabaseStoredFiles class to do most of 
the heavy lifting here.

Problem is : this class was built with GAE in consideration and you can 
find many explicit implementations dedicated to GAE (e.g. if gae : ...)
I'm sure this can be improved to solve inconsistencies when this class is 
used with Heroku (for example this issue 
https://groups.google.com/forum/#!topic/web2py/w2RJBqKIwRE)

Using the database is a solid option when using an ephemeral filesystem.

*3. Postdeploy hook*

This option hasn't been explored at all so far.

Based on Heroku's doc <https://devcenter.heroku.com/articles/buildpack-api>, 
one can specify explicitely when shell commands to run when building an app.

If there was python script that could run the migrations (migrate_enabled = 
True, fake_migrate_all = False, lazy_tables = False), you could run it at 
postdeploy-time then basically disable migrations consistently throught 
your project without having to worry about how this model change will 
affect your production DB.

*4. CVS*

Including .table files in my Git changesets if the solution I'm currently 
using.
It makes sure your local and production environment are on the same page 
migration-wise, and it's a good way to remind you that your changeset may 
trigger migrations.

I'm not 100% sure it's the best way to do things, and there are times when 
it breaks : if you don't specify explicit names for your migration files 
then you'll end up with tons of .table files in your changesets due to the 
hashed prefix being changed (why is it changed all the time anyway ??).


These are the only options I know of, and none of them is fully 
satisfactory so far.

What do you think ? What's the best way to handle migrations on a git-based 
deploy system that has an ephemeral filesystem ?

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to