I'm opening a thread dedicated to website performance issues in web2py, 
giving the answers I've got so far so that other may contribute as well.


Static assets (fonts, vendor CSS & JS)

How do you manage & distribute these ?

Fonts are easiest to manage using Google Fonts 
<https://www.google.com/fonts>. This tool will generate an optimized file 
load management through Google's CDNs.

If at all possible, your static files need to be compiled into one big 
bundle per page (main.css & main.js) so that you can optimize its delivery. 
This can be done using Bower to download the source code and Gulp or Grunt 
to concatenate it into one file.

Otherwise, I'd recommend using the minified version of your vendors and 
using your own CDN for delivery, as publicly available ones aren't very 
reliable in terms of speed.


Dynamic assets (fast-changing CSS & JS, e.g : your main.css & main.js)

The challenge here is minification, variable replacement for JS (if you're 
not using AngularJS), compression & versioning.

Grunt & Gulp can take care of all this through a deployment task system, 
and you can get a manifest in JSON format giving you the versioned filename 
for each of your generated files.
See gulp-rev-all <https://github.com/smysnk/gulp-rev-all> for the 
versioning / manifest generation part.

Personally, I'm using Gulp to generate the minified bundles and then I'm 
using Django's collectstatic combined with Whitenoise to generate a gzipped 
& versioned file for all of my assets (I can go into details on how to 
combine Django's collectstatic with Web2py in another thread if someone is 
interested)


Static images

Google PageSpeed Insights 
<https://developers.google.com/speed/pagespeed/insights> recommends that 
you resize & compress your images as much as can be done.

Compressing is a matter of using higher compression indices when generating 
your output files in .png or .jpg formats (using Photoshop, GIMP...), as 
these algorithmically include a compression.
In case you would need to use a format different than these, I'd recommend 
gzipping... but really you don't wanna stray too much from PNG & JPG.

As far as I know, resizing has to be done manually since no machine can 
know what the maximum viewable size of your content may be (especially with 
responsive websites). It is a sensible step that requires a lot of human 
interaction from your webdesigner.

Compression, on the other hand, can be done automatically with a gulp task 
<https://www.npmjs.com/package/gulp-image-optimization>.


Dynamic images (uploads)

Again, they need to be optimized in terms of size & compression.

Resizing can be easilly done in web2py with a computed field (i.e. 
something like Field('image_resized', 'upload', compute=THUMB(200,200), 
uploadfolder='uploads/resized')).

Compression can be done in the compute fonction I guess, but I confess I 
haven't spent much time on this yet.
The Python Imaging Library, forked as Pillow 
<https://python-pillow.github.io/> these days, has all sorts of functions 
to achieve that.


HTML responses

Once you've compressed static assets, Google PageSpeed will (rightfully) 
nag you about compressing your HTML response itself.

Web2py includes a contribution for html minification 
(contrib/minify/html_minify.py) which can be used easilly with a decorator.

I've created a small decorator than handles compression:

import zlib

def deflate(func):

    def _f():

        out = func()

        render = response.render(out) if isinstance(out, dict) else out

        if 'deflate' in request.env.http_accept_encoding:

            response.headers['Content-Encoding'] = 'deflate'

            return zlib.compress(render)

        else:

            return render

    return _f


Using these decorators has to be done in the right order (minification, 
then compression) otherwise you'll run into trouble :)

Here's how to do it :

@deflate
@minify
def index():
   return dict()


I've managed to reduce the size of my homepage's HTML code from 36.6KB to 
7.6KB using both minification & compression.

Please note that these decorators do mean a CPU overhead for your webserver.
It is up to you to use one or both depending on what kind of hardware your 
server runs on.
There is also the possibility of using a caching strategy to reduce 
processing time. I'd recommend web2py's @cache.action(...) decorator for 
that.


HTML response headers

Now for the most important part : settings the right headers on your 
responses.

Most of the performance-oriented websites nowadays will rely on CDNs to act 
as proxies and cache resources.
This is especially useful for static assets, as these can be served through 
your webserver and then cached in your CDN.

If you use Amazon Cloudfront for instance, you can set your website 
(http://your-website.com) as origin of the distribution, which means 
Cloudfront will look for the resource on your webserver once and then 
basically cache it "forever", serving it to your client from whichever 
Cloudfront endpoint is closest to him.

I say "forever" because this behavior can and should be controlled with 
specific headers.
If you're versioning your files (following the above-mentioned advice), 
then you can set far-future cache headers without any risk of stale data. 
Otherwise, you need to be very careful about how long you want to cache 
your data.

Now basically, your want to set 3 headers:


   1. Content-Encoding: how did you compress your data ? (e.g. : 'gzip', 
   'deflate'). You need to check the Accept-Encoding header in your client's 
   request beforehand to make sure its browser supports the proposed 
   compression algorithm (all browers nowadays support on-the-fly 
   decompression so this is mainly for old browsers)
   
   2. Content-Type: if you did use a compression algorithm, then you need 
   to specify this header so that your client's browser will interpret your 
   data correctly (and not based on it's content, which otherwise would be 
   interpreted as 'octet/stream'). Amazon S3 will suggest you default values 
   based on the file extension when setting this manually, otherwise web2py 
   has an amazing solution for that : 
   http://web2py.readthedocs.org/en/latest/contenttype.html
   
   3. Cache-Control: this will determine how long you want your client's 
   browser to keep the data, avoiding unnecessary requests to your webserver. 
   Web2py will implicitly set it if you use the @cache.action(...) decorator. 
   I'd recommend this article 
   
<https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching?hl=en#defining-optimal-cache-control-policy>
 
   to understand the basics of browser caching leverage.



Possible improvements

My main issue with web2py so far is serving static files with custom 
headers.
Web2py's wsgi implements a very basic caching strategy:
if static_file:
                    if eget('QUERY_STRING', '').startswith('attachment'):
                        response.headers['Content-Disposition'] \
                            = 'attachment'
                    if version:
                        response.headers['Cache-Control'] = 
'max-age=315360000'
                        response.headers[
                            'Expires'] = 'Thu, 31 Dec 2037 23:59:59 GMT'
                    response.stream(static_file, request=request)

That code requires programmers to use web2py's versioning system which, no 
offense, isn't even close to Gulp's or Django's.

I've ended up writing my own static file controller as follows :

def serve_static():

    session.forget(response)

    relpath = request.env.PATH_INFO

    fullpath = os.path.join(request.folder, 'static', 'dist', *request.args)

    if os.path.isfile(fullpath):

        response.headers['Cache-Control'] = 'max-age=315360000, 
s-maxage=315360000, no-transform, public'

        response.headers['Content-Type'] = contenttype(fullpath)

        if os.path.isfile(fullpath + '.gz'):

            # Gzipped version exists

            fullpath = fullpath + '.gz'

            response.headers['Content-Encoding'] = 'gzip'

        return response.stream(open(fullpath, 'rb'), chunk_size=10**6)

    else:

        raise HTTP(404)

My main issue with it is that there is an unnecessary overhead generated by 
loading models before resolving this function, whereas web2py's default 
static management goes directly through the wsgi.

Possible solutions may be : managing the model architecture to prevent 
other models from loading when serving through this function ? using a 
dedicated WSGI such as Whitenoise <http://whitenoise.evans.io/en/latest/> ? 
(I've given it several attemps but I've never managed to successfully plug 
it into web2py. Any advice on that ?)


I'm interested in finding a way to use the gzip compression algorithm 
(instead of deflate) to compress html responses. Gzip is more standard 
these days but Python's native function only allows file compression, not 
strings directly.
Maybe it can be used with TempFiles (ugly), streams (less ugly) or maybe 
there's another library that does just that. Do tell if you know about any.

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to