Hi Benj,

On Thu, Jan 21, 2016 at 6:41 AM, Benj <webko...@gmail.com> wrote:

> Hi,
>
> on my django project, at a point i resize an image with pilow before
> associating it to django field, then saving the instance.
>
> is there any benefit in making the image resize / saving on disk async ?
>

Yes - although async isn’t the way I’d describe it - I’d describe this as
background task processing.

When a user hits your web server, the server generates a page, and returns
the content to the user. The user’s browser then displays the page. The
user doesn’t see *any* content until the entire page has been generated and
sent back to them. This means that it is critical that the page can be
computed quickly - if it isn’t, the user will observe this as a slow page
load.

So - if you’ve got a time intensive task, like resizing an image, you’re
generally advised to do that *outside* the request/response loop. You *can*
do it inline (or synchronously), and if you’re dealing with a low traffic
site with users that don’t mind an occasional delay, you might do it inline
just to expedite the development process. But if you’re on a high traffic
site, or if users will notice a delay, you should get any time-expensive
processing out of the page generation process.

There’s another reason to get time-expensive processes out of the page
generation process: server load. Web servers generally have a fixed
capacity - they can only be processing N requests at a time. Some web
servers can *accept* very large numbers of simultaneous connections (nGinX,
for example, can handle thousands) - but that’s only *accepting* the
connection - web servers will generally *process* a handful of requests at
a time (often some small multiple of the number of processors).

So - if you have something that takes a non-trivial amount of time to
perform, you will be locking up that web server thread until the processing
is completed. That means you’ve just reduced the number of available web
server threads. That means *everybody else* visiting your website will have
degraded performance, not just the person whose request caused the time
expensive task. If you’ve got a lot of users who simultaneously request the
same time-expensive view, *nobody* will be able to get a request processed,
because the web server will be tied up doing the time-expensive tasks.

The usual approach for this sort of problem (and image processing is the
classic use case) is a worker thread. The user submits their image, the web
server receives it, puts the image onto a work queue, and immediately
response with a success acknowledging receipt. In a completely separate
worker thread, images are taken off the queue, processed, and stored. This
means the user gets a fast response, doesn’t drag everyone else down with
them, but the work is still done.

This obviously imposes some extra overhead on your code - in your example,
you can’t assume the image exists, so you have to put in fallback
mechanisms when the user requests a page where the image needs to be
displayed.

To implement this sort of feature, you need to have a worker queue - Celery
is the heavy duty answer for this; if you just need a cheap and cheerful
answer, RQ is a fairly easy-to-use option, or you can roll-your-own in the
database without too much trouble.

As an aside - when web developers talk about “asynchronous” behaviour, they
are generally referring to things like chat clients. This is a situation
where the server is able to send data back to the client at will. “Classic
web” is client-driven; user requests a page, server provides it.
“Asynchronous web” is a different mode of operation, where the user
requests a specific thing which isn’t available yet; server provides it
when it is available. This approach *could* be used for something like
image processing, but it isn’t something Django is well set-up to manage at
present.


> I heard that since django is a non async framework, there is no interest
> in doing so. But is there ?
>
> I’m not sure exactly what you’ve heard - but it sounds like whoever told
you was misinformed.

Django doesn’t *currently* have any built-in asynchronous tools, but that
doesn’t mean we don’t want to add them, or that there aren’t options for
doing asynchronous work right now. There are a couple of patches, in
various stages of development, including one that just received a large
grant from Mozilla, that will add various flavours of asynchronous handling
to Django.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CAJxq849Zua8%2BPMCRkAeT7nAXuJnshORcjnJLjv3SAgFSoJ6bxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to