[google-appengine] Re: Performance issues

Eric Walker Sat, 18 Apr 2009 00:28:05 -0700

Thanks for the suggestion.  I'll try #3 as an first approach and see
where that takes me (and look forward to #5).  I think app-engine-
patch has a main() declared, and I assume the app is getting cached as
a result, but I'm not sure if that's the case.

Re the file caching idea, I don't understand the present environment
enough to do more than guess about how this might be done, so this is
just brainstorming.

I assume that there are two different factors that are going into the
longer request times that I've been seeing in my own case.  There's
the zipimporter call, which decompresses the Django 1.0 archive that I
keep in a subdirectory of my app.  I imagine this takes a while each
time it happens, and it might account for much of the request time.
Since zipimporter isn't called with every request and some requests
are pretty quick, I assume in those instances I'm hitting a server on
which the needed python symbols were resolved during a previous
request.  Hence the application was probably memory resident, and the
zipimporter call wasn't needed.  In other instances it's possible that
the application had to be reloaded from the network, which might
account any time above that necessary for the zipimporter call.

If I remember correctly, the reason the Django files are being
archived is that there's a 1000 file limit on what can be uploaded (in
a single upload?).  Using an archive instead of an uncompressed
directory is a way to get around this limit.

What I had in mind in the postscript was something like a distributed,
shared memory for the cloud as a whole that doesn't know or care about
the actual data that is being shared (python or Java files) or where
it comes from (application developers).  If 1000 applications request
a file often enough (e.g., a core Django library file), it would be
pushed out to the servers in such a way that symbols could be loaded
by the python interpreter from memory rather than from a file on disk
or over the network.  We would know how often a file was being
requested by taking a strong cryptographic checksum of the contents of
each file in an application the first time an open() system call is
called on the file after a new version of the application has been
uploaded.  Perhaps such an open() call would happen during a python
import.  Once the checksum has been obtained, the usage of the file
could then be tracked across applications and across requests, and it
could be checked against a large table of checksums on which
statistics are being maintained.  Although some of this might happen
during the request, I think a lot of it could be deferred until
later.  Since we're working at a pretty low level, hopefully it
wouldn't matter if the original source of a file was an archive rather
than an uncompressed directory; one issue, though, is that the
uncompressed files from an archive will obviously be (much) larger
than the archive itself and will require more memory.  The problem
with making use of the archive itself is that the zipimporter call
seems to cost a lot, but perhaps it is small compared to a network
call.

In this setup, if 1000 applications use a common subset of Django
files (0.96, 1.0 or whatever), the total server memory across the
cloud devoted to those files at any given time wouldn't necessarily be
less than in the current setup, since some or all of the subset will
have been pushed out.  But it wouldn't be necessary to push out a
single application in its entirety as often, since some of the files
might already be in memory.  I'm not sure how this would affect other
applications, e.g., how much it would reduce the need to recycle
memory currently being used for other applications when there is a new
request.  But hopefully it would result in a more efficient use of
existing resources.  There would be an incentive for people to use the
same libraries so that their applications would be more responsive,
and perhaps a pricing signal could be devised that would encourage
this.

There would be a need to ensure that the files that are understood to
be shared really are the same files, so that someone isn't able to
inject a malicious file that is then used by a number of other
applications.  If A uploads a common file that is profiled and then
gets pushed out and B's application is to make use of it, it would be
important for A's version of the file to be the same as B's version,
so that B's application isn't compromised.  I was thinking that a
strong cryptographic checksum might be adequate for this, but I'm sure
there are challenges I'm not aware of.  Something like MD5 probably
wouldn't be suitable, since I understand there are ways to make
different files that have the same MD5 hash.  Perhaps including a
check of the file size would be enough.  Computation time needed to
derive the hash would be an important factor.

You guys are obviously facing a pretty challenging problem.  Best of
luck in this regard.

Eric

On Apr 16, 2:49 pm, "Jeff S (Google)" <[email protected]> wrote:
> Hi Eric,
>
> I'm not sure if you've seen the documentation on app caching, but it
> seems like this is related to the questions you have about the initial
> load time:
>
> http://code.google.com/appengine/docs/python/runtime.html#App_Caching
>
> Of the five ideas that you listed, all of them would probably have
> some positive improvement but I'd personally lean towards 3 and/or 5
> (profiling + inline imports and wait for improvements). I think these
> would be the least disruptive to the code you've already written and,
> as you mention, the initial startup cost should effect a smaller
> percentage of requests as your app sees more usage. It is really up to
> you though.
>
> Your idea about speeding up access to commonly accessed files is
> interesting, and I think we may already do something similar (also
> described under app cachine). I'm interested in hearing more, please
> elaborate :-)
>
> Happy coding,
>
> Jeff
>
> On Apr 14, 7:29 pm,EricWalker <[email protected]> wrote:
>
> > I apologize for the recent spate of posts.  I don't think I'll need
> > much attention once I get going.  This is partly a Django-specific
> > question and partly a general question, so I'll ask it here rather
> > than on the python list.
>
> > I'm seeing latencies of 2000 to almost 3000 milliseconds with a small
> > GAE application in connection with requests that are reading a short
> > list from memcache.  I don't think times like these are going to work
> > for me, and I'm trying to figure out how to proceed from here.
>
> > I'm using Django 1.0.2 together with app-engine-patch 1.0.  There's a
> > possibility that I'm doing something wrong, but after reading a number
> > of posts, the impression I get is that this is not something that is
> > out of the usual for this setup.  The problem seems to be due to a
> > zimporter call that is made to load Django.  Logs that don't include
> > this call are as low as 800 ms; not an ideal time, but one I think I
> > can live with.
>
> > I understand from what I've read that the high latency is due to the
> > entirety of the application image being loaded with each request,
> > either because the request hits a different server or because a server
> > that previously served a request has flushed the application from
> > memory in the intervening time.  If I've understood things correctly,
> > the application would be loaded into memory on certain of the servers
> > and response times would go down if there were to be a sustained
> > increase in traffic.  But since the traffic to my application is very
> > low right now, this doesn't happen.
>
> > Following are some possible ways forward, and I'm wondering whether
> > anyone has any thoughts on which of them might be the best way to go:
>
> > 1. Back-port the application to Django 0.96, which I understand is
> > memory-resident on the GAE servers, on the assumption that things
> > would significantly speed up given my application's traffic profile.
> > I don't think I've used too many Django 1.0 features at this point, so
> > hopefully this wouldn't be difficult.
>
> > 2. Stick with the current setup, which I like in other respects, and
> > remove the unused portions of the Django sources, as is suggested 
> > inhttp://code.google.com/appengine/articles/django10_zipimport.html.
>
> > 3. Stick with the current setup, profile the application carefully and
> > do things like inline import statements into application code and use
> > Django templates more carefully, on the assumption that the problem
> > arises because I'm not using Django efficiently rather than as a
> > result of the zimport call that is being made.
>
> > 4. Port the application over to a lighter-weight framework.
>
> > 5. Live with 2000+ ms latencies for now, since the App Engine
> > infrastructure will probably handle a case like this more optimally in
> > the medium term.
>
> > In a different connection, one question I had for the Google folks as
> > I was thinking through this issue was whether it might not make sense
> > to keep a cryptographically secure hash of each file for which a
> > system call is made (e.g., in connection with an import statement in
> > python).  When a given hash is seen a lot, the file could be pushed
> > out onto the servers into read-only memory and a copy-on-write done as
> > necessary.  This was just a thought.  There might be some 
> > securityissuesthat would prevent such an approach.
>
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

[google-appengine] Re: Performance issues

Reply via email to