[appengine-java] Re: High latency issue

Steve Pritchard Mon, 22 Feb 2010 14:29:27 -0800

This reply is really to the Google folks.

This issue has come up as a number of threads some blaming Datanucleus
and others about timing out during startup.

I will soon deploy my app and I have the same concerns because it will
not have a high hit rate.  Unless it is solved I can foresee many apps
unable to deploy because the response times will be erratic and will
not meet the user's level of acceptance.

The design of Java is such that it 'learns' as it warms-up.  Thus
classes get loaded, byte-code gets rendered into machine code and
Singletons get initialized.  The standard design of  web-applications
is to also cache many of its computed values during this warm-up.
Once a JDO manager such as Datanucleus is thrown into the mix this
warm-up takes even longer and even becomes measurable.  Speeding up
the warm-up would help but it does not really solve the latency issue.

In essence the issue is that the App gets thrown out by the Google app
manager because it is deemed to be idle.  Then when it needs to
service another request it gets rolled back in, somewhere. The
'somewhere' notion is great as it makes the whole Google App Engine
concept that Google is building feasible. The trouble is that the App
has to warm-up again.
This can happen even in a high-volume App because somebody will be the
unlucky user that has to warm the App up.

My background includes IBM's MVS of 35 plus years ago when they first
introduced the concept of Virtual Storage and page faults.  As an
optimization at this time they also introduced the notion of a
'Working Set'.  That is, when an application was idle they threw the
whole thing out on disk but noted its 'Working Set'.  When it was
restarted they bulk loaded (cheaper) the working set and so the
application worked in a reasonable manner.

I am wondering if such a notion could be used to really solve this
problem.  I can only guess at the underlying structure of the servers
running the Google Web App.  Here is my guess with a potential
solution.

Guess:
64Bit Servers running some sort of VMWare.
32bit Linux used as the host operating system.

Potential solution:
(1) Configure the server's VMWare to run n images of Linux called
Linux-1 through Linux-N, generically referred to as Linux-n.
(2) When an App is first started it is run on Linux-n.  The IP address
settings are virtualized and the VMWare maps them to the physical
address for its physical server.  The same for the 'mounted drives' if
necessary and any other interfaces that need to be externalized.
(3) The Page Table for the Linux-n image uses huge Page Table entries,
not the 4M entries used in Linux.
(3) When it is thrown out the Page Table entries are copied to disk.
It might be possible to only write the modified pages.
(4) When an App needs to run, a server is found that has the Linux-n
image available.  It is restarted by bulk loading the whole image.
(5) The App will thus run in its Warmed-up-state.

Notes:
(1) Linux, I assume to be generic, makes no use of Intel's wonderful
GDT, LDT design.  Instead, it plays with CR3 which points to the Page
Table.  This makes the Page Table mechanism work but it is not
optimized for Intel's design.  It also means the application deals
with Page Faults as encountered and there is no way to pre-load pages
(working set notion).

(2) Potentially the 64 Bit VMWare can define Page Table entries that
are very large.  That is what 64bit machines brought to the table.
This could be used to emulate a 'working set'.  This is not off-the-
shelf VMWare and so it would need some custom tweaks.

(3) It might be possible to share the Linux and JVM classes between
Linux-n images and thus reduce the footprint of what gets rolled out
and in.

(4) My guesses could be way off and this whole strategy out to lunch.

Steve Pritchard

On Feb 22, 12:41 pm, Soichi Hayashi <[email protected]> wrote:
> Hi. I just deployed one of my first GAE applications, and I've noticed
> immediately that it has unacceptably high latency issue. My
> application uses Ajax for certain parts of the app, and something that
> should takes less than 100 milliseconds (on our Tomcat container), it
> is often taking as much as 6.5 or more for the content to be returned
> in GAE.
>
> Is there a problem with my account? Or is this perhaps expected level
> of performance for non-paid users who are test driving Google App
> Engine (and will improve once we pay the minimum fee?)
>
> Thank you,
> Soichi

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

[appengine-java] Re: High latency issue

Reply via email to