On 16 Oct 2006, at 4:29, Shane Ambler wrote:
Harvell F wrote:
Getting back to the original posting, as I remember it, the
question was about seldom changed information. In that case, and
assuming a repetitive query as above, a simple query results cache
that is keyed on the passed SQL statement string and that simply
returns the previously cooked result set would be a really big
I believe the main point that Mark made was the extra overhead is
in the sql parsing and query planning - this is the part that
postgres won't get around. Even if you setup simple tables for
caching it still goes through the parser and planner and looses the
benefits that memcached has. Or you fork those requests before the
planner and loose the benefits of postgres.
The main benefit of using memcached is to bypass the parsing and
That was the basis of my suggestion to just use the passed query
string as the key. No parsing or processing of the query, just a
simple string match.
You will find there is more to sql parsing than you first think, it
needs to find the components that make up the sql statement (tables
column names functions) and check that they exist and can be used
in the context of the given sql and the given data matches the
context that is given to be used in, it needs to check that the
current user has enough privileges to perform the requested task,
then it locates the data whether it be in the memory cache, on disk
or an integrated version of memcached, this would also include
checks to make sure another user hasn't locked the data to change
it and whether there exists more than one version of the data,
committed and uncommitted and then sends the results back to the
client requesting it.
The user permissions checking is a potential issue but again, for
the special case of repeated queries by the same user (the webserver
process) for the same data, a simple match of the original query
string _and_ the original query user, would still be very simple.
The big savings by having the simple results cache would be the
elimination of the parsing, planning, locating, combining, and
sorting of the results set.
I don't believe normal locking plays a part in the cache (there
are basic cache integrity locking issues though) nor does the
versioning or commit states, beyond the invalidation of the cache
upon a commit to a referenced table. It may be that the invalidation
needs to happen whenever a table is locked as well. (The hooks for
the invalidation would be done during the original caching of the
I know that the suggestion is a very simple minded suggestion and
is limited to a very small subset of the potential query types and
interactions, however, at least for web applications, it would be a
very big win. Many website want to display today's data on their
webpage and have it change as dates change (or as users change). The
data in the source table doesn't change very often (especially
compared to a popular website) and the number of times that the exact
same query could be issued between changes can measure into the
hundreds of thousands or more. Putting even this simple results
cache into the database would really simplify the programmer's life
and improve reliability (and the use of PostgreSQL).
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?