I've looked at PREPARE, but apparently it only lasts per-session -
that's
worthless in our case (web based service, one connection per
data-requiring
connection).
You don't use persistent connections ???
Your problem might simply be the connection time overhead (also including
a
Tatsuo Ishii [EMAIL PROTECTED] writes:
First, it's not a particular problem with pgpool. As far as I know any
connection pool solution has exactly the same problem. Second, it's
easy to fix if PostgreSQL provides a functionarity such as:drop all
temporary tables if any.
I don't like that
Tatsuo Ishii [EMAIL PROTECTED] writes:
First, it's not a particular problem with pgpool. As far as I know any
connection pool solution has exactly the same problem. Second, it's
easy to fix if PostgreSQL provides a functionarity such as:drop all
temporary tables if any.
I don't like
I don't know what you are exactly referring to in above URL
when you are talking about potential pitfalls of pooling.
Please explain more.
Sorry, I wasn't implying that pgpool doesn't deal with the issues, just that
some people aren't necessarily aware of them up front. For instance,
On Fri, Oct 01, 2004 at 10:10:40AM -0700, Josh Berkus wrote:
Transparent query caching is the industry standard for how these things
are handled. However, Postgres' lack of this feature has made me consider
other approaches, and I'm starting to wonder if the standard query caching
--
pgpool (which makes some rather questionable claims IMO); any decent web
application language/environment will support connection pooling.
That's why it should not be tied to something specific as pgpool.
If you want performance, which is the case here, usually you have a
webserver serving
1) The materialized data is available in 3 different forms; a list, a
detail
view, and a spreadsheet. Each form as somewhat different columns and
different rules about ordering, which would likely confuse an SQC
planner.
In this implementation, all 3 forms are able to share the same cache.
If it was in pgpool or something similar, I could devote a separate
machine just for caching results leaving the db server untouched.
BUT you would be limited to caching complete queries. There is a more
efficient strategy...
---(end of
More to the point though, I think this is a feature that really really
should be in the DB, because then it's trivial for people to use.
How does putting it into PGPool make it any less trivial for people to
use?
The answers are at
Aaron,
I'm not sure I understand your req fully.
I'm not surprised.I got wrapped up in an overly involved example and
completely left off the points I was illustrating. So here's the points, in
brief:
1) Query caching is not a single problem, but rather several different
problems
Josh Berkus wrote:
1) Query caching is not a single problem, but rather several different
problems requiring several different solutions.
2) Of these several different solutions, any particular query result caching
implementation (but particularly MySQL's) is rather limited in its
William,
Just my 2 cents on this whole issue. I would lean towards having result
caching in pgpool versus the main backend. I want every ounce of memory
on a database server devoted to the database. Caching results would
double the effect of cache flushing ... ie, now both the results and the
On Fri, Oct 01, 2004 at 06:43:42AM +0100, Matt Clark wrote:
If you're not using a connection pool of some kind then you might as
well forget query plan caching, because your connect overhead will swamp
the planning cost. This does not mean you have to use something like
pgpool (which makes
OK, that'd work too... the point is if you're re-connecting
all the time it doesn't really matter what else you do for
performance.
Yeah, although there is the chap who was asking questions on the list
recently who had some very long-running code on his app servers, so was best
off closing
] Caching of Queries
[ discussion of server side result caching ]
and lets not forget PG's major fork it will throw into things: MVCC
The results of query A may hold true for txn 1, but not txn 2 and so on
.
That would have to be taken into account as well and would greatly
complicate things
People:
Transparent query caching is the industry standard for how these things
are handled. However, Postgres' lack of this feature has made me consider
other approaches, and I'm starting to wonder if the standard query caching
-- where a materialized query result, or some reduction
,
they're in memory. I interleaved some notes into your posting.
- Original Message -
From: Josh Berkus [EMAIL PROTECTED]
To: Postgresql Performance [EMAIL PROTECTED]
Sent: Friday, October 01, 2004 1:10 PM
Subject: Re: [PERFORM] Caching of Queries
People:
Transparent query
On Mon, Sep 27, 2004 at 09:30:31PM +0100, Matt Clark wrote:
It's certainly the case that the typical web app (which, along with
warehouses, seems to be one half of the needy apps), could probably do
worse than use pooling as well. I'm not well up enough on pooling to
know how bulletproof
Aaron Werman [EMAIL PROTECTED] writes:
I imagine a design where a shared plan cache would consist of the plans,
indexed by a statement hash and again by dependant objects. A statement to
be planned would be hashed and matched to the cache. DDL would need to
synchronously destroy all dependant
- Original Message -
From: Tom Lane [EMAIL PROTECTED]
To: Aaron Werman [EMAIL PROTECTED]
Cc: Iain [EMAIL PROTECTED]; Jim C. Nasby [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Tuesday, September 28, 2004 9:58 AM
Subject: Re: [PERFORM] Caching of Queries
Aaron Werman [EMAIL PROTECTED
Of Tom Lane
Sent: Monday, September 27, 2004 1:19 AM
To: Neil Conway
Cc: Aaron Werman; Scott Kirkwood; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Caching of Queries
Neil Conway [EMAIL PROTECTED] writes:
I think the conclusion of past discussions about this feature is that
it's a bad idea. Last I
[ discussion of server side result caching ]
and lets not forget PG's major fork it will throw into things: MVCC
The results of query A may hold true for txn 1, but not txn 2 and so on
.
That would have to be taken into account as well and would greatly
complicate things.
It is always possible
On Mon, 27 Sep 2004 18:20:48 +0100, Matt Clark [EMAIL PROTECTED] wrote:
This is very true. Client side caching is an enormous win for apps, but it
requires quite a lot of logic, triggers to update last-modified fields on
relevant tables, etc etc. Moving some of this logic to the DB would
On Thu, Sep 23, 2004 at 08:29:25AM -0700, Mr Pink wrote:
Not knowing anything about the internals of pg, I don't know how this relates, but
in theory,
query plan caching is not just about saving time re-planning queries, it's about
scalability.
Optimizing queries requires shared locks on
Basically you set a default in seconds for the HTML results to be
cached, and then have triggers set that force the cache to regenerate
(whenever CRUD happens to the content, for example).
Can't speak for Perl/Python/Ruby/.Net/Java, but Cache_Lite sure made a
believer out of me!
Nice to have
More to the point though, I think this is a feature that really really
should be in the DB, because then it's trivial for people to use.
How does putting it into PGPool make it any less trivial for people to
use?
---(end of broadcast)---
TIP
On Mon, Sep 27, 2004 at 09:19:12PM +0100, Matt Clark wrote:
Basically you set a default in seconds for the HTML results to be
cached, and then have triggers set that force the cache to regenerate
(whenever CRUD happens to the content, for example).
Can't speak for
More to the point though, I think this is a feature that really really
should be in the DB, because then it's trivial for people to use.
How does putting it into PGPool make it any less trivial for people to
use?
The answers are at
Any competently written application where caching results would be a
suitable performance boost can already implement application or
middleware caching fairly easily, and increase performance much more
than putting result caching into the database would.
I guess the performance increase is
Iain [EMAIL PROTECTED] writes:
I can only tell you (roughly) how it works wth Oracle,
Which unfortunately has little to do with how it works with Postgres.
This latches stuff is irrelevant to us.
In practice, any repetitive planning in PG is going to be consulting
catalog rows that it draws
ROTECTED]
Sent: Tuesday, September 28, 2004 12:17 PM
Subject: Re: [PERFORM] Caching of Queries
"Iain" [EMAIL PROTECTED] writes:
I can only tell you (roughly) how it works wth Oracle,
Which unfortunately has little to do with how it works with Postgres.
Neil Conway wrote:
Another idea would be to improve the quality of the plan we generate at PREPARE time:
for instance you could generate 'n' plans for various combinations of
input parameters, and then choose the best query plan at EXECUTE time.
It's a difficult problem to solve, however (consider
In article [EMAIL PROTECTED],
Scott Kirkwood [EMAIL PROTECTED] writes:
I couldn't find anything in the docs or in the mailing list on this,
but it is something that Oracle appears to do as does MySQL.
The idea, I believe, is to do a quick (hash) string lookup of the
query and if it's exactly
Not knowing anything about the internals of pg, I don't know how this relates, but in
theory,
query plan caching is not just about saving time re-planning queries, it's about
scalability.
Optimizing queries requires shared locks on the database metadata, which, as I
understand it
causes
: Re: [PERFORM] Caching of Queries
Not knowing anything about the internals of pg, I don't know how this
relates, but in theory,
query plan caching is not just about saving time re-planning queries, it's
about scalability.
Optimizing queries requires shared locks on the database metadata, which
Jason Coene [EMAIL PROTECTED] writes:
All of our postgres processes end up in the semwai state - seemingly
waiting on other queries to complete. If the system isn't taxed in CPU or
disk, I have a good feeling that this may be the cause.
Whatever that is, I'll bet lunch that it's got 0 to do
Jason Coene wrote:
I'm not an expert, but I've been hunting down a killer performance problem
for a while now. It seems this may be the cause.
At peak load, our database slows to a trickle. The CPU and disk utilization
are normal - 20-30% used CPU and disk performance good.
For a peak load
on the server then we would get any
other way.
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jason Coene
Sent: Thursday, September 23, 2004 10:53 AM
To: 'Mr Pink'; 'Scott Kirkwood'
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] Caching of Queries
I'm
: [PERFORM] Caching of Queries
Jason Coene [EMAIL PROTECTED] writes:
All of our postgres processes end up in the semwai state - seemingly
waiting on other queries to complete. If the system isn't taxed in CPU
or
disk, I have a good feeling that this may be the cause.
Whatever
Jason Coene [EMAIL PROTECTED] writes:
All of our postgres processes end up in the semwai state - seemingly
waiting on other queries to complete. If the system isn't taxed in CPU or
disk, I have a good feeling that this may be the cause.
Well, it's possible contention of some sort is an
Jason Coene [EMAIL PROTECTED] writes:
I'm not sure how I go about getting the stack traceback you need. Any info
on this? Results of ps below. System is dual xeon 2.6, 2gb ram, hardware
raid 10 running FreeBSD 5.2.1.
Hmm. Dual Xeon sets off alarm bells ...
I think you are probably looking
I've looked at PREPARE, but apparently it only lasts
per-session - that's worthless in our case (web based
service, one connection per data-requiring connection).
That's a non-sequitur. Most 'normal' high volume web apps have persistent
DB connections, one per http server process. Are you
Jason Coene wrote:
Hi Tom,
Easily recreated with Apache benchmark, ab -n 3 -c 3000
http://webserver . This runs 1 query per page, everything else is cached
on webserver.
That test require 3 access with 3000 connections that is not a normal
load. Describe us your HW.
3000 connections
Hi, Jason,
On Thu, 23 Sep 2004 12:53:25 -0400
Jason Coene [EMAIL PROTECTED] wrote:
I've looked at PREPARE, but apparently it only lasts per-session - that's
worthless in our case (web based service, one connection per data-requiring
connection).
This sounds like the loads of connection init
Tom,
I think you are probably looking at the same problem previously reported
by Josh Berkus among others. Does the rate of context swaps shown by
vmstat go through the roof when this happens? If you strace or ktrace
one of the backends, do you see lots of semop()s and little else?
That
Josh Berkus [EMAIL PROTECTED] writes:
I think you are probably looking at the same problem previously reported
by Josh Berkus among others.
That would be interesting. Previously we've only demonstrated the
problem on long-running queries, but I suppose it could also affect
massive
Hi All,
It does sound like we should be pooling connections somehow. I'll be
looking at implementing that shortly. I'd really like to understand what
the actual problem is, though.
Sorry, I meant 30,000 with 300 connections - not 3,000. The 300 connections
/ second is realistic, if not
Update:
I just tried running the same test (ab with 150 concurrent connections)
while connecting to postgres through 35 persistent connections (PHP
library), and had roughly the same type of results. This should eliminate
the new connection overhead. I've attached top and vmstat. I let it run
Jason,
Sorry, I meant 30,000 with 300 connections - not 3,000. The 300
connections
/ second is realistic, if not underestimated. As is the nature of
our site
(realtime information about online gaming), there's a huge fan base
and as a
big upset happens, we'll do 50,000 page views in a
Sorry, I meant 30,000 with 300 connections - not 3,000. The 300
connections
/ second is realistic, if not underestimated. As is the nature of
our site
(realtime information about online gaming), there's a huge fan base
and as a
big upset happens, we'll do 50,000 page views in a span
PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Caching of Queries
Jason,
Sorry, I meant 30,000 with 300 connections - not 3,000. The 300
connections
/ second is realistic, if not underestimated. As is the nature of
our site
(realtime information about online gaming), there's
On Thu, Sep 23, 2004 at 09:23:51PM -0400, Jason Coene wrote:
I ran some previous queries to get pgpool to pre-establish all the
connections, and ab ran for a few minutes (with one query per page, eek!).
It was still exhibiting the same problems as before. While so many new
connections at once
I couldn't find anything in the docs or in the mailing list on this,
but it is something that Oracle appears to do as does MySQL.
The idea, I believe, is to do a quick (hash) string lookup of the
query and if it's exactly the same as another query that has been done
recently to re-use the old
On Thu, 2004-09-23 at 05:59, Tom Lane wrote:
I think this would allow the problems of cached plans to bite
applications that were previously not subject to them :-(.
An app that wants plan re-use can use PREPARE to identify the
queries that are going to be re-executed.
I agree; if you want to
54 matches
Mail list logo