Re: [PERFORM] Caching of Queries

2004-12-23 Thread Pierre-Frdric Caillaud
I've looked at PREPARE, but apparently it only lasts per-session - that's worthless in our case (web based service, one connection per data-requiring connection). You don't use persistent connections ??? Your problem might simply be the connection time overhead (also including a

Re: [PERFORM] Caching of Queries

2004-10-07 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: First, it's not a particular problem with pgpool. As far as I know any connection pool solution has exactly the same problem. Second, it's easy to fix if PostgreSQL provides a functionarity such as:drop all temporary tables if any. I don't like that

Re: [PERFORM] Caching of Queries

2004-10-07 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes: First, it's not a particular problem with pgpool. As far as I know any connection pool solution has exactly the same problem. Second, it's easy to fix if PostgreSQL provides a functionarity such as:drop all temporary tables if any. I don't like

Re: [PERFORM] Caching of Queries

2004-10-06 Thread Tatsuo Ishii
I don't know what you are exactly referring to in above URL when you are talking about potential pitfalls of pooling. Please explain more. Sorry, I wasn't implying that pgpool doesn't deal with the issues, just that some people aren't necessarily aware of them up front. For instance,

Re: [PERFORM] Caching of Queries

2004-10-04 Thread Jim C. Nasby
On Fri, Oct 01, 2004 at 10:10:40AM -0700, Josh Berkus wrote: Transparent query caching is the industry standard for how these things are handled. However, Postgres' lack of this feature has made me consider other approaches, and I'm starting to wonder if the standard query caching --

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frdric Caillaud
pgpool (which makes some rather questionable claims IMO); any decent web application language/environment will support connection pooling. That's why it should not be tied to something specific as pgpool. If you want performance, which is the case here, usually you have a webserver serving

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frdric Caillaud
1) The materialized data is available in 3 different forms; a list, a detail view, and a spreadsheet. Each form as somewhat different columns and different rules about ordering, which would likely confuse an SQC planner. In this implementation, all 3 forms are able to share the same cache.

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frdric Caillaud
If it was in pgpool or something similar, I could devote a separate machine just for caching results leaving the db server untouched. BUT you would be limited to caching complete queries. There is a more efficient strategy... ---(end of

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Tatsuo Ishii
More to the point though, I think this is a feature that really really should be in the DB, because then it's trivial for people to use. How does putting it into PGPool make it any less trivial for people to use? The answers are at

Re: [PERFORM] Caching of Queries

2004-10-02 Thread Josh Berkus
Aaron, I'm not sure I understand your req fully. I'm not surprised.I got wrapped up in an overly involved example and completely left off the points I was illustrating. So here's the points, in brief: 1) Query caching is not a single problem, but rather several different problems

Re: [PERFORM] Caching of Queries

2004-10-02 Thread William Yu
Josh Berkus wrote: 1) Query caching is not a single problem, but rather several different problems requiring several different solutions. 2) Of these several different solutions, any particular query result caching implementation (but particularly MySQL's) is rather limited in its

Re: [PERFORM] Caching of Queries

2004-10-02 Thread Josh Berkus
William, Just my 2 cents on this whole issue. I would lean towards having result caching in pgpool versus the main backend. I want every ounce of memory on a database server devoted to the database. Caching results would double the effect of cache flushing ... ie, now both the results and the

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Jim C. Nasby
On Fri, Oct 01, 2004 at 06:43:42AM +0100, Matt Clark wrote: If you're not using a connection pool of some kind then you might as well forget query plan caching, because your connect overhead will swamp the planning cost. This does not mean you have to use something like pgpool (which makes

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Matt Clark
OK, that'd work too... the point is if you're re-connecting all the time it doesn't really matter what else you do for performance. Yeah, although there is the chap who was asking questions on the list recently who had some very long-running code on his app servers, so was best off closing

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Aaron Werman
] Caching of Queries [ discussion of server side result caching ] and lets not forget PG's major fork it will throw into things: MVCC The results of query A may hold true for txn 1, but not txn 2 and so on . That would have to be taken into account as well and would greatly complicate things

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Josh Berkus
People: Transparent query caching is the industry standard for how these things are handled. However, Postgres' lack of this feature has made me consider other approaches, and I'm starting to wonder if the standard query caching -- where a materialized query result, or some reduction

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Aaron Werman
, they're in memory. I interleaved some notes into your posting. - Original Message - From: Josh Berkus [EMAIL PROTECTED] To: Postgresql Performance [EMAIL PROTECTED] Sent: Friday, October 01, 2004 1:10 PM Subject: Re: [PERFORM] Caching of Queries People: Transparent query

Re: [PERFORM] Caching of Queries

2004-09-30 Thread Jim C. Nasby
On Mon, Sep 27, 2004 at 09:30:31PM +0100, Matt Clark wrote: It's certainly the case that the typical web app (which, along with warehouses, seems to be one half of the needy apps), could probably do worse than use pooling as well. I'm not well up enough on pooling to know how bulletproof

Re: [PERFORM] Caching of Queries

2004-09-28 Thread Tom Lane
Aaron Werman [EMAIL PROTECTED] writes: I imagine a design where a shared plan cache would consist of the plans, indexed by a statement hash and again by dependant objects. A statement to be planned would be hashed and matched to the cache. DDL would need to synchronously destroy all dependant

Re: [PERFORM] Caching of Queries

2004-09-28 Thread Aaron Werman
- Original Message - From: Tom Lane [EMAIL PROTECTED] To: Aaron Werman [EMAIL PROTECTED] Cc: Iain [EMAIL PROTECTED]; Jim C. Nasby [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, September 28, 2004 9:58 AM Subject: Re: [PERFORM] Caching of Queries Aaron Werman [EMAIL PROTECTED

Re: [PERFORM] Caching of Queries

2004-09-28 Thread Matthew Nuzum
Of Tom Lane Sent: Monday, September 27, 2004 1:19 AM To: Neil Conway Cc: Aaron Werman; Scott Kirkwood; [EMAIL PROTECTED] Subject: Re: [PERFORM] Caching of Queries Neil Conway [EMAIL PROTECTED] writes: I think the conclusion of past discussions about this feature is that it's a bad idea. Last I

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Jeff
[ discussion of server side result caching ] and lets not forget PG's major fork it will throw into things: MVCC The results of query A may hold true for txn 1, but not txn 2 and so on . That would have to be taken into account as well and would greatly complicate things. It is always possible

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Mitch Pirtle
On Mon, 27 Sep 2004 18:20:48 +0100, Matt Clark [EMAIL PROTECTED] wrote: This is very true. Client side caching is an enormous win for apps, but it requires quite a lot of logic, triggers to update last-modified fields on relevant tables, etc etc. Moving some of this logic to the DB would

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Jim C. Nasby
On Thu, Sep 23, 2004 at 08:29:25AM -0700, Mr Pink wrote: Not knowing anything about the internals of pg, I don't know how this relates, but in theory, query plan caching is not just about saving time re-planning queries, it's about scalability. Optimizing queries requires shared locks on

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Matt Clark
Basically you set a default in seconds for the HTML results to be cached, and then have triggers set that force the cache to regenerate (whenever CRUD happens to the content, for example). Can't speak for Perl/Python/Ruby/.Net/Java, but Cache_Lite sure made a believer out of me! Nice to have

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Rod Taylor
More to the point though, I think this is a feature that really really should be in the DB, because then it's trivial for people to use. How does putting it into PGPool make it any less trivial for people to use? ---(end of broadcast)--- TIP

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Steve Atkins
On Mon, Sep 27, 2004 at 09:19:12PM +0100, Matt Clark wrote: Basically you set a default in seconds for the HTML results to be cached, and then have triggers set that force the cache to regenerate (whenever CRUD happens to the content, for example). Can't speak for

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Matt Clark
More to the point though, I think this is a feature that really really should be in the DB, because then it's trivial for people to use. How does putting it into PGPool make it any less trivial for people to use? The answers are at

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Matt Clark
Any competently written application where caching results would be a suitable performance boost can already implement application or middleware caching fairly easily, and increase performance much more than putting result caching into the database would. I guess the performance increase is

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Tom Lane
Iain [EMAIL PROTECTED] writes: I can only tell you (roughly) how it works wth Oracle, Which unfortunately has little to do with how it works with Postgres. This latches stuff is irrelevant to us. In practice, any repetitive planning in PG is going to be consulting catalog rows that it draws

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Iain
ROTECTED] Sent: Tuesday, September 28, 2004 12:17 PM Subject: Re: [PERFORM] Caching of Queries "Iain" [EMAIL PROTECTED] writes: I can only tell you (roughly) how it works wth Oracle, Which unfortunately has little to do with how it works with Postgres.

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Gaetano Mendola
Neil Conway wrote: Another idea would be to improve the quality of the plan we generate at PREPARE time: for instance you could generate 'n' plans for various combinations of input parameters, and then choose the best query plan at EXECUTE time. It's a difficult problem to solve, however (consider

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Harald Fuchs
In article [EMAIL PROTECTED], Scott Kirkwood [EMAIL PROTECTED] writes: I couldn't find anything in the docs or in the mailing list on this, but it is something that Oracle appears to do as does MySQL. The idea, I believe, is to do a quick (hash) string lookup of the query and if it's exactly

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Mr Pink
Not knowing anything about the internals of pg, I don't know how this relates, but in theory, query plan caching is not just about saving time re-planning queries, it's about scalability. Optimizing queries requires shared locks on the database metadata, which, as I understand it causes

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Jason Coene
: Re: [PERFORM] Caching of Queries Not knowing anything about the internals of pg, I don't know how this relates, but in theory, query plan caching is not just about saving time re-planning queries, it's about scalability. Optimizing queries requires shared locks on the database metadata, which

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Tom Lane
Jason Coene [EMAIL PROTECTED] writes: All of our postgres processes end up in the semwai state - seemingly waiting on other queries to complete. If the system isn't taxed in CPU or disk, I have a good feeling that this may be the cause. Whatever that is, I'll bet lunch that it's got 0 to do

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Gaetano Mendola
Jason Coene wrote: I'm not an expert, but I've been hunting down a killer performance problem for a while now. It seems this may be the cause. At peak load, our database slows to a trickle. The CPU and disk utilization are normal - 20-30% used CPU and disk performance good. For a peak load

Re: [PERFORM] Caching of Queries

2004-09-23 Thread jason.servetar
on the server then we would get any other way. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jason Coene Sent: Thursday, September 23, 2004 10:53 AM To: 'Mr Pink'; 'Scott Kirkwood' Cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] Caching of Queries I'm

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Jason Coene
: [PERFORM] Caching of Queries Jason Coene [EMAIL PROTECTED] writes: All of our postgres processes end up in the semwai state - seemingly waiting on other queries to complete. If the system isn't taxed in CPU or disk, I have a good feeling that this may be the cause. Whatever

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Greg Stark
Jason Coene [EMAIL PROTECTED] writes: All of our postgres processes end up in the semwai state - seemingly waiting on other queries to complete. If the system isn't taxed in CPU or disk, I have a good feeling that this may be the cause. Well, it's possible contention of some sort is an

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Tom Lane
Jason Coene [EMAIL PROTECTED] writes: I'm not sure how I go about getting the stack traceback you need. Any info on this? Results of ps below. System is dual xeon 2.6, 2gb ram, hardware raid 10 running FreeBSD 5.2.1. Hmm. Dual Xeon sets off alarm bells ... I think you are probably looking

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Matt Clark
I've looked at PREPARE, but apparently it only lasts per-session - that's worthless in our case (web based service, one connection per data-requiring connection). That's a non-sequitur. Most 'normal' high volume web apps have persistent DB connections, one per http server process. Are you

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Gaetano Mendola
Jason Coene wrote: Hi Tom, Easily recreated with Apache benchmark, ab -n 3 -c 3000 http://webserver . This runs 1 query per page, everything else is cached on webserver. That test require 3 access with 3000 connections that is not a normal load. Describe us your HW. 3000 connections

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Markus Schaber
Hi, Jason, On Thu, 23 Sep 2004 12:53:25 -0400 Jason Coene [EMAIL PROTECTED] wrote: I've looked at PREPARE, but apparently it only lasts per-session - that's worthless in our case (web based service, one connection per data-requiring connection). This sounds like the loads of connection init

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Josh Berkus
Tom, I think you are probably looking at the same problem previously reported by Josh Berkus among others. Does the rate of context swaps shown by vmstat go through the roof when this happens? If you strace or ktrace one of the backends, do you see lots of semop()s and little else? That

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: I think you are probably looking at the same problem previously reported by Josh Berkus among others. That would be interesting. Previously we've only demonstrated the problem on long-running queries, but I suppose it could also affect massive

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Jason Coene
Hi All, It does sound like we should be pooling connections somehow. I'll be looking at implementing that shortly. I'd really like to understand what the actual problem is, though. Sorry, I meant 30,000 with 300 connections - not 3,000. The 300 connections / second is realistic, if not

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Jason Coene
Update: I just tried running the same test (ab with 150 concurrent connections) while connecting to postgres through 35 persistent connections (PHP library), and had roughly the same type of results. This should eliminate the new connection overhead. I've attached top and vmstat. I let it run

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Josh Berkus
Jason, Sorry, I meant 30,000 with 300 connections - not 3,000. The 300 connections / second is realistic, if not underestimated. As is the nature of our site (realtime information about online gaming), there's a huge fan base and as a big upset happens, we'll do 50,000 page views in a

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Joshua D. Drake
Sorry, I meant 30,000 with 300 connections - not 3,000. The 300 connections / second is realistic, if not underestimated. As is the nature of our site (realtime information about online gaming), there's a huge fan base and as a big upset happens, we'll do 50,000 page views in a span

Re: [PERFORM] Caching of Queries (now with pgpool)

2004-09-23 Thread Jason Coene
PROTECTED]; [EMAIL PROTECTED] Subject: Re: [PERFORM] Caching of Queries Jason, Sorry, I meant 30,000 with 300 connections - not 3,000. The 300 connections / second is realistic, if not underestimated. As is the nature of our site (realtime information about online gaming), there's

Re: [PERFORM] Caching of Queries (now with pgpool)

2004-09-23 Thread Michael Adler
On Thu, Sep 23, 2004 at 09:23:51PM -0400, Jason Coene wrote: I ran some previous queries to get pgpool to pre-establish all the connections, and ab ran for a few minutes (with one query per page, eek!). It was still exhibiting the same problems as before. While so many new connections at once

[PERFORM] Caching of Queries

2004-09-22 Thread Scott Kirkwood
I couldn't find anything in the docs or in the mailing list on this, but it is something that Oracle appears to do as does MySQL. The idea, I believe, is to do a quick (hash) string lookup of the query and if it's exactly the same as another query that has been done recently to re-use the old

Re: [PERFORM] Caching of Queries

2004-09-22 Thread Neil Conway
On Thu, 2004-09-23 at 05:59, Tom Lane wrote: I think this would allow the problems of cached plans to bite applications that were previously not subject to them :-(. An app that wants plan re-use can use PREPARE to identify the queries that are going to be re-executed. I agree; if you want to