Re: [HACKERS] Again, sorry, caching.
Greg Copeland wrote: At this point in time, I think we've both pretty well beat this topic up. Obviously there are two primary ways of viewing the situation. I don't think anyone is saying it's a bad idea...I think everyone is saying that it's easier to address elsewhere and that overall, the net returns may be at the expense of some other work loads. So, unless there are new pearls to be shared and gleaned, I think the topics been fairly well addressed. Does more need to said? With a PREPARE/EXECUTE patch now out for approval, can I assume we will go with that first and see how far it gets us, and then revisit the idea of cached results. In this case, we are caching the query plan. The query still executes again in the executor, so the data is always fresh. In a sense, the buffer cache and disk are the caches, which don't need separate invalidation if some data changes in the table. The plan can get invalid if it calls a non-cachable function or the schema changes, or the constants used to generate the plan in the optimizer would generate a different plan from the constants used in a later query, or the analyze statistics changed. The MVCC ramifications of cached results and invalidation could be quite complex. The commit of a transaction could change tuple visibility rules even if the data modify statement was executed much earlier in the transaction. Also, on the NOTIFY/trigger idea, triggers are called on statement end, not transaction end, so if an UPDATE query is in a multi-statement transaction, another backend looking for the NOTIFY will receive it before the transaction commits, meaning it will not see the update. That seems like a problem. We do have deferrable constraints which will only do checking on transaction end, but I am not sure if that can be used for NOTIFY on transaction commit. Anyone? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Again, sorry, caching.
Bruce Momjian [EMAIL PROTECTED] writes: Also, on the NOTIFY/trigger idea, triggers are called on statement end, not transaction end, so if an UPDATE query is in a multi-statement transaction, another backend looking for the NOTIFY will receive it before the transaction commits, meaning it will not see the update. No it won't. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Again, sorry, caching.
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Also, on the NOTIFY/trigger idea, triggers are called on statement end, not transaction end, so if an UPDATE query is in a multi-statement transaction, another backend looking for the NOTIFY will receive it before the transaction commits, meaning it will not see the update. No it won't. Is this because NOTIFY is held for transaction end or because the triggers are held until transaction commit? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Again, sorry, caching.
Bruce Momjian [EMAIL PROTECTED] writes: No it won't. Is this because NOTIFY is held for transaction end or because the triggers are held until transaction commit? The former. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Again, sorry, caching.
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: No it won't. Is this because NOTIFY is held for transaction end or because the triggers are held until transaction commit? The former. Thanks. I see it in the NOTIFY manual page now. Very nice. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Again, sorry, caching, (Tom What do you think: function
mlw wrote: Jan Wieck wrote: mlw wrote: [...] IMHO modifying the function manager to allow the return of a full row, and a set of full rows, answers a LOT of issues I have seen over the years with PostgreSQL extensibility. Sure. Actually I think you'll have an easy project with this one, because all the work has been done by Tom already. The function manager isn't the problem any more. It is that you cannot have such a set of function in the rangetable. So you have no mechanism to USE the result. I'm not sure I follow you. OK, maybe I identified the wrong portion of code. The idea is that the first return value could return an array of varlenas, one for each column, then a set of varlenas, one for each column. Is there a way to return this to PostgreSQL? There is a way to return anything. The problem in PostgreSQL is to actually USE it. Our idea originally was to extend the capabilities of a rangetable entry. Currently, rangetable entries can only hold a relation, which is a table or a view. After rewriting, they are down to real tables only. But basically, a rangetable entry should just be a row- source, so that a function returning a row-set could occur there too. In order to avoid multiple calls to the function because of nestloops and the like, I think when a set function occurs in a RTE, it's result should be dumped into a sort-tape and that is used as the row source in the rest of the plan. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Again, sorry, caching, (Tom What do you think: function
Neil, Following is an email I sent the other day detailing how this works. The entry point to the underlying invalidation system is the heap manipulation functions: heap_delete(), heap_update(). (I've just had a quick look at heap_insert() and cannot find where the cache modification takes place) These call RelationInvalidateHeapTuple() - PrepareForTupleInvalidation() - RegisterCatcacheInvalidation()/RegisterRelcacheInvalidation. These feed linked lists which get processed at the end of the transaction as is detailed below. Clearly, this is a much better way of running the LISTEN/NOTIFY than storing them in the system. Gavin -- Forwarded message -- Date: Wed, 20 Mar 2002 02:17:09 +1100 (EST) From: Gavin Sherry [EMAIL PROTECTED] To: Greg Copeland [EMAIL PROTECTED] Cc: mlw [EMAIL PROTECTED], Jeff Davis [EMAIL PROTECTED], PostgreSQL-development [EMAIL PROTECTED] Subject: Re: [HACKERS] Again, sorry, caching, (Tom What do you think: function On 19 Mar 2002, Greg Copeland wrote: On Tue, 2002-03-19 at 07:46, mlw wrote: [snip] Right now, the function manager can only return one value, or one set of values for a column. It should be possible, but require a lot of research, to enable the function manager to return a set of rows. If we could get that working, it could be fairly trivial to implement a cache as a contrib project. It would work something like this: select querycache(select * from mytable where foo='bar') ; Interesting concept...but how would you know when the cache has become dirty? That would give you a set of rows...but I don't understand what would let you know your result set is invalid? Perhaps: select querycache( foobar_event, select * from my table where foo='bar' ) ; would automatically create a listen for you?? Personally, I think this method of providing query caching is very messy. Why not just implement this along side the system relation cache? This maybe be slightly more time consuming but it will perform better and will be able to take advantage of Postgres's current MVCC. There would be three times when the cache would be interacted with ExecRetrieve() would need to be modified to handle a prepare-for-cache-update kind of feature. This would involve adding the tuple table slot data into a linked list. At the end of processing/transaction and if the query was successfuly, the prepare-for-cache-update list could be processed by AtCommit_Cache() (called from CommitTransaction()) and the shared cache updated. 2) attempt to get result set from cache Before planning in postgres.c, test if the query will produce an already cached result set. If so, send the data off from cache. 3) modification of underlying heap Like (1), produce a list inside the executor (ExecAppend(), ExecDelete(), ExecReplace() - RelationInvalidateHeapTuple() - PrepareForTupleInvalidation()) which gets processed by AtEOXactInvalidationMessages(). This results in the affected entries being purged. --- I'm not sure that cached results is a direction postgres need move in. But if it does, I think this a better way to do it (given that I may have overlooked something) than modifying the function manager (argh!). Gavin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Again, sorry, caching.
On Tue, Mar 19, 2002 at 08:28:19PM -0500, Neil Conway wrote: On Tue, 2002-03-19 at 19:20, F Harvell wrote: I feel that the caching should be SQL transparent. If it is implemented reasonably well, the performance gain should be pretty much universal. Well, the simple query cache scheme that is currently being proposed would use a byte-by-byte comparison of the incoming query. I think the consensus is that for a lot of workloads, this would be a bad idea. Apparently, no one actually read _my_ proposal, they just replied to it. All the arguements about if this kind of cache is any good have been thrashed out, up-thread. Apparently Mr. Harvell didn't feel the need to go back and read them. Going over them again is not productive - the next step is to see if there is anything to actually _code_ here. Caching is a hard problem, once you start digging into it. Going from no cache to some cache is (almost) always a win, but multiple caches in a datapath can interact in non-intuitive ways. And we _already_ have several, well tuned, clever caches in place. Anything that messes with them is going to be rejected, for sure. What I proposed is a sort of compromise: it is clear to me that the core developers are not very interested in the kind of cache Neil is talking about above, and would rather see query caching done in the app. What I proposed is extending the backends support for client-side caching, to make it easier (or possible) for middleware to automate the task. The bare bones are: flag a query in some way so the backend auto generates the appropriate NOTIFY triggers, so the middleware can do proper cache maintenance by LISTENing. I think I'll go away and write up my compromise proposal a little more clearly, and post it under a new subject, later. Perhaps we can then have a productive discussion about _it_, and not rehash old arguments. Ross P.S. HACKER sociological opinion below - feel free to skip - There are only three reasons to discuss features on HACKERS: to see if a proposed feature would be rejected, so you don't waste time implementing it; to refine a proposed implementation, so it doesn't have to be reworked; and to discuss an actual in-hand implementation. Notice that there's no way to skip step one: if the CVS committers don't like the feature, arguing for it on HACKERS won't make it magically better: providing an implementation that doesn't do bad things _might_. And you can always maintain an independent patch, or fork. So, we have a number of people who think a query cache would be a good idea. And core developers who are not convinced. I think one of the reasons is that, while it might be useful in some situations (even fairly common situations) it's neither elegant nor flexible. The PostgreSQL project has a long tradition of turning down narrow, 'good enough - it works for me' solutions, while looking for a better, more inclusive solution. Sometimes this has been a problem with missing features, but in the long run, it's been a win. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Again, sorry, caching.
Neil Conway [EMAIL PROTECTED] writes: My impression (I could be wrong) is that LISTEN/NOTIFY doesn't get the press that it deserves. If this model isn't widely used because of some deficiencies in the LISTEN/NOTIFY implementation, IMHO our time would be better spent fixing those problems than implementing the proposed caching scheme. I would have to say I think a large part of the problem is lack of press---I've been hanging around pgsql-hackers for two or three years now, and until all this discussion, had never heard of LISTEN/NOTIFY. That doesn't mean I didn't miss prior mentions, but it certainly doesn't seem to come up often or get a lot of discussion when it does. Now that I know about it, well, it looks like it would be trivial to use it to implement cache invalidation logic in my HTML::Mason-based application---I need only have a long-lived process running on the web server(s) that uses the perl Pg interface, and sits listening, and when it sees notifies on given conditions, flush the appropriate local caches. I'd actually been contemplating cramming logic to do this down into the library I use for implementing the system logic, but had resisted doing it because it would make the library too tied to the web---this would be much easier. I won't even have to re-grab the results from the DB and reformat and all that crap, I can just spew the output from the last time the page was assembled---sounds better to me than what MySQL provides. Of course, I get a lot of this for free as a result of the tools I'm using, but surely this sort of thing shouldn't be all that hard to implement in other systems. Mike. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Again, sorry, caching.
Yes...I was thinking that a generic library interface with a nice design pattern might meet this need rather well. Done properly, I think we can make it where all that, more or less, would be needed is application hooks which accept the result set to be cached and a mechanism to signal invalidation of the current cacheobviously that's not an exhaustive list... :) A library implies that the application is running long enough to actually hear the notofication. Web apps start up, read from the database, and before any cache is needed they're done and the next one starts up, reading again from the database. Only currently open connections receive the notification. I think that you do need an entire layer... but that's not a bad thing necessarily. Have a daemon that stays connected for a long time and when a notification arrives, rewrite the cache (or mark it dirty). Other applications can read data from static files or shared memory, or even another communication socket with your daemon. There may be some way around running a daemon, so if you have a better solution please let me know. I think I am in favor of client caching in general, but mlw (sorry, I can't find your real name in the emails at hand) makes some good points. The most important one is that we don't want to change application architectures on everyone. It's easy if you just have to add iscachable to a query, it's hard if you have to start writing against a different set of routines (to grab from your client cache rather than a database). However, I am perfectly happy writing a client-side cache or using temp tables to store a result set. I also don't care that much if someone chooses PostgreSQL for their website (unless I'm responsible for it's success in some way :) ). That's me personally, if you want to attract more users from mysql, iscachable is very likely an attractive feature. Regards, Jeff ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Again, sorry, caching, (Tom What do you think: function
On Tue, 2002-03-19 at 07:46, mlw wrote: [snip] Right now, the function manager can only return one value, or one set of values for a column. It should be possible, but require a lot of research, to enable the function manager to return a set of rows. If we could get that working, it could be fairly trivial to implement a cache as a contrib project. It would work something like this: select querycache(select * from mytable where foo='bar') ; Interesting concept...but how would you know when the cache has become dirty? That would give you a set of rows...but I don't understand what would let you know your result set is invalid? Perhaps: select querycache( foobar_event, select * from my table where foo='bar' ) ; would automatically create a listen for you?? This does two things that I would like to see: The ability to cache subselects independent of the full query. The ability to control which queries get cached. If we can get this row functionality in the function manager for 7.3, we could then implement MANY MANY enterprise level functionalities. Remote queries, query caching, external tables, etc. as contrib projects rather than full blown modifications to PostgreSQL. Correct me if I'm wrong, but this concept would also be applicable to some clustering/distributed query (that what you meant by remote queries) needs too? Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching, (Tom What do you think: function
On Tue, 2002-03-19 at 07:46, mlw wrote: I was thinking about this. There seems to be a consensus that caching means no ACID compliance. And everyone seems to think it needs to be limited. We can implement a non-ACID cache as a contrib function with some work to the function manager. Until know, I hadn't really thought about it...I just took it for granted since it was asserted...however, what isn't ACID about the approach that I offered? A - Not effected...it's read only and provided directly from the database, thus, it's still a function of the database. Any change resulting from atomic changes are notified to the cache, whereby it is repopulated. C - Not effected...the database is still responsible for keeping consistency. The cache is still read only. State is ensured as invalidation is notified by the database and data set should be returned consistent by the database or the database is broken. I - Again, the database is still performing this task and notifies the cache when updates need to take place. Again, Isolation isn't an issue because the cache is still read-only. D - Durability isn't a question either as, again, the database is still doing this. In the event of cache failure...it would be repopulated from the database...so it would be as durable as is the database. Please help me understand. Thanks, Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
Jeff Davis [EMAIL PROTECTED] writes: A library implies that the application is running long enough to actually hear the notofication. Web apps start up, read from the database, and before any cache is needed they're done and the next one starts up, reading again from the database. Only currently open connections receive the notification. If your web app works this way than you already don't care about performance. People doing scalable web apps these days use connection pooling and session data kept in memory, so you already have a persistent layer running (whether it's your JVM, Apache process for mod_perl or PHP, or whatever). Really big apps definitely have a long-running daemon process that handles caching, session management (so you can have multiple webservers) etc etc... -Doug -- Doug McNaught Wireboard Industries http://www.wireboard.com/ Custom software development, systems and network consulting. Java PostgreSQL Enhydra Python Zope Perl Apache Linux BSD... ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Again, sorry, caching, (Tom What do you think: function
Gavin Sherry wrote: I'm not sure that cached results is a direction postgres need move in. But if it does, I think this a better way to do it (given that I may have overlooked something) than modifying the function manager (argh!). I actually had an anterior motive. Your comment about caching not being a direction in which PostgreSQL needs to move, says it all. The general rank and file seems to agree. I think caching could speed up a number of things, certainly some of the stuff I have been working on. I think it would be more likely to get some sort of caching from a contrib project rather than to sway the core team. IMHO modifying the function manager to allow the return of a full row, and a set of full rows, answers a LOT of issues I have seen over the years with PostgreSQL extensibility. With a full row function API we can implement: (1) Remote Queries select remotequery(hostname, port, 'select * from foo'); (2) External queries select mysqlquery(hostname, port, 'select * from foo'); (3) Cached queries select cachedquery('select * from foo'); (4) Full text search select ftssquery(hostname, port, 'word1 and word2 and word3 not word4'); Again, with full row functions, we could prototype/implement many advanced features in PostgreSQL as contrib projects. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Again, sorry, caching.
On Tue, 19 Mar 2002 12:12:52 CST, Ross J. Reedstrom wrote: On Mon, Mar 18, 2002 at 09:35:51PM -0500, Neil Conway wrote: It is an application issue This is completely wrong. Caching can not be done against a database without knowledge of the database, i.e. when the data changes. ... If we're looking to provide a quick and easy caching scheme for users attracted to MySQL's query cache, why not provide this functionality through another application? ... What does everyone think? Neil, this sounds like exactly the approach to follow up on: ... Seems like a win all around. Anyone else have comments? ... I'm not certain the full direction of the thinking here, however, it seems to me that there are a few considerations that I would like to see/keep in mind: I feel that the caching should be SQL transparent. If it is implemented reasonably well, the performance gain should be pretty much universal. Yes, a large number of queries would never be called again, however, the results still need to be fetched into memory and caching them for later reuse should involve little more than a skipped free (think filesystem cache). It makes more sense to specify non-cachable in a query for tuning than cacheable. This also means that just switching databases to PostgreSQL improves my performance. Also, it is very important that the caching should be transparent to the application. This means that the application should be able to connect to the database using the standard application interface (i.e., ODBC, PHP, Perl/DBI, etc.) This allows me to port my existing Oracle/DB2/MySQL/etc. application to pgsql through normal porting. If I have to implement a non-standard interface, I can likely gain even more performance through custom code and maintain reasonable database independence. While I am a strong believer in PostgreSQL, many of my customers have other demands/requirements. I still want to be able to use my existing code and libraries when using their database. Sticking with the standards allows me to develop best of class applications and interface to best of class databases. It also allows others to easily realize the value of PostgreSQL. Thanks, F Harvell ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Again, sorry, caching.
On Tue, 2002-03-19 at 19:20, F Harvell wrote: I feel that the caching should be SQL transparent. If it is implemented reasonably well, the performance gain should be pretty much universal. Well, the simple query cache scheme that is currently being proposed would use a byte-by-byte comparison of the incoming query. I think the consensus is that for a lot of workloads, this would be a bad idea. Cheers, Neil -- Neil Conway [EMAIL PROTECTED] PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Again, sorry, caching.
On Sat, Mar 16, 2002 at 09:01:28AM -0500, mlw wrote: If it is mostly static data, why not just make it a static page? Because a static page is a maintenance nightmare. One uses a database in a web site to allow content to be changed and upgraded dynamically and with a minimum of work. It's ugly argumentation for DB cache. What generate web page after data change and next time use it as static? I was thinking that it could be implemented as a keyword or comment in a query. Such as: select * from table where column = 'foo' cacheable You can insert mostly static data into temp table and in next queries use this temp table. After update/delete/insert can your application rebuild temp table (or by trigger?). Karel -- Karel Zak [EMAIL PROTECTED] http://home.zf.jcu.cz/~zakkr/ C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Again, sorry, caching.
Karel Zak wrote: On Sat, Mar 16, 2002 at 09:01:28AM -0500, mlw wrote: If it is mostly static data, why not just make it a static page? Because a static page is a maintenance nightmare. One uses a database in a web site to allow content to be changed and upgraded dynamically and with a minimum of work. It's ugly argumentation for DB cache. What generate web page after data change and next time use it as static? I was thinking that it could be implemented as a keyword or comment in a query. Such as: select * from table where column = 'foo' cacheable You can insert mostly static data into temp table and in next queries use this temp table. After update/delete/insert can your application rebuild temp table (or by trigger?). Yes, I could, as could most of the guys reading these messages. I am thinking about a feature in PostgreSQL that would make that easier for the average DBA or web producer. Lets face it, MySQL wins a lot of people because they put in features that people want. All the ways people have suggested to compete with MySQL's caching have been ugly kludges. I understand the there is an amount of work involved with doing caching, and the value of caching is debatable by some, however, it is demonstrable that caching can improve a very common, albeit specific, set of deployments. Also, managing data is the job of the database, not the application. It does belong in PostgreSQL, if someone is forced to write a caching scheme around PostgreSQL, it is because PostgreSQL lacks that feature. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Again, sorry, caching.
Le Lundi 18 Mars 2002 13:23, mlw a écrit : Lets face it, MySQL wins a lot of people because they put in features that people want. MySQL is very interested in benchmarks. It does not really care for data consistency. Cheers, Jean-Michel POURE ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Again, sorry, caching.
Yes. EVERY person that I've ever known which runs MySQL run for two very simple reasons. First, they believe it to be wicked fast. Second, they don't understand what ACID is, what a transaction is, or why running a single session against a database to perform a benchmark is a completely bogus concept. In case it's not obvious, these are usually people that are trying to take a step up from Access. While I do believe MySQL, from a performance perspective, is a step up from Access I always tell my clients...if you wouldn't use an Access database for this project, you shouldn't use MySQL either. To me, this means we need better advertising, PR, and education rather than a result set cache. :P Speaking of which, I'm wondering if there are any design patterns we can look at which would address client side caching...well, at least make it easier to implement as well as implement it in a consistent manner. Greg On Mon, 2002-03-18 at 07:32, Jean-Michel POURE wrote: Le Lundi 18 Mars 2002 13:23, mlw a écrit : Lets face it, MySQL wins a lot of people because they put in features that people want. MySQL is very interested in benchmarks. It does not really care for data consistency. Cheers, Jean-Michel POURE ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
Jean-Michel POURE wrote: Le Lundi 18 Mars 2002 13:23, mlw a écrit : Lets face it, MySQL wins a lot of people because they put in features that people want. MySQL is very interested in benchmarks. It does not really care for data consistency. In no way am I suggesting we avoid ACID compliance. In no way am I suggesting that PostgreSQL change. All I am suggesting is that tables which change infrequently can and should be cached. select * from table where foo = 'bar' Need not be executed twice if the table has not changed. select * from table1, (select * from table2 where foo='bar' cacheable) as subset were subset.col1 = table1.col1; In the above query, if table two changes 4 times a day, and it queried a couple times a minute or second, the caching of the subset could save a huge amount of disk I/O. This sort of query could improve many catalog based implementations, from music, to movies, to books. A library could implement a SQL query for book lookups like this: select * from authors, (select * from books where genre = 'scifi' cacheable) as subset where authors.id = subset.auhorid and authors.id in () Yes it is arguable that index scans may work better, and obviously, summary tables may help, etc. but imagine a more complex join which produces fewer records, but is executed frequently. Caching could help the performance of PostgreSQL in some very real applications. MySQL's quest for benchmarking numbers, I agree, is shameful because they create numbers which are not really applicable in the real world. This time, however, I think they may be on to something. (1) PostgreSQL use a cacheable or iscacheable keyword. (2) If the query uses functions which are not marked as iscacheable, then it is not cached. (3) If any table contained within the cacheable portion of the query is modified, the cache is marked as dirty. (4) No provisions are made to recreate the cache after an insert/update/delete. (5) The first query marked as iscacheable that encounters a dirty flag in a table, does an exhaustive search on the cache and removes all entries that are affected. As far as I can see, if the above parameters are used to define caching, it could improve performance on sites where a high number of transactions are made, where there is also a large amount of static data, i.e. a ecommerce site, library, etc. If the iscacheable keyword is not used, PostgreSQL will not incur any performance degradation. However, if he iscacheable keyword is used, the performance loss could very well be made up by the benefits of caching. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Again, sorry, caching.
Mattew T. O'Connor wrote: My big problem with putting the cache outside of the database is that it is now incumbent on the applications programmer to write a cache. A database should manage the data, the application should handle how the data is presented. Forcing the application to implement a cache feels wrong. I believe someone suggested a possible solution that was in the pg client using NOTICE and triggers. The argument given against it, was that it would not be ACID compliant. I say, who cares. I would think that the select cachable would only be allowed for simple selects, it would not be used for select for update or anything else. Anytime you are given the result of a simple select, you are not guaranteed that the data won't change underneath you. Not true, if you begin a transaction, you can be isolated of changes made to the database. The primary use that you have suggested is for web sites, and they certainly won't mind of the cache is 0.3seconds out of date. Again, if they don't care about accuracy, then they will use MySQL. PostgreSQL is a far better system. Making PostgreSQL less accurate, less correct takes away, IMHO, the very reasons to use it. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Again, sorry, caching.
On Mon, 2002-03-18 at 08:15, mlw wrote: Mattew T. O'Connor wrote: [snip] The primary use that you have suggested is for web sites, and they certainly won't mind of the cache is 0.3seconds out of date. Again, if they don't care about accuracy, then they will use MySQL. PostgreSQL is a far better system. Making PostgreSQL less accurate, less correct takes away, IMHO, the very reasons to use it. If you are using a web site and you need real time data within 0.3s, you've implemented on the wrong platform. It's as simple as that. In the web world, there are few applications where a 0.3s of a window is notable. After all, that 0.3s of a window can be anywhere within the system, including the web server, network, any front end caches, dns resolutions, etc. I tend to agree with Mettew. Granted, there are some application domains where this can be critical...generally speaking, web serving isn't one of them. That's why all of the solutions I offered were pointedly addressing a web server scenario and not a generalized database cache. I completely agree with you on that. In a generalized situation, the database should be managing and caching the data (which it already does). Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
Greg Copeland wrote: On Mon, 2002-03-18 at 08:15, mlw wrote: Mattew T. O'Connor wrote: [snip] The primary use that you have suggested is for web sites, and they certainly won't mind of the cache is 0.3seconds out of date. Again, if they don't care about accuracy, then they will use MySQL. PostgreSQL is a far better system. Making PostgreSQL less accurate, less correct takes away, IMHO, the very reasons to use it. If you are using a web site and you need real time data within 0.3s, you've implemented on the wrong platform. It's as simple as that. In the web world, there are few applications where a 0.3s of a window is notable. After all, that 0.3s of a window can be anywhere within the system, including the web server, network, any front end caches, dns resolutions, etc. This is totally wrong! An out of date cache can cause errors by returning results that are no longer valid, thus causing lookup issues. That is what ACID compliance is all about. I tend to agree with Mettew. Granted, there are some application domains where this can be critical...generally speaking, web serving isn't one of them. That's why all of the solutions I offered were pointedly addressing a web server scenario and not a generalized database cache. I completely agree with you on that. In a generalized situation, the database should be managing and caching the data (which it already does). But it does not cache a query. An expensive query which does an index range scan and filters by a where clause could invalidate a good number of buffers in the buffer cache. If this or a number of queries like it are frequently repeated, verbatim, in a seldom changed table, why not cache them within PostgreSQL? It would improve overall performance by preserving more blocks in the buffer cache and eliminate a number of queries being executed. I don't see how caching can be an argument of applicability. I can understand it from a time/work point of view, but to debate that it is a useful feature seems ludicrous. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Again, sorry, caching.
On Mon, 2002-03-18 at 10:08, mlw wrote: Greg Copeland wrote: On Mon, 2002-03-18 at 08:15, mlw wrote: Mattew T. O'Connor wrote: [snip] If you are using a web site and you need real time data within 0.3s, you've implemented on the wrong platform. It's as simple as that. In the web world, there are few applications where a 0.3s of a window is notable. After all, that 0.3s of a window can be anywhere within the system, including the web server, network, any front end caches, dns resolutions, etc. This is totally wrong! An out of date cache can cause errors by returning results that are no longer valid, thus causing lookup issues. That is what ACID compliance is all about. I understand what ACID is about. Question. Was the result set valid when it was cached? Yes. So will it be valid when it's returned as a cached result set? Yes. Might it be an out of date view. Sure...with a horribly small window for becoming out of date. Will it cause look up problems? Might. No more than what you are proposing. In the mean time, the FE cached result set, performance wise, is beating the pants off of the database cached solution on both a specific work load and over all system performance. I should point out that once the FE cache has been notified that it's cache is invalid, the FE would no longer return the invalidated result set. I consider that to be a given, however, from some of your comments I get the impression that you think the invalid result set would continue to be served. Another way of thinking about that is...it's really not any different from the notification acting as the result returned result set...from a validity perspective. That is...if that had been the returned result set (the notification) from the database...it would be accurate (which in the case means the FE cache is now dirty and treated as such)...if the query is refreshed because it is now invalid..the result set is once again accurate and reflective of the database. Example... Database cache Query result set Result set returned (cached on database) local change to database (result set cache invalid) new query based on out of date queried result set Application cache Query result set (cached) Result set returned local change to database (app cache invalid and signaled) new query based on out of date queried result set Both have that problem since transactional boundaries are hard to keep across HTTP requests. This again, is why for web applications, a FE cache is perfectly acceptable for *most* needs. Also notice that your margin for error is more or less the same. [snip] I don't see how caching can be an argument of applicability. I can understand it from a time/work point of view, but to debate that it is a useful feature seems ludicrous. I don't think I'm arguing if it's applicable or useful. Rather, I'm saying that faster results can be yielded by implementing it in the client with far less effort than it would take to implement in the BE. I am arguing that it's impact on overall system performance (though I really didn't do more than just touch on this topic) is questionable...granted, it may greatly enhance specific work loads...at the expense of others. Which shouldn't be too surprising as trade offs of some type are pretty common. At this point in time, I think we've both pretty well beat this topic up. Obviously there are two primary ways of viewing the situation. I don't think anyone is saying it's a bad idea...I think everyone is saying that it's easier to address elsewhere and that overall, the net returns may be at the expense of some other work loads. So, unless there are new pearls to be shared and gleaned, I think the topics been fairly well addressed. Does more need to said? Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
On Sat, 2002-03-16 at 09:01, mlw wrote: On a web site, a few specific queries get executed, unchanged, repeatedly. Think about an ecommerce site, most of the time it is just a handful of basic queries. These basic queries are usually against pretty large product tables. A caching mechanism would make these queries pretty light weight. The arguments against caching: It is an application issue This is completely wrong. Caching can not be done against a database without knowledge of the database, i.e. when the data changes. But can't this be achieved by using a LISTEN/NOTIFY model, with user-created rules to NOTIFY the appropriate listener when a table changes? With a good notification scheme like this, you don't need to continually poll the DB for changes. You don't need to teach your cache a lot of things about the database, since most of that knowledge is encapsulated inside the rules, and supporting tables. My impression (I could be wrong) is that LISTEN/NOTIFY doesn't get the press that it deserves. If this model isn't widely used because of some deficiencies in the LISTEN/NOTIFY implementation, IMHO our time would be better spent fixing those problems than implementing the proposed caching scheme. If we're looking to provide a quick and easy caching scheme for users attracted to MySQL's query cache, why not provide this functionality through another application? I'm thinking about a generic caching layer that would sit in between Postgres and the database client. It could speak the FE/BE protocol as necessary; it would use LISTEN/NOTIFY to allow it to efficiently be aware of database changes; it would create the necessary rules for the user, providing a simple interface to enabling query caching for a table or a set of tables? What does everyone think? OK, let me have it, tell me how terrible an idea this is. tell me how wrong I am. I think your goals are laudable (and I also appreciate the effort that you and everyone else puts into Postgres); I just think we could get most of the benefits without needing to implement potentially complex changes to Postgres internals. Cheers, Neil -- Neil Conway [EMAIL PROTECTED] PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Again, sorry, caching.
On Mon, 2002-03-18 at 20:35, Neil Conway wrote: [snip] My impression (I could be wrong) is that LISTEN/NOTIFY doesn't get the press that it deserves. If this model isn't widely used because of some deficiencies in the LISTEN/NOTIFY implementation, IMHO our time would be better spent fixing those problems than implementing the proposed caching scheme. If we're looking to provide a quick and easy caching scheme for users attracted to MySQL's query cache, why not provide this functionality through another application? I'm thinking about a generic caching layer that would sit in between Postgres and the database client. It could speak the FE/BE protocol as necessary; it would use LISTEN/NOTIFY to allow it to efficiently be aware of database changes; it would create the necessary rules for the user, providing a simple interface to enabling query caching for a table or a set of tables? What does everyone think? Yes...I was thinking that a generic library interface with a nice design pattern might meet this need rather well. Done properly, I think we can make it where all that, more or less, would be needed is application hooks which accept the result set to be cached and a mechanism to signal invalidation of the current cacheobviously that's not an exhaustive list... :) I haven't spent much time on this, but I'm fairly sure some library routines can be put together which would greatly reduce the effort of application coders to support fe-data caches and still be portable for even the Win32 port. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
Andrew Sullivan wrote: On Sat, Mar 16, 2002 at 09:01:28AM -0500, mlw wrote: If it is mostly static data, why not just make it a static page? Because a static page is a maintenance nightmare. One uses a database in a web site to allow content to be changed and upgraded dynamically and with a minimum of work. This seems wrong to me. Why not build an extra bit of functionality so that when the admin makes a static-data change, the new static data gets pushed into the static files? I was originally intrigued by the suggestion you made, but the more I thought about it (and read the arguments of others) the more convinced I became that the MySQL approach is a mistake. It's probably worth it for their users, who seem not to care that much about ACID anyway. But I think for a system that really wants to play in the big leagues, the cache is a big feature that requires a lot of development, but which is not adequately useful for most cases. If we had infinite developer resources, it might be worth it. In the actual case, I think it's too low a priority. Again, I can't speak to priority, but I can name a few common application where caching would be a great benefit. The more I think about it, the more I like the idea of a 'cacheable' keyword in the select statement. My big problem with putting the cache outside of the database is that it is now incumbent on the applications programmer to write a cache. A database should manage the data, the application should handle how the data is presented. Forcing the application to implement a cache feels wrong. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Again, sorry, caching.
Greg Copeland wrote: On Sat, 2002-03-16 at 08:36, mlw wrote: Triggers and asynchronous notification are not substitutes for real hard ACID complient caching. The way you suggest implies only one access model. Take the notion of a library, they have both web and application access. These should both be able to use the cache. Well, obviously, you'd need to re-implement the client side cache in each implementation of the client. That is a down side and I certainly won't argue that. As for the no substitute comment, I'm guess I'll plead ignorance because I'm not sure what I'm missing here. What am I missing that would not be properly covered by that model? It would not be guarenteed to be up to date with the state of the database. By implementing the cache within the database, PostgreSQL could maintain the consistency. Also, your suggestion does not address the sub-select case, which I think is much bigger, performance wise, and more efficient than MySQL's cache. I'm really not sure what you mean by that. Doesn't address it but is more efficient? Maybe it's because I've not had my morning coffee yet... ;) If an internal caching system can be implemented within PostgreSQL, and trust me, I undersand what a hairball it would be with multiversion concurrency, omplex queries such as: select * from (select * from mytable where foo = 'bar' cacheable) as subset where subset.col = 'value' The 'cacheable' keyword applied to the query would mean that PostgreSQL could keep that result set handy for later use. If mytable and that subselect always does a table scan, no one can argue that this subquery caching could be a huge win. As a side note, I REALLY like the idea of a keyword for caching as apposed to automated caching. t would allow the DBA or developer more control over PostgreSQL's behavior, and poentially make the fature easier to implement. This whole discussion could be moot, and this could be developed as an extension, if there were a function API that could return sets of whole rows. Currently a function can only return one value or a setof a single type, implemented as one function call for each entry in a set. If there could be a function interface which could return a row, and multiple rows similar to the 'setof' return, that would be very cool. That way caching can be implemented as: select * from pgcache('select * from mytable where foo='bar') as subset where subset.col = 'value'; ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Again, sorry, caching.
I think the notion that data is managed outside of the database is bogus. Query caching can improve performance in some specific, but popular, scenarios. Saying it does not belong within the database and is the job of the application, is like saying file caching is not a job of the file system but is the job of the application. This is a functionality many users want, and can be justified by some very specific, but very common, scenarios. It is not me to say if it is worth the work, or if it should be done. From the perspective of the user, having this capability within the database is an important feature, I want to make the argument. Greg Copeland wrote: I previously replied to you vaguely describing a way you could implement this by using a combination of client side caching and database tables and triggers to allow you to determine if your cache is still valid. Someone came right behind me, Tom maybe??, and indicated that the proper/ideal way to do this would be to using postgres' asynchronous database notification mechanisms (listen/notify I believe were the semantics) to alert your application that your cache has become invalid. Basically, a couple of triggers and the use of the list/notify model, and you should be all set. Done properly, a client side cache which is asynchronously notified by the database when it's contents become invalid should be faster than relying on MySQL's database caching scheme. Basically, a strong client side cache is going to prevent your database from even having to return a cached result set while a database side cache is going to always return a result set. Of course, one of the extra cool things you can do is to cache a gzip'd copy of the data contents which would further act as an optimization preventing the client or web server (in case they are different) from having to recompress every result set. In the long run, again, if properly done, you should be able to beat MySQL's implementation without too extra much effort. Why? Because a client side cache can be much smarter in the way that it uses it's cached contents much in the same way an application is able to better cache it's data then what the file system is able to do. This is why an client side cache should be preferred over that of a database result set cache. Greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Again, sorry, caching.
I previously replied to you vaguely describing a way you could implement this by using a combination of client side caching and database tables and triggers to allow you to determine if your cache is still valid. Someone came right behind me, Tom maybe??, and indicated that the proper/ideal way to do this would be to using postgres' asynchronous database notification mechanisms (listen/notify I believe were the semantics) to alert your application that your cache has become invalid. Basically, a couple of triggers and the use of the list/notify model, and you should be all set. Done properly, a client side cache which is asynchronously notified by the database when it's contents become invalid should be faster than relying on MySQL's database caching scheme. Basically, a strong client side cache is going to prevent your database from even having to return a cached result set while a database side cache is going to always return a result set. Of course, one of the extra cool things you can do is to cache a gzip'd copy of the data contents which would further act as an optimization preventing the client or web server (in case they are different) from having to recompress every result set. In the long run, again, if properly done, you should be able to beat MySQL's implementation without too extra much effort. Why? Because a client side cache can be much smarter in the way that it uses it's cached contents much in the same way an application is able to better cache it's data then what the file system is able to do. This is why an client side cache should be preferred over that of a database result set cache. Greg References: http://www.postgresql.org/idocs/index.php?sql-notify.html http://www.postgresql.org/idocs/index.php?sql-listen.html On Sat, 2002-03-16 at 08:01, mlw wrote: I traded a couple emails with a guy using one of my open source projects. To make a long story short, he is going to the new version of MySQL for his website because of the new caching feature. He is convinced that it will speed up his web site, and he is probably right. On a web site, a few specific queries get executed, unchanged, repeatedly. Think about an ecommerce site, most of the time it is just a handful of basic queries. These basic queries are usually against pretty large product tables. A caching mechanism would make these queries pretty light weight. The arguments against caching: It is an application issue This is completely wrong. Caching can not be done against a database without knowledge of the database, i.e. when the data changes. If it is mostly static data, why not just make it a static page? Because a static page is a maintenance nightmare. One uses a database in a web site to allow content to be changed and upgraded dynamically and with a minimum of work. It isn't very useful I disagree completely. A cache of most frequently used queries, or specific ones, could make for REALLY good performance in some very specific, but very common, applications. Any system that has a hierarchical drill down interface to a data set, ecommerce, libraries, document management systems, etc. will greatly benefit from a query cache. I was thinking that it could be implemented as a keyword or comment in a query. Such as: select * from table where column = 'foo' cacheable or select * from table where column = 'bar' /* cacheable */ Either way, it would speed up a lot of common application types. It would even be very cool if you could just cache the results of sub queries, such as: select * from (select * from table where col1 = 'foo' cacheable) as subset where subset.col2 = 'bar' ; Which would mean that the subquery gets cached, but the greater select need not be. The cache could be like a global temp table. Perhaps the user could even name the cache entry: select * from table where column = 'foo' cache on foo Where one could also do: select * from cache_foo Using a keyword is probably a better idea, it can be picked up by the parser and instruct PostgreSQL to use the cache, otherwise there will be no additional overhead. Having caching within PostgreSQL will be good for data integrity. Application caches can't tell when an update/delete/insert happens, they often have to use a time-out mechanism. OK, let me have it, tell me how terrible an idea this is. tell me how wrong I am. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
Triggers and asynchronous notification are not substitutes for real hard ACID complient caching. The way you suggest implies only one access model. Take the notion of a library, they have both web and application access. These should both be able to use the cache. Also, your suggestion does not address the sub-select case, which I think is much bigger, performance wise, and more efficient than MySQL's cache. This whole discussion could be moot, and this could be developed as an extension, if there were a function API that could return sets of whole rows. Greg Copeland wrote: I previously replied to you vaguely describing a way you could implement this by using a combination of client side caching and database tables and triggers to allow you to determine if your cache is still valid. Someone came right behind me, Tom maybe??, and indicated that the proper/ideal way to do this would be to using postgres' asynchronous database notification mechanisms (listen/notify I believe were the semantics) to alert your application that your cache has become invalid. Basically, a couple of triggers and the use of the list/notify model, and you should be all set. Done properly, a client side cache which is asynchronously notified by the database when it's contents become invalid should be faster than relying on MySQL's database caching scheme. Basically, a strong client side cache is going to prevent your database from even having to return a cached result set while a database side cache is going to always return a result set. Of course, one of the extra cool things you can do is to cache a gzip'd copy of the data contents which would further act as an optimization preventing the client or web server (in case they are different) from having to recompress every result set. In the long run, again, if properly done, you should be able to beat MySQL's implementation without too extra much effort. Why? Because a client side cache can be much smarter in the way that it uses it's cached contents much in the same way an application is able to better cache it's data then what the file system is able to do. This is why an client side cache should be preferred over that of a database result set cache. Greg References: http://www.postgresql.org/idocs/index.php?sql-notify.html http://www.postgresql.org/idocs/index.php?sql-listen.html On Sat, 2002-03-16 at 08:01, mlw wrote: I traded a couple emails with a guy using one of my open source projects. To make a long story short, he is going to the new version of MySQL for his website because of the new caching feature. He is convinced that it will speed up his web site, and he is probably right. On a web site, a few specific queries get executed, unchanged, repeatedly. Think about an ecommerce site, most of the time it is just a handful of basic queries. These basic queries are usually against pretty large product tables. A caching mechanism would make these queries pretty light weight. The arguments against caching: It is an application issue This is completely wrong. Caching can not be done against a database without knowledge of the database, i.e. when the data changes. If it is mostly static data, why not just make it a static page? Because a static page is a maintenance nightmare. One uses a database in a web site to allow content to be changed and upgraded dynamically and with a minimum of work. It isn't very useful I disagree completely. A cache of most frequently used queries, or specific ones, could make for REALLY good performance in some very specific, but very common, applications. Any system that has a hierarchical drill down interface to a data set, ecommerce, libraries, document management systems, etc. will greatly benefit from a query cache. I was thinking that it could be implemented as a keyword or comment in a query. Such as: select * from table where column = 'foo' cacheable or select * from table where column = 'bar' /* cacheable */ Either way, it would speed up a lot of common application types. It would even be very cool if you could just cache the results of sub queries, such as: select * from (select * from table where col1 = 'foo' cacheable) as subset where subset.col2 = 'bar' ; Which would mean that the subquery gets cached, but the greater select need not be. The cache could be like a global temp table. Perhaps the user could even name the cache entry: select * from table where column = 'foo' cache on foo Where one could also do: select * from cache_foo Using a keyword is probably a better idea, it can be picked up by the parser and instruct PostgreSQL to use the cache, otherwise there will be no additional overhead. Having caching within PostgreSQL will be good for data integrity. Application caches can't tell when an update/delete/insert happens, they often have to use a time-out mechanism.
Re: [HACKERS] Again, sorry, caching.
On Sat, 2002-03-16 at 08:01, mlw wrote: [snip] If it is mostly static data, why not just make it a static page? Because a static page is a maintenance nightmare. One uses a database in a web site to allow content to be changed and upgraded dynamically and with a minimum of work. Oh ya, I forgot that reply to that part. I think you are forgetting that you can use a database to generate a static page. That is, only regenerate the static page when the data within the database changes. Again, this is another example of efficient application caching. If you have an application which listens for your cache invalidation event, you can then recreate your static page. Again, database result set caching is not required. And again, then should be significantly faster than MySQL's result set caching. Also worth noting that you could then gzip your static page (keeping both static pages -- compressed and uncompressed) resulting in yet another optimization for most web servers and browsers. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
On Sat, 2002-03-16 at 08:36, mlw wrote: Triggers and asynchronous notification are not substitutes for real hard ACID complient caching. The way you suggest implies only one access model. Take the notion of a library, they have both web and application access. These should both be able to use the cache. Well, obviously, you'd need to re-implement the client side cache in each implementation of the client. That is a down side and I certainly won't argue that. As for the no substitute comment, I'm guess I'll plead ignorance because I'm not sure what I'm missing here. What am I missing that would not be properly covered by that model? Also, your suggestion does not address the sub-select case, which I think is much bigger, performance wise, and more efficient than MySQL's cache. I'm really not sure what you mean by that. Doesn't address it but is more efficient? Maybe it's because I've not had my morning coffee yet... ;) This whole discussion could be moot, and this could be developed as an extension, if there were a function API that could return sets of whole rows. Maybe...but you did ask for feedback. :) Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Again, sorry, caching.
I was thinking that it could be implemented as a keyword or comment in a query. Such as: select * from table where column = 'foo' cacheable or select * from table where column = 'bar' /* cacheable */ Having caching within PostgreSQL will be good for data integrity. Application caches can't tell when an update/delete/insert happens, they often have to use a time-out mechanism. OK, let me have it, tell me how terrible an idea this is. tell me how wrong I am. I don't think it's a bad idea, but a cache that takes a query string (or subquery string) and looks for a match based on that is flawed without special consideration to non-cacheable functions and constructs (CURRENT_USER, things that depend on timezone, things that depend on datestyle). We'd also need to work out an appropriate mechanism to deal with cache invalidation and adding things to the cache. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly