Re: [HACKERS] Possible Commit Syntax Change for Improved TPS
Seun Osewa wrote: I observed that in in many applications there are some transactions that are more critical than others. I may have the same database instance managing website visitor accounting and financial transactions. I could tolerate the loss of a few transactions whose only job is to tell me a user has clicked a page on my website but would not dare risk this for any of the real financials work my web-based app is doing. It is possible to split the data over 2 database clusters: one which contains important data (this cluster will be configured with fsync enabled), and a second one that contains the less important data (configured with fsync=off for speed reasons). Cheers, Adrian Maier ([EMAIL PROTECTED]) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Possible Commit Syntax Change for Improved TPS
On Thu, Oct 02, 2003 at 05:31:52AM -0700, Seun Osewa wrote: The beauty of the scheme is that the WAL syncs which sync everyone's changes so far would cost about the same as the WAL syncs for just one transaction being committed. But when there are so many trans- actions we would not have to sync the WAL so often. In that case, why not go to a lazy policy in high-load situations, where subsequent commits are bundled up into a single physical write? Just hold up a commit until either there's a full buffer's worth of commits waiting to be written, or some timer says it's time to flush so the client doesn't wait too long. It would increase per-client latency when viewed in isolation, but if it really improves throughput that much you might end up getting a faster response after all. (BTW I haven't looked at the code involved so this may be completely wrong, impossible, and/or how it works already) Jeroen ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Possible Commit Syntax Change for Improved TPS
Hi, I have been studying the basic limitation that the number of committed transactions per second possible in a relational databases. Since each transaction requires at least write-ahead log data to be flushed to disk the upper bound of transactions per second is equal to the number of independent disk writes possible per second. Most of what I know is from performance docs of PostgreSQL and MySQL. Its often possible to increase the total transaction processing speed by turning off the compulsory disc syncing at each commit, which means that in the case of system failure some transactions may be lost *but* the database would still be consistent if we are careful to make sure the log is always written first. I observed that in in many applications there are some transactions that are more critical than others. I may have the same database instance managing website visitor accounting and financial transactions. I could tolerate the loss of a few transactions whose only job is to tell me a user has clicked a page on my website but would not dare risk this for any of the real financials work my web-based app is doing. In the case of bulk inserts, also, or in some special cases I might be able to code around the need for guaranteed *durability* on transaction commit as long as the database is consistent. So I want to ask, what is databases have a 'COMMIT NOSYNC;' option? Then we can really improve transaction-per-second performance for a database that has lots of non-critical transactions while not jeopardising the durability of critical transactions in the (relatively unlikely) case of system failure. Primarily through combining the log updates for several non-critical transactions. COMMIT; -- COMMIT SYNC; (guarantees atomic, consistent, durable write) COMMIT NOSYNC; -- (sacrifice durability of non-critical transaction for overall speed). So, the question is what people, especially those who have done DBMS work, think about this! Seun Osewa. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Possible Commit Syntax Change for Improved TPS
Hi Christopher, Just to go through your points. COMMIT NOSYNC; -- (sacrifice durability of non-critical transaction for overall speed). So, the question is what people, especially those who have done DBMS work, think about this! I think that whenever my organization cares THAT much about performance, I'll probably be able to get enough budget to pay for a SCSI RAID card that has battery backed cache that makes that issue go away, as it allows the fsync() to become _nearly_ as fast as a no-op. I agree, but I would not want to throw hardware at something that can be easily implemented with software. I think the functionality is in about every RDBMS today, just not under the database users' control. The case you suggest, where there are a lot of 'unimportant' transactions, seems of dubious likelihood. If some updates actually commit, why shouldn't others? I feel, if people have the choice they would feel free to use the DBMS for some functions they don't use it for now cause of the limited update speeds without battery backup. For example, Microsoft ASP.NET docs re- peat that its slower to use a database to manage visitor sessions. In many cases I can afford to risk forgetting information about the act- ivity of a user (out of thousands) who visited a shopping site without ordering anything. The ASP.NET script would get to choose which COMMIT to use depending on a number of factors. And if the users know they can't really trust the COMMIT NOSYNC updates, won't it be tough to convince them to trust the really commited stuff? Actually, I see it the other way round. The existence of COMMIT NOSYNC (faster, not durable in case of crash) should remind users that the other COMMIT [SYNC] though slower, is durable. The battery backed cache idea winds up helping out _all_ updates, in a HUGE way. That seems the way to go. At least in part because having universal answers (e.g. - that helps ALL transactions) is likely to be simpler than having everything be a special case. I think if database programmers have it, they will use it to optimize their applications. Aside from increased speed there is the possibility people will just get to do some things they have just not been doing. I think its a nice concept, which can be exploited for performance if implemented in a RdBMS. Seun Osewa. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Possible Commit Syntax Change for Improved TPS
[EMAIL PROTECTED] (Tom Lane) wrote in message news:[EMAIL PROTECTED]... Christopher Browne [EMAIL PROTECTED] writes: In the last exciting episode, [EMAIL PROTECTED] (Seun Osewa) wrote: So I want to ask, what if databases have a 'COMMIT NOSYNC;' option? Another possibility in this would be to have not one, but TWO backends. One database, on one port, is running in FSYNC mode, so that the really vital stuff is sure to get committed quickly. The other, on another port, has FSYNC turned off in its postgresql.conf file, and the set of untrusted files go there. They would have in fact to be two separate installations (not two databases under one postmaster). There is no way to make some transactions less safe than others in a single installation, because they're all hitting the same WAL log, and potentially modifying the same disk buffers to boot. Anyone's WAL sync therefore syncs everyone's changes-so-far. The beauty of the scheme is that the WAL syncs which sync everyone's changes so far would cost about the same as the WAL syncs for just one transaction being committed. But when there are so many trans- actions we would not have to sync the WAL so often. Seun Osewa ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Possible Commit Syntax Change for Improved TPS
[EMAIL PROTECTED] (Seun Osewa) wrote: COMMIT; -- COMMIT SYNC; (guarantees atomic, consistent, durable write) COMMIT NOSYNC; -- (sacrifice durability of non-critical transaction for overall speed). So, the question is what people, especially those who have done DBMS work, think about this! I think that whenever my organization cares THAT much about performance, I'll probably be able to get enough budget to pay for a SCSI RAID card that has battery backed cache that makes that issue go away, as it allows the fsync() to become _nearly_ as fast as a no-op. The case you suggest, where there are a lot of 'unimportant' transactions, seems of dubious likelihood. If some updates actually commit, why shouldn't others? And if the users know they can't really trust the COMMIT NOSYNC updates, won't it be tough to convince them to trust the really commited stuff? The battery backed cache idea winds up helping out _all_ updates, in a HUGE way. That seems the way to go. At least in part because having universal answers (e.g. - that helps ALL transactions) is likely to be simpler than having everything be a special case. -- (reverse (concatenate 'string gro.gultn @ enworbbc)) http://www.ntlug.org/~cbbrowne/spiritual.html This is Linux country. On a quiet night, you can hear NT re-boot. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Possible Commit Syntax Change for Improved TPS
In the last exciting episode, [EMAIL PROTECTED] (Seun Osewa) wrote: So I want to ask, what is databases have a 'COMMIT NOSYNC;' option? Then we can really improve transaction-per-second performance for a database that has lots of non-critical transactions while not jeopardising the durability of critical transactions in the (relatively unlikely) case of system failure. Primarily through combining the log updates for several non-critical transactions. Another possibility in this would be to have not one, but TWO backends. One database, on one port, is running in FSYNC mode, so that the really vital stuff is sure to get committed quickly. The other, on another port, has FSYNC turned off in its postgresql.conf file, and the set of untrusted files go there. That has the added merit that you can do other tuning that distinguishes between the important and unimportant data. For instance, if the unimportant stuff is a set of logs that don't get directly referred to, you might set cacheing real low on that backend so that cache isn't being wasted on unimportant data. So if you really want this, you can have it right now without anyone doing any implementation work. -- let name=aa454 and tld=freenet.carleton.ca in String.concat @ [name;tld];; http://www.ntlug.org/~cbbrowne/internet.html God is real unless declared integer. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Possible Commit Syntax Change for Improved TPS
Christopher Browne [EMAIL PROTECTED] writes: In the last exciting episode, [EMAIL PROTECTED] (Seun Osewa) wrote: So I want to ask, what is databases have a 'COMMIT NOSYNC;' option? Another possibility in this would be to have not one, but TWO backends. One database, on one port, is running in FSYNC mode, so that the really vital stuff is sure to get committed quickly. The other, on another port, has FSYNC turned off in its postgresql.conf file, and the set of untrusted files go there. They would have in fact to be two separate installations (not two databases under one postmaster). There is no way to make some transactions less safe than others in a single installation, because they're all hitting the same WAL log, and potentially modifying the same disk buffers to boot. Anyone's WAL sync therefore syncs everyone's changes-so-far. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org