Re: [HACKERS] Possible Commit Syntax Change for Improved TPS

2003-10-08 Thread Adrian Maier
Seun Osewa wrote:
I observed that in in many applications there are some transactions
that are more critical than others.  I may have the same database
instance managing website visitor accounting and financial
transactions.  I could tolerate the loss of a few transactions whose
only job is to tell me a user has clicked a page on my website but
would not dare risk this for any of the real financials work my
web-based app is doing.
It is possible to split the data over 2 database clusters:
one which contains important data (this cluster will be configured 
with fsync enabled),   and a second one that contains the less
important data (configured with fsync=off for speed reasons).



Cheers,

Adrian Maier
([EMAIL PROTECTED])
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Possible Commit Syntax Change for Improved TPS

2003-10-08 Thread Jeroen T. Vermeulen
On Thu, Oct 02, 2003 at 05:31:52AM -0700, Seun Osewa wrote:

 The beauty of the scheme is that the WAL syncs which sync everyone's 
 changes so far would cost about the same as the WAL syncs for just 
 one transaction being committed.  But when there are so many trans-
 actions we would not have to sync the WAL so often.

In that case, why not go to a lazy policy in high-load situations,
where subsequent commits are bundled up into a single physical write?
Just hold up a commit until either there's a full buffer's worth of 
commits waiting to be written, or some timer says it's time to flush
so the client doesn't wait too long.

It would increase per-client latency when viewed in isolation, but if
it really improves throughput that much you might end up getting a
faster response after all.

(BTW I haven't looked at the code involved so this may be completely
wrong, impossible, and/or how it works already)


Jeroen


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Possible Commit Syntax Change for Improved TPS

2003-10-07 Thread Seun Osewa
Hi,

I have been studying the basic limitation that the number of committed
transactions per second possible in a relational databases.  Since
each transaction requires at least write-ahead log data to be flushed
to disk the upper bound of transactions per second is equal to the
number of independent disk writes possible per second.  Most of what I
know is from performance docs of PostgreSQL and MySQL.

Its often possible to increase the total transaction processing speed
by turning off the compulsory disc syncing at each commit, which means
that in the case of system failure some transactions may be lost *but*
the database would still be consistent if we are careful to make sure
the log is always written first.

I observed that in in many applications there are some transactions
that are more critical than others.  I may have the same database
instance managing website visitor accounting and financial
transactions.  I could tolerate the loss of a few transactions whose
only job is to tell me a user has clicked a page on my website but
would not dare risk this for any of the real financials work my
web-based app is doing.

In the case of bulk inserts, also, or in some special cases I might be
able to code around the need for guaranteed *durability* on
transaction commit as long as the database is consistent.

So I want to ask, what is databases have a 'COMMIT NOSYNC;' option? 
Then we can really improve transaction-per-second performance for a
database that has lots of non-critical transactions while not
jeopardising the durability of critical transactions in the
(relatively unlikely) case of system failure.  Primarily through
combining the log updates for several non-critical transactions.

COMMIT; -- COMMIT SYNC; (guarantees atomic, consistent, durable
write)
COMMIT NOSYNC; -- (sacrifice durability of non-critical transaction
for overall speed).  So, the question is what people, especially those
who have done DBMS work, think about this!

Seun Osewa.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Possible Commit Syntax Change for Improved TPS

2003-10-07 Thread Seun Osewa
Hi Christopher,

Just to go through your points.

  COMMIT NOSYNC; -- (sacrifice durability of non-critical transaction
  for overall speed).  So, the question is what people, especially those
  who have done DBMS work, think about this!
 I think that whenever my organization cares THAT much about
 performance, I'll probably be able to get enough budget to pay for a
 SCSI RAID card that has battery backed cache that makes that issue go
 away, as it allows the fsync() to become _nearly_ as fast as a no-op.
I agree, but I would not want to throw hardware at something that can be 
easily implemented with software. I think the functionality is in
about every RDBMS today, just not under the database users' control.

 The case you suggest, where there are a lot of 'unimportant'
 transactions, seems of dubious likelihood.  If some updates actually
 commit, why shouldn't others?  
I feel, if people have the choice they would feel free to use the DBMS
for some functions they don't use it for now cause of the limited update
speeds without battery backup.  For example, Microsoft ASP.NET docs re-
peat that its slower to use a database to manage visitor sessions.  In
many cases I can afford to risk forgetting information about the act-
ivity of a user (out of thousands) who visited a shopping site without
ordering anything.  The ASP.NET script would get to choose which COMMIT
to use depending on a number of factors.

 And if the users know they can't
 really trust the COMMIT NOSYNC updates, won't it be tough to
 convince them to trust the really commited stuff?
Actually, I see it the other way round.  The existence of
COMMIT NOSYNC (faster, not durable in case of crash) 
should remind users that the other COMMIT [SYNC] though 
slower, is durable.  

 The battery backed cache idea winds up helping out _all_ updates, in a
 HUGE way.  That seems the way to go.  At least in part because having
 universal answers (e.g. - that helps ALL transactions) is likely to be
 simpler than having everything be a special case.
I think if database programmers have it, 
they will use it to optimize their applications.  
Aside from increased speed there is the possibility people
will just get to do some things they have just not been 
doing.  I think its a nice concept, which can be exploited 
for performance if implemented in a RdBMS.

Seun Osewa.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Possible Commit Syntax Change for Improved TPS

2003-10-07 Thread Seun Osewa
[EMAIL PROTECTED] (Tom Lane) wrote in message news:[EMAIL PROTECTED]...
 Christopher Browne [EMAIL PROTECTED] writes:
  In the last exciting episode, [EMAIL PROTECTED] (Seun Osewa) wrote:
  So I want to ask, what if databases have a 'COMMIT NOSYNC;' option? 
  Another possibility in this would be to have not one, but TWO
  backends.  
  One database, on one port, is running in FSYNC mode, so that the
  really vital stuff is sure to get committed quickly.  The other, on
  another port, has FSYNC turned off in its postgresql.conf file, and
  the set of untrusted files go there.
 They would have in fact to be two separate installations (not two
 databases under one postmaster).  There is no way to make some
 transactions less safe than others in a single installation, because
 they're all hitting the same WAL log, and potentially modifying the
 same disk buffers to boot.  Anyone's WAL sync therefore syncs everyone's
 changes-so-far.
The beauty of the scheme is that the WAL syncs which sync everyone's 
changes so far would cost about the same as the WAL syncs for just 
one transaction being committed.  But when there are so many trans-
actions we would not have to sync the WAL so often.

Seun Osewa

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Possible Commit Syntax Change for Improved TPS

2003-09-30 Thread Christopher Browne
[EMAIL PROTECTED] (Seun Osewa) wrote:
 COMMIT; -- COMMIT SYNC; (guarantees atomic, consistent, durable
 write)
 COMMIT NOSYNC; -- (sacrifice durability of non-critical transaction
 for overall speed).  So, the question is what people, especially those
 who have done DBMS work, think about this!

I think that whenever my organization cares THAT much about
performance, I'll probably be able to get enough budget to pay for a
SCSI RAID card that has battery backed cache that makes that issue go
away, as it allows the fsync() to become _nearly_ as fast as a no-op.

The case you suggest, where there are a lot of 'unimportant'
transactions, seems of dubious likelihood.  If some updates actually
commit, why shouldn't others?  And if the users know they can't
really trust the COMMIT NOSYNC updates, won't it be tough to
convince them to trust the really commited stuff?

The battery backed cache idea winds up helping out _all_ updates, in a
HUGE way.  That seems the way to go.  At least in part because having
universal answers (e.g. - that helps ALL transactions) is likely to be
simpler than having everything be a special case.
-- 
(reverse (concatenate 'string gro.gultn @ enworbbc))
http://www.ntlug.org/~cbbrowne/spiritual.html
This is Linux country.  On a quiet night, you can hear NT re-boot.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Possible Commit Syntax Change for Improved TPS

2003-09-30 Thread Christopher Browne
In the last exciting episode, [EMAIL PROTECTED] (Seun Osewa) wrote:
 So I want to ask, what is databases have a 'COMMIT NOSYNC;' option? 
 Then we can really improve transaction-per-second performance for a
 database that has lots of non-critical transactions while not
 jeopardising the durability of critical transactions in the
 (relatively unlikely) case of system failure.  Primarily through
 combining the log updates for several non-critical transactions.

Another possibility in this would be to have not one, but TWO
backends.  

One database, on one port, is running in FSYNC mode, so that the
really vital stuff is sure to get committed quickly.  The other, on
another port, has FSYNC turned off in its postgresql.conf file, and
the set of untrusted files go there.

That has the added merit that you can do other tuning that
distinguishes between the important and unimportant data.  For
instance, if the unimportant stuff is a set of logs that don't get
directly referred to, you might set cacheing real low on that backend
so that cache isn't being wasted on unimportant data.

So if you really want this, you can have it right now without anyone
doing any implementation work.
-- 
let name=aa454 and tld=freenet.carleton.ca in String.concat @ [name;tld];;
http://www.ntlug.org/~cbbrowne/internet.html
God is real unless declared integer.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Possible Commit Syntax Change for Improved TPS

2003-09-30 Thread Tom Lane
Christopher Browne [EMAIL PROTECTED] writes:
 In the last exciting episode, [EMAIL PROTECTED] (Seun Osewa) wrote:
 So I want to ask, what is databases have a 'COMMIT NOSYNC;' option? 

 Another possibility in this would be to have not one, but TWO
 backends.  
 One database, on one port, is running in FSYNC mode, so that the
 really vital stuff is sure to get committed quickly.  The other, on
 another port, has FSYNC turned off in its postgresql.conf file, and
 the set of untrusted files go there.

They would have in fact to be two separate installations (not two
databases under one postmaster).  There is no way to make some
transactions less safe than others in a single installation, because
they're all hitting the same WAL log, and potentially modifying the
same disk buffers to boot.  Anyone's WAL sync therefore syncs everyone's
changes-so-far.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org