Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-12 Thread Tom Lane
[EMAIL PROTECTED] writes: - Re uni-directional logs Of course. I forgot about PG's non-in-place update mechanisms and the use of VACCUUM .. with versioning there are really no undo logging necessary. I guess that means that during VACCUUM you might have to significant work in indexes ? I'm

[HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes
I combed the archives but could not find a discussion on this and am amazed this hasn't been discussed. My experience with Oracle (and now limited experience with Pg) is that the major choke point in performance is not the CPU or read I/O, it is the log performance of big update and select

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Rod Taylor
If a transaction will do large updates or inserts, why don't we just log the parsed statements in the WAL instead of the individual data blocks UPDATE table SET col = random(); ---(end of broadcast)--- TIP 2: you can get off all lists at once

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Sailesh Krishnamurthy
Marty == Marty Scholes [EMAIL PROTECTED] writes: Marty Why have I not seen this in any database? Marty There must be a reason. For ARIES-style systems, logging parsed statements (commonly called logical logging) is not preferred compared to logging data items (physical or physiological

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Tom Lane
Marty Scholes [EMAIL PROTECTED] writes: My experience with Oracle (and now limited experience with Pg) is that the major choke point in performance is not the CPU or read I/O, it is the log performance of big update and select statements. If your load is primarily big update statements,

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes
I can see that and considered it. The seed state would need to be saved, or any particular command that is not reproducible would need to be exempted from this sort of logging. Again, this would apply only to situations where a small SQL command created huge changes. Marty Rod Taylor wrote:

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Rod Taylor
On Thu, 2004-03-11 at 21:04, Marty Scholes wrote: I can see that and considered it. The seed state would need to be saved, or any particular command that is not reproducible would need to be exempted from this sort of logging. Again, this would apply only to situations where a small SQL

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes
A major reason for this is that logical logs make recovery contingent on being able to execute the parsed statements. This execution might, however, not be possible if the system is itself not in a consistent state .. as is normally the case during recovery. I am not sure I follow you here.

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes
If your load is primarily big update statements, maybe so... It is. Maybe we are anomalous here. I don't think I buy that claim. We don't normally fsync the log file except at transaction commit (and read-only transactions don't generate any commit record, so they don't cause an fsync).

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes
Anyway, it really doesn't matter. You're trying to save a large amount of time that simply isn't spent in this area in PostgreSQL. fsync() happens once with commit -- and on a busy system, a single fsync call may be sufficient for a number of parallel backends. I think you may be right. I

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Sailesh Krishnamurthy
(Just a note: my comments are not pg-specific .. indeed I don't know much about pg recovery). Marty == Marty Scholes [EMAIL PROTECTED] writes: Marty If the DB state cannot be put back to a consistent state Marty prior to a SQL statement in the log, then NO amount of Marty logging

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes
The point is that with redo logging, you can just blindly apply the log to the data pages in question, without even really restarting the database. I also am not a recovery expert, but I have watched it happen more than once. You bring up a good point. My (perhaps false) understanding with

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Tom Lane
Sailesh Krishnamurthy [EMAIL PROTECTED] writes: (Just a note: my comments are not pg-specific .. indeed I don't know much about pg recovery). ... BTW, logging raw datafile blocks would be pretty gross (physical logging) and so ARIES logs the changes to each tuple in logical fashion .. so if