Re: [HACKERS] postmaster.exe vs postgres.exe (was: CVS HEAD busted on

2006-06-23 Thread Bruce Momjian
Tom Lane wrote: Dave Page dpage@vale-housing.co.uk writes: though - Magnus I were wondering if Peter's change means we no longer need to ship postmaster.exe and postgres.exe with pgInstaller. Presumably we can just use postgres.exe for everything now? Won't we still need to know if

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Dave Page
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Dunstan Sent: 22 June 2006 23:09 To: Tom Lane Cc: Bort, Paul; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [CORE] GPL Source and Copyright Questions Something has broken

Re: [HACKERS] postmaster.exe vs postgres.exe (was: CVS HEAD busted on Windows?)

2006-06-23 Thread Dave Page
-Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: 23 June 2006 07:09 To: Tom Lane Cc: Dave Page; Andrew Dunstan; Peter Eisentraut; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] postmaster.exe vs postgres.exe (was: CVS HEAD busted on Windows?)

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Magnus Hagander
Andrew Dunstan [EMAIL PROTECTED] writes: Frankly this patch has significant infelicities. For example, what is the reason for removing the standard protection against double inclusion that header files should usually have from pg_config.h.win32? I've got to admit, I don't recall that. It

[HACKERS] Shared library conflicts

2006-06-23 Thread Thomas Hallgren
There was a discussion some time back concerning the linking of the postgres backend. Some libraries where linked although they where not needed, mainly because it was convenient. I had a problem with PL/Java since a Sun JVM ships with their own version of libz.so (they call it libzip.so). Sun

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Michael Meskes
On Thu, Jun 22, 2006 at 11:37:08AM -0400, Tom Lane wrote: item #3: Carsten Wolff copyright in informix.c file The file informix.c contains a copyright from Carsten Wolff. Did Carsten directly contribute this file to the PostgreSQL project? This code was added by Michael Meskes in

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Good advice, except if the table is huge :-) ... Then the table shouldn't be designed to be huge. That represents a design error. [snip] This demonstrates that archival material and active data should be kept separately. They have different access patterns; kludging them into the same

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread PFC
Well, then please help me find a better design cause I can't see one... what we have here is a big membership table of email lists. When there's a sendout then the memberships of the affected group are heavily read/updated, otherwise they are idle. None of the memberships is archive data, they

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
I suppose you have a table memberships (user_id, group_id) or something like it ; it should have as few columns as possible ; then try regularly clustering on group_id (maybe once a week) so that all the records for a particular group are close together. Getting the members of a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Luke Lonergan
Csaba, Regularly clustering is out of question as it would render the system unusable for hours. There's no 0 activity hour we could use for such stuff. There's always something happening, only the overall load is smaller at night... We are planning to implement a btree organized table,

Re: [HACKERS] Shared library conflicts

2006-06-23 Thread William ZHANG
Thomas Hallgren [EMAIL PROTECTED] Another related question. What happens when I use --without-zlib? Does it have any effect on besides disabling compression for the dump/restore utilities? Is there anyway to make it affect the backend only? --without-zlib will affected LIBS and HAVE_LIBZ

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
The example is a very active web site, the flow is this: query for session information process HTTP request update session information This happens for EVERY http request. Chances are that you won't have concurrent requests for the same row, but you may have well over 100 HTTP server

Re: [HACKERS] Shared library conflicts

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 10:45:29AM +0200, Thomas Hallgren wrote: There was a discussion some time back concerning the linking of the postgres backend. Some libraries where linked although they where not needed, mainly because it was convenient. AIUI, this was fixed in -HEAD. In the Makefile

Re: [HACKERS] Shared library conflicts

2006-06-23 Thread Thomas Hallgren
That's good news. Any chance of getting this fix backported to 8.1? Or at least, the libz part of it? Regards, Thomas Hallgren Martijn van Oosterhout wrote: On Fri, Jun 23, 2006 at 10:45:29AM +0200, Thomas Hallgren wrote: There was a discussion some time back concerning the linking of the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
I suppose you have a table memberships (user_id, group_id) or something like it ; it should have as few columns as possible ; then try regularly clustering on group_id (maybe once a week) so that all the records for a particular group are close together. Getting the members of a group to

Re: [HACKERS] xlog viewer proposal

2006-06-23 Thread Simon Riggs
On Thu, 2006-06-22 at 14:57 -0300, Diogo Biazus wrote: Agree, the project must choose one path as the starting point. But the two options can be given in the long run. I'm acting as Diogo's mentor for the SoC, so I'm trying to let Diogo discuss his ideas in the community manner without too much

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Let me ask a question, you have this hundred million row table. OK, how much of that table is read/write? Would it be posible to divide the table into two (or more) tables where one is basically static, only infrequent inserts and deletes, and the other is highly updated? Well, all of it is

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Andrew Dunstan
Dave Page wrote: Is there any real reason to continue to support Cygwin? We've always said it's not a first class port, and now we have the native port which is it seems somewhat pointless expending further effort on it. Some people still use it for development, I believe. Similar

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Zeugswetter Andreas DCP SD
back and forth the data between an archive table and the live table, based on how active the groups are, I can't imagine any other way of partitioning it. And that would also mean some quite big load given the pretty high dynamics of the groups. You said the activity comes in bursts per

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Let me ask a question, you have this hundred million row table. OK, how much of that table is read/write? Would it be posible to divide the table into two (or more) tables where one is basically static, only infrequent inserts and deletes, and the other is highly updated? Well, all of it is

[HACKERS] Planning without reason.

2006-06-23 Thread Tzahi Fadida
Hi, I think there is a bug/misscalculation of some rare query i am using. Suppose we only query one specific relation R. R contains indices but not on all attributes or not on all ordered subset of keys. Query example: (SELECT * FROM R WHERE a=3, b=6,. ...) UNION (SELECT * FROM R WHERE b=5,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
You said the activity comes in bursts per group, so the obvious partitioning would be per group. If you have too many groups to have one partition per group you could try to find some modulo or other rule to spread them into separate partitions. This could be a solution... but then I'm not

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 03:57:19PM +0300, Tzahi Fadida wrote: R contains indices but not on all attributes or not on all ordered subset of keys. Query example: (SELECT * FROM R WHERE a=3, b=6,. ...) UNION (SELECT * FROM R WHERE b=5, d=2,. ...) UNION And lots of unions. Do you

Re: [HACKERS] Overhead for stats_command_string et al, take 2

2006-06-23 Thread Alvaro Herrera
Tom Lane wrote: I redid my previous measurements after finishing up the weekend's hacking. The numbers shown below are elapsed time in seconds for time psql -f testfile.sql postgres /dev/null Average of 5 runs, for the first two cases, on the x86 machine that shows high overhead in

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Well, the only thing left is to cluster the database. There are a couple ways to do this, one switch to a platform that supports clustering or create an API to wrap multiple databases. If your queries are simple and limited, you could create an HTTP/XML service that wraps a number of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-06-22 kell 12:41, kirjutas Mark Woodward: Depending on exact details and optimisations done, this can be either slower or faster than postgresql's way, but they still need to do something to get transactional visibility rules implemented. I think they have a

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread John DeSoi
On Jun 23, 2006, at 3:10 AM, Dave Page wrote: Is there any real reason to continue to support Cygwin? We've always said it's not a first class port, and now we have the native port which is it seems somewhat pointless expending further effort on it. Are all the tools needed to compile

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote: It sounds like you have a big problem and you need a big solution. Well, Postgres does a decent job as it is. The problem is under peek load, sometimes it gets bogged down and the usual things like vacuum will not help immediately.

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Tom Lane
Michael Meskes [EMAIL PROTECTED] writes: On Thu, Jun 22, 2006 at 11:37:08AM -0400, Tom Lane wrote: item #3: Carsten Wolff copyright in informix.c file The file informix.c contains a copyright from Carsten Wolff. Did Carsten directly contribute this file to the PostgreSQL project? This

Re: [HACKERS] xlog viewer proposal

2006-06-23 Thread Diogo Biazus
On 6/23/06, Simon Riggs [EMAIL PROTECTED] wrote: - give more flexibility for managing the xlogs remotelyNot sure what you mean. - I think it's faster to implement and to have a working and usable tool.Why do you think that? It sounds like you've got more work since you effectively need to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Alvaro Herrera
Csaba Nagy wrote: This could be a solution... but then I'm not sure how well would do queries which need the first 10 records based on some criteria which does not include the group id. I guess limit queries across the union of the partitions don't work too well for now, and we do have such

Re: [HACKERS] xlog viewer proposal

2006-06-23 Thread Diogo Biazus
On 6/23/06, Diogo Biazus [EMAIL PROTECTED] wrote: On 6/23/06, Simon Riggs [EMAIL PROTECTED] wrote: - give more flexibility for managing the xlogs remotelyNot sure what you mean.I can connect to the server if I want to query xlogs in a remote machine. If i depend on a standalone tool that reads

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/22/06, Alvaro Herrera [EMAIL PROTECTED] wrote: Not in all systems. A few now perform in-memory UNDO and only write it to disk if and when it is required. How does that work? If the last transaction is not finished after it wrote the tuple when the power goes out, and the UNDO is not

Re: [HACKERS] Full Disjunction

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Christopher Kings-Lynne [EMAIL PROTECTED] wrote: What IS this full disjunction business? Tzahi Fadida is working on a Summer of Code project to implement a contrib module which can perform full disjunctions within PostgreSQL. It's pretty cool. -- Jonah H. Harris, Software

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Tzahi Fadida
On Friday 23 June 2006 16:14, Martijn van Oosterhout wrote: On Fri, Jun 23, 2006 at 03:57:19PM +0300, Tzahi Fadida wrote: R contains indices but not on all attributes or not on all ordered subset of keys. Query example: (SELECT * FROM R WHERE a=3, b=6,. ...) UNION (SELECT * FROM R

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Ühel kenal päeval, N, 2006-06-22 kell 12:41, kirjutas Mark Woodward: Depending on exact details and optimisations done, this can be either slower or faster than postgresql's way, but they still need to do something to get transactional visibility rules implemented. I think they have a

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Dave Page
-Original Message- From: John DeSoi [mailto:[EMAIL PROTECTED] Sent: 23 June 2006 14:56 To: Dave Page Cc: Andrew Dunstan; Tom Lane; Bort, Paul; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [CORE] GPL Source and Copyright Questions On Jun 23, 2006, at 3:10 AM, Dave Page

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Tom Lane
Dave Page dpage@vale-housing.co.uk writes: Is there any real reason to continue to support Cygwin? We've always said it's not a first class port, and now we have the native port which is it seems somewhat pointless expending further effort on it. I think the day will come when there's a good

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Dave Page
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: 23 June 2006 15:15 To: Dave Page Cc: Andrew Dunstan; Bort, Paul; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [CORE] GPL Source and Copyright Questions Actually, my gripe about this one is that it wasn't

Re: [HACKERS] xlog viewer proposal

2006-06-23 Thread Simon Riggs
On Fri, 2006-06-23 at 11:03 -0300, Diogo Biazus wrote: On 6/23/06, Diogo Biazus [EMAIL PROTECTED] wrote: On 6/23/06, Simon Riggs [EMAIL PROTECTED] wrote: - give more flexibility for managing the xlogs remotely Not

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Alvaro Herrera
Mark Woodward wrote: In case of the number of actively modified rows being in only tens or low hundreds of thousands of rows, (i.e. the modified set fits in memory) the continuous vacuum process shows up as just another backend, not really taking order of magnitude more resources. It

Anyone still care about Cygwin? (was Re: [HACKERS] [CORE] GPL Source and Copyright Questions)

2006-06-23 Thread Tom Lane
Dave Page dpage@vale-housing.co.uk writes: Actually, my gripe about this one is that it wasn't detected promptly. That patch went in two weeks ago; we should have known about the problem within a couple days at most. Seems like the Windows members of the buildfarm don't run often enough.

Re: [HACKERS] xlog viewer proposal

2006-06-23 Thread Simon Riggs
On Fri, 2006-06-23 at 10:59 -0300, Diogo Biazus wrote: On 6/23/06, Simon Riggs [EMAIL PROTECTED] wrote: - give more flexibility for managing the xlogs remotely Not sure what you mean. - I think it's faster to implement and to have a working and

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread A.M.
On Fri, June 23, 2006 9:56 am, Martijn van Oosterhout wrote: On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote: It sounds like you have a big problem and you need a big solution. Well, Postgres does a decent job as it is. The problem is under peek load, sometimes it gets bogged

Re: Anyone still care about Cygwin? (was Re: [HACKERS] [CORE] GPL Source and Copyright Questions)

2006-06-23 Thread Larry Rosenman
Tom Lane wrote: Dave Page dpage@vale-housing.co.uk writes: Actually, my gripe about this one is that it wasn't detected promptly. That patch went in two weeks ago; we should have known about the problem within a couple days at most. Seems like the Windows members of the buildfarm don't

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Mark Woodward wrote: In case of the number of actively modified rows being in only tens or low hundreds of thousands of rows, (i.e. the modified set fits in memory) the continuous vacuum process shows up as just another backend, not really taking order of magnitude more resources. It

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
These sort of solutions, IMHO, don't show how good PostgreSQL is, but show where it is very lacking. We all know Postgres is lacking; some of us try to improve it (some with more success than others). People who know the current limitations but like the capabilities, try to find

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Zeugswetter Andreas DCP SD
This could be a solution... but then I'm not sure how well would do queries which need the first 10 records based on some criteria which does not include the group id. I guess limit queries across the union of the partitions don't work too well for now, and we do have such queries.

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 05:12:14PM +0300, Tzahi Fadida wrote: My initial reasoning was to avoid extra sorts but i guess that the planner just doesn't get the LIMIT 1. I see now that UNION should be better for the planner to undestand (not performance wise). However, UNION alone, doesn't

Re: [HACKERS] [CORE] GPL Source and Copyright Questions

2006-06-23 Thread Michael Meskes
On Fri, Jun 23, 2006 at 09:58:42AM -0400, Tom Lane wrote: That sounds fine --- could you add a note in the source code to this effect? Contributed under the PostgreSQL License or something like that after the copyright notice would be sufficient. No problem. Just committed it. Michael --

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-06-22 kell 13:49, kirjutas Mark Woodward: Christopher Browne [EMAIL PROTECTED] writes: Basically there's no free lunch: if you want the benefits of MVCC it's going to cost you somewhere. In the Postgres design you pay by having to do VACUUM pretty often for

[HACKERS] Webcluster session storage, was vacuum, performance, and MVCC

2006-06-23 Thread James Robinson
Verging on offtopic, but ... Regarding the best place for session data, memcached isn't really the answer, for opposite reasons as to why it isn't so great to store it in the central DB for a bug web farm. Folks on the memcached lists propose this [ I keep all my session data in

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
First of all, I base my assumptions on what I recall to have read on this list, as I didn't try yet partitioning using inheritance. It's not trivial to set up and I didn't have the time to play with it yet. So I wouldn't know for sure that it won't work fine with our application, and that will

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Tzahi Fadida
On Friday 23 June 2006 17:47, Martijn van Oosterhout wrote: On Fri, Jun 23, 2006 at 05:12:14PM +0300, Tzahi Fadida wrote: My initial reasoning was to avoid extra sorts but i guess that the planner just doesn't get the LIMIT 1. I see now that UNION should be better for the planner to

Re: [HACKERS] Webcluster session storage, was vacuum, performance, and MVCC

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 10:31:22AM -0400, James Robinson wrote: Regarding the best place for session data, memcached isn't really the answer, for opposite reasons as to why it isn't so great to store it in the central DB for a bug web farm. The thought that occurred to me while reading

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread mark
On Fri, Jun 23, 2006 at 10:24:06AM -0400, Alvaro Herrera wrote: Mark Woodward wrote: In case of the number of actively modified rows being in only tens or low hundreds of thousands of rows, (i.e. the modified set fits in memory) the continuous vacuum process shows up as just another

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 06:10:33PM +0300, Tzahi Fadida wrote: On Friday 23 June 2006 17:47, Martijn van Oosterhout wrote: On Fri, Jun 23, 2006 at 05:12:14PM +0300, Tzahi Fadida wrote: My initial reasoning was to avoid extra sorts but i guess that the planner just doesn't get the LIMIT 1.

Re: Anyone still care about Cygwin? (was Re: [HACKERS] [CORE] GPL

2006-06-23 Thread Andrew Dunstan
Tom Lane wrote: Dave Page dpage@vale-housing.co.uk writes: Actually, my gripe about this one is that it wasn't detected promptly. That patch went in two weeks ago; we should have known about the problem within a couple days at most. Seems like the Windows members of the buildfarm don't

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-06-23 kell 10:30, kirjutas Mark Woodward: What is interesting is setting up the server so that you can service your loads comfortably. Running the server at 100% lead is not anything you want to do on production server. There will be things you need to do

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Usually it gets really bad if you *don't* run vacuum continuously, maybe hopeing to do it in slower times at night. For high-update db you have to run it continuously, maybe having some 5-15 sec pauses between runs. And how much I/O does this take? Surprisingly its mostly WAL

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: On Fri, Jun 23, 2006 at 03:57:19PM +0300, Tzahi Fadida wrote: (SELECT * FROM R WHERE a=3, b=6,. ...) UNION (SELECT * FROM R WHERE b=5, d=2,. ...) UNION And lots of unions. Do you need UNION, or do you actually mean UNION ALL? Also,

Re: [HACKERS] Overhead for stats_command_string et al, take 2

2006-06-23 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: Average of 5 runs, for the first two cases, on the x86 machine that shows high overhead in gettimeofday. I used only 3 SELECT 1 queries instead of 100k. 3 SELECT 1; HEAD8.1 no overhead 21.9

Re: Anyone still care about Cygwin? (was Re: [HACKERS] [CORE] GPL Source and Copyright Questions)

2006-06-23 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes: Anyway, the lack of daily Cygwin builds is not permanent. There are several supported platforms not represented on the buildfarm - e.g. the one HPUX member has never actually reported any results. Yeah, and this is not a good thing. Eventually I'd

Re: [HACKERS] checking on buildfarm member thrush

2006-06-23 Thread Gaetano Mendola
Tom Lane wrote: Could you get a gdb stack trace from that crash? If the buildfarm run is under a suitable ulimit, it should be leaving a core file in the test PGDATA directory. Unfortunately the core size for the user pgfarm is 0: $ulimit -c 0 However I did a configure, make and make check

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Tom Lane
Csaba Nagy [EMAIL PROTECTED] writes: Surprisingly its mostly WAL traffic, the heap/index pages themselves are often not yet synced to disk by time of vacuum, so no additional traffic there. If you had made 5 updates per page and then vacuum it, then you make effectively 1 extra WAL write

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Tzahi Fadida
Perhaps it is over the top just for my specific query. Basically, i wish not to do something the system should do because, as i already noticed, when versions changes the database can break your code if you don't keep up. I guess i can make a map of attributes participating in an index of a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Bottom line: there's still lots of low-hanging fruit. Why are people feeling that we need to abandon or massively complicate our basic architecture to make progress? regards, tom lane I, for one, see a particularly nasty unscalable behavior in the implementation of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread David Fetter
On Fri, Jun 23, 2006 at 02:30:29PM -0400, Mark Woodward wrote: Bottom line: there's still lots of low-hanging fruit. Why are people feeling that we need to abandon or massively complicate our basic architecture to make progress? regards, tom lane I, for one, see

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: I, for one, see a particularly nasty unscalable behavior in the implementation of MVCC with regards to updates. I think this is a fairly common acceptance. The overhead required to perform an UPDATE in PostgreSQL is pretty heavy. Actually,

Re: Anyone still care about Cygwin? (was Re: [HACKERS] [CORE] GPL

2006-06-23 Thread Andrew Dunstan
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: There are several supported platforms not represented on the buildfarm - e.g. the one HPUX member has never actually reported any results. Yeah, and this is not a good thing. Eventually I'd like to get to a point where every

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jochem van Dieten
On 6/23/06, Mark Woodward wrote: For each update to a row additional work needs to be done to access that row. Surely a better strategy can be done, especially considering that the problem being solved is a brief one. The only reason why you need previous versions of a row is for transactions

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Bruce Momjian
Tom Lane wrote: If you're doing heavy updates of a big table then it's likely to end up visiting most of the table anyway, no? There is talk of keeping a map of dirty pages, but I think it'd be a win for infrequently-updated tables, not ones that need constant vacuuming. I think a lot of

Re: Anyone still care about Cygwin? (was Re: [HACKERS] [CORE] GPL

2006-06-23 Thread Adrian Maier
On 23/06/06, Andrew Dunstan [EMAIL PROTECTED] wrote: Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: There are several supported platforms not represented on the buildfarm - e.g. the one HPUX member has never actually reported any results. Yeah, and this is not a good thing.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Florian Weimer
* Gavin Sherry: Tom Lane's MVCC talk: http://www.postgresql.org/files/developer/transactions.pdf Is this still up-to-date with regard to to partial page writes? I hope that has been fixed (like the fsync issue). ---(end of broadcast)--- TIP 9:

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: I, for one, see a particularly nasty unscalable behavior in the implementation of MVCC with regards to updates. I think this is a fairly common acceptance. The overhead required to perform an UPDATE in PostgreSQL is pretty heavy.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: I think at some point we have to admit that _polling_ the tables, which is what autovacuum does, just isn't going to work well, no matter how much it is tweeked, and another approach should be considered for certain workload cases. Autovacuum polls in

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I think at some point we have to admit that _polling_ the tables, which is what autovacuum does, just isn't going to work well, no matter how much it is tweeked, and another approach should be considered for certain workload cases.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-06-23 kell 18:05, kirjutas Csaba Nagy: Usually it gets really bad if you *don't* run vacuum continuously, maybe hopeing to do it in slower times at night. For high-update db you have to run it continuously, maybe having some 5-15 sec pauses between runs.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: Rather than keep references to all versions of all rows in indexes, keep only a reference to the first or key row of each row, and have the first version of a row form the head of a linked list to subsequent versions of each row. The list will

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Tom Lane wrote: If you're doing heavy updates of a big table then it's likely to end up visiting most of the table anyway, no? There is talk of keeping a map of dirty pages, but I think it'd be a win for infrequently-updated tables, not ones that need constant vacuuming. I think a lot of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: Vacuum is findimentally inefficient The theory of database cleanup (VACUUM) is good, but has nothing to do with the overhead we currently encounter in performing an update. I have been ranting about a first row strategy, one where the first

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Stefan Kaltenbrunner
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I think at some point we have to admit that _polling_ the tables, which is what autovacuum does, just isn't going to work well, no matter how much it is tweeked, and another approach should be considered for certain workload cases.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be greener on the other side syndrome, and hardly any recognition that every technique has its downsides and complications. I'm being totally objective. I don't think we should

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: It's conceivable that the planner could prove that neither effect is possible in a particular query and then make the transformation automatically, but I'm not about to expend that kind of planning effort on such an odd case --- checking for it would waste

Re: [HACKERS] Planning without reason.

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 08:14:07PM +0300, Tzahi Fadida wrote: I guess i can make a map of attributes participating in an index of a relation. Also, i would have to take into account the type of index used. For example, a btree should have the capability to do prefix key searches while hash

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Bruce Momjian
Jonah H. Harris wrote: On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be greener on the other side syndrome, and hardly any recognition that every technique has its downsides and complications. I'm being totally objective.

[HACKERS] Buffers to Nest Loop Join and him pages

2006-06-23 Thread Daniel Xavier de Sousa
Hi for all Please, I have two doubts. Can somebody help me please?? First doubt: I want know how many pages Postgres use when execute one query for Nest-loop-Join. I have used “explain analyze”, but it don’t give number of pages… Second doubt: Normally when some SGBD

Re: [HACKERS] [GENERAL] Out of memory error in 8.1.0 Win32

2006-06-23 Thread Tom Lane
Ah-hah. I made a table similar to yours (with a lot of dummy data) and went trawling through the backend memory to try to see where the space was going. I found two significant inefficiencies in HashAggregate's space usage: * It stores a representative tuple for each input group, containing the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Rick Gigger
Just out of curiosity Mark, didn't you write your session daemon so that you don't have to put sessions in postgres anymore? Or are you just giving that as an example of a very wide, very heavily updated table? My session tables have been an extreme case of this problem, but no other

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Rick Gigger
On Jun 22, 2006, at 10:33 AM, Mark Woodward wrote: Ühel kenal päeval, N, 2006-06-22 kell 09:59, kirjutas Mark Woodward: After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into It

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Rick Gigger
On Jun 22, 2006, at 2:36 PM, Mark Woodward wrote: What you seem not to grasp at this point is a large web-farm, about 10 or more servers running PHP, Java, ASP, or even perl. The database is usually the most convenient and, aside from the particular issue we are talking about, best

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread mark
On Fri, Jun 23, 2006 at 03:08:34PM -0400, Bruce Momjian wrote: Tom Lane wrote: ... suggesting. We're having a hard enough time debugging and optimizing *one* storage model. I think the correct path forward is to stick with the same basic storage model and vacuuming concept, and address

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Just out of curiosity Mark, didn't you write your session daemon so that you don't have to put sessions in postgres anymore? The original project started as a shared key/value system for a beowulf cluster in the late 90s, but got reworked to be a session handler for PHP when I worked with Stig,