Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Simon; Thanks a lot for your comments/advices. I'd like to write some feedback. Simon Riggs wrote: On Tue, 2007-03-27 at 11:52 +0900, Koichi Suzuki wrote: Here's an update of a code to improve full page writes as proposed in http://archives.postgresql.org/pgsql-hackers/2007-01/msg01491.php and http://archives.postgresql.org/pgsql-patches/2007-01/msg00607.php Update includes some modification for error handling in archiver and restoration command. In the previous threads, I posted several evaluation and shown that we can keep all the full page writes needed for full XLOG crash recovery, while compressing archive log size considerably better than gzip, with less CPU consumption. I've found no further objection for this proposal but still would like to hear comments/opinions/advices. Koichi-san, Looks interesting. I like the small amount of code to do this. A few thoughts: - Not sure why we need "full_page_compress", why not just mark them always? That harms noone. (Did someone else ask for that? If so, keep it) No, no one asked to have a separate option. There'll be no bad influence to do so. As written below, full page write can be categolized as follows: 1) Needed for crash recovery: first page update after each checkpoint. This has to be kept in WAL. 2) Needed for archive recovery: page update between pg_start_backup and pg_stop_backup. This has to be kept in archive log. 3) For log-shipping slave such as pg_standby: no full page writes will be needed for this purpose. My proposal deals with 2). So, if we mark each "full_page_write", I'd rather mark when this is needed. Still need only one bit because the case 3) does not need any mark. - OTOH I'd like to see an explicit parameter set during recovery since you're asking the main recovery path to act differently in case a single bit is set/unset. If you are using that form of recovery, we should say so explicitly, to keep everybody else safe. Only one thing I had to do is to create "dummy" full page write to maintain LSNs. Full page writes are omitted in archive log. We have to LSNs same as those in the original WAL. In this case, recovery has to read logical log, not "dummy" full page writes. On the other hand, if both logical log and "real" full page writes are found in a log record, the recovery has to use "real" full page writes. - I'd rather mark just the nonremovable blocks. But no real difference It sound nicer. According to the full page write categories above, we can mark full page writes as needed in "crash recovery" or "archive recovery". Please give some feedback to the above full page write categories. - We definitely don't want an normal elog in a XLogInsert critical section, especially at DEBUG1 level I agree. I'll remove elog calls from critical sections. - diff -c format is easier and the standard I'll change the diff option. - pg_logarchive and pg_logrestore should be called by a name that reflects what they actually do. Possibly pg_compresslog and pg_decompresslog etc.. I've not reviewed those programs, but: I wasn't careful to have command names more based on what to be done. I'll change the command name. - Not sure why we have to compress away page headers. Touch as little as you can has always been my thinking with recovery code. Eliminating page headers gives compression rate slightly better, a couple of percents. To make code simpler, I'll drop this compression. - I'm very uncomfortable with touching the LSN. Maybe I misunderstand? The reason why we need pg_logarchive (or pg_decompresslog) is to maintain LSN the same as those in the original WAL. For this purpose, we need "dummy" full page write. I don't like to touch LSN either and this is the reason why pg_decompresslog need some extra work, eliminating many other changes in the recovery. - Have you thought about how pg_standby would integrate with this option? Can you please? Yes I believe so. As pg_standby does not include any chance to meet partial writes of pages, I believe you can omit all the full page writes. Of course, as Tom Lange suggested in http://archives.postgresql.org/pgsql-hackers/2007-02/msg00034.php removing full page writes can lose a chance to recover from partial/inconsisitent writes in the crash of pg_standby. In this case, we have to import a backup and archive logs (with full page writes during the backup) to recover. (We have to import them when the file system crashes anyway). If it's okay, I believe pg_compresslog/pg_decompresslog can be integrated with pg_standby. Maybe we can work together to include pg_compresslog/pg_decompresslog in pg_standby. - I'll do some docs for this after Freeze, if you'd like. I have some other changes to make there, so I can do this at the same time. I'll be looking forward to your writings. Best regards; -- Koichi Suzuki ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore
[PATCHES] Small code clean-up
Here are two small code clean-up in initdb and win32_shmem. pg_char_to_encoding() was redundant in initdb because pg_valid_server_encoding() returns the same result if the encoding is valid, Changes in win32_shmem suppress the following warnings. | pg_shmem.c: In function `PGSharedMemoryCreate': | pg_shmem.c:137: warning: long unsigned int format, Size arg (arg 2) | pg_shmem.c:159: warning: long unsigned int format, Size arg (arg 2) *** initdb.c.orig Mon Mar 19 01:50:43 2007 --- initdb.cWed Mar 28 10:02:42 2007 *** get_encoding_id(char *encoding_name) *** 709,716 if (encoding_name && *encoding_name) { ! if ((enc = pg_char_to_encoding(encoding_name)) >= 0 && ! pg_valid_server_encoding(encoding_name) >= 0) return encodingid_to_string(enc); } fprintf(stderr, _("%s: \"%s\" is not a valid server encoding name\n"), --- 709,715 if (encoding_name && *encoding_name) { ! if ((enc = pg_valid_server_encoding(encoding_name)) >= 0) return encodingid_to_string(enc); } fprintf(stderr, _("%s: \"%s\" is not a valid server encoding name\n"), *** win32_shmem.c.orig Wed Mar 28 10:17:14 2007 --- win32_shmem.c Wed Mar 28 10:16:36 2007 *** PGSharedMemoryCreate(Size size, bool mak *** 136,142 if (!hmap) ereport(FATAL, (errmsg("could not create shared memory segment: %lu", GetLastError()), !errdetail("Failed system call was CreateFileMapping(size=%lu, name=%s)", size, szShareMem))); /* * If the segment already existed, CreateFileMapping() will return a --- 136,142 if (!hmap) ereport(FATAL, (errmsg("could not create shared memory segment: %lu", GetLastError()), !errdetail("Failed system call was CreateFileMapping(size=%lu, name=%s)", (unsigned long) size, szShareMem))); /* * If the segment already existed, CreateFileMapping() will return a *** PGSharedMemoryCreate(Size size, bool mak *** 158,164 if (!hmap) ereport(FATAL, (errmsg("could not create shared memory segment: %lu", GetLastError()), !errdetail("Failed system call was CreateFileMapping(size=%lu, name=%s)", size, szShareMem))); if (GetLastError() == ERROR_ALREADY_EXISTS) ereport(FATAL, --- 158,164 if (!hmap) ereport(FATAL, (errmsg("could not create shared memory segment: %lu", GetLastError()), !errdetail("Failed system call was CreateFileMapping(size=%lu, name=%s)", (unsigned long) size, szShareMem))); if (GetLastError() == ERROR_ALREADY_EXISTS) ereport(FATAL, Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [PATCH] add CLUSTER table ORDER BY index
Gregory Stark <[EMAIL PROTECTED]> writes: > "Holger Schurig" <[EMAIL PROTECTED]> writes: >> * psql tab-completion, it favours now CLUSTER table ORDER BY index" > It occurs to me (sorry that I didn't think of this earlier) that if we're > going to use "ORDER BY" it really ought to take a list columns. Surely you jest. The point is to be ordered the same as the index, no? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] autovacuum: multiple workers
On Tue, 2007-03-27 at 17:41 -0400, Alvaro Herrera wrote: > The main change is to have an array of Worker structs in shared memory; > each worker checks the current table of all other Workers, and skips a > table that's being vacuumed by any of them. It also rechecks the table > before vacuuming, which removes the problem of redundant vacuuming. Slightly OT: Personally, I'd like it if we added an array for all special backends, with configurable behaviour. That way it would be easier to have multiple copies of other backends of any flavour using the same code, as well as adding others without cutting and pasting each time. That part of the postmaster code has oozed sideways in the past few years and seems in need of some love. (A former sinner repents). -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Patch for circular buffer in tuplestore to optimizemerge joins (v1)
On Tue, 2007-03-27 at 13:32 +0100, stark wrote: > This patch implements a circular buffer in tuplestore which drops old tuples > as they're no longer needed. It uses this for merge joins to avoid having to > spill the tuplestore if no single value exceeds work_mem. It also is what's > needed for both recursive query support and OLAP window functions (hence why > it implements the more complex circular buffer rather than just moving the > single tuple up to the head of the buffer). > > This was mostly already done by Simon, I just finished the logic in > tuplesort.c. Cool. This item was previously discussed here: http://archives.postgresql.org/pgsql-hackers/2007-01/msg00096.php (this changes tuplestore.c not tuplesort.c, though its clear in the patch) > This is actually not quite polished so I guess it's still a WIP but it's > certainly ready to be reviewed. All that remains is polishing. If there's > anything in there people object to now I would like to know. I guess a performance test would be a great eyecatcher here, but it does seem otherwise complete. Any eager merge join testers out there? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] autovacuum: multiple workers
Hi, This is the patch to put multiple workers into autovacuum. This patch applies after the recheck patch I just posted. The main change is to have an array of Worker structs in shared memory; each worker checks the current table of all other Workers, and skips a table that's being vacuumed by any of them. It also rechecks the table before vacuuming, which removes the problem of redundant vacuuming. It also introduces the business of SIGUSR1 between workers and launcher. The launcher keeps a database list in memory and schedules workers to vacuum databases depending on that list. The actual database selected may differ from what was in the schedule; in that case, the list is reconstructed. There are two main FIXMEs in this code: 1. have the list reconstruction and scheduling be smarter so that databases are not ganged together in the schedule. The only difficulty is keeping the sort order that the databases had. 2. have a way to clean up after failed workers filling up the Worker array and thus starving other databases from vacuuming. I don't really know a way to do this that works in all cases. The only idea I have so far is that workers that started more than autovacuum_naptime seconds ago are considered failed to start. Neither of these is really minor, but I think they are solvable. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support *** 14recheck/src/backend/postmaster/autovacuum.c 2007-03-27 16:43:31.0 -0400 --- 12vacuum/src/backend/postmaster/autovacuum.c 2007-03-27 17:40:19.0 -0400 *** *** 52,57 --- 52,58 #include "utils/syscache.h" + static volatile sig_atomic_t got_SIGUSR1 = false; static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t avlauncher_shutdown_request = false; *** *** 59,64 --- 60,66 * GUC parameters */ bool autovacuum_start_daemon = false; + int autovacuum_max_workers; int autovacuum_naptime; int autovacuum_vac_thresh; double autovacuum_vac_scale; *** *** 69,75 int autovacuum_vac_cost_delay; int autovacuum_vac_cost_limit; ! /* Flag to tell if we are in the autovacuum daemon process */ static bool am_autovacuum_launcher = false; static bool am_autovacuum_worker = false; --- 71,77 int autovacuum_vac_cost_delay; int autovacuum_vac_cost_limit; ! /* Flags to tell if we are in an autovacuum process */ static bool am_autovacuum_launcher = false; static bool am_autovacuum_worker = false; *** *** 86,91 --- 88,94 typedef struct autovac_dbase { Oid ad_datid; + TimestampTz ad_next_worker; char *ad_name; TransactionId ad_frozenxid; PgStat_StatDBEntry *ad_entry; *** *** 110,123 int at_vacuum_cost_limit; } autovac_table; typedef struct { ! Oid process_db; /* OID of database to process */ ! int worker_pid; /* PID of the worker process, if any */ } AutoVacuumShmemStruct; static AutoVacuumShmemStruct *AutoVacuumShmem; #ifdef EXEC_BACKEND static pid_t avlauncher_forkexec(void); static pid_t avworker_forkexec(void); --- 113,158 int at_vacuum_cost_limit; } autovac_table; + /*- + * This struct holds information about a single worker's whereabouts. We keep + * an array of these in shared memory, sized according to + * autovacuum_max_workers. + * + * wi_dboid OID of the database this worker is supposed to work on + * wi_tableoid OID of the table currently being vacuumed + * wi_workerpid PID of the running worker, 0 if not yet started + * wi_finished True when the worker is done and about to exit + *- + */ + typedef struct + { + Oid wi_dboid; + Oid wi_tableoid; + int wi_workerpid; + bool wi_finished; + } WorkerInfo; + typedef struct { ! pid_t av_launcherpid; ! WorkerInfo av_workers[1]; ! /* VARIABLE LENGTH STRUCT */ } AutoVacuumShmemStruct; + /* Macro to iterate over all workers. Beware multiple evaluation of args! */ + #define foreach_worker(_i, _worker) \ + _worker = (WorkerInfo *) (AutoVacuumShmem + \ + offsetof(AutoVacuumShmemStruct, av_workers)); \ + for (_i = 0; _i < autovacuum_max_workers; _i++, _worker += sizeof(WorkerInfo)) + static AutoVacuumShmemStruct *AutoVacuumShmem; + /* number of currently free worker slots; only valid in the launcher */ + static int free_workers; + /* the database list in the launcher, and the context that contains it */ + static Dllist *DatabaseList = NULL; + static MemoryContext DatabaseListCxt = NULL; + #ifdef EXEC_BACKEND static pid_t avlauncher_forkexec(void); static pid_t avworker_forkexec(void); *** *** 125,133 NON_EXEC_STATIC void AutoVacWorkerMain(int argc, char *argv[]); NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]); ! static void do_start_wo
Re: [PATCHES] [PATCH] add CLUSTER table ORDER BY index
"Holger Schurig" <[EMAIL PROTECTED]> writes: > * psql tab-completion, it favours now CLUSTER table ORDER BY index" It occurs to me (sorry that I didn't think of this earlier) that if we're going to use "ORDER BY" it really ought to take a list columns. It would be more consistent with what ORDER BY means in queries and one day we may want to add support for ordering by arbitrary lists of columns even if no index exists. It may be reasonable to allow both to coexist and just have an arbitrary rule that if an index of the specified name exists takes precedence over any columns. I've never seen anyone name their indexes in a way that would conflict with a column anyways. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
On Tue, 2007-03-27 at 11:52 +0900, Koichi Suzuki wrote: > Here's an update of a code to improve full page writes as proposed in > > http://archives.postgresql.org/pgsql-hackers/2007-01/msg01491.php > and > http://archives.postgresql.org/pgsql-patches/2007-01/msg00607.php > > Update includes some modification for error handling in archiver and > restoration command. > > In the previous threads, I posted several evaluation and shown that we > can keep all the full page writes needed for full XLOG crash recovery, > while compressing archive log size considerably better than gzip, with > less CPU consumption. I've found no further objection for this proposal > but still would like to hear comments/opinions/advices. Koichi-san, Looks interesting. I like the small amount of code to do this. A few thoughts: - Not sure why we need "full_page_compress", why not just mark them always? That harms noone. (Did someone else ask for that? If so, keep it) - OTOH I'd like to see an explicit parameter set during recovery since you're asking the main recovery path to act differently in case a single bit is set/unset. If you are using that form of recovery, we should say so explicitly, to keep everybody else safe. - I'd rather mark just the nonremovable blocks. But no real difference - We definitely don't want an normal elog in a XLogInsert critical section, especially at DEBUG1 level - diff -c format is easier and the standard - pg_logarchive and pg_logrestore should be called by a name that reflects what they actually do. Possibly pg_compresslog and pg_decompresslog etc.. I've not reviewed those programs, but: - Not sure why we have to compress away page headers. Touch as little as you can has always been my thinking with recovery code. - I'm very uncomfortable with touching the LSN. Maybe I misunderstand? - Have you thought about how pg_standby would integrate with this option? Can you please? - I'll do some docs for this after Freeze, if you'd like. I have some other changes to make there, so I can do this at the same time. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PATCHES] autovacuum: recheck logic
Hi, This is the first non-trivial patch to autovacuum multiple workers. This patch that adds the "recheck" logic to autovacuum worker. With this, the worker first builds its table list and then rechecks pgstat before vacuuming each table to verify that no one has vacuumed the table in the meantime, before vacuuming it. In the current autovacuum world this only means that a worker will not vacuum a table that a user has vacuumed manually. As discussed, it will be much more useful as soon as multiple workers are running concurrently. To do this, I separated the task of calculating autovacuum parameters (freeze_min_age, vacuum cost limit and delay, etc) from the autovac equation calculation (freeze_max_age, pg_class.reltuples, etc). We now keep track of three lists at the initial pg_class scan: 1. tables that need vacuum or analyze, per equations 2. tables not in (1) that have toast tables 3. toast tables that need vacuum Then we append those tables in (2) whose toast tables are in (3), to the (1) list. The rest are discarded. So when we need to do the rechecking, we need to process only those tables that actually needed vacuuming. With the previous coding, we would end up rechecking almost all tables every time (to be exact, all tables that have a toast table). The autovacuum parameters are only calculated in the second pass (the rechecking). The first pass only yields boolean parameters. Unless there are objections I'll apply this tomorrow. -- Alvaro Herrerahttp://www.PlanetPostgreSQL.org "La naturaleza, tan frágil, tan expuesta a la muerte... y tan viva" Index: src/backend/postmaster/autovacuum.c === RCS file: /home/alvherre/cvs/pgsql/src/backend/postmaster/autovacuum.c,v retrieving revision 1.39 diff -c -p -r1.39 autovacuum.c *** src/backend/postmaster/autovacuum.c 27 Mar 2007 20:36:03 - 1.39 --- src/backend/postmaster/autovacuum.c 27 Mar 2007 20:43:31 - *** typedef struct autovac_dbase *** 91,96 --- 91,103 PgStat_StatDBEntry *ad_entry; } autovac_dbase; + /* struct to keep track of tables to vacuum and/or analyze, in 1st pass */ + typedef struct av_relation + { + Oid ar_relid; + Oid ar_toastrelid; + } av_relation; + /* struct to keep track of tables to vacuum and/or analyze, after rechecking */ typedef struct autovac_table { *** NON_EXEC_STATIC void AutoVacLauncherMain *** 121,137 static void do_start_worker(void); static void do_autovacuum(Oid dbid); static List *autovac_get_database_list(void); ! static void test_rel_for_autovac(Oid relid, PgStat_StatTabEntry *tabentry, ! Form_pg_class classForm, ! Form_pg_autovacuum avForm, ! List **vacuum_tables, ! List **toast_table_ids); static void autovacuum_do_vac_analyze(Oid relid, bool dovacuum, bool doanalyze, int freeze_min_age); static HeapTuple get_pg_autovacuum_tuple_relid(Relation avRel, Oid relid); static PgStat_StatTabEntry *get_pgstat_tabentry_relid(Oid relid, bool isshared, PgStat_StatDBEntry *shared, PgStat_StatDBEntry *dbentry); static void autovac_report_activity(VacuumStmt *vacstmt, Oid relid); static void avl_sighup_handler(SIGNAL_ARGS); static void avlauncher_shutdown(SIGNAL_ARGS); --- 128,152 static void do_start_worker(void); static void do_autovacuum(Oid dbid); static List *autovac_get_database_list(void); ! ! static void relation_check_autovac(Oid relid, Form_pg_class classForm, ! Form_pg_autovacuum avForm, PgStat_StatTabEntry *tabentry, ! List **table_oids, List **table_toast_list, ! List **toast_oids); ! static autovac_table *table_recheck_autovac(Oid relid, ! PgStat_StatDBEntry *shared, ! PgStat_StatDBEntry *dbentry); static void autovacuum_do_vac_analyze(Oid relid, bool dovacuum, bool doanalyze, int freeze_min_age); static HeapTuple get_pg_autovacuum_tuple_relid(Relation avRel, Oid relid); static PgStat_StatTabEntry *get_pgstat_tabentry_relid(Oid relid, bool isshared, PgStat_StatDBEntry *shared, PgStat_StatDBEntry *dbentry); + static void relation_needs_vacanalyze(Oid relid, Form_pg_autovacuum avForm, + Form_pg_class classForm, + PgStat_StatTabEntry *tabentry, bool *dovacuum, + bool *doanalyze); + static HeapTuple get_pg_autovacuum_tuple_relid(Relation avRel, Oid relid); static void autovac_report_activity(VacuumStmt *vacstmt, Oid relid); static void avl_sighup_handler(SIGNAL_ARGS); static void avlauncher_shutdown(SIGNAL_ARGS); *** do_autovacuum(Oid dbid) *** 872,879 HeapTuple tuple; HeapScanDesc relScan; Form_pg_database dbForm; ! List *vacuum_tables = NIL; ! List *toast_table_ids = NIL; ListCell *cell; PgStat_StatDBEntry *shared; PgStat_StatDBEntry *dbentry; --- 887,895 HeapTuple tuple; HeapScanDesc relScan; F
Re: [PATCHES] [pgsql-patches] O_DIRECT support for Windows
IIRC, we're still waiting for performance numbers showing there exists a win from this patch. //Magnus Bruce Momjian wrote: > Magnus, where are on this? > > --- > > Magnus Hagander wrote: >> We're ok with the alignment issues provided the is code added to reject >> O_DIRECT if the sector size is too large. >> >> We also said we need to see some performance numbers on the effect of >> the patch before it goes in. >> >> //Magnus >> >> >> Bruce Momjian wrote: >>> So, do we want this patch? Are we OK on WIN32 alignment issues? >>> >>> --- >>> >>> ITAGAKI Takahiro wrote: The attached is a patch to define O_DIRECT by ourselves on Windows, and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. There will be a consistency in our support between Windows and other OSes that have O_DIRECT. Also, there is the following comment that says, I read, we should do so. | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) Is this worth doing? Do we need more performance reports for the change? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center >>> [ Attachment, skipping... ] >>> ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster >> >> ---(end of broadcast)--- >> TIP 7: You can help support the PostgreSQL project by donating at >> >> http://www.postgresql.org/about/donate > ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PATCHES] [PATCH] add CLUSTER table ORDER BY index
The following table add's a new variant of the CLUSTER command. The old variants are preserved, as suggested in the TODO entry. Things I changed: * The grammar * psql help text * psql tab-completion, it favours now CLUSTER table ORDER BY index" * two uses of CLUSTER in the regression, so that both the old and new syntax get checked Things to consider: * not yet in the documentation * psql should probably no longer emit "CLUSTER index ON table" Index: src/doc/TODO === *** src.orig/doc/TODO 2007-03-27 21:18:01.0 +0200 --- src/doc/TODO2007-03-27 21:18:26.0 +0200 *** *** 624,631 o %Add VERBOSE option to report tables as they are processed, like VACUUM VERBOSE - o Add more logical syntax CLUSTER table ORDER BY index; - support current syntax for backward compatibility * COPY --- 624,629 Index: src/doc/src/FAQ/TODO.html === *** src.orig/doc/src/FAQ/TODO.html 2007-03-27 21:18:01.0 +0200 --- src/doc/src/FAQ/TODO.html 2007-03-27 21:18:26.0 +0200 *** *** 558,565 %Add VERBOSE option to report tables as they are processed, like VACUUM VERBOSE - Add more logical syntax CLUSTER table ORDER BY index; - support current syntax for backward compatibility COPY --- 558,563 Index: src/src/backend/parser/gram.y === *** src.orig/src/backend/parser/gram.y 2007-03-27 21:18:01.0 +0200 --- src/src/backend/parser/gram.y 2007-03-27 21:18:26.0 +0200 *** *** 209,215 %typerelation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name %type func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator --- 209,215 %typerelation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name opt_cluster_order_by %type func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator *** *** 5327,5332 --- 5327,5333 * *QUERY: *cluster on + *cluster ORDER BY *cluster *cluster * *** *** 5340,5350 n->indexname = $2; $$ = (Node*)n; } ! | CLUSTER qualified_name { ClusterStmt *n = makeNode(ClusterStmt); n->relation = $2; ! n->indexname = NULL; $$ = (Node*)n; } | CLUSTER --- 5341,5351 n->indexname = $2; $$ = (Node*)n; } ! | CLUSTER qualified_name opt_cluster_order_by { ClusterStmt *n = makeNode(ClusterStmt); n->relation = $2; ! n->indexname = $3; $$ = (Node*)n; } | CLUSTER *** *** 5356,5361 --- 5357,5368 } ; + opt_cluster_order_by: + ORDER BY index_name { $$ = $3; } + | /*EMPTY*/ { $$ = NULL; } + ; + + /* * *QUERY: Index: src/src/bin/psql/sql_help.h === *** src.orig/src/bin/psql/sql_help.h2007-03-27 21:18:01.0 +0200 --- src/src/bin/psql/sql_help.h 2007-03-27 21:18:26.0 +0200 *** *** 119,125 { "CLUSTER", N_("cluster a table according to an index"), ! N_("CLUSTER indexname ON tablename\nCLUSTER tablename\nCLUSTER") }, { "COMMENT", N_("define or change the comment of an object"), --- 119,125 { "CLUSTER", N_("cluster a table according to an index"), ! N_("CLUSTER indexname ON tablename\nCLUSTER tablename [ORDER BY indexname]\nCLUSTER") }, { "COMMENT",
Re: [PATCHES] DEALLOCATE ALL
On 3/27/07, Tom Lane <[EMAIL PROTECTED]> wrote: Alvaro Herrera <[EMAIL PROTECTED]> writes: > Marko Kreen wrote: >> When pooling connections where prepared statements are in use, >> it is hard to give new client totally clean connection as >> there may be allocated statements that give errors when >> new client starts preparing statements again. > Huh, didn't we have a RESET SESSION command to do just that? What about > cursors, for example? We don't actually *have* one, but I believe it was agreed that that is the right API to provide. If a pooler has to remember to clear prepared statements, GUCs, cursors, and who knows what else, it'll be perpetually broken because there'll be something it omits. Well. Please apply following patch then: http://archives.postgresql.org/pgsql-patches/2004-12/msg00228.php Even if it is incomplete, the missing parts can be added later. I see no reason to keep it from users. There might be a use-case for DEALLOCATE ALL, but needs of poolers aren't it. I'd be inclined to vote against this unless someone can point to a better use-case. Ok, a non-pooler argument: prepared statements are supposed to be garbage-collected by the user. Thats it. There should be friendly way to get a clean state without the need for user to specifically keep track of whats allocated, or to do messy exception-handling around PREPARE/DEALLOCATE. (PREPARE OR REPLACE and DEALLOCATE IF EXISTS would also lessen the pain.) Then a pooler argument: there is one pooler where RandomJoe executes queries and another for specific app where the subset of SQL it uses is known. I want to RESET only specific things in app case. So it would be good if the RESET-s for specific areas would be available. Also the objections to the Hans' patch give impression that different pooling solutions want different RESET EVERYTHING, so again, it would be good if RESET-s for different areas are available and the all-encomassing RESET EVERYTHING just ties all the specific RESETs together. -- marko ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [pgsql-patches] O_DIRECT support for Windows
Magnus, where are on this? --- Magnus Hagander wrote: > We're ok with the alignment issues provided the is code added to reject > O_DIRECT if the sector size is too large. > > We also said we need to see some performance numbers on the effect of > the patch before it goes in. > > //Magnus > > > Bruce Momjian wrote: > > So, do we want this patch? Are we OK on WIN32 alignment issues? > > > > --- > > > > ITAGAKI Takahiro wrote: > >> The attached is a patch to define O_DIRECT by ourselves on Windows, > >> and to map O_DIRECT to FILE_FLAG_NO_BUFFERING. > >> > >> There will be a consistency in our support between Windows and other OSes > >> that have O_DIRECT. Also, there is the following comment that says, I read, > >> we should do so. > >> | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH) > >> > >> Is this worth doing? Do we need more performance reports for the change? > >> > >> Regards, > >> --- > >> ITAGAKI Takahiro > >> NTT Open Source Software Center > > > > [ Attachment, skipping... ] > > > >> ---(end of broadcast)--- > >> TIP 2: Don't 'kill -9' the postmaster > > > > > ---(end of broadcast)--- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Numeric patch to add special-case representations for < 8 bytes
Gregory Stark wrote: The main design issue is that I was proposing to make it impossible to access the internals of the numeric storage using macros. Currently some of the data (the sign, dscale, and weight) is visible without having to call any special numeric functions. I was proposing to use representations where those might not be as easily accessible. However I don't think we have any consumers of that data outside of numeric.c anyways. Is there anything using that functionality currently? Do we mind losing it? The data would still be available through a function, right? If there's no-one accessing that information currently, there's no backwards-compatibility issue. I think this is a non-issue. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] DEALLOCATE ALL
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Marko Kreen wrote: >> When pooling connections where prepared statements are in use, >> it is hard to give new client totally clean connection as >> there may be allocated statements that give errors when >> new client starts preparing statements again. > Huh, didn't we have a RESET SESSION command to do just that? What about > cursors, for example? We don't actually *have* one, but I believe it was agreed that that is the right API to provide. If a pooler has to remember to clear prepared statements, GUCs, cursors, and who knows what else, it'll be perpetually broken because there'll be something it omits. There might be a use-case for DEALLOCATE ALL, but needs of poolers aren't it. I'd be inclined to vote against this unless someone can point to a better use-case. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Heikki Linnakangas wrote: > Tom Lane wrote: > > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > >> Maybe we should keep this issue open until we resolve the vacuum WAL > >> flush issue? I can then rerun the same tests to see if this patch is a > >> win after that. > > > > Sounds like a plan, if you are willing to do that. > > Sure, just rerunning the same tests isn't much work. > > Bruce Momjian wrote: > > Would you like to add a TODO item? > > I don't know how we track things like this. Maybe add to the end of the > patch queue, with link to this discussion so that we remember that it > needs more testing? OK, I added this email to the patch queue. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] DEALLOCATE ALL
Marko Kreen wrote: > When pooling connections where prepared statements are in use, > it is hard to give new client totally clean connection as > there may be allocated statements that give errors when > new client starts preparing statements again. Huh, didn't we have a RESET SESSION command to do just that? What about cursors, for example? > I did it slightly hacky way - if DeallocateStmt->name is > NULL is signifies DEALLOCATE ALL command. All the code > that looks into DeallocateStmt seems to survive the situation > so it should be problem. If still a new node is needed > or additional field in the node I can rework the patch. Wouldn't it be easier to just add a bool to DeallocateStmt? -- Alvaro Herrerahttp://www.PlanetPostgreSQL.org "Si un desconocido se acerca y te regala un CD de Ubuntu ... Eso es ... Eau de Tux" ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [pgsql-patches] pg_get_domaindef
I have remove this TODO item: * %Add pg_get_acldef(), pg_get_typedefault(), pg_get_attrdef(), pg_get_tabledef(), pg_get_domaindef(), pg_get_functiondef() These would be for application use, not for use by pg_dump. Seems there is lack of interest in adding this feature because of maintanance concerns. The attached patch is rejected for the same reason. Sorry for the confusion. --- FAST PostgreSQL wrote: > On Thu, 25 Jan 2007 02:25, Tom Lane wrote: > > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > > FAST PostgreSQL wrote: > > >> Please find attached the patch with modifications > > > > > > are you proposing to implement the other functions in this TODO item > > > (pg_get_acldef(), pg_get_typedefault(), pg_get_attrdef(), > > > pg_get_tabledef(), pg_get_functiondef() ) ? > > > > I haven't entirely understood the use case for any of these. It's not > > Any consensus on these functions? If we decide against having these then its > better to remove them from the TODO list temporarily/permanently. > > Rgds, > Arul Shaji > > > > pg_dump, for a number of reasons: one being that pg_dump still has to > > support older backend versions, and another being that every time we > > let backend SnapshotNow functions get involved, we take another hit to > > pg_dump's claim to produce a consistent MVCC snapshot. > > > > But my real objection is: do we really want to support duplicative code > > in both pg_dump and the backend? Updating pg_dump is already a major > > PITA whenever one adds a new feature; doubling that work isn't > > attractive. (And it'd be double, not just a copy-and-paste, because of > > the large difference in the operating environment.) So I want to hear a > > seriously convincing use-case that will justify the maintenance load we > > are setting up for ourselves. "Somebody might want this" is not > > adequate. > > > > Perhaps a better area of work would be the often-proposed refactoring of > > pg_dump into a library and driver program, wherein the library could > > expose individual functions such as "fetch the SQL definition of this > > object". Unfortunately, that'll be a huge project with no payoff until > > the end... > > > > regards, tom lane > This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 > 003 693 481. It is confidential to the ordinary user of the email address to > which it was addressed and may contain copyright and/or legally privileged > information. No one else may read, print, store, copy or forward all or any > of it or its attachments. If you receive this email in error, please return > to sender. Thank you. > > If you do not wish to receive commercial email messages from Fujitsu > Australia Software Technology Pty Ltd, please email [EMAIL PROTECTED] > > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Tom Lane wrote: Heikki Linnakangas <[EMAIL PROTECTED]> writes: Maybe we should keep this issue open until we resolve the vacuum WAL flush issue? I can then rerun the same tests to see if this patch is a win after that. Sounds like a plan, if you are willing to do that. Sure, just rerunning the same tests isn't much work. Bruce Momjian wrote: Would you like to add a TODO item? I don't know how we track things like this. Maybe add to the end of the patch queue, with link to this discussion so that we remember that it needs more testing? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > Maybe we should keep this issue open until we resolve the vacuum WAL > flush issue? I can then rerun the same tests to see if this patch is a > win after that. Sounds like a plan, if you are willing to do that. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Heikki Linnakangas wrote: > Tom Lane wrote: > > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > >> Bruce Momjian wrote: > >>> So are you stopping work on the patch? I assume so. > > > >> Yes, at least for now. I can't believe the patch actually hurts > >> performance, but I'm not going to spend time investigating it. > > > > Are we withdrawing the patch from consideration for 8.3 then? > > I had assumed it was still a live candidate, but if it seems to > > lose in pgbench maybe we had better set it aside. > > I haven't tried pgbench, the tests I ran were with DBT-2. > > Just to summarize again: the patch did help to keep the stock table > smaller, but the response times were higher with the patch. > > Maybe we should keep this issue open until we resolve the vacuum WAL > flush issue? I can then rerun the same tests to see if this patch is a > win after that. Would you like to add a TODO item? -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Tom Lane wrote: Heikki Linnakangas <[EMAIL PROTECTED]> writes: Bruce Momjian wrote: So are you stopping work on the patch? I assume so. Yes, at least for now. I can't believe the patch actually hurts performance, but I'm not going to spend time investigating it. Are we withdrawing the patch from consideration for 8.3 then? I had assumed it was still a live candidate, but if it seems to lose in pgbench maybe we had better set it aside. I haven't tried pgbench, the tests I ran were with DBT-2. Just to summarize again: the patch did help to keep the stock table smaller, but the response times were higher with the patch. Maybe we should keep this issue open until we resolve the vacuum WAL flush issue? I can then rerun the same tests to see if this patch is a win after that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > Bruce Momjian wrote: >> So are you stopping work on the patch? I assume so. > Yes, at least for now. I can't believe the patch actually hurts > performance, but I'm not going to spend time investigating it. Are we withdrawing the patch from consideration for 8.3 then? I had assumed it was still a live candidate, but if it seems to lose in pgbench maybe we had better set it aside. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Gregory Stark wrote: "Heikki Linnakangas" <[EMAIL PROTECTED]> writes: Yes, at least for now. I can't believe the patch actually hurts performance, but I'm not going to spend time investigating it. Isn't this exactly what you would expect? It will clean up more tuples so it'll dirty more pages. Especially given the pessimal way vacuum's dirty buffers are handled until Simon's patch to fix that goes in. Hmm. Yeah, maybe it'll get better when we get that fixed.. The benefit of the patch that we would expect to see is that you won't need to run VACUUM as often. In the long term we would expect the stock table to grow less too but I doubt these tests were long enough to demonstrate that effect. The size did reach a steady state about half-way through the test, see the logs here: patched http://community.enterprisedb.com/oldestxmin/92/server/relsizes.log unpatched http://community.enterprisedb.com/oldestxmin/93/server/relsizes.log The test was a success in that sense, the patch did reduce the steady state size of the stock table. Maybe we would see a gain in transactions per minute or response times if we traded off the smaller table size to run vacuum slightly less frequently.. But as I said I don't want to spend time running more tests for what seems like a small benefit. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > But what surprises me is that response times went up a with the patch. I > don't know why. Maybe because of increased contention of ProcArrayLock? (I assume you are using that, althought I haven't seen the patch) >>> I am, but I doubt that's it. The response times are dominated by I/O, so any >>> increase in lock contention would hardly show up. And the patch is only >>> adding one GetOldestXmin call every 1000 scanned pages, which is nothing >>> compared to the thousands of GetSnapshot calls the normal transactions are >>> issuing per minute. >>> >>> The patch must have changed the I/O pattern in some subtle way. >> >> So are you stopping work on the patch? I assume so. > > Yes, at least for now. I can't believe the patch actually hurts performance, > but I'm not going to spend time investigating it. Isn't this exactly what you would expect? It will clean up more tuples so it'll dirty more pages. Especially given the pessimal way vacuum's dirty buffers are handled until Simon's patch to fix that goes in. The benefit of the patch that we would expect to see is that you won't need to run VACUUM as often. In the long term we would expect the stock table to grow less too but I doubt these tests were long enough to demonstrate that effect. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] Concurrent psql v4 [WIP]
This is just an update against CVS. The interface is still as described at this URL: http://community.enterprisedb.com/concurrent/index.html It's true there isn't any documentation in the patch but I'm not sure we're actually settled on the interface. I think our experience here is that the \cwait command was a mistake. You never need it in intractive use, and you need to use it *religiously* when writing regression tests. If you miss one your regression test will fail randomly depending on the timing. Instead we should just have a psql setting that makes it automatically wait before every connection switch. That ensures you catch any bugs where a command fails to block when it should, and also ensures you catch the output of an unblocked command as soon as you switch connections to it. The other issue is the cursor behaviour. Currently it looks like below and it starts a new pager for each block. I'm not sure this is really all that bad myself but I have a feeling others will disagree. I'm not exactly sure what the right behaviour is though, if this is an asynchronous command issued with \cnowait then it would strange to suddenly start executing synchronously once the first bit of data arrives. postgres[2]=# select * from generate_series(1,4); generate_series - 1 2 3 4 (10 rows) postgres[2]=# \set FETCH_COUNT 1 postgres[2]=# select * from generate_series(1,10); generate_series - 1 (1 row) generate_series - 2 (1 row) generate_series - 3 (1 row) generate_series - 4 (1 row) generate_series - (0 rows) concurrent-psql-v4.patch.gz Description: Binary data -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] WIP patch - INSERT-able log statements
What is the status of this patch? --- FAST PostgreSQL wrote: > Hi, > > I've been working on the following TODO item and attached is an initial > patch. (It is only partial and not yet completely functional) > > "Allow server log information to be output as INSERT statements > This would allow server log information to be easily loaded into a database > for analysis. " > > I want to confirm, if what I have done so far is what community is looking > for and also want to clear some doubts. > > What is done so far > --- > > Two postgresql.conf variables > > #log_output_type = 'text' #Valid values are 'SQL' or 'text' > #log_output_table_name = 'auditlogs' > > These control how to output the log. Defaults to 'text' which is status quo. > If it is set to 'SQL' log will be output as INSERT commands. > > The second variable is of interest. We need to specify a table in the insert > command. My preferred option is for the user to give one and he can create it > if and when he wants to. The alternative is we decide the table name and make > initdb to create one. > > The proposed log output structure > -- > INSERT INTO user_defined_table values( timestamp_with_milliseconds, > timestamp, username, databasename, sessionid, host_and_port, host, proc_id, > command_tag, session_start, transaction_id, error_severity, > SQL_State_Code, error_message); > > All these columns will follow the current rules of log output. ie, unless > explicity requested by the user, these columns will have NULL. User can still > give log_line_prefix in any order he wants, and logger will output it in > appropriate columns. The code has been modified to do > this. > > Issues/Questions are: > - How about 'Statement duration log'. This will come to the logger as a > single string and after the query execution. In the existing log we can make > sense of the duration log by matching it with the statement above it or by > the statement which gets printed besides it (Again as > a single string). But when this is loaded onto a table doesn't make much > sense untless everything is in a single row. (My preferred option is to add > another column to the table structure defined above as 'duration'. But > haven't figured out how to achieve this, because the > statement is printed first and then the duration as another log.) > > - If the SQL log output is to the syslog, then it becomes pretty awkward and > possibly useless because our current syslog writer function breaks up the log > into several lines to accomodate various platforms. Syslog also then adds > other information before outputting it, which > cannot be loaded onto a table. The preferred option is to educate the user > through documentation that SQL type log output is best served when it is > output to stderr and redirected to a file? Same goes with other aspects such > as verbose and various other statistics log. > > - There are also other minor issues such as, the actual query currently gets > output in log as 'Statement: CREATE '. For sql type log we may not > need the 'Statement:' part as it will be in a column ? Do we remove this in > both text and SQL outputs ? > > Rgds, > Arul Shaji > > This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 > 003 693 481. It is confidential to the ordinary user of the email address to > which it was addressed and may contain copyright and/or legally privileged > information. No one else may read, print, store, copy or forward all or any > of it or its attachments. If you receive this email in error, please return > to sender. Thank you. > > If you do not wish to receive commercial email messages from Fujitsu > Australia Software Technology Pty Ltd, please email [EMAIL PROTECTED] [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Bruce Momjian wrote: Heikki Linnakangas wrote: Alvaro Herrera wrote: Heikki Linnakangas wrote: I ran two 24h test runs with DBT-2, one with the patch and one without. To get comparable, predictable results, I turned autovacuum off and run a manual vacuum in a loop on the stock-table alone. As expected, the steady-state of the stock table is smaller with the patch. But only by ~2%, that's slightly less than I expected. But what surprises me is that response times went up a with the patch. I don't know why. Maybe because of increased contention of ProcArrayLock? (I assume you are using that, althought I haven't seen the patch) I am, but I doubt that's it. The response times are dominated by I/O, so any increase in lock contention would hardly show up. And the patch is only adding one GetOldestXmin call every 1000 scanned pages, which is nothing compared to the thousands of GetSnapshot calls the normal transactions are issuing per minute. The patch must have changed the I/O pattern in some subtle way. So are you stopping work on the patch? I assume so. Yes, at least for now. I can't believe the patch actually hurts performance, but I'm not going to spend time investigating it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Fast CLUSTER
On Tue, 2007-03-27 at 10:28 -0400, Bruce Momjian wrote: > Simon Riggs wrote: > > On Thu, 2007-02-22 at 22:49 +, Simon Riggs wrote: > > Could we add this to the unapplied patches queue? It seems to have been > > missed off the list. Thanks. > > Done. Many thanks -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
Heikki Linnakangas wrote: > Alvaro Herrera wrote: > > Heikki Linnakangas wrote: > > > >> I ran two 24h test runs with DBT-2, one with the patch and one without. > >> To get comparable, predictable results, I turned autovacuum off and run > >> a manual vacuum in a loop on the stock-table alone. > >> > >> As expected, the steady-state of the stock table is smaller with the > >> patch. But only by ~2%, that's slightly less than I expected. > >> > >> But what surprises me is that response times went up a with the patch. I > >> don't know why. > > > > Maybe because of increased contention of ProcArrayLock? (I assume you > > are using that, althought I haven't seen the patch) > > I am, but I doubt that's it. The response times are dominated by I/O, so > any increase in lock contention would hardly show up. And the patch is > only adding one GetOldestXmin call every 1000 scanned pages, which is > nothing compared to the thousands of GetSnapshot calls the normal > transactions are issuing per minute. > > The patch must have changed the I/O pattern in some subtle way. So are you stopping work on the patch? I assume so. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Warning about LISTEN names
Tom Lane wrote: > "Greg Sabino Mullane" <[EMAIL PROTECTED]> writes: > > I'll save the full rant for my blog :), but wanted to submit this > > documentation > > patch for this listen gotcha that's been bugging me for a while. I'd like > > to see LISTEN and NOTIFY changed to use a simple text string, but until > > then, > > I think we should probably warn about the chopping off of the left-hand > > part. > > Let's change it to a plain ColId, so you get a syntax error if you try > that. OK, so should I make this change for 8.3? -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Unbroke srcdir!=builddir compilation
Applied. --- Marko Kreen wrote: > Let's be decent. > > -- > marko [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] DEALLOCATE ALL
When pooling connections where prepared statements are in use, it is hard to give new client totally clean connection as there may be allocated statements that give errors when new client starts preparing statements again. Currently PgPool solves the situation by parsing all queries and keeping list of prepares statements. This may not be a big problem for it as it parses the queries anyway, but for simple pooler like PgBouncer (see pgfoundry) that does not look inside libpq packets it is huge problem. Attached is a patch that allows keyword ALL to be used with DEALLOCATE and then frees all prepared queryes. I think it is useful for most pooling situations, not only PgBouncer. Also its similar in the spirit to RESET ALL. I did it slightly hacky way - if DeallocateStmt->name is NULL is signifies DEALLOCATE ALL command. All the code that looks into DeallocateStmt seems to survive the situation so it should be problem. If still a new node is needed or additional field in the node I can rework the patch. -- marko dealloc_all.diff Description: Binary data ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Fast CLUSTER
Simon Riggs wrote: > On Thu, 2007-02-22 at 22:49 +, Simon Riggs wrote: > > On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote: > > > > > Cool. I noticed that the SGML seems broken here: > > > > Corrected. > > > > > You need to close the and opened in the COPY mention. > > > > > > > + > > > > + static void > > > > + heap_sync_relation(Relation rel) > > > > + { > > > > + if (!rel->rd_istemp) > > > > > > No comment in this function? > > > > I've added more comments as you suggest. > > > > Thanks for the review. > > Could we add this to the unapplied patches queue? It seems to have been > missed off the list. Thanks. Done. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Fast CLUSTER
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Simon Riggs wrote: > On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote: > > > Cool. I noticed that the SGML seems broken here: > > Corrected. > > > You need to close the and opened in the COPY mention. > > > > > + > > > + static void > > > + heap_sync_relation(Relation rel) > > > + { > > > + if (!rel->rd_istemp) > > > > No comment in this function? > > I've added more comments as you suggest. > > Thanks for the review. > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Improvement of procArray.xmin for VACUUM
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> Bruce Momjian <[EMAIL PROTECTED]> writes: > >>> Well, my secondary addition was to start MyProc->xmin with the current > >>> xid counter, rather than your own xid. > >> > >> Don't tell me you seriously believe that would work. > > > I do. Please tell me why it will not. > > You can't have GlobalXmin greater than your own xid, else log space > (particularly pg_subtrans) may be recycled too soon. OK, so maybe GlobalXmin would have to be split into two new variables that control log space and dead-row detection separately. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Improvement of procArray.xmin for VACUUM
Bruce Momjian <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Bruce Momjian <[EMAIL PROTECTED]> writes: >>> Well, my secondary addition was to start MyProc->xmin with the current >>> xid counter, rather than your own xid. >> >> Don't tell me you seriously believe that would work. > I do. Please tell me why it will not. You can't have GlobalXmin greater than your own xid, else log space (particularly pg_subtrans) may be recycled too soon. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] patch adding new regexp functions
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Jeremy Drake wrote: > On Thu, 22 Mar 2007, Tom Lane wrote: > > > I'd vote for making this new code look like the rest of it, to wit > > hardwire the values. > > Attached please find a patch which does this. > > > > -- > Although written many years ago, Lady Chatterley's Lover has just been > reissued by the Grove Press, and this pictorial account of the > day-to-day life of an English gamekeeper is full of considerable > interest to outdoor minded readers, as it contains many passages on > pheasant-raising, the apprehending of poachers, ways to control vermin, > and other chores and duties of the professional gamekeeper. > Unfortunately, one is obliged to wade through many pages of extraneous > material in order to discover and savour those sidelights on the > management of a midland shooting estate, and in this reviewer's opinion > the book cannot take the place of J. R. Miller's "Practical > Gamekeeping." > -- Ed Zern, "Field and Stream" (Nov. 1959) Content-Description: [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] patch adding new regexp functions
Jeremy Drake wrote: > On Mon, 26 Mar 2007, Bruce Momjian wrote: > > > Tom Lane wrote: > > > Or were you speaking to the question of whether to adjust the regexp > > > patch to conform more nearly to the coding practices found elsewhere? > > > I agree with that, but I thought there was already a submitted patch > > > for it. > > > > Yes, regex patch adjustment, and I have not seen a patch which makes > > such adjustments. > > http://archives.postgresql.org/pgsql-patches/2007-03/msg00285.php Ah, yes, I still have that in my mailbox, but didn't think it was related because there was discussion _after_ the patch was posted. Will add the patch to the queue now. Thanks. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Improvement of procArray.xmin for VACUUM
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> Uh, no, that's not very clear. A long-running transaction will be a > >> VACUUM bottleneck because of its own XID, never mind its xmin. > > > Well, my secondary addition was to start MyProc->xmin with the current > > xid counter, rather than your own xid. > > Don't tell me you seriously believe that would work. I do. Please tell me why it will not. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Patch for circular buffer in tuplestore to optimize merge joins (v1)
This patch implements a circular buffer in tuplestore which drops old tuples as they're no longer needed. It uses this for merge joins to avoid having to spill the tuplestore if no single value exceeds work_mem. It also is what's needed for both recursive query support and OLAP window functions (hence why it implements the more complex circular buffer rather than just moving the single tuple up to the head of the buffer). This was mostly already done by Simon, I just finished the logic in tuplesort.c. This is actually not quite polished so I guess it's still a WIP but it's certainly ready to be reviewed. All that remains is polishing. If there's anything in there people object to now I would like to know. mergejoin-circbuffer-v1.patch.gz Description: Binary data -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Numeric patch to add special-case representations for < 8 bytes
"Bruce Momjian" <[EMAIL PROTECTED]> writes: > Greg, do you want to submit a patch for this or add a TODO item for this? Well I never got any approval or rejection in principle so I don't know if such a patch would be accepted even if it were implemented reasonably. If it has the consensus needed to be a TODO item I would go ahead and do it. The main design issue is that I was proposing to make it impossible to access the internals of the numeric storage using macros. Currently some of the data (the sign, dscale, and weight) is visible without having to call any special numeric functions. I was proposing to use representations where those might not be as easily accessible. However I don't think we have any consumers of that data outside of numeric.c anyways. Is there anything using that functionality currently? Do we mind losing it? > --- > > Gregory Stark wrote: >> >> I've uploaded a quick hack to store numerics in < 8 bytes when possible. >> >> http://community.enterprisedb.com/numeric-hack-1.patch >> >> This is a bit of a kludge since it doesn't actually provide any interface for >> external clients of the numeric module to parse the resulting data. Ie, the >> macros in numeric.h will return garbage. >> >> But I'm not entirely convinced that matters. It's not like those macros were >> really useful to any other modules anyways since there was no way to extract >> the actual digits. >> >> The patch is also slightly unsatisfactory because as I discovered numbers >> like >> 1.1 are stored as two digits currently. But it does work and it does save a >> substantial amount of space for integers. >> >> postgres=# select n,pg_column_size(n) from n; >> n | pg_column_size >> --+ >> 1 | 2 >>11 | 2 >> 101 | 2 >> 1001 | 3 >> 10001 | 9 >>11 | 9 >> 1.1 | 9 >> 10.1 | 9 >> 100.1 | 9 >>1000.1 | 9 >> 1.1 | 11 >> 10.1 | 11 >> >> I had hoped to get the second batch to be 3-4 bytes. But even now note how >> much space is saved for integers <1. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PATCHES] Unbroke srcdir!=builddir compilation
Let's be decent. -- marko pg13.diff Description: Binary data ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Fast CLUSTER
On Thu, 2007-02-22 at 22:49 +, Simon Riggs wrote: > On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote: > > > Cool. I noticed that the SGML seems broken here: > > Corrected. > > > You need to close the and opened in the COPY mention. > > > > > + > > > + static void > > > + heap_sync_relation(Relation rel) > > > + { > > > + if (!rel->rd_istemp) > > > > No comment in this function? > > I've added more comments as you suggest. > > Thanks for the review. Could we add this to the unapplied patches queue? It seems to have been missed off the list. Thanks. There is a probable conflict with Heikki's recent CLUSTER patch, so I'm happy to re-write this patch after that has been applied. So its OK to stick it at the bottom of the queue. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org