Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-27 Thread Koichi Suzuki

Simon;

Thanks a lot for your comments/advices. I'd like to write some feedback.

Simon Riggs wrote:

On Tue, 2007-03-27 at 11:52 +0900, Koichi Suzuki wrote:


Here's an update of a code to improve full page writes as proposed in

http://archives.postgresql.org/pgsql-hackers/2007-01/msg01491.php
and
http://archives.postgresql.org/pgsql-patches/2007-01/msg00607.php

Update includes some modification for error handling in archiver and
restoration command.

In the previous threads, I posted several evaluation and shown that we
can keep all the full page writes needed for full XLOG crash recovery,
while compressing archive log size considerably better than gzip, with
less CPU consumption.  I've found no further objection for this proposal
but still would like to hear comments/opinions/advices.


Koichi-san,

Looks interesting. I like the small amount of code to do this.

A few thoughts:

- Not sure why we need "full_page_compress", why not just mark them
always? That harms noone. (Did someone else ask for that? If so, keep
it)


No, no one asked to have a separate option. There'll be no bad
influence to do so. As written below, full page write can be
categolized as follows:

1) Needed for crash recovery: first page update after each checkpoint.
This has to be kept in WAL.

2) Needed for archive recovery: page update between pg_start_backup and
pg_stop_backup. This has to be kept in archive log.

3) For log-shipping slave such as pg_standby: no full page writes will
be needed for this purpose.

My proposal deals with 2). So, if we mark each "full_page_write", I'd
rather mark when this is needed. Still need only one bit because the
case 3) does not need any mark.


- OTOH I'd like to see an explicit parameter set during recovery since
you're asking the main recovery path to act differently in case a single
bit is set/unset. If you are using that form of recovery, we should say
so explicitly, to keep everybody else safe.


Only one thing I had to do is to create "dummy" full page write to
maintain LSNs. Full page writes are omitted in archive log. We have to
LSNs same as those in the original WAL. In this case, recovery has to
read logical log, not "dummy" full page writes. On the other hand, if
both logical log and "real" full page writes are found in a log record,
the recovery has to use "real" full page writes.


- I'd rather mark just the nonremovable blocks. But no real difference


It sound nicer. According to the full page write categories above, we 
can mark full page writes as needed in "crash recovery" or "archive 
recovery".   Please give some feedback to the above full page write

categories.



- We definitely don't want an normal elog in a XLogInsert critical
section, especially at DEBUG1 level


I agree. I'll remove elog calls from critical sections.


- diff -c format is easier and the standard


I'll change the diff option.


- pg_logarchive and pg_logrestore should be called by a name that
reflects what they actually do. Possibly pg_compresslog and
pg_decompresslog etc.. I've not reviewed those programs, but:


I wasn't careful to have command names more based on what to be done. 
I'll change the command name.




- Not sure why we have to compress away page headers. Touch as little as
you can has always been my thinking with recovery code.


Eliminating page headers gives compression rate slightly better, a 
couple of percents.   To make code simpler, I'll drop this compression.




- I'm very uncomfortable with touching the LSN. Maybe I misunderstand?


The reason why we need pg_logarchive (or pg_decompresslog) is to
maintain LSN the same as those in the original WAL. For this purpose,
we need "dummy" full page write. I don't like to touch LSN either and
this is the reason why pg_decompresslog need some extra work,
eliminating many other changes in the recovery.



- Have you thought about how pg_standby would integrate with this
option? Can you please?


Yes I believe so. As pg_standby does not include any chance to meet
partial writes of pages, I believe you can omit all the full page
writes. Of course, as Tom Lange suggested in
http://archives.postgresql.org/pgsql-hackers/2007-02/msg00034.php
removing full page writes can lose a chance to recover from
partial/inconsisitent writes in the crash of pg_standby. In this case,
we have to import a backup and archive logs (with full page writes
during the backup) to recover. (We have to import them when the file
system crashes anyway). If it's okay, I believe
pg_compresslog/pg_decompresslog can be integrated with pg_standby.

Maybe we can work together to include pg_compresslog/pg_decompresslog in 
pg_standby.




- I'll do some docs for this after Freeze, if you'd like. I have some
other changes to make there, so I can do this at the same time.


I'll be looking forward to your writings.

Best regards;

--
Koichi Suzuki

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore 

[PATCHES] Small code clean-up

2007-03-27 Thread ITAGAKI Takahiro
Here are two small code clean-up in initdb and win32_shmem.

pg_char_to_encoding() was redundant in initdb because
pg_valid_server_encoding() returns the same result if the encoding is valid,

Changes in win32_shmem suppress the following warnings.
| pg_shmem.c: In function `PGSharedMemoryCreate':
| pg_shmem.c:137: warning: long unsigned int format, Size arg (arg 2)
| pg_shmem.c:159: warning: long unsigned int format, Size arg (arg 2)


*** initdb.c.orig   Mon Mar 19 01:50:43 2007
--- initdb.cWed Mar 28 10:02:42 2007
*** get_encoding_id(char *encoding_name)
*** 709,716 
  
if (encoding_name && *encoding_name)
{
!   if ((enc = pg_char_to_encoding(encoding_name)) >= 0 &&
!   pg_valid_server_encoding(encoding_name) >= 0)
return encodingid_to_string(enc);
}
fprintf(stderr, _("%s: \"%s\" is not a valid server encoding name\n"),
--- 709,715 
  
if (encoding_name && *encoding_name)
{
!   if ((enc = pg_valid_server_encoding(encoding_name)) >= 0)
return encodingid_to_string(enc);
}
fprintf(stderr, _("%s: \"%s\" is not a valid server encoding name\n"),


*** win32_shmem.c.orig  Wed Mar 28 10:17:14 2007
--- win32_shmem.c   Wed Mar 28 10:16:36 2007
*** PGSharedMemoryCreate(Size size, bool mak
*** 136,142 
if (!hmap)
ereport(FATAL,
(errmsg("could not create shared memory 
segment: %lu", GetLastError()),
!errdetail("Failed system call was 
CreateFileMapping(size=%lu, name=%s)", size, szShareMem)));
  
/*
 * If the segment already existed, CreateFileMapping() will return a
--- 136,142 
if (!hmap)
ereport(FATAL,
(errmsg("could not create shared memory 
segment: %lu", GetLastError()),
!errdetail("Failed system call was 
CreateFileMapping(size=%lu, name=%s)", (unsigned long) size, szShareMem)));
  
/*
 * If the segment already existed, CreateFileMapping() will return a
*** PGSharedMemoryCreate(Size size, bool mak
*** 158,164 
if (!hmap)
ereport(FATAL,
(errmsg("could not create shared memory 
segment: %lu", GetLastError()),
!errdetail("Failed system call was 
CreateFileMapping(size=%lu, name=%s)", size, szShareMem)));
  
if (GetLastError() == ERROR_ALREADY_EXISTS)
ereport(FATAL,
--- 158,164 
if (!hmap)
ereport(FATAL,
(errmsg("could not create shared memory 
segment: %lu", GetLastError()),
!errdetail("Failed system call was 
CreateFileMapping(size=%lu, name=%s)", (unsigned long) size, szShareMem)));
  
if (GetLastError() == ERROR_ALREADY_EXISTS)
ereport(FATAL,


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] [PATCH] add CLUSTER table ORDER BY index

2007-03-27 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Holger Schurig" <[EMAIL PROTECTED]> writes:
>> * psql tab-completion, it favours now CLUSTER table ORDER BY index"

> It occurs to me (sorry that I didn't think of this earlier) that if we're
> going to use "ORDER BY" it really ought to take a list columns.

Surely you jest.  The point is to be ordered the same as the index, no?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] autovacuum: multiple workers

2007-03-27 Thread Simon Riggs
On Tue, 2007-03-27 at 17:41 -0400, Alvaro Herrera wrote:

> The main change is to have an array of Worker structs in shared memory;
> each worker checks the current table of all other Workers, and skips a
> table that's being vacuumed by any of them.  It also rechecks the table
> before vacuuming, which removes the problem of redundant vacuuming.

Slightly OT: Personally, I'd like it if we added an array for all
special backends, with configurable behaviour. That way it would be
easier to have multiple copies of other backends of any flavour using
the same code, as well as adding others without cutting and pasting each
time. That part of the postmaster code has oozed sideways in the past
few years and seems in need of some love. (A former sinner repents).

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Patch for circular buffer in tuplestore to optimizemerge joins (v1)

2007-03-27 Thread Simon Riggs
On Tue, 2007-03-27 at 13:32 +0100, stark wrote:

> This patch implements a circular buffer in tuplestore which drops old tuples
> as they're no longer needed. It uses this for merge joins to avoid having to
> spill the tuplestore if no single value exceeds work_mem. It also is what's
> needed for both recursive query support and OLAP window functions (hence why
> it implements the more complex circular buffer rather than just moving the
> single tuple up to the head of the buffer).
> 
> This was mostly already done by Simon, I just finished the logic in 
> tuplesort.c.

Cool. 

This item was previously discussed here:
http://archives.postgresql.org/pgsql-hackers/2007-01/msg00096.php

(this changes tuplestore.c not tuplesort.c, though its clear in the
patch)

> This is actually not quite polished so I guess it's still a WIP but it's
> certainly ready to be reviewed. All that remains is polishing. If there's
> anything in there people object to now I would like to know. 

I guess a performance test would be a great eyecatcher here, but it does
seem otherwise complete. Any eager merge join testers out there?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PATCHES] autovacuum: multiple workers

2007-03-27 Thread Alvaro Herrera
Hi,

This is the patch to put multiple workers into autovacuum.  This patch
applies after the recheck patch I just posted.

The main change is to have an array of Worker structs in shared memory;
each worker checks the current table of all other Workers, and skips a
table that's being vacuumed by any of them.  It also rechecks the table
before vacuuming, which removes the problem of redundant vacuuming.

It also introduces the business of SIGUSR1 between workers and launcher.
The launcher keeps a database list in memory and schedules workers to
vacuum databases depending on that list.  The actual database selected
may differ from what was in the schedule; in that case, the list is
reconstructed.

There are two main FIXMEs in this code:

1. have the list reconstruction and scheduling be smarter so that
databases are not ganged together in the schedule.  The only difficulty
is keeping the sort order that the databases had.

2. have a way to clean up after failed workers filling up the Worker
array and thus starving other databases from vacuuming.  I don't really
know a way to do this that works in all cases.  The only idea I have so
far is that workers that started more than autovacuum_naptime seconds
ago are considered failed to start.


Neither of these is really minor, but I think they are solvable.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
*** 14recheck/src/backend/postmaster/autovacuum.c	2007-03-27 16:43:31.0 -0400
--- 12vacuum/src/backend/postmaster/autovacuum.c	2007-03-27 17:40:19.0 -0400
***
*** 52,57 
--- 52,58 
  #include "utils/syscache.h"
  
  
+ static volatile sig_atomic_t got_SIGUSR1 = false;
  static volatile sig_atomic_t got_SIGHUP = false;
  static volatile sig_atomic_t avlauncher_shutdown_request = false;
  
***
*** 59,64 
--- 60,66 
   * GUC parameters
   */
  bool		autovacuum_start_daemon = false;
+ int			autovacuum_max_workers;
  int			autovacuum_naptime;
  int			autovacuum_vac_thresh;
  double		autovacuum_vac_scale;
***
*** 69,75 
  int			autovacuum_vac_cost_delay;
  int			autovacuum_vac_cost_limit;
  
! /* Flag to tell if we are in the autovacuum daemon process */
  static bool am_autovacuum_launcher = false;
  static bool am_autovacuum_worker = false;
  
--- 71,77 
  int			autovacuum_vac_cost_delay;
  int			autovacuum_vac_cost_limit;
  
! /* Flags to tell if we are in an autovacuum process */
  static bool am_autovacuum_launcher = false;
  static bool am_autovacuum_worker = false;
  
***
*** 86,91 
--- 88,94 
  typedef struct autovac_dbase
  {
  	Oid			ad_datid;
+ 	TimestampTz	ad_next_worker;
  	char	   *ad_name;
  	TransactionId ad_frozenxid;
  	PgStat_StatDBEntry *ad_entry;
***
*** 110,123 
  	int			at_vacuum_cost_limit;
  } autovac_table;
  
  typedef struct
  {
! 	Oid		process_db;			/* OID of database to process */
! 	int		worker_pid;			/* PID of the worker process, if any */
  } AutoVacuumShmemStruct;
  
  static AutoVacuumShmemStruct *AutoVacuumShmem;
  
  #ifdef EXEC_BACKEND
  static pid_t avlauncher_forkexec(void);
  static pid_t avworker_forkexec(void);
--- 113,158 
  	int			at_vacuum_cost_limit;
  } autovac_table;
  
+ /*-
+  * This struct holds information about a single worker's whereabouts.  We keep
+  * an array of these in shared memory, sized according to
+  * autovacuum_max_workers.
+  *
+  * wi_dboid		OID of the database this worker is supposed to work on
+  * wi_tableoid	OID of the table currently being vacuumed
+  * wi_workerpid	PID of the running worker, 0 if not yet started
+  * wi_finished	True when the worker is done and about to exit
+  *-
+  */
+ typedef struct
+ {
+ 	Oid			wi_dboid;
+ 	Oid			wi_tableoid;
+ 	int			wi_workerpid;
+ 	bool		wi_finished;
+ } WorkerInfo;
+ 
  typedef struct
  {
! 	pid_t		av_launcherpid;
! 	WorkerInfo	av_workers[1];
! 	/* VARIABLE LENGTH STRUCT */
  } AutoVacuumShmemStruct;
  
+ /* Macro to iterate over all workers.  Beware multiple evaluation of args! */
+ #define foreach_worker(_i, _worker) \
+ 	_worker = (WorkerInfo *) (AutoVacuumShmem + \
+ 			  offsetof(AutoVacuumShmemStruct, av_workers)); \
+ 	for (_i = 0; _i < autovacuum_max_workers; _i++, _worker += sizeof(WorkerInfo))
+ 
  static AutoVacuumShmemStruct *AutoVacuumShmem;
  
+ /* number of currently free worker slots; only valid in the launcher */
+ static int free_workers;
+ /* the database list in the launcher, and the context that contains it */
+ static Dllist *DatabaseList = NULL;
+ static MemoryContext DatabaseListCxt = NULL;
+ 
  #ifdef EXEC_BACKEND
  static pid_t avlauncher_forkexec(void);
  static pid_t avworker_forkexec(void);
***
*** 125,133 
  NON_EXEC_STATIC void AutoVacWorkerMain(int argc, char *argv[]);
  NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]);
  
! static void do_start_wo

Re: [PATCHES] [PATCH] add CLUSTER table ORDER BY index

2007-03-27 Thread Gregory Stark
"Holger Schurig" <[EMAIL PROTECTED]> writes:

> * psql tab-completion, it favours now CLUSTER table ORDER BY index"

It occurs to me (sorry that I didn't think of this earlier) that if we're
going to use "ORDER BY" it really ought to take a list columns. It would be
more consistent with what ORDER BY means in queries and one day we may want to
add support for ordering by arbitrary lists of columns even if no index
exists.

It may be reasonable to allow both to coexist and just have an arbitrary rule
that if an index of the specified name exists takes precedence over any
columns. I've never seen anyone name their indexes in a way that would
conflict with a column anyways.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-27 Thread Simon Riggs
On Tue, 2007-03-27 at 11:52 +0900, Koichi Suzuki wrote:

> Here's an update of a code to improve full page writes as proposed in
> 
> http://archives.postgresql.org/pgsql-hackers/2007-01/msg01491.php
> and
> http://archives.postgresql.org/pgsql-patches/2007-01/msg00607.php
> 
> Update includes some modification for error handling in archiver and
> restoration command.
> 
> In the previous threads, I posted several evaluation and shown that we
> can keep all the full page writes needed for full XLOG crash recovery,
> while compressing archive log size considerably better than gzip, with
> less CPU consumption.  I've found no further objection for this proposal
> but still would like to hear comments/opinions/advices.

Koichi-san,

Looks interesting. I like the small amount of code to do this.

A few thoughts:

- Not sure why we need "full_page_compress", why not just mark them
always? That harms noone. (Did someone else ask for that? If so, keep
it)

- OTOH I'd like to see an explicit parameter set during recovery since
you're asking the main recovery path to act differently in case a single
bit is set/unset. If you are using that form of recovery, we should say
so explicitly, to keep everybody else safe.

- I'd rather mark just the nonremovable blocks. But no real difference

- We definitely don't want an normal elog in a XLogInsert critical
section, especially at DEBUG1 level

- diff -c format is easier and the standard

- pg_logarchive and pg_logrestore should be called by a name that
reflects what they actually do. Possibly pg_compresslog and
pg_decompresslog etc.. I've not reviewed those programs, but:

- Not sure why we have to compress away page headers. Touch as little as
you can has always been my thinking with recovery code.

- I'm very uncomfortable with touching the LSN. Maybe I misunderstand?

- Have you thought about how pg_standby would integrate with this
option? Can you please?

- I'll do some docs for this after Freeze, if you'd like. I have some
other changes to make there, so I can do this at the same time.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PATCHES] autovacuum: recheck logic

2007-03-27 Thread Alvaro Herrera
Hi,

This is the first non-trivial patch to autovacuum multiple workers.

This patch that adds the "recheck" logic to autovacuum worker.  With
this, the worker first builds its table list and then rechecks pgstat
before vacuuming each table to verify that no one has vacuumed the table
in the meantime, before vacuuming it.

In the current autovacuum world this only means that a worker will not
vacuum a table that a user has vacuumed manually.  As discussed, it will
be much more useful as soon as multiple workers are running
concurrently.

To do this, I separated the task of calculating autovacuum parameters
(freeze_min_age, vacuum cost limit and delay, etc) from the autovac
equation calculation (freeze_max_age, pg_class.reltuples, etc).

We now keep track of three lists at the initial pg_class scan:

1. tables that need vacuum or analyze, per equations
2. tables not in (1) that have toast tables
3. toast tables that need vacuum

Then we append those tables in (2) whose toast tables are in (3), to the
(1) list.  The rest are discarded.  So when we need to do the
rechecking, we need to process only those tables that actually needed
vacuuming.  With the previous coding, we would end up rechecking almost
all tables every time (to be exact, all tables that have a toast table).

The autovacuum parameters are only calculated in the second pass (the
rechecking).  The first pass only yields boolean parameters.

Unless there are objections I'll apply this tomorrow.

-- 
Alvaro Herrerahttp://www.PlanetPostgreSQL.org
"La naturaleza, tan frágil, tan expuesta a la muerte... y tan viva"
Index: src/backend/postmaster/autovacuum.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.39
diff -c -p -r1.39 autovacuum.c
*** src/backend/postmaster/autovacuum.c	27 Mar 2007 20:36:03 -	1.39
--- src/backend/postmaster/autovacuum.c	27 Mar 2007 20:43:31 -
*** typedef struct autovac_dbase
*** 91,96 
--- 91,103 
  	PgStat_StatDBEntry *ad_entry;
  } autovac_dbase;
  
+ /* struct to keep track of tables to vacuum and/or analyze, in 1st pass */
+ typedef struct av_relation
+ {
+ 	Oid		ar_relid;
+ 	Oid		ar_toastrelid;
+ } av_relation;
+ 
  /* struct to keep track of tables to vacuum and/or analyze, after rechecking */
  typedef struct autovac_table
  {
*** NON_EXEC_STATIC void AutoVacLauncherMain
*** 121,137 
  static void do_start_worker(void);
  static void do_autovacuum(Oid dbid);
  static List *autovac_get_database_list(void);
! static void test_rel_for_autovac(Oid relid, PgStat_StatTabEntry *tabentry,
! 	 Form_pg_class classForm,
! 	 Form_pg_autovacuum avForm,
! 	 List **vacuum_tables,
! 	 List **toast_table_ids);
  static void autovacuum_do_vac_analyze(Oid relid, bool dovacuum,
  		  bool doanalyze, int freeze_min_age);
  static HeapTuple get_pg_autovacuum_tuple_relid(Relation avRel, Oid relid);
  static PgStat_StatTabEntry *get_pgstat_tabentry_relid(Oid relid, bool isshared,
  		  PgStat_StatDBEntry *shared,
  		  PgStat_StatDBEntry *dbentry);
  static void autovac_report_activity(VacuumStmt *vacstmt, Oid relid);
  static void avl_sighup_handler(SIGNAL_ARGS);
  static void avlauncher_shutdown(SIGNAL_ARGS);
--- 128,152 
  static void do_start_worker(void);
  static void do_autovacuum(Oid dbid);
  static List *autovac_get_database_list(void);
! 
! static void relation_check_autovac(Oid relid, Form_pg_class classForm,
! 	   Form_pg_autovacuum avForm, PgStat_StatTabEntry *tabentry,
! 	   List **table_oids, List **table_toast_list,
! 	   List **toast_oids);
! static autovac_table *table_recheck_autovac(Oid relid,
! 			PgStat_StatDBEntry *shared,
! 			PgStat_StatDBEntry *dbentry);
  static void autovacuum_do_vac_analyze(Oid relid, bool dovacuum,
  		  bool doanalyze, int freeze_min_age);
  static HeapTuple get_pg_autovacuum_tuple_relid(Relation avRel, Oid relid);
  static PgStat_StatTabEntry *get_pgstat_tabentry_relid(Oid relid, bool isshared,
  		  PgStat_StatDBEntry *shared,
  		  PgStat_StatDBEntry *dbentry);
+ static void relation_needs_vacanalyze(Oid relid, Form_pg_autovacuum avForm,
+ 		  Form_pg_class classForm,
+ 		  PgStat_StatTabEntry *tabentry, bool *dovacuum,
+ 		  bool *doanalyze);
+ static HeapTuple get_pg_autovacuum_tuple_relid(Relation avRel, Oid relid);
  static void autovac_report_activity(VacuumStmt *vacstmt, Oid relid);
  static void avl_sighup_handler(SIGNAL_ARGS);
  static void avlauncher_shutdown(SIGNAL_ARGS);
*** do_autovacuum(Oid dbid)
*** 872,879 
  	HeapTuple	tuple;
  	HeapScanDesc relScan;
  	Form_pg_database dbForm;
! 	List	   *vacuum_tables = NIL;
! 	List	   *toast_table_ids = NIL;
  	ListCell   *cell;
  	PgStat_StatDBEntry *shared;
  	PgStat_StatDBEntry *dbentry;
--- 887,895 
  	HeapTuple	tuple;
  	HeapScanDesc relScan;
  	F

Re: [PATCHES] [pgsql-patches] O_DIRECT support for Windows

2007-03-27 Thread Magnus Hagander
IIRC, we're still waiting for performance numbers showing there exists a
win from this patch.

//Magnus

Bruce Momjian wrote:
> Magnus, where are on this?
> 
> ---
> 
> Magnus Hagander wrote:
>> We're ok with the alignment issues provided the is code added to reject
>> O_DIRECT if the sector size is too large.
>>
>> We also said we need to see some performance numbers on the effect of
>> the patch before it goes in.
>>
>> //Magnus
>>
>>
>> Bruce Momjian wrote:
>>> So, do we want this patch?  Are we OK on WIN32 alignment issues?
>>>
>>> ---
>>>
>>> ITAGAKI Takahiro wrote:
 The attached is a patch to define O_DIRECT by ourselves on Windows,
 and to map O_DIRECT to FILE_FLAG_NO_BUFFERING.

 There will be a consistency in our support between Windows and other OSes
 that have O_DIRECT. Also, there is the following comment that says, I read,
 we should do so.
 | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH)

 Is this worth doing? Do we need more performance reports for the change?

 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center
>>> [ Attachment, skipping... ]
>>>
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
>>
>> ---(end of broadcast)---
>> TIP 7: You can help support the PostgreSQL project by donating at
>>
>> http://www.postgresql.org/about/donate
> 


---(end of broadcast)---
TIP 6: explain analyze is your friend


[PATCHES] [PATCH] add CLUSTER table ORDER BY index

2007-03-27 Thread Holger Schurig
The following table add's a new variant of the CLUSTER command. The old
variants are preserved, as suggested in the TODO entry.

Things I changed:

* The grammar
* psql help text
* psql tab-completion, it favours now CLUSTER table ORDER BY index"
* two uses of CLUSTER in the regression, so that both the old and
  new syntax get checked

Things to consider:

* not yet in the documentation
* psql should probably no longer emit "CLUSTER index ON table"


Index: src/doc/TODO
===
*** src.orig/doc/TODO   2007-03-27 21:18:01.0 +0200
--- src/doc/TODO2007-03-27 21:18:26.0 +0200
***
*** 624,631 
  
o %Add VERBOSE option to report tables as they are processed,
  like VACUUM VERBOSE
-   o Add more logical syntax CLUSTER table ORDER BY index; 
- support current syntax for backward compatibility
  
  
  * COPY
--- 624,629 
Index: src/doc/src/FAQ/TODO.html
===
*** src.orig/doc/src/FAQ/TODO.html  2007-03-27 21:18:01.0 +0200
--- src/doc/src/FAQ/TODO.html   2007-03-27 21:18:26.0 +0200
***
*** 558,565 
  
  %Add VERBOSE option to report tables as they are processed,
like VACUUM VERBOSE
- Add more logical syntax CLUSTER table ORDER BY index; 
-   support current syntax for backward compatibility

COPY

--- 558,563 
Index: src/src/backend/parser/gram.y
===
*** src.orig/src/backend/parser/gram.y  2007-03-27 21:18:01.0 +0200
--- src/src/backend/parser/gram.y   2007-03-27 21:18:26.0 +0200
***
*** 209,215 
  
  %typerelation_name copy_file_name
database_name access_method_clause 
access_method attr_name
!   index_name name file_name
  
  %type   func_name handler_name qual_Op qual_all_Op subquery_Op
opt_class opt_validator
--- 209,215 
  
  %typerelation_name copy_file_name
database_name access_method_clause 
access_method attr_name
!   index_name name file_name opt_cluster_order_by
  
  %type   func_name handler_name qual_Op qual_all_Op subquery_Op
opt_class opt_validator
***
*** 5327,5332 
--- 5327,5333 
   *
   *QUERY:
   *cluster  on 
+  *cluster  ORDER BY 
   *cluster 
   *cluster
   *
***
*** 5340,5350 
   n->indexname = $2;
   $$ = (Node*)n;
}
!   | CLUSTER qualified_name
{
   ClusterStmt *n = makeNode(ClusterStmt);
   n->relation = $2;
!  n->indexname = NULL;
   $$ = (Node*)n;
}
| CLUSTER
--- 5341,5351 
   n->indexname = $2;
   $$ = (Node*)n;
}
!   | CLUSTER qualified_name opt_cluster_order_by
{
   ClusterStmt *n = makeNode(ClusterStmt);
   n->relation = $2;
!  n->indexname = $3;
   $$ = (Node*)n;
}
| CLUSTER
***
*** 5356,5361 
--- 5357,5368 
}
;
  
+ opt_cluster_order_by:
+   ORDER BY index_name { $$ = $3; }
+   | /*EMPTY*/ { $$ = NULL; }
+   ;
+ 
+ 
  /*
   *
   *QUERY:
Index: src/src/bin/psql/sql_help.h
===
*** src.orig/src/bin/psql/sql_help.h2007-03-27 21:18:01.0 +0200
--- src/src/bin/psql/sql_help.h 2007-03-27 21:18:26.0 +0200
***
*** 119,125 
  
  { "CLUSTER",
N_("cluster a table according to an index"),
!   N_("CLUSTER indexname ON tablename\nCLUSTER tablename\nCLUSTER") },
  
  { "COMMENT",
N_("define or change the comment of an object"),
--- 119,125 
  
  { "CLUSTER",
N_("cluster a table according to an index"),
!   N_("CLUSTER indexname ON tablename\nCLUSTER tablename [ORDER BY 
indexname]\nCLUSTER") },
  
  { "COMMENT",
   

Re: [PATCHES] DEALLOCATE ALL

2007-03-27 Thread Marko Kreen

On 3/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Marko Kreen wrote:
>> When pooling connections where prepared statements are in use,
>> it is hard to give new client totally clean connection as
>> there may be allocated statements that give errors when
>> new client starts preparing statements again.

> Huh, didn't we have a RESET SESSION command to do just that?  What about
> cursors, for example?

We don't actually *have* one, but I believe it was agreed that that is
the right API to provide.  If a pooler has to remember to clear prepared
statements, GUCs, cursors, and who knows what else, it'll be perpetually
broken because there'll be something it omits.


Well.  Please apply following patch then:

http://archives.postgresql.org/pgsql-patches/2004-12/msg00228.php

Even if it is incomplete, the missing parts can be added later.
I see no reason to keep it from users.


There might be a use-case for DEALLOCATE ALL, but needs of poolers
aren't it.  I'd be inclined to vote against this unless someone can
point to a better use-case.


Ok, a non-pooler argument: prepared statements are supposed to be
garbage-collected by the user.  Thats it.  There should be friendly
way to get a clean state without the need for user to specifically
keep track of whats allocated, or to do messy exception-handling
around PREPARE/DEALLOCATE.  (PREPARE OR REPLACE and DEALLOCATE IF EXISTS
would also lessen the pain.)

Then a pooler argument: there is one pooler where RandomJoe executes
queries and another for specific app where the subset of SQL it uses is
known.  I want to RESET only specific things in app case.  So it would be
good if the RESET-s for specific areas would be available.

Also the objections to the Hans' patch give impression that different
pooling solutions want different RESET EVERYTHING, so again,
it would be good if RESET-s for different areas are available
and the all-encomassing RESET EVERYTHING just ties all the specific
RESETs together.

--
marko

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [pgsql-patches] O_DIRECT support for Windows

2007-03-27 Thread Bruce Momjian

Magnus, where are on this?

---

Magnus Hagander wrote:
> We're ok with the alignment issues provided the is code added to reject
> O_DIRECT if the sector size is too large.
> 
> We also said we need to see some performance numbers on the effect of
> the patch before it goes in.
> 
> //Magnus
> 
> 
> Bruce Momjian wrote:
> > So, do we want this patch?  Are we OK on WIN32 alignment issues?
> > 
> > ---
> > 
> > ITAGAKI Takahiro wrote:
> >> The attached is a patch to define O_DIRECT by ourselves on Windows,
> >> and to map O_DIRECT to FILE_FLAG_NO_BUFFERING.
> >>
> >> There will be a consistency in our support between Windows and other OSes
> >> that have O_DIRECT. Also, there is the following comment that says, I read,
> >> we should do so.
> >> | handle other flags? (eg FILE_FLAG_NO_BUFFERING/FILE_FLAG_WRITE_THROUGH)
> >>
> >> Is this worth doing? Do we need more performance reports for the change?
> >>
> >> Regards,
> >> ---
> >> ITAGAKI Takahiro
> >> NTT Open Source Software Center
> > 
> > [ Attachment, skipping... ]
> > 
> >> ---(end of broadcast)---
> >> TIP 2: Don't 'kill -9' the postmaster
> > 
> 
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Numeric patch to add special-case representations for < 8 bytes

2007-03-27 Thread Heikki Linnakangas

Gregory Stark wrote:

The main design issue is that I was proposing to make it impossible to access
the internals of the numeric storage using macros. Currently some of the data
(the sign, dscale, and weight) is visible without having to call any special
numeric functions. I was proposing to use representations where those might
not be as easily accessible. 


However I don't think we have any consumers of that data outside of numeric.c
anyways. Is there anything using that functionality currently? Do we mind
losing it?


The data would still be available through a function, right? If there's 
no-one accessing that information currently, there's no 
backwards-compatibility issue. I think this is a non-issue.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] DEALLOCATE ALL

2007-03-27 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Marko Kreen wrote:
>> When pooling connections where prepared statements are in use,
>> it is hard to give new client totally clean connection as
>> there may be allocated statements that give errors when
>> new client starts preparing statements again.

> Huh, didn't we have a RESET SESSION command to do just that?  What about
> cursors, for example?

We don't actually *have* one, but I believe it was agreed that that is
the right API to provide.  If a pooler has to remember to clear prepared
statements, GUCs, cursors, and who knows what else, it'll be perpetually
broken because there'll be something it omits.

There might be a use-case for DEALLOCATE ALL, but needs of poolers
aren't it.  I'd be inclined to vote against this unless someone can
point to a better use-case.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Bruce Momjian
Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> >> Maybe we should keep this issue open until we resolve the vacuum WAL 
> >> flush issue? I can then rerun the same tests to see if this patch is a 
> >> win after that.
> > 
> > Sounds like a plan, if you are willing to do that.
> 
> Sure, just rerunning the same tests isn't much work.
> 
> Bruce Momjian wrote:
> >  Would you like to add a TODO item?
> 
> I don't know how we track things like this. Maybe add to the end of the 
> patch queue, with link to this discussion so that we remember that it 
> needs more testing?

OK, I added this email to the patch queue.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] DEALLOCATE ALL

2007-03-27 Thread Alvaro Herrera
Marko Kreen wrote:
> When pooling connections where prepared statements are in use,
> it is hard to give new client totally clean connection as
> there may be allocated statements that give errors when
> new client starts preparing statements again.

Huh, didn't we have a RESET SESSION command to do just that?  What about
cursors, for example?

> I did it slightly hacky way - if DeallocateStmt->name is
> NULL is signifies DEALLOCATE ALL command.  All the code
> that looks into DeallocateStmt seems to survive the situation
> so it should be problem.  If still a new node is needed
> or additional field in the node I can rework the patch.

Wouldn't it be easier to just add a bool to DeallocateStmt?

-- 
Alvaro Herrerahttp://www.PlanetPostgreSQL.org
"Si un desconocido se acerca y te regala un CD de Ubuntu ...
 Eso es ...  Eau de Tux"

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [pgsql-patches] pg_get_domaindef

2007-03-27 Thread Bruce Momjian

I have remove this TODO item:

* %Add pg_get_acldef(), pg_get_typedefault(), pg_get_attrdef(),
  pg_get_tabledef(), pg_get_domaindef(), pg_get_functiondef()

  These would be for application use, not for use by pg_dump.

Seems there is lack of interest in adding this feature because of
maintanance concerns.

The attached patch is rejected for the same reason.  Sorry for the
confusion.


---

FAST PostgreSQL wrote:
> On Thu, 25 Jan 2007 02:25, Tom Lane wrote:
> > Andrew Dunstan <[EMAIL PROTECTED]> writes:
> > > FAST PostgreSQL wrote:
> > >> Please find attached the patch with modifications
> > >
> > > are you proposing to implement the other functions in this TODO item
> > > (pg_get_acldef(), pg_get_typedefault(), pg_get_attrdef(),
> > > pg_get_tabledef(), pg_get_functiondef() ) ?
> >
> > I haven't entirely understood the use case for any of these.  It's not
> 
> Any consensus on these functions? If we decide against having these then its 
> better to remove them from the TODO list temporarily/permanently.
> 
> Rgds,
> Arul Shaji
> 
> 
> > pg_dump, for a number of reasons: one being that pg_dump still has to
> > support older backend versions, and another being that every time we
> > let backend SnapshotNow functions get involved, we take another hit to
> > pg_dump's claim to produce a consistent MVCC snapshot.
> >
> > But my real objection is: do we really want to support duplicative code
> > in both pg_dump and the backend?  Updating pg_dump is already a major
> > PITA whenever one adds a new feature; doubling that work isn't
> > attractive.  (And it'd be double, not just a copy-and-paste, because of
> > the large difference in the operating environment.)  So I want to hear a
> > seriously convincing use-case that will justify the maintenance load we
> > are setting up for ourselves.  "Somebody might want this" is not
> > adequate.
> >
> > Perhaps a better area of work would be the often-proposed refactoring of
> > pg_dump into a library and driver program, wherein the library could
> > expose individual functions such as "fetch the SQL definition of this
> > object".  Unfortunately, that'll be a huge project with no payoff until
> > the end...
> >
> > regards, tom lane
> This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 
> 003 693 481. It is confidential to the ordinary user of the email address to 
> which it was addressed and may contain copyright and/or legally privileged 
> information. No one else may read, print, store, copy or forward all or any 
> of it or its attachments. If you receive this email in error, please return 
> to sender. Thank you.
> 
> If you do not wish to receive commercial email messages from Fujitsu 
> Australia Software Technology Pty Ltd, please email [EMAIL PROTECTED]
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Maybe we should keep this issue open until we resolve the vacuum WAL 
flush issue? I can then rerun the same tests to see if this patch is a 
win after that.


Sounds like a plan, if you are willing to do that.


Sure, just rerunning the same tests isn't much work.

Bruce Momjian wrote:

 Would you like to add a TODO item?


I don't know how we track things like this. Maybe add to the end of the 
patch queue, with link to this discussion so that we remember that it 
needs more testing?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Maybe we should keep this issue open until we resolve the vacuum WAL 
> flush issue? I can then rerun the same tests to see if this patch is a 
> win after that.

Sounds like a plan, if you are willing to do that.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Bruce Momjian
Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> >> Bruce Momjian wrote:
> >>> So are you stopping work on the patch?  I assume so.
> > 
> >> Yes, at least for now. I can't believe the patch actually hurts 
> >> performance, but I'm not going to spend time investigating it.
> > 
> > Are we withdrawing the patch from consideration for 8.3 then?
> > I had assumed it was still a live candidate, but if it seems to
> > lose in pgbench maybe we had better set it aside.
> 
> I haven't tried pgbench, the tests I ran were with DBT-2.
> 
> Just to summarize again: the patch did help to keep the stock table 
> smaller, but the response times were higher with the patch.
> 
> Maybe we should keep this issue open until we resolve the vacuum WAL 
> flush issue? I can then rerun the same tests to see if this patch is a 
> win after that.

Would you like to add a TODO item?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:

Bruce Momjian wrote:

So are you stopping work on the patch?  I assume so.


Yes, at least for now. I can't believe the patch actually hurts 
performance, but I'm not going to spend time investigating it.


Are we withdrawing the patch from consideration for 8.3 then?
I had assumed it was still a live candidate, but if it seems to
lose in pgbench maybe we had better set it aside.


I haven't tried pgbench, the tests I ran were with DBT-2.

Just to summarize again: the patch did help to keep the stock table 
smaller, but the response times were higher with the patch.


Maybe we should keep this issue open until we resolve the vacuum WAL 
flush issue? I can then rerun the same tests to see if this patch is a 
win after that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Bruce Momjian wrote:
>> So are you stopping work on the patch?  I assume so.

> Yes, at least for now. I can't believe the patch actually hurts 
> performance, but I'm not going to spend time investigating it.

Are we withdrawing the patch from consideration for 8.3 then?
I had assumed it was still a live candidate, but if it seems to
lose in pgbench maybe we had better set it aside.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Heikki Linnakangas

Gregory Stark wrote:

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:

Yes, at least for now. I can't believe the patch actually hurts performance,
but I'm not going to spend time investigating it.


Isn't this exactly what you would expect? It will clean up more tuples so
it'll dirty more pages. Especially given the pessimal way vacuum's dirty
buffers are handled until Simon's patch to fix that goes in.


Hmm. Yeah, maybe it'll get better when we get that fixed..


The benefit of the patch that we would expect to see is that you won't need to
run VACUUM as often. In the long term we would expect the stock table to grow
less too but I doubt these tests were long enough to demonstrate that effect.


The size did reach a steady state about half-way through the test, see 
the logs here:


patched
http://community.enterprisedb.com/oldestxmin/92/server/relsizes.log

unpatched
http://community.enterprisedb.com/oldestxmin/93/server/relsizes.log

The test was a success in that sense, the patch did reduce the steady 
state size of the stock table.


Maybe we would see a gain in transactions per minute or response times 
if we traded off the smaller table size to run vacuum slightly less 
frequently.. But as I said I don't want to spend time running more tests 
for what seems like a small benefit.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Gregory Stark

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:

> But what surprises me is that response times went up a with the patch. I
> don't know why.
 Maybe because of increased contention of ProcArrayLock?  (I assume you
 are using that, althought I haven't seen the patch)
>>> I am, but I doubt that's it. The response times are dominated by I/O, so any
>>> increase in lock contention would hardly show up. And the patch is only
>>> adding one GetOldestXmin call every 1000 scanned pages, which is nothing
>>> compared to the thousands of GetSnapshot calls the normal transactions are
>>> issuing per minute.
>>>
>>> The patch must have changed the I/O pattern in some subtle way.
>>
>> So are you stopping work on the patch?  I assume so.
>
> Yes, at least for now. I can't believe the patch actually hurts performance,
> but I'm not going to spend time investigating it.

Isn't this exactly what you would expect? It will clean up more tuples so
it'll dirty more pages. Especially given the pessimal way vacuum's dirty
buffers are handled until Simon's patch to fix that goes in.

The benefit of the patch that we would expect to see is that you won't need to
run VACUUM as often. In the long term we would expect the stock table to grow
less too but I doubt these tests were long enough to demonstrate that effect.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PATCHES] Concurrent psql v4 [WIP]

2007-03-27 Thread stark

This is just an update against CVS.

The interface is still as described at this URL:

http://community.enterprisedb.com/concurrent/index.html

It's true there isn't any documentation in the patch but I'm not sure we're
actually settled on the interface.

I think our experience here is that the \cwait command was a mistake. You
never need it in intractive use, and you need to use it *religiously* when
writing regression tests. If you miss one your regression test will fail
randomly depending on the timing. Instead we should just have a psql setting
that makes it automatically wait before every connection switch. That ensures
you catch any bugs where a command fails to block when it should, and also
ensures you catch the output of an unblocked command as soon as you switch
connections to it.

The other issue is the cursor behaviour. Currently it looks like below and it
starts a new pager for each block. I'm not sure this is really all that bad
myself but I have a feeling others will disagree. I'm not exactly sure what
the right behaviour is though, if this is an asynchronous command issued with
\cnowait then it would strange to suddenly start executing synchronously once
the first bit of data arrives.

postgres[2]=# select * from generate_series(1,4);
 generate_series 
-
   1
   2
   3
   4
(10 rows)

postgres[2]=# \set FETCH_COUNT 1

postgres[2]=# select * from generate_series(1,10);
 generate_series 
-
   1
(1 row)

 generate_series 
-
   2
(1 row)

 generate_series 
-
   3
(1 row)

 generate_series 
-
   4
(1 row)

 generate_series 
-
(0 rows)




concurrent-psql-v4.patch.gz
Description: Binary data


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] WIP patch - INSERT-able log statements

2007-03-27 Thread Bruce Momjian

What is the status of this patch?

---

FAST PostgreSQL wrote:
> Hi,
> 
> I've been working on the following TODO item and attached is an initial 
> patch. (It is only partial and not yet completely functional)
> 
> "Allow server log information to be output as INSERT statements 
> This would allow server log information to be easily loaded into a database 
> for analysis. "
> 
> I want to confirm, if what I have done so far is what community is looking 
> for and also want to clear some doubts.
> 
> What is done so far
> ---
> 
> Two postgresql.conf variables
> 
> #log_output_type = 'text' #Valid values are 'SQL' or 'text'
> #log_output_table_name = 'auditlogs'
> 
> These control how to output the log. Defaults to 'text' which is status quo. 
> If it is set to 'SQL' log will be output as INSERT commands.
> 
> The second variable is of interest. We need to specify a table in the insert 
> command. My preferred option is for the user to give one and he can create it 
> if and when he wants to. The alternative is we decide the table name and make 
> initdb to create one.  
> 
> The proposed log output structure
> --
> INSERT INTO user_defined_table values( timestamp_with_milliseconds,  
> timestamp, username,  databasename, sessionid,  host_and_port, host, proc_id, 
> command_tag,  session_start, transaction_id,  error_severity,  
> SQL_State_Code, error_message);
> 
> All these columns will follow the current rules of log output. ie, unless 
> explicity requested by the user, these columns will have NULL. User can still 
> give log_line_prefix in any order he wants, and logger will output it in 
> appropriate columns. The code has been modified to do 
> this.
> 
> Issues/Questions are:
> - How about 'Statement duration log'.  This will come to the logger as a 
> single string and after the query execution. In the existing log we can make 
> sense of the duration log by matching it with the statement above it or by 
> the statement which gets printed besides it (Again as 
> a single string). But when this is loaded onto a table doesn't make much 
> sense untless everything is in a single row. (My preferred option is to add 
> another column to the table structure defined above as 'duration'. But 
> haven't figured out how to achieve this, because the 
> statement is printed first and then the duration as another log.)
> 
> - If the SQL log output is to the syslog, then it becomes pretty awkward and 
> possibly useless because our current syslog writer function breaks up the log 
> into several lines to accomodate various platforms. Syslog also then adds 
> other information before outputting it, which 
> cannot be loaded onto a table. The preferred option is to educate the user 
> through documentation that SQL type log output is best served when it is 
> output to stderr and redirected to a file? Same goes with other aspects such 
> as verbose and various other statistics log. 
> 
> - There are also other minor issues such as, the actual query currently gets 
> output in log as 'Statement: CREATE '. For sql type log we may not 
> need the 'Statement:' part as it will be in a column ? Do we remove this in 
> both text and SQL outputs ?
> 
> Rgds,
> Arul Shaji
> 
> This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 
> 003 693 481. It is confidential to the ordinary user of the email address to 
> which it was addressed and may contain copyright and/or legally privileged 
> information. No one else may read, print, store, copy or forward all or any 
> of it or its attachments. If you receive this email in error, please return 
> to sender. Thank you.
> 
> If you do not wish to receive commercial email messages from Fujitsu 
> Australia Software Technology Pty Ltd, please email [EMAIL PROTECTED]

[ Attachment, skipping... ]

> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Heikki Linnakangas

Bruce Momjian wrote:

Heikki Linnakangas wrote:

Alvaro Herrera wrote:

Heikki Linnakangas wrote:

I ran two 24h test runs with DBT-2, one with the patch and one without. 
To get comparable, predictable results, I turned autovacuum off and run 
a manual vacuum in a loop on the stock-table alone.


As expected, the steady-state of the stock table is smaller with the 
patch. But only by ~2%, that's slightly less than I expected.


But what surprises me is that response times went up a with the patch. I 
don't know why.

Maybe because of increased contention of ProcArrayLock?  (I assume you
are using that, althought I haven't seen the patch)
I am, but I doubt that's it. The response times are dominated by I/O, so 
any increase in lock contention would hardly show up. And the patch is 
only adding one GetOldestXmin call every 1000 scanned pages, which is 
nothing compared to the thousands of GetSnapshot calls the normal 
transactions are issuing per minute.


The patch must have changed the I/O pattern in some subtle way.


So are you stopping work on the patch?  I assume so.


Yes, at least for now. I can't believe the patch actually hurts 
performance, but I'm not going to spend time investigating it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] Fast CLUSTER

2007-03-27 Thread Simon Riggs
On Tue, 2007-03-27 at 10:28 -0400, Bruce Momjian wrote:
> Simon Riggs wrote:
> > On Thu, 2007-02-22 at 22:49 +, Simon Riggs wrote:
> > Could we add this to the unapplied patches queue? It seems to have been
> > missed off the list. Thanks.
> 
> Done.

Many thanks

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum

2007-03-27 Thread Bruce Momjian
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
> > Heikki Linnakangas wrote:
> > 
> >> I ran two 24h test runs with DBT-2, one with the patch and one without. 
> >> To get comparable, predictable results, I turned autovacuum off and run 
> >> a manual vacuum in a loop on the stock-table alone.
> >>
> >> As expected, the steady-state of the stock table is smaller with the 
> >> patch. But only by ~2%, that's slightly less than I expected.
> >>
> >> But what surprises me is that response times went up a with the patch. I 
> >> don't know why.
> > 
> > Maybe because of increased contention of ProcArrayLock?  (I assume you
> > are using that, althought I haven't seen the patch)
> 
> I am, but I doubt that's it. The response times are dominated by I/O, so 
> any increase in lock contention would hardly show up. And the patch is 
> only adding one GetOldestXmin call every 1000 scanned pages, which is 
> nothing compared to the thousands of GetSnapshot calls the normal 
> transactions are issuing per minute.
> 
> The patch must have changed the I/O pattern in some subtle way.

So are you stopping work on the patch?  I assume so.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Warning about LISTEN names

2007-03-27 Thread Bruce Momjian
Tom Lane wrote:
> "Greg Sabino Mullane" <[EMAIL PROTECTED]> writes:
> > I'll save the full rant for my blog :), but wanted to submit this 
> > documentation 
> > patch for this listen gotcha that's been bugging me for a while. I'd like 
> > to see LISTEN and NOTIFY changed to use a simple text string, but until 
> > then, 
> > I think we should probably warn about the chopping off of the left-hand 
> > part.
> 
> Let's change it to a plain ColId, so you get a syntax error if you try
> that.

OK, so should I make this change for 8.3?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Unbroke srcdir!=builddir compilation

2007-03-27 Thread Bruce Momjian

Applied.

---

Marko Kreen wrote:
> Let's be decent.
> 
> -- 
> marko

[ Attachment, skipping... ]

> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PATCHES] DEALLOCATE ALL

2007-03-27 Thread Marko Kreen

When pooling connections where prepared statements are in use,
it is hard to give new client totally clean connection as
there may be allocated statements that give errors when
new client starts preparing statements again.

Currently PgPool solves the situation by parsing all queries
and keeping list of prepares statements.  This may not be a big
problem for it as it parses the queries anyway, but for simple
pooler like PgBouncer (see pgfoundry) that does not look inside
libpq packets it is huge problem.

Attached is a patch that allows keyword ALL to be used with
DEALLOCATE and then frees all prepared queryes.  I think it is
useful for most pooling situations, not only PgBouncer.
Also its similar in the spirit to RESET ALL.

I did it slightly hacky way - if DeallocateStmt->name is
NULL is signifies DEALLOCATE ALL command.  All the code
that looks into DeallocateStmt seems to survive the situation
so it should be problem.  If still a new node is needed
or additional field in the node I can rework the patch.

--
marko


dealloc_all.diff
Description: Binary data

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Fast CLUSTER

2007-03-27 Thread Bruce Momjian
Simon Riggs wrote:
> On Thu, 2007-02-22 at 22:49 +, Simon Riggs wrote:
> > On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:
> > 
> > > Cool.  I noticed that the SGML seems broken here:
> > 
> > Corrected. 
> > 
> > > You need to close the  and  opened in the COPY mention.
> > > 
> > > > + 
> > > > + static void
> > > > + heap_sync_relation(Relation rel)
> > > > + {
> > > > +   if (!rel->rd_istemp)
> > > 
> > > No comment in this function?
> > 
> > I've added more comments as you suggest.
> > 
> > Thanks for the review.
> 
> Could we add this to the unapplied patches queue? It seems to have been
> missed off the list. Thanks.

Done.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Fast CLUSTER

2007-03-27 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Simon Riggs wrote:
> On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:
> 
> > Cool.  I noticed that the SGML seems broken here:
> 
> Corrected. 
> 
> > You need to close the  and  opened in the COPY mention.
> > 
> > > + 
> > > + static void
> > > + heap_sync_relation(Relation rel)
> > > + {
> > > + if (!rel->rd_istemp)
> > 
> > No comment in this function?
> 
> I've added more comments as you suggest.
> 
> Thanks for the review.
> 
> -- 
>   Simon Riggs 
>   EnterpriseDB   http://www.enterprisedb.com
> 

[ Attachment, skipping... ]

> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Improvement of procArray.xmin for VACUUM

2007-03-27 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Bruce Momjian <[EMAIL PROTECTED]> writes:
> >>> Well, my secondary addition was to start MyProc->xmin with the current
> >>> xid counter, rather than your own xid.
> >> 
> >> Don't tell me you seriously believe that would work.
> 
> > I do.  Please tell me why it will not.
> 
> You can't have GlobalXmin greater than your own xid, else log space
> (particularly pg_subtrans) may be recycled too soon.

OK, so maybe GlobalXmin would have to be split into two new variables
that control log space and dead-row detection separately.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Improvement of procArray.xmin for VACUUM

2007-03-27 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Bruce Momjian <[EMAIL PROTECTED]> writes:
>>> Well, my secondary addition was to start MyProc->xmin with the current
>>> xid counter, rather than your own xid.
>> 
>> Don't tell me you seriously believe that would work.

> I do.  Please tell me why it will not.

You can't have GlobalXmin greater than your own xid, else log space
(particularly pg_subtrans) may be recycled too soon.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-03-27 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Jeremy Drake wrote:
> On Thu, 22 Mar 2007, Tom Lane wrote:
> 
> > I'd vote for making this new code look like the rest of it, to wit
> > hardwire the values.
> 
> Attached please find a patch which does this.
> 
> 
> 
> -- 
> Although written many years ago, Lady Chatterley's Lover has just been
> reissued by the Grove Press, and this pictorial account of the
> day-to-day life of an English gamekeeper is full of considerable
> interest to outdoor minded readers, as it contains many passages on
> pheasant-raising, the apprehending of poachers, ways to control vermin,
> and other chores and duties of the professional gamekeeper.
> Unfortunately, one is obliged to wade through many pages of extraneous
> material in order to discover and savour those sidelights on the
> management of a midland shooting estate, and in this reviewer's opinion
> the book cannot take the place of J. R. Miller's "Practical
> Gamekeeping."
>   -- Ed Zern, "Field and Stream" (Nov. 1959)
Content-Description: 

[ Attachment, skipping... ]

> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-03-27 Thread Bruce Momjian
Jeremy Drake wrote:
> On Mon, 26 Mar 2007, Bruce Momjian wrote:
> 
> > Tom Lane wrote:
> > > Or were you speaking to the question of whether to adjust the regexp
> > > patch to conform more nearly to the coding practices found elsewhere?
> > > I agree with that, but I thought there was already a submitted patch
> > > for it.
> >
> > Yes, regex patch adjustment, and I have not seen a patch which makes
> > such adjustments.
> 
> http://archives.postgresql.org/pgsql-patches/2007-03/msg00285.php

Ah, yes, I still have that in my mailbox, but didn't think it was
related because there was discussion _after_ the patch was posted.  Will
add the patch to the queue now.  Thanks.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Improvement of procArray.xmin for VACUUM

2007-03-27 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Uh, no, that's not very clear.  A long-running transaction will be a
> >> VACUUM bottleneck because of its own XID, never mind its xmin.
> 
> > Well, my secondary addition was to start MyProc->xmin with the current
> > xid counter, rather than your own xid.
> 
> Don't tell me you seriously believe that would work.

I do.  Please tell me why it will not.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PATCHES] Patch for circular buffer in tuplestore to optimize merge joins (v1)

2007-03-27 Thread stark

This patch implements a circular buffer in tuplestore which drops old tuples
as they're no longer needed. It uses this for merge joins to avoid having to
spill the tuplestore if no single value exceeds work_mem. It also is what's
needed for both recursive query support and OLAP window functions (hence why
it implements the more complex circular buffer rather than just moving the
single tuple up to the head of the buffer).

This was mostly already done by Simon, I just finished the logic in tuplesort.c.

This is actually not quite polished so I guess it's still a WIP but it's
certainly ready to be reviewed. All that remains is polishing. If there's
anything in there people object to now I would like to know. 



mergejoin-circbuffer-v1.patch.gz
Description: Binary data


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Numeric patch to add special-case representations for < 8 bytes

2007-03-27 Thread Gregory Stark
"Bruce Momjian" <[EMAIL PROTECTED]> writes:

> Greg, do you want to submit a patch for this or add a TODO item for this?

Well I never got any approval or rejection in principle so I don't know if
such a patch would be accepted even if it were implemented reasonably. If it
has the consensus needed to be a TODO item I would go ahead and do it.

The main design issue is that I was proposing to make it impossible to access
the internals of the numeric storage using macros. Currently some of the data
(the sign, dscale, and weight) is visible without having to call any special
numeric functions. I was proposing to use representations where those might
not be as easily accessible. 

However I don't think we have any consumers of that data outside of numeric.c
anyways. Is there anything using that functionality currently? Do we mind
losing it?

> ---
>
> Gregory Stark wrote:
>> 
>> I've uploaded a quick hack to store numerics in < 8 bytes when possible. 
>> 
>>  http://community.enterprisedb.com/numeric-hack-1.patch
>> 
>> This is a bit of a kludge since it doesn't actually provide any interface for
>> external clients of the numeric module to parse the resulting data. Ie, the
>> macros in numeric.h will return garbage.
>> 
>> But I'm not entirely convinced that matters. It's not like those macros were
>> really useful to any other modules anyways since there was no way to extract
>> the actual digits.
>> 
>> The patch is also slightly unsatisfactory because as I discovered numbers 
>> like
>> 1.1 are stored as two digits currently. But it does work and it does save a
>> substantial amount of space for integers.
>> 
>> postgres=# select n,pg_column_size(n) from n;
>> n | pg_column_size 
>> --+
>> 1 |  2
>>11 |  2
>>   101 |  2
>>  1001 |  3
>> 10001 |  9
>>11 |  9
>>   1.1 |  9
>>  10.1 |  9
>> 100.1 |  9
>>1000.1 |  9
>>   1.1 | 11
>>  10.1 | 11
>> 
>> I had hoped to get the second batch to be 3-4 bytes. But even now note how
>> much space is saved for integers <1.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PATCHES] Unbroke srcdir!=builddir compilation

2007-03-27 Thread Marko Kreen

Let's be decent.

--
marko


pg13.diff
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Fast CLUSTER

2007-03-27 Thread Simon Riggs
On Thu, 2007-02-22 at 22:49 +, Simon Riggs wrote:
> On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:
> 
> > Cool.  I noticed that the SGML seems broken here:
> 
> Corrected. 
> 
> > You need to close the  and  opened in the COPY mention.
> > 
> > > + 
> > > + static void
> > > + heap_sync_relation(Relation rel)
> > > + {
> > > + if (!rel->rd_istemp)
> > 
> > No comment in this function?
> 
> I've added more comments as you suggest.
> 
> Thanks for the review.

Could we add this to the unapplied patches queue? It seems to have been
missed off the list. Thanks.

There is a probable conflict with Heikki's recent CLUSTER patch, so I'm
happy to re-write this patch after that has been applied. So its OK to
stick it at the bottom of the queue.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org