Re: [HACKERS] make async slave to wait for lsn to be replayed
Ants Aasma писал 2017-10-26 17:29: On Mon, Oct 23, 2017 at 12:29 PM, Ivan Kartyshov <i.kartys...@postgrespro.ru> wrote: Ants Aasma писал 2017-09-26 13:00: Exposing this interface as WAITLSN will encode that visibility order matches LSN order. This removes any chance of fixing for example visibility order of async/vs/sync transactions. Just renaming it so the token is an opaque commit visibility token that just happens to be a LSN would still allow for progress in transaction management. For example, making PostgreSQL distributed will likely want timestamp and/or vector clock based visibility rules. I'm sorry I did not understand exactly what you meant. Please explain this in more detail. Currently transactions on the master become visible when xid is removed from proc array. This depends on the order of acquiring ProcArrayLock, which can happen in a different order from inserting the commit record to wal. Whereas on the standby the transactions will become visible in the same order that commit records appear in wal. The difference can be quite large when transactions are using different values for synchronous_commit. Fixing this is not easy, but we may want to do it someday. IIRC CSN threads contained more discussion on this topic. If we do fix it, it seems likely that what we need to wait on is not LSN, but some other kind of value. If the UI were something like "WAITVISIBILITY token", then we can later change the token to something other than LSN. Regards, Ants Aasma It sounds reasonable. I can offer the following version. WAIT LSN lsn_number; WAIT LSN lsn_number TIMEOUT delay; WAIT LSN lsn_number INFINITE; WAIT LSN lsn_number NOWAIT; WAIT [token] wait_value [option]; token - [LSN, TIME | TIMESTAMP | CSN | XID] option - [TIMEOUT | INFINITE | NOWAIT] Ready to listen to your suggestions. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept
Hello. I made some bugfixes and rewrite the patch. Simon Riggs писал 2017-09-05 14:44: As Alexander says, simply skipping truncation if standby is busy isn't a great plan. If we defer an action on standby replay, when and who will we apply it? What happens if the standby is shutdown or crashes while an action is pending. After crash standby server will go to the nearest checkout and will replay all his wal`s to reach consistent state and then open for read-only load. (all collected wal`s will be replayed in right way, I meant wal`s that truncates table and wal`s that make table grow) -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c index 9a5fde0..68decab 100644 --- a/src/backend/catalog/storage.c +++ b/src/backend/catalog/storage.c @@ -31,6 +31,9 @@ #include "storage/smgr.h" #include "utils/memutils.h" #include "utils/rel.h" +#include "storage/lmgr.h" +#include "storage/procarray.h" +#include "access/transam.h" /* * We keep a list of all relations (represented as RelFileNode values) @@ -498,50 +501,56 @@ smgr_redo(XLogReaderState *record) reln = smgropen(xlrec->rnode, InvalidBackendId); - /* - * Forcibly create relation if it doesn't exist (which suggests that - * it was dropped somewhere later in the WAL sequence). As in - * XLogReadBufferForRedo, we prefer to recreate the rel and replay the - * log as best we can until the drop is seen. - */ - smgrcreate(reln, MAIN_FORKNUM, true); - - /* - * Before we perform the truncation, update minimum recovery point to - * cover this WAL record. Once the relation is truncated, there's no - * going back. The buffer manager enforces the WAL-first rule for - * normal updates to relation files, so that the minimum recovery - * point is always updated before the corresponding change in the data - * file is flushed to disk. We have to do the same manually here. - * - * Doing this before the truncation means that if the truncation fails - * for some reason, you cannot start up the system even after restart, - * until you fix the underlying situation so that the truncation will - * succeed. Alternatively, we could update the minimum recovery point - * after truncation, but that would leave a small window where the - * WAL-first rule could be violated. - */ - XLogFlush(lsn); - - if ((xlrec->flags & SMGR_TRUNCATE_HEAP) != 0) + if (!ConditionalLockRelationOid(reln->smgr_rnode.node.relNode, AccessExclusiveLock)) { - smgrtruncate(reln, MAIN_FORKNUM, xlrec->blkno); + /* + * Forcibly create relation if it doesn't exist (which suggests that + * it was dropped somewhere later in the WAL sequence). As in + * XLogReadBufferForRedo, we prefer to recreate the rel and replay the + * log as best we can until the drop is seen. + */ + smgrcreate(reln, MAIN_FORKNUM, true); + + /* + * Before we perform the truncation, update minimum recovery point to + * cover this WAL record. Once the relation is truncated, there's no + * going back. The buffer manager enforces the WAL-first rule for + * normal updates to relation files, so that the minimum recovery + * point is always updated before the corresponding change in the data + * file is flushed to disk. We have to do the same manually here. + * + * Doing this before the truncation means that if the truncation fails + * for some reason, you cannot start up the system even after restart, + * until you fix the underlying situation so that the truncation will + * succeed. Alternatively, we could update the minimum recovery point + * after truncation, but that would leave a small window where the + * WAL-first rule could be violated. + */ + XLogFlush(lsn); - /* Also tell xlogutils.c about it */ - XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno); - } + if ((xlrec->flags & SMGR_TRUNCATE_HEAP) != 0) + { +smgrtruncate(reln, MAIN_FORKNUM, xlrec->blkno); + +/* Also tell xlogutils.c about it */ +XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno); + } - /* Truncate FSM and VM too */ - rel = CreateFakeRelcacheEntry(xlrec->rnode); + /* Truncate FSM and VM too */ + rel = CreateFakeRelcacheEntry(xlrec->rnode); - if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 && - smgrexists(reln, FSM_FORKNUM)) - FreeSpaceMapTruncateRel(rel, xlrec->blkno); - if ((xlrec->flags & SMGR_TRUNCATE_VM) != 0 && - smgrexists(reln, VISIBILITYMAP_FORKNUM)) - visibilitymap_truncate(rel, xlrec->blkno); + if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 && +smgrexists(reln, FSM_FORKNUM)) +FreeSpaceMapTruncateRel(rel, xlrec->blkno); + if ((xlrec->flags & SMGR_TRUNCATE_VM) != 0 &&
Re: [HACKERS] make async slave to wait for lsn to be replayed
Alexander Korotkov писал 2017-10-23 13:19: Despite code cleanup, you still have some random empty lines removals in your patch. I reconfigured my IDE to avoid this in the future. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 01acc2e..6792eb0 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..6f389ca --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,144 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait for the target LSN to be replayed + + + + +WAITLSN 'LSN' [ INFINITELY ] +WAITLSN 'LSN' TIMEOUT wait_time +WAITLSN 'LSN' NOWAIT + + + + + Description + + + interprocess communication mechanism to wait for the target log sequence + number (LSN) on standby in + databases with master-standby asynchronous replication. When run with the + LSN option, the WAITLSN command + waits for the specified LSN to be replayed. By default, wait + time is unlimited. Waiting can be interrupted using Ctrl+C, or + by shutting down the postgres server. You can also limit the wait + time using the TIMEOUT option, or check the target LSN + status immediately using the NOWAIT option. + + + + + + Parameters + + + +LSN + + + Specify the target log sequence number to wait for. + + + + + INFINITELY + + + Wait until the target LSN is replayed on standby. + This is an optional parameter reinforcing the default behavior. + + + + + + + + TIMEOUT wait_time + + + Limit the time to wait for the LSN to be replayed. + The specified wait_time must be an integer + and is measured in milliseconds. + + + + + + NOWAIT + + + Report whether the target LSN has been replayed already, + without any waiting. + + + + + + + Examples + + + Run WAITLSN from psql, + limiting wait time to 1 milliseconds: + + +WAITLSN '0/3F07A6B1' TIMEOUT 1; +NOTICE: LSN is not reached. Try to increase wait time. +LSN reached +- + f +(1 row) + + + + + Wait until the specified LSN is replayed: + +WAITLSN '0/3F07A611'; +LSN reached +- + t +(1 row) + + + + + Limit LSN wait time to 50 milliseconds, and then cancel the command: + +WAITLSN '0/3F0FF791' TIMEOUT 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to increase wait time. +ERROR: canceling statement due to user request + LSN reached +- + f +(1 row) + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9000b3a..0c5951a 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -209,6 +209,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dd028a1..117cc9b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -40,6 +40,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -7312,6 +7312,15 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN()) +{ + + WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr); +} + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL); diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index 4a6c99e..0d10117 100644 --- a/src/backend/commands/Makefile +++ b/src/backend/commands/Makefile @@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \ policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \ schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \ - vacuum.o vacuumlazy.o variable.o view.o + vacuum.o vacuumlazy.o variable.o view.o waitlsn.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/commands/waitlsn.c b/src/backend/commands/waitlsn.c new file mode 100644 index 000..db2f549 --- /dev/null +++ b/src/backend/commands/waitlsn.c @@ -0,0 +1,273 @@ +/*- + * + * waitlsn.c + * WaitLSN statment:
Re: [HACKERS] make async slave to wait for lsn to be replayed
New little cleanup code changes -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 217f842726531edb1b0056a5c5727ab01bab7f9b Author: i.kartyshov <i.kartys...@postgrespro.com> Date: Mon Oct 23 12:08:59 2017 +0300 Cherry picked and ported 11dev diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 01acc2e..6792eb0 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..6f389ca --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,144 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait for the target LSN to be replayed + + + + +WAITLSN 'LSN' [ INFINITELY ] +WAITLSN 'LSN' TIMEOUT wait_time +WAITLSN 'LSN' NOWAIT + + + + + Description + + + interprocess communication mechanism to wait for the target log sequence + number (LSN) on standby in + databases with master-standby asynchronous replication. When run with the + LSN option, the WAITLSN command + waits for the specified LSN to be replayed. By default, wait + time is unlimited. Waiting can be interrupted using Ctrl+C, or + by shutting down the postgres server. You can also limit the wait + time using the TIMEOUT option, or check the target LSN + status immediately using the NOWAIT option. + + + + + + Parameters + + + +LSN + + + Specify the target log sequence number to wait for. + + + + + INFINITELY + + + Wait until the target LSN is replayed on standby. + This is an optional parameter reinforcing the default behavior. + + + + + + + + TIMEOUT wait_time + + + Limit the time to wait for the LSN to be replayed. + The specified wait_time must be an integer + and is measured in milliseconds. + + + + + + NOWAIT + + + Report whether the target LSN has been replayed already, + without any waiting. + + + + + + + Examples + + + Run WAITLSN from psql, + limiting wait time to 1 milliseconds: + + +WAITLSN '0/3F07A6B1' TIMEOUT 1; +NOTICE: LSN is not reached. Try to increase wait time. +LSN reached +- + f +(1 row) + + + + + Wait until the specified LSN is replayed: + +WAITLSN '0/3F07A611'; +LSN reached +- + t +(1 row) + + + + + Limit LSN wait time to 50 milliseconds, and then cancel the command: + +WAITLSN '0/3F0FF791' TIMEOUT 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to increase wait time. +ERROR: canceling statement due to user request + LSN reached +- + f +(1 row) + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9000b3a..0c5951a 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -209,6 +209,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dd028a1..117cc9b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -40,6 +40,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; - /* * Although only "on", "off", and "always" are documented, * we accept all the likely variants of "on" and "off". @@ -7312,6 +7312,15 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN()) +{ + + WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr); +} + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL); diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index 4a6c99e..0d10117 100644 --- a/src/backend/commands/Makefile +++ b/src/backend/commands/Makefile @@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \ policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \ schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \ - vacuum.o vacuumlazy.o variable.o view.o + vacuum.o vacuumlazy.o variable.o view.o waitlsn.o include $(top_srcdir)/src/backend/common.mk diff --g
Re: [HACKERS] make async slave to wait for lsn to be replayed
Alexander Korotkov писал 2017-09-26 12:07: I propose following syntax options. WAITLSN lsn; WAITLSN lsn TIMEOUT delay; WAITLSN lsn INFINITE; WAITLSN lsn NOWAIT; For me that looks rather better. What do you think? I agree with you, now syntax looks better. New patch attached to tha mail. Ants Aasma писал 2017-09-26 13:00: Exposing this interface as WAITLSN will encode that visibility order matches LSN order. This removes any chance of fixing for example visibility order of async/vs/sync transactions. Just renaming it so the token is an opaque commit visibility token that just happens to be a LSN would still allow for progress in transaction management. For example, making PostgreSQL distributed will likely want timestamp and/or vector clock based visibility rules. I'm sorry I did not understand exactly what you meant. Please explain this in more detail. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 217f842726531edb1b0056a5c5727ab01bab7f9b Author: i.kartyshov <i.kartys...@postgrespro.com> Date: Mon Oct 23 12:08:59 2017 +0300 Cherry picked and ported 11dev diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 01acc2e..6792eb0 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..6f389ca --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,144 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait for the target LSN to be replayed + + + + +WAITLSN 'LSN' [ INFINITELY ] +WAITLSN 'LSN' TIMEOUT wait_time +WAITLSN 'LSN' NOWAIT + + + + + Description + + + interprocess communication mechanism to wait for the target log sequence + number (LSN) on standby in + databases with master-standby asynchronous replication. When run with the + LSN option, the WAITLSN command + waits for the specified LSN to be replayed. By default, wait + time is unlimited. Waiting can be interrupted using Ctrl+C, or + by shutting down the postgres server. You can also limit the wait + time using the TIMEOUT option, or check the target LSN + status immediately using the NOWAIT option. + + + + + + Parameters + + + +LSN + + + Specify the target log sequence number to wait for. + + + + + INFINITELY + + + Wait until the target LSN is replayed on standby. + This is an optional parameter reinforcing the default behavior. + + + + + + + + TIMEOUT wait_time + + + Limit the time to wait for the LSN to be replayed. + The specified wait_time must be an integer + and is measured in milliseconds. + + + + + + NOWAIT + + + Report whether the target LSN has been replayed already, + without any waiting. + + + + + + + Examples + + + Run WAITLSN from psql, + limiting wait time to 1 milliseconds: + + +WAITLSN '0/3F07A6B1' TIMEOUT 1; +NOTICE: LSN is not reached. Try to increase wait time. +LSN reached +- + f +(1 row) + + + + + Wait until the specified LSN is replayed: + +WAITLSN '0/3F07A611'; +LSN reached +- + t +(1 row) + + + + + Limit LSN wait time to 50 milliseconds, and then cancel the command: + +WAITLSN '0/3F0FF791' TIMEOUT 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to increase wait time. +ERROR: canceling statement due to user request + LSN reached +- + f +(1 row) + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9000b3a..0c5951a 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -209,6 +209,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dd028a1..117cc9b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -40,6 +40,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; - /* * Although only "on", "off", and "always" are documented, * we accept all the likely variants of "on" and "off". @@ -7312,6 +7312,15 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN()) +
Re: [HACKERS] make async slave to wait for lsn to be replayed
Hello On 07.03.2017 15:58, Masahiko Sawada wrote: I've read the discussion so far but I didn't see the reason why you've changed it to as a contrib module. Could you tell me about that? I did it on the initiative of our customer, who preferred the implementation in the form of contrib. Contrib realization of WAITLSN adds to core the only hook. On 07.03.2017 15:58, Masahiko Sawada wrote: > I guess this feature would be more useful if provided as a core > feature and we need to discuss about syntax as Thomas mentioned. Here I attached rebased patch waitlsn_10dev_v3 (core feature) I will leave the choice of implementation (core/contrib) to the discretion of the community. Will be glad to hear your suggestion about syntax, code and patch. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 2bc4d9f..6d5a81e 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -178,6 +178,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..338187b --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,134 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait until target LSN has been replayed + + + + +WAITLSN 'LSN' [ , delay ] +WAITLSN_INFINITE 'LSN' +WAITLSN_NO_WAIT 'LSN' + + + + + Description + + + The WAITLSN wait till target LSN will + be replayed with an optional delay (milliseconds by default + infinity) to be wait for LSN to replayed. + + + + WAITLSN provides a simple + interprocess LSN wait mechanism for a backends on slave + in master-slave replication scheme on PostgreSQL database. + + + + The WAITLSN_INFINITE wait till target LSN will + be replayed on slave . + + + + The WAITLSN_NO_WAIT will tell if target LSN was replayed without any wait. + + + + + Parameters + + + +LSN + + + Target log sequence number to be wait for. + + + + +delay + + + Time in miliseconds to waiting for LSN to be replayed. + + + + + + + + Notes + + + Delay time for waiting till LSN to be replayed must be integer. For + default it is infinity. Waiting can be interupped using Ctl+C, or + by Postmaster death. + + + + + GUCs + + + Add two GUCs which help tuning influence on StartupXLOG: + count_waitlsn (denominator to check not each LSN) + int count_waitlsn= 10; + + + + interval_waitlsn (Interval in milliseconds to additional LSN check) + int interval_waitlsn = 100; + + + + + Examples + + + Configure and execute a waitlsn from + psql: + + +WAITLSN '0/3F07A6B1', 1; +NOTICE: LSN is not reached. Try to make bigger delay. +WAITLSN + +WAITLSN '0/3F07A611'; +WAITLSN + +WAITLSN '0/3F0FF791', 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to make bigger delay. +ERROR: canceling statement due to user request + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index c8191de..a5100a2 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -206,6 +206,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2dcff7f..9c2def1 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -39,6 +39,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -145,6 +146,9 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; +/* GUC variable */ +intcount_waitlsn = 10; +intinterval_waitlsn = 100; /* * Although only "on", "off", and "always" are documented, @@ -6948,6 +6952,8 @@ StartupXLOG(void) { ErrorContextCallback errcallback; TimestampTz xtime; + TimestampTz time_waitlsn = GetCurrentTimestamp(); + int counter_waitlsn = 0; InRedo = true; @@ -7174,6 +7180,17 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (counter_waitlsn % count_waitlsn == 0 + || TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) +{ + WaitLSNSetLatch(); + time_waitlsn = GetCurrentTimestamp(); +} +counter_waitlsn++; + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL)
Re: [HACKERS] make async slave to wait for lsn to be replayed
Rebase done. Meanwhile I made some more changes. Changes === 1) WAITLSN is now implemented as an extension called "pg_waitlsn" 2) Call new hook "lsn_updated_hook" right after xact_redo_commit (xlog.c) 3) Corresponding functions: pg_waitlsn('0/693FF800', 1) - wait 10 seconds pg_waitlsn_infinite('0/693FF800') - for infinite wait pg_waitlsn_no_wait('0/693FF800') - once check if LSN was replayed or not. 4) Add two GUCs which help tuning influence on StartupXLOG: count_waitlsn (denominator to check not each LSN) int count_waitlsn= 10; interval_waitlsn (Interval in milliseconds to additional LSN check) int interval_waitlsn = 100; 5) Optimize loop that set latches. How to use it == Master: 1) Make "wal_level = replica" Slave: 2) Add shared_preload_libraries = 'pg_waitlsn' hot_standby = on (in postgresql.conf) 3) Create extension pg_waitlsn; 4) And in hot_standby you can wait for LSN (pgsleep), when LSN will replayed on slave pg_waitlsn will release select pg_waitlsn(‘LSN’ [, timeout in ms]); select pg_waitlsn_infinite(‘LSN’); select pg_waitlsn_no_wait(‘LSN’); #Wait until LSN 0/303EC60 will be replayed, or 10 second passed. select pg_waitlsn(‘0/303EC60’, 1); #Or same without timeout. select pg_waitlsn(‘0/303EC60’); select pg_waitlsn_infinite('0/693FF800'); #To check if LSN is replayed can be used. select pg_waitlsn_no_wait('0/693FF800'); Notice: select pg_waitlsn will release on PostmasterDeath or Interruption events if they come earlier then target LSN or timeout. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/pg_waitlsn/Makefile b/contrib/pg_waitlsn/Makefile new file mode 100644 index 000..49a326c --- /dev/null +++ b/contrib/pg_waitlsn/Makefile @@ -0,0 +1,21 @@ +# pg_waitlsn/Makefile + +MODULE_big = pg_waitlsn +OBJS = pg_waitlsn.o +EXTENSION = pg_waitlsn +DATA = pg_waitlsn--1.0.sql + + + +ifdef USE_PGXS + + PG_CONFIG = pg_config + PGXS := $( shell $( PG_CONFIG ) --pgxs ) + include $(PGXS) +else + + subdir = contrib/pg_waitlsn + top_builddir = ../.. + include $(top_builddir)/src/Makefile.global + include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/pg_waitlsn/pg_waitlsn--1.0.sql b/contrib/pg_waitlsn/pg_waitlsn--1.0.sql new file mode 100644 index 000..8b251f3 --- /dev/null +++ b/contrib/pg_waitlsn/pg_waitlsn--1.0.sql @@ -0,0 +1,19 @@ +/* contrib/pg_waitlsn/pg_waitlsn--1.0.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "CREATE EXTENSION pg_waitlsn" to wait target LSN to been replayed, delay for waiting in miliseconds (default infinity) \quit + +CREATE FUNCTION pg_waitlsn(lsn pg_lsn, delay int default 0) +RETURNS bool +AS 'MODULE_PATHNAME', 'pg_waitlsn' +LANGUAGE C IMMUTABLE STRICT ; + +CREATE FUNCTION pg_waitlsn_infinite(lsn pg_lsn) +RETURNS bool +AS 'MODULE_PATHNAME', 'pg_waitlsn_infinite' +LANGUAGE C IMMUTABLE STRICT ; + +CREATE FUNCTION pg_waitlsn_no_wait(lsn pg_lsn) +RETURNS bool +AS 'MODULE_PATHNAME', 'pg_waitlsn_no_wait' +LANGUAGE C IMMUTABLE STRICT ; diff --git a/contrib/pg_waitlsn/pg_waitlsn.c b/contrib/pg_waitlsn/pg_waitlsn.c new file mode 100644 index 000..d210678 --- /dev/null +++ b/contrib/pg_waitlsn/pg_waitlsn.c @@ -0,0 +1,299 @@ +/*- + * + * pg_waitlsn + * + * Portions Copyright (c) 2012-2017, PostgresPro Global Development Group + * + * IDENTIFICATION + * contrib/pg_waitlsn/pg_waitlsn.c + * + *- + */ + +#include "postgres.h" +#include "fmgr.h" +#include "pgstat.h" +#include "access/xlog.h" +#include "utils/pg_lsn.h" +#include "storage/latch.h" +#include "miscadmin.h" +#include "storage/spin.h" +#include "storage/backendid.h" +#include "access/xact.h" +#include "storage/shmem.h" +#include "storage/ipc.h" +#include "utils/timestamp.h" +#include "storage/pmsignal.h" +#include "storage/proc.h" +#include "access/transam.h" +#include "utils/guc.h" + +PG_MODULE_MAGIC; + +static bool pg_waitlsn_internal(XLogRecPtr lsn, uint64_t delay); + +/* Hooks values */ +static lsn_updated_hook_type prev_lsn_updated_hook = NULL; +static shmem_startup_hook_type prev_shmem_startup_hook = NULL; +static void wl_lsn_updated_hook(void); +static uint32 estimate_shmem_size(void); + +/* Latches Own-DisownLatch and AbortCаllBack */ +static void disown_latches_on_abort(XactEvent event, void *arg); +static void wl_own_latch(void); +static void wl_disown_latch(void); + +/* GUC variable */ +intcount_waitlsn = 10; +intinterval_waitlsn = 100; + +/* Globals */ +TimestampTz time_waitlsn = 0; +intcounter_waitlsn = 0; + +void _PG_init(void); + +/* Shared memory structu
Re: [HACKERS] make async slave to wait for lsn to be replayed
Thank you for reviews and suggested improvements. I rewrote patch to make it more stable. Changes === I've made a few changes: 1) WAITLSN now doesn`t depend on snapshot 2) Check current replayed LSN rather than in xact_redo_commit 3) Add syntax WAITLSN_INFINITE '0/693FF800' - for infinite wait and WAITLSN_NO_WAIT '0/693FF800' for check if LSN was replayed as you advised. 4) Reduce the count of loops with GUCs (WalRcvForceReply() which in 9.5 doesn`t exist). 5) Optimize loop that set latches. 6) Add two GUCs that helps us to configure influence on StartupXLOG: count_waitlsn (denominator to check not each LSN) interval_waitlsn (Interval in milliseconds to additional LSN check) Feedback On 09/15/2016 05:41 AM, Thomas Munro wrote: You hold a spinlock in one arbitrary slot, but that doesn't seem sufficient: another backend may also read it, compute a new value and then write it, while holding a different spin lock. Or am I missing something? We acquire an individual spinlock on each member of array, so you cannot compute new value and write it concurrently. Tested == We have been tested it on different servers and OS`s, in different cases and workloads. New version is nearly as fast as vanilla on primary and bring tiny influence on standby performance. Hardware: 144 Intel Cores with HT 3TB RAM all data on ramdisk primary + hotstandby on the same node. A dataset was created with "pgbench -i -s 1000" command. For each round of test we pause replay on standby, make 100 transaction on primary with pgbench, start replay on standby and measure replication gap disappearing time under different standby workload. The workload was "WAITLSN ('Very/FarLSN', 1000ms timeout)" followed by "select abalance from pgbench_accounts there aid = random_aid;" For vanilla 1000ms timeout was enforced on pgbench side by -R option. GUC waitlsn parameters was adopted for 1000ms timeout on standby with 35000 tps rate on primary. interval_waitlsn = 500 (ms) count_waitlsn = 3 On 200 clients, slave caching up master as vanilla without significant delay. On 500 clients, slave caching up master 3% slower then vanilla. On 1000 clients, 12% slower. On 5000 clients, 3 time slower because it far above our hardware ability. How to use it == WAITLSN ‘LSN’ [, timeout in ms]; WAITLSN_INFINITE ‘LSN’; WAITLSN_NO_WAIT ‘LSN’; #Wait until LSN 0/303EC60 will be replayed, or 10 second passed. WAITLSN ‘0/303EC60’, 1; #Or same without timeout. WAITLSN ‘0/303EC60’; orfile:///home/vis/Downloads/waitlsn_10dev_v2.patch WAITLSN_INFINITE '0/693FF800'; #To check if LSN is replayed can be used. WAITLSN_NO_WAIT '0/693FF800'; Notice: WAITLSN will release on PostmasterDeath or Interruption events if they come earlier then target LSN or timeout. Thank you for reading, will be glad to get your feedback. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 77667bdebd..72c5390695 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -172,6 +172,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 8acdff1393..3733ad960b 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -200,6 +200,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index aa9ee5a0dd..9696b5dbb5 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -39,6 +39,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -143,6 +144,9 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; +/* GUC variable */ +intcount_waitlsn = 10; +intinterval_waitlsn = 100; /* * Although only "on", "off", and "always" are documented, @@ -6781,6 +6785,8 @@ StartupXLOG(void) { ErrorContextCallback errcallback; TimestampTz xtime; + TimestampTz time_waitlsn = GetCurrentTimestamp(); + int counter_waitlsn = 0; InRedo = true; @@ -6998,6 +7004,17 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (counter_waitlsn % count_waitlsn == 0 + || TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) +{ + WaitLSNSetLatch(); + time_waitlsn = GetCurrentTimestamp(); +} +counter_waitlsn++; + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } whil
Re: [HACKERS] less expensive pg_buffercache on big shmem
Hello everyone, patch was rebased. Thank you Tomas for your reviewing this patch and for your valuable comments. From the very beginning we had the misunderstanding with the naming of meethods. > It'd be really useful if you could provide actual numbers, explain what > metrics you compare and how. I'm sure it makes sense to you but it's > utterly confusing for everyone else, and it also makes it impossible to > reproduce the benchmark. I test it as I wrote earlier, I run it a several times collecting TPS in each series, and calculate average value. > Secondly, I see this bit added to the loop over buffers: > > if (bufHdr->tag.forkNum == -1) > { > fctx->record[i].blocknum = InvalidBlockNumber; > continue; > } > > and I have no idea why this is needed (when it was not needed before). This helps to skip not used bufferpages. It is valuable on big and not warmup shared memory. On 09/02/2016 12:01 PM, Robert Haas wrote: I think we certainly want to lock the buffer header, because otherwise we might get a torn read of the buffer tag, which doesn't seem good. But it's not obvious to me that there's any point in taking the lock on the buffer mapping partition; I'm thinking that doesn't really do anything unless we lock them all, and we all seem to agree that's going too far. Replace consistent method with semiconsistent (that lock buffer headers without partition lock). Made some additional fixes thanks to reviewers. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile index 497dbeb..9668d84 100644 --- a/contrib/pg_buffercache/Makefile +++ b/contrib/pg_buffercache/Makefile @@ -4,8 +4,9 @@ MODULE_big = pg_buffercache OBJS = pg_buffercache_pages.o $(WIN32RES) EXTENSION = pg_buffercache -DATA = pg_buffercache--1.2.sql pg_buffercache--1.1--1.2.sql \ - pg_buffercache--1.0--1.1.sql pg_buffercache--unpackaged--1.0.sql +DATA = pg_buffercache--1.3.sql pg_buffercache--1.2--1.3.sql \ + pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \ + pg_buffercache--unpackaged--1.0.sql PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time" ifdef USE_PGXS diff --git a/contrib/pg_buffercache/pg_buffercache--1.2--1.3.sql b/contrib/pg_buffercache/pg_buffercache--1.2--1.3.sql new file mode 100644 index 000..5935326 --- /dev/null +++ b/contrib/pg_buffercache/pg_buffercache--1.2--1.3.sql @@ -0,0 +1,6 @@ +/* contrib/pg_buffercache/pg_buffercache--1.2--1.3.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION pg_buffercache(text) UPDATE TO '1.3'" to load this file. \quit + +ALTER FUNCTION pg_buffercache_pages() PARALLEL SAFE; diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.3.sql similarity index 88% rename from contrib/pg_buffercache/pg_buffercache--1.2.sql rename to contrib/pg_buffercache/pg_buffercache--1.3.sql index 6ee5d84..6b3f69e 100644 --- a/contrib/pg_buffercache/pg_buffercache--1.2.sql +++ b/contrib/pg_buffercache/pg_buffercache--1.3.sql @@ -1,4 +1,4 @@ -/* contrib/pg_buffercache/pg_buffercache--1.2.sql */ +/* contrib/pg_buffercache/pg_buffercache--1.3.sql */ -- complain if script is sourced in psql, rather than via CREATE EXTENSION \echo Use "CREATE EXTENSION pg_buffercache" to load this file. \quit @@ -18,4 +18,4 @@ CREATE VIEW pg_buffercache AS -- Don't want these to be available to public. REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC; -REVOKE ALL ON pg_buffercache FROM PUBLIC; +REVOKE ALL ON pg_buffercache FROM PUBLIC; \ No newline at end of file diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control index a4d664f..8c060ae 100644 --- a/contrib/pg_buffercache/pg_buffercache.control +++ b/contrib/pg_buffercache/pg_buffercache.control @@ -1,5 +1,5 @@ # pg_buffercache extension comment = 'examine the shared buffer cache' -default_version = '1.2' +default_version = '1.3' module_pathname = '$libdir/pg_buffercache' relocatable = true diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c index da13bde..b6d2efc 100644 --- a/contrib/pg_buffercache/pg_buffercache_pages.c +++ b/contrib/pg_buffercache/pg_buffercache_pages.c @@ -136,15 +136,6 @@ pg_buffercache_pages(PG_FUNCTION_ARGS) MemoryContextSwitchTo(oldcontext); /* - * To get a consistent picture of the buffer state, we must lock all - * partitions of the buffer map. Needless to say, this is horrible - * for concurrency. Must grab locks in increasing order to avoid - * possible deadlocks. - */ - for (i = 0; i < NUM_BUFFER_PARTITIONS; i++) - LWLockAcquire(BufMappingPartitionLockByIndex(i), LW_SHARED); - - /* * Scan through a
Re: [HACKERS] make async slave to wait for lsn to be replayed
On 08/31/2016 05:54 PM, Craig Ringer wrote: How do you get the commit LSN to watch for? Grab pg_current_xlog_insert_location() just after the commit and figure that replaying to that point guarantees you get the commit? That's the point, it was created in order to provide the cosistent view of data between master and replica. You almost guessed, I used GetXLogReplayRecPtr() right after LSN was physically replayed on downstream. Some time ago[1] I raised the idea of reporting commit LSN on the wire to clients. That didn't go anywhere due to compatibility and security concerns. I think those were resolvable, but it wasn't enough of a priority to push hard on at the time. A truly "right" solution has to wait for a protocol bump, but I think good-enough solutions are possible now. So you might want to read that thread. Thank you for pointing to your thread, it was very informative! It seems that you have solved the very similar problem. It also mentions hesitations about exposing LSN to clients even more. I think we're *way* past that now - we have replication origins and replication slots relying on it, it's exposed in a pg_lsn datatype, a bunch of views expose it, etc. But it might be reasonable to ask "should the client instead be expected to wait for the confirmed commit of a 64-bit epoch-extended xid, like that returned by txid_current()?" . One advantage of using xid is that you can get it while you're still in the xact, so there's no race between commit and checking the lsn after commit. That sounds reasonable, but I dont think it will give us any considerable benefits. But I`ll work out this variant. Are you specifically trying to ensure "this commit has replayed on the replica before we run queries on it" ? Or something else? Yes you are right, I want to ensure data consistency on downstream before running queries on it. Our clients would use it as a part of background worker and maybe directly in apps too. --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] less expensive pg_buffercache on big shmem
On 09/03/2016 05:04 AM, Tomas Vondra wrote: > This patch needs a rebase, as 06d7fd6e bumped the version to 1.2. Thank you for a valuable hint. > > If we will replace consistent method, then we should replace it with the > > partially consistent method (called "nonconsistent") because: > > 1) it's based on fast spinlocks (it's not fully up to its name, though) > > In other words, it does exactly the thing proposed up-thread, i.e. > locking only buffer headers. > > What do you mean by fast spinlocks? And why aren't they up to the name? Not they (spinlocks), but the name “nonconsistent” is somewhat misleading. At the moment we can't implement concurrent shared memory access without locks in general, so most inconsistent method that has been proposed was the "nonconsistent" one. But roughly speaking *nonconsistent* is not as such by the name, because it contains a locking mechanism (spinlocks). > > 2) it's *almost* the fastest one (the less time needed for execution of > > method, the less data core will change and as a consequence the more > > consistent snapshot will be) > > I'm not sure I understand what you're saying here? What do you mean by > "almost the fastest one"? I mean the fastest one that has been proposed ("consistent"| "semiconsistent"| "nonconsistent"). > I'm a bit confused by the results you reported before, i.e. that > > 1) nonconsistent is 10% faster than consistent method > 2) semiconsistent is 5-times slower than nonconsistent method > > What does that mean? Are you refering to duration of the queries reading > data from pg_buffercache, or to impact on the other queries? Here I mean "working duration time". > How can be semiconsistent 5x slower than nonconsistent? Wouldn't that > make it significantly slower than the consistent method? Firstly, when we want to measure the quality of pg_buffercache, we must measure several values: 1) Execution time (duration of the queries reading data from pg_buffercache) 2) How it influences the system (and other queries) during its work Secondly, the semiconsistent is slower than nonconsistent and consistent method, but it makes less influence on other queries then consistent. Thirdly, it is slower because it locks the partitions of shared memory in a different way than in consistent or nonconsistent methods. The semi-consistent strategy implies that only one buffer is locked at a time. Therefor has no significant effect on the system, but it takes more time. --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] less expensive pg_buffercache on big shmem
On 09/02/2016 06:01 AM, Robert Haas wrote: I wonder whether we ought to just switch from the consistent method to the semiconsistent method and call it good. I agree with you that taking every buffer partition lock simultaneously seems like too much locking. And in the future if we replace the buffer mapping hash with something that is lock-free or close to it, then we wouldn't even have buffer partition locks any more, and probably no way at all to get an entirely consistent snapshot. What do you think of this? I fully agree with you that it would be preferred in the future to replace the buffer mapping hash with some of lock-free algorithms. In the question of replacing the consistent method I agree with you, Andres Freund and Peter Geoghegan: the consistent method does not bring any considerable benefits. You might be right regarding the three different modes, but our DBAs asked if we could preserve a legacy mode too, thus the choice. On 09/02/2016 06:19 AM, Andres Freund wrote: > +1. I think, before long, we're going to have to switch away from having > locks & partitions in the first place. So I don't see a problem relaxing > this. It's not like that consistency really buys you anything... I'd > even consider not using any locks. If we will replace consistent method, then we should replace it with the partially consistent method (called "nonconsistent") because: 1) it's based on fast spinlocks (it's not fully up to its name, though) 2) it's *almost* the fastest one (the less time needed for execution of method, the less data core will change and as a consequence the more consistent snapshot will be) 3) and it has less influence on the entire system during query processing. On 09/02/2016 06:30 AM, Peter Geoghegan wrote: > I would like to be able to run pg_buffercache in production from time > to time. Yes, in our experience the usage of fully consistent pg_buffercache in production is quite a courageous decision. --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] less expensive pg_buffercache on big shmem
Hi hackers, Recently I have finished my work on a patch for pg_buffercache contrib, I think it's time to share my results. Introduction I want to offer you the implementation that allows to decrease system workload by partially sacrificing (fully snapshot consistency) data consistency. Sometimes we do not need full data consistency, for example on quantitative rather than qualitative analysis of memory contents, or when we want to catch insufficient memory resources or how often relation is used. Problem description === Currently, the pg_buffercache v1.1 and prior takes an exclusive lock on all shared buffers, which greatly affects system performance. Usually we use pg_buffercache to find out why DB is working slower than expected or examine what occupies the entire memory. So when we run pg_buffercache on such system, we make it even slower. Implementation == Vanilla implementation contains loop which collecting statistic from whole shared memory acquire, read and release Spinlocks one by one, page by page while holding LWLock. V1.2 implementation contains flexible loop which can collect shared memory statistic using three different methods: 1) with holding LWLock only on one partition of shared memory (semiconsistent method) 2) without LWLocks (nonconsistent method), 3) or in vanilia way (consistent method) The aforementioned allow us to launch pg_buffercache in the three different ways. Each of them have some advantages and some disadvantages: Consistent: + 100% consistency of shared memory snapshot - Slowdown the system with whole shared memory exclusive lock Semiconsistent: + Faster than consistent method + Mostly doesn`t affect on the system load - Speed of taking that snapshot is low Nonconsistent: The fastest + Doesn`t noticeably affects on the systems - <3% lost of snapshot consistency What works == Actually, it work well even on big load, but of course there might be things I've overlooked. VIEW pg_buffercache_cons VIEW pg_buffercache_noncons VIEW pg_buffercache_semicons Examples from docs in new realization: SELECT c.relname, count(*) AS buffers FROM pg_buffercache_noncons b INNER JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid) AND b.reldatabase IN (0, (SELECT oid FROM pg_database WHERE datname = current_database())) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10; SELECT c.relname, count(*) AS buffers FROM pg_buffercache_semicons b INNER JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid) AND b.reldatabase IN (0, (SELECT oid FROM pg_database WHERE datname = current_database())) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10; Testing the implementation == How implementation tested: 1) Start server 2) Make pgbench tps pgbench -c 250 -s 1000 -T 200 -P1 3) Compare how tps sags under load if: SELECT count(*) FROM pg_buffercache_cons; SELECT count(*) FROM pg_buffercache_semicons; SELECT count(*) FROM pg_buffercache_noncons; This test was made on server (server parameters) Model name:Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz CPU(s):144 Socket(s): 4 Shared_buffers:200GB Results of testing == Our DBA team obtained the following results: Nonconsistent: * 10% faster then consistent method * doesn`t noticeably affects on the systems * the maximum loss of accuracy was less then 3%* ( in most situation it is permissible accuracy loss ) Semiconsistent: * 5 time slower then nonconsistent * made less affects on system compared to consistent Overall results: Our clients was pleased with this implementation. Implementation is made with backward compatibility, as a conclusion old pg_buffercache v1.1 queries will work well. Semiconsistent show results approaching to nonconsistent on SELECTONLY queries. * this values were obtained from our DBA tests. What can be better === It is unclear how to optimize the semiconsistent method to make it faster, and reduce temporary effect that appears from time to time. I will be glad to see your feedback! --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile index 065d3d6..8813c50 100644 --- a/contrib/pg_buffercache/Makefile +++ b/contrib/pg_buffercache/Makefile @@ -4,7 +4,7 @@ MODULE_big = pg_buffercache OBJS = pg_buffercache_pages.o $(WIN32RES) EXTENSION = pg_buffercache -DATA = pg_buffercache--1.1.sql pg_buffercache--1.0--1.1.sql pg_buffercache--unpackaged--1.0.sql +DATA = pg_buffercache--1.2.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql pg_buffercache--unpackaged--1.0.sql PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time" ifdef USE_PGXS diff --git a/contrib/pg_buffercache/README b/contrib/p
[HACKERS] make async slave to wait for lsn to be replayed
Hi hackers, Few days earlier I've finished my work on WAITLSN statement utility, so I’d like to share it. Introduction Our clients who deal with 9.5 and use asynchronous master-slave replication, asked to make the wait-mechanism on the slave side to prevent the situation when slave handles query which needs data (LSN) that was received, flushed, but still not replayed. Problem description === The implementation: Must handle the wait-mechanism using pg_sleep() in order not to load system Must avoid race conditions if different backend want to wait for different LSN Must not take snapshot of DB, to avoid troubles with sudden minXID change Must have optional timeout parameter if LSN traffic has stalled. Must release on postmaster’s death or interrupts. Implementation == To avoid troubles with snapshots, WAITLSN was implemented as a utility statement, this allows us to circumvent the snapshot-taking mechanism. We tried different variants and the most effective way was to use Latches. To handle interprocess interaction all Latches are stored in shared memory and to cope with race conditions, each Latch is protected by a Spinlock. Timeout was made optional parameter, it is set in milliseconds. What works == Actually, it works well even with significant timeout or wait period values, but of course there might be things I've overlooked. How to use it == WAITLSN ‘LSN’ [, timeout in ms]; #Wait until LSN 0/303EC60 will be replayed, or 10 second passed. WAITLSN ‘0/303EC60’, 1; #Or same without timeout. WAITLSN ‘0/303EC60’; Notice: WAITLSN will release on PostmasterDeath or Interruption events if they come earlier then LSN or timeout. Testing the implementation == The implementation was tested with testgres and unittest python modules. How to test this implementation: Start master server Make table test, insert tuple 1 Make asynchronous slave replication (9.5 wal_level = standby, 9.6 or higher wal_level = replica) Slave: START TRANSACTION ISOLATION LEVEL REPEATABLE READ ; SELECT * FROM test; Master: delete tuple + make vacuum + get new LSN Slave: WAITLSN ‘newLSN’, 6; Waitlsn finished with FALSE “LSN doesn`t reached” Slave: COMMIT; WAITLSN ‘newLSN’, 6; Waitlsn finished with success (without NOTICE message) The WAITLSN as expected wait LSN, and interrupts on PostmasterDeath, interrupts or timeout. Your feedback is welcome! --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 77667bd..72c5390 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -172,6 +172,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..6a8bdca --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,108 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait when target LSN been replayed + + + + +WAITLSN 'LSN' [ , delay ] + + + + + Description + + + The WAITLSN wait till target LSN will + be replayed with an optional delay (milliseconds by default + infinity) to be wait for LSN to replayed. + + + + WAITLSN provides a simple + interprocess LSN wait mechanism for a backends on slave + in master-slave replication scheme on PostgreSQL database. + + + + + Parameters + + + +LSN + + + Target log sequence number to be wait for. + + + + +delay + + + Time in miliseconds to waiting for LSN to be replayed. + + + + + + + + Notes + + + Delay time to waiting for LSN to be replayed must be integer. For + default it is infinity. Waiting can be interupped using Ctl+C, or + by Postmaster death. + + + + + Examples + + + Configure and execute a waitlsn from + psql: + + +WAITLSN '0/3F07A6B1', 1; +NOTICE: LSN is not reached. Try to make bigger delay. +WAITLSN + +WAITLSN '0/3F07A611'; +WAITLSN + +WAITLSN '0/3F0FF791', 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to make bigger delay. +ERROR: canceling statement due to user request + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 8acdff1..3733ad9 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -200,6 +200,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f13f9c1..609c83e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -39,6 +39,7 @@ #include "catalog/pg_cont
[HACKERS] UPSERT strange behavior
Suppose we have some table create table cnt( usr_id int primary key, usr_doc_ref text not null, cnt int, sum int ); And going to run some insert on conflict update on it (pgbench script): \setrandom id 1 50 insert into cnt as c(usr_id,usr_doc_ref, cnt) values(:id, '#'||:id, 1) on conflict(usr_id) do update set cnt=c.cnt+1; Run it: pgbench -c 16 -j 2 -t 5 -n -h localhost -p 5432 -U postgres -f upsert2-ok.pgb work transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 16 number of threads: 2 number of transactions per client: 5 number of transactions actually processed: 80/80 latency average: 0.000 ms tps = 36475.782816 (including connections establishing) tps = 36483.759765 (excluding connections establishing) All ok. Then add a unique constraint to the table. alter table cnt add constraint usr_doc_ref_uq unique(usr_doc_ref) Run pgbench again: pgbench -c 16 -j 2 -t 5 -n -h localhost -p 5432 -U postgres -f upsert2-ok.pgb work client 2 aborted in state 2: ERROR: duplicate key value violates unique constraint "usr_doc_ref_uq" DETAIL: Key (usr_doc_ref)=(#39) already exists. client 6 aborted in state 2: ERROR: duplicate key value violates unique constraint "usr_doc_ref_uq" DETAIL: Key (usr_doc_ref)=(#16) already exists. client 9 aborted in state 2: ERROR: duplicate key value violates unique constraint "usr_doc_ref_uq" DETAIL: Key (usr_doc_ref)=(#28) already exists. So, if we have primary key and unique constraint on a table then upsert will not work as would expected.
Re: [HACKERS] proposal: get oldest LSN - function
On 11.03.2016 16:33, Kartyshov Ivan wrote: On 28.02.2016 11:36, Michael Paquier wrote: On Sun, Feb 28, 2016 at 4:40 PM, Kartyshov Ivan wrote: It will not satisfy our purposes and our administrators for three reasons. 1) DBA set me task to get the oldest number that present in WAL, not last Yeah I got that. 2) Surely we can choose the oldest segment from list "pg_ls_dir('pg_xlog')" of segments and calculate the first LSN by hand, but it is not what we want to do manually. That's where I am not following. In any case you are just one SQL query away from the result. And actually your patch is incorrect I think. If you care about the oldest record available you should look for the first LSN position of the oldest segment, no? What you are calculating now is the oldest virtual LSN position in local pg_xlog. Sorry, for long not answering and thank you for time you spent reviewing this patch, please clarify what you meen, when you say "virtual LSN position". In whole you are right, my main goal is to get the oldest LSN position in local pg_xlog, and i agree with you that it isn`t too hard to get it by SQL query. But it will be more usefull, to have easy access in userspace. You wrote "If you care about the oldest record available you should look for the first LSN position of the oldest segment, no?" Yes I do it exactly this way. -- -- Ivan Kartyshov Postgres Professional: www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: get oldest LSN - function
On 28.02.2016 11:36, Michael Paquier wrote: On Sun, Feb 28, 2016 at 4:40 PM, Kartyshov Ivan wrote: It will not satisfy our purposes and our administrators for three reasons. 1) DBA set me task to get the oldest number that present in WAL, not last Yeah I got that. 2) Surely we can choose the oldest segment from list "pg_ls_dir('pg_xlog')" of segments and calculate the first LSN by hand, but it is not what we want to do manually. That's where I am not following. In any case you are just one SQL query away from the result. And actually your patch is incorrect I think. If you care about the oldest record available you should look for the first LSN position of the oldest segment, no? What you are calculating now is the oldest virtual LSN position in local pg_xlog. Sorry, for long not answering and thank you for time you spent reviewing this patch, please clarify what you meen, when you say "virtual LSN position". In whole you are right, my main goal is to get the oldest LSN position in local pg_xlog, and i agree with you that it isn`t too hard to get it by SQL query. But it will be more usefull, to have easy access in userspace. -- Ivan Kartyshov Postgres Professional: www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: get oldest LSN - function
On 27.02.2016 16:09, Michael Paquier wrote: On Sat, Feb 27, 2016 at 3:52 PM, Kartyshov Ivan <i.kartys...@postgrespro.ru> wrote: Maybe i wasn't too accurate in terms, because I newbie, but: We can get information about xlog, using big amout of support function (pg_current_xlog_location(), pg_current_xlog_insert_location(), pg_xlogfile_name_offset(), pg_xlogfile_name(), pg_last_xlog_receive_location(), pg_last_xlog_replay_location(), ... etc) they helps to get get useful information about xlog files and its content. So, this patch extends this amount of functions. It is useful additional tool for DBA (we can get replicationSlotMinLSN, so why not in master), it can show us, if xlog replication or wal-sender is working properly or indicate if replication on startup can get up to date with master, or after long turnoff must be recovered from archive. What pg_ls_dir('pg_xlog') couldn't do here if you need to know the last WAL segment present on master? Hi Michael, It will not satisfy our purposes and our administrators for three reasons. 1) DBA set me task to get the oldest number that present in WAL, not last 2) Surely we can choose the oldest segment from list "pg_ls_dir('pg_xlog')" of segments and calculate the first LSN by hand, but it is not what we want to do manually. 3) Soon will be commitfest and our administrators wants to enhance their tool case for debug with usable features. Thank you for comment. -- Ivan Kartyshov Postgres Professional: www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: get oldest LSN - function
On 27.02.2016 22:36, Yury Zhuravlev wrote: Hello. You forgot to declare a GetXLogOldestLSNPtr function in a header file (in xlog.h I think). Some compilers refuse to compile this code (MSVC for example). Now I see warning from GCC: xlog.c:10627:1: warning: no previous prototype for «GetXLogOldestLSNPtr» [-Wmissing-prototypes] GetXLogOldestLSNPtr(void) Thanks. Hi Yury, thank for valuable remark. Changes applied -- Ivan Kartyshov Postgres Professional: www.postgrespro.com Russian Postgres Company diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index f9eea76..f774233 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16908,6 +16908,8 @@ SELECT set_config('log_statement_stats', 'off', false); pg_current_xlog_location + +pg_oldest_xlog_location pg_start_backup @@ -16981,6 +16983,13 @@ SELECT set_config('log_statement_stats', 'off', false); +pg_oldest_xlog_location() + + pg_lsn + Get the oldest WAL LSN (log sequence number) + + + pg_start_backup(label text , fast boolean ) pg_lsn @@ -17096,6 +17105,7 @@ postgres=# select pg_start_backup('label_goes_here'); +pg_oldest_xlog_location displays the oldest WAL LSN. pg_current_xlog_location displays the current transaction log write location in the same format used by the above functions. Similarly, pg_current_xlog_insert_location displays the current transaction log diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml index 6cb690c..5a0e887 100644 --- a/doc/src/sgml/high-availability.sgml +++ b/doc/src/sgml/high-availability.sgml @@ -860,6 +860,8 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass' The last WAL receive location in the standby is also displayed in the process status of the WAL receiver process, displayed using the ps command (see for details). + Also we can get the oldest WAL LSN (Log Sequence Number) + pg_oldest_xlog_location, it can give us a useful tool for DBA, additionally it can show us, if xlog replication or wal-sender is working properly or indicate if replication on startup can can get up to date with master, or after long turnoff must be recovered from archive. You can retrieve a list of WAL sender processes via the diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 94b79ac..067d51c 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -10669,6 +10669,18 @@ GetXLogWriteRecPtr(void) } /* + * Get oldest WAL write pointer + */ +XLogRecPtr +GetXLogOldestLSNPtr(void) +{ + XLogRecPtr result; + + XLogSegNoOffsetToRecPtr(XLogGetLastRemovedSegno()+1, 1, result); + return result; +} + +/* * Returns the redo pointer of the last checkpoint or restartpoint. This is * the oldest point in WAL that we still need, if we have to restart recovery. */ diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c index 31cbb01..44e01e1 100644 --- a/src/backend/access/transam/xlogfuncs.c +++ b/src/backend/access/transam/xlogfuncs.c @@ -195,6 +195,27 @@ pg_current_xlog_location(PG_FUNCTION_ARGS) } /* + * Report the oldest WAL write location (same format as pg_start_backup etc) + * + * This is useful for determining the first LSN in existing sequences + */ +Datum +pg_oldest_xlog_location(PG_FUNCTION_ARGS) +{ + XLogRecPtr oldest_recptr; + + if (RecoveryInProgress()) + ereport(ERROR, +(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("recovery is in progress"), + errhint("WAL control functions cannot be executed during recovery."))); + + oldest_recptr = GetXLogOldestLSNPtr(); + + PG_RETURN_LSN(oldest_recptr); +} + +/* * Report the current WAL insert location (same format as pg_start_backup etc) * * This function is mostly for debugging purposes. diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index ecd30ce..bc7ce6c 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -236,6 +236,7 @@ extern void GetXLogReceiptTime(TimestampTz *rtime, bool *fromStream); extern XLogRecPtr GetXLogReplayRecPtr(TimeLineID *replayTLI); extern XLogRecPtr GetXLogInsertRecPtr(void); extern XLogRecPtr GetXLogWriteRecPtr(void); +extern XLogRecPtr GetXLogOldestLSNPtr(void); extern bool RecoveryIsPaused(void); extern void SetRecoveryPause(bool recoveryPause); extern TimestampTz GetLatestXTime(void); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index b24e434..3c2cefb 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -5202,6 +5202,10 @@ DESCR("get an individual replication origin's replication progress"); DATA(insert OID = 6014 ( pg_show_replication_origin_status PGNSP PGUID 12 1 10
Re: [HACKERS] proposal: get oldest LSN - function
On 27.02.2016 03:07, Andres Freund wrote How does it help with any of that? Hi, thank you for fast answer. Maybe i wasn't too accurate in terms, because I newbie, but: We can get information about xlog, using big amout of support function (pg_current_xlog_location(), pg_current_xlog_insert_location(), pg_xlogfile_name_offset(), pg_xlogfile_name(), pg_last_xlog_receive_location(), pg_last_xlog_replay_location(), ... etc) they helps to get get useful information about xlog files and its content. So, this patch extends this amount of functions. This function is mostly for debugging purposes. -- Ivan Kartyshov Postgres Professional: www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp
> Why does it say "tested, failed" for all points above there? ;-) Hi, I just used Web reviewer form on https://commitfest.postgresql.org to make review on patch, but form doesn't work properly unlike the patch.))
[HACKERS] proposal: get oldest LSN - function
Hello, I want to suggest a client-side little function, implemented in the attached patch. Function pg_oldest_xlog_location gets us the oldest LSN (Log Sequence Number) in xlog. It is useful additional tool for DBA (we can get replicationSlotMinLSN, so why not in master), it can show us, if xlog replication or wal-sender is working properly or indicate if replication on startup can get up to date with master, or after long turnoff must be recovered from archive. Anyway, does it look useful enough to be part of postgres? I guess I should push this to commitfest if that's the case. Best regards, -- Ivan Kartyshov Postgres Professional: www.postgrespro.com Russian Postgres Company diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index f9eea76..f774233 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16908,6 +16908,8 @@ SELECT set_config('log_statement_stats', 'off', false); pg_current_xlog_location + +pg_oldest_xlog_location pg_start_backup @@ -16981,6 +16983,13 @@ SELECT set_config('log_statement_stats', 'off', false); +pg_oldest_xlog_location() + + pg_lsn + Get the oldest WAL LSN (log sequence number) + + + pg_start_backup(label text , fast boolean ) pg_lsn @@ -17096,6 +17105,7 @@ postgres=# select pg_start_backup('label_goes_here'); +pg_oldest_xlog_location displays the oldest WAL LSN. pg_current_xlog_location displays the current transaction log write location in the same format used by the above functions. Similarly, pg_current_xlog_insert_location displays the current transaction log diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml index 6cb690c..5a0e887 100644 --- a/doc/src/sgml/high-availability.sgml +++ b/doc/src/sgml/high-availability.sgml @@ -860,6 +860,8 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass' The last WAL receive location in the standby is also displayed in the process status of the WAL receiver process, displayed using the ps command (see for details). + Also we can get the oldest WAL LSN (Log Sequence Number) + pg_oldest_xlog_location, it can give us a useful tool for DBA, additionally it can show us, if xlog replication or wal-sender is working properly or indicate if replication on startup can can get up to date with master, or after long turnoff must be recovered from archive. You can retrieve a list of WAL sender processes via the diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 94b79ac..067d51c 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -10669,6 +10669,18 @@ GetXLogWriteRecPtr(void) } /* + * Get oldest WAL write pointer + */ +XLogRecPtr +GetXLogOldestLSNPtr(void) +{ + XLogRecPtr result; + + XLogSegNoOffsetToRecPtr(XLogGetLastRemovedSegno()+1, 1, result); + return result; +} + +/* * Returns the redo pointer of the last checkpoint or restartpoint. This is * the oldest point in WAL that we still need, if we have to restart recovery. */ diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c index 31cbb01..44e01e1 100644 --- a/src/backend/access/transam/xlogfuncs.c +++ b/src/backend/access/transam/xlogfuncs.c @@ -195,6 +195,27 @@ pg_current_xlog_location(PG_FUNCTION_ARGS) } /* + * Report the oldest WAL write location (same format as pg_start_backup etc) + * + * This is useful for determining the first LSN in existing sequences + */ +Datum +pg_oldest_xlog_location(PG_FUNCTION_ARGS) +{ + XLogRecPtr oldest_recptr; + + if (RecoveryInProgress()) + ereport(ERROR, +(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("recovery is in progress"), + errhint("WAL control functions cannot be executed during recovery."))); + + oldest_recptr = GetXLogOldestLSNPtr(); + + PG_RETURN_LSN(oldest_recptr); +} + +/* * Report the current WAL insert location (same format as pg_start_backup etc) * * This function is mostly for debugging purposes. diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index b24e434..3c2cefb 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -5202,6 +5202,10 @@ DESCR("get an individual replication origin's replication progress"); DATA(insert OID = 6014 ( pg_show_replication_origin_status PGNSP PGUID 12 1 100 0 0 f f f f f t v r 0 0 2249 "" "{26,25,3220,3220}" "{o,o,o,o}" "{local_id, external_id, remote_lsn, local_lsn}" _null_ _null_ pg_show_replication_origin_status _null_ _null_ _null_ )); DESCR("get progress for all replication origins"); + +DATA(insert OID = 6015 ( pg_oldest_xlog_location PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 3220 "" _null_ _nul
Re: [HACKERS] Sanity checking for ./configure options?
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, failed Spec compliant: tested, failed Documentation:tested, failed Tested, I think it`s rather important to make cleanup work on that project. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, failed Spec compliant: tested, failed Documentation:tested, failed Applied this patch, it works well, make what it expected correctly, code style is maintained. Regression tests passed OK. No documentation or comments. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Union All View execution
Hi, If we have two queries, Query A and Query B that both return same columns. If we do Union All View will postgresql execute Query A and then Query B sequentially? Does the work being discussed around parallel query have the potential to parallely execute both parts of a Union All View? Also is there a pointer to the latest summary of goals and design for parallel query work? Thanks! Cheers, Ivan
Re: [HACKERS] backup.sgml-cmd-v003.patch
On Sep 3, 2013, at 6:56 AM, Karl O. Pinc k...@meme.com wrote: On 07/31/2013 12:08:12 PM, Ivan Lezhnjov IV wrote: Patch filename: backup.sgml-cmd-v003.patch The third version of this patch takes into consideration feedback received after original submission (it can be read starting from this message http://www.postgresql.org/message-id/CA +Tgmoaq-9D_mst113TdW=ar8mpgbc+x6t61azk3emhww9g...@mail.gmail.com) Essentially, it addresses the points that were raised in community feedback and offers better worded statements that avoid implying that some features are being deprecated when it isn't the case. We also spent some more time polishing other details, like making adjustments to the tone of the text so that it sounds more like a manual, and less like a blog post. More importantly, this chapter now makes it clear that superuser privileges are not always required to perform a successful backup because in practice as long as the role used to make a backup has sufficient read privileges on all of the objects a user is interested in it's going to work just fine. We also mention and show examples of usage for pg_restore and pigz alongside with gzip, and probably something else too. Hi Ivan, I'm reviewing your patch. I did not read the entirety of the thread referenced above. I apologize if this causes problems. Attached is backup-sgml-cmnd-v003_1.patch to be applied on top of backup-sgml-cmnd-v003.patch and containing my edits. You will eventually want to produce a single patch (but see below). Meanwhile this might help you keep track of my changes. Attached also is your original v3 patch. --- Cleaned up and clarified here and there. The bit about OIDs being depreciated might properly belong in a separate patch. The same might be said about adding mention of pigz. If you submit these as separate patch file attachments they can always be applied in a single commit, but the reverse is more work for the committer. (Regardless, I see no reason to have separate commitfest entries or anything other than multiple attachments to the email that finalizes our discussion.) Hello, took me a while to get here, but a lot has been going on... Okay, I'm new and I don't know why a single patch like this is more work for a commiter? Just so I understand and know. Minimally modified to note the existence of directory dumps. It may be that the utility/flexibility of directory dumps should also be mentioned. My thought is that the part beginning with The options in detail are: should not describe all the possibilities for the --format option, that being better left to the reference section. Likewise, this being prose, it might be best to describe all the options in-line, instead of presented as a list. I have left it as-is for you to improve as seen fit. Agreed, it probably looks better as a sentence. I have frobbed your programlisting to adjust the indentation and line-wrap style. I submit it here for consideration in case this style is attractive. This is nothing but conceit. We should use the same style used elsewhere in the documentation. (I can't think offhand of a place to look for a line-wrapped shell example. If you can't find one I'll take a look and if neither of us finds one we'll then have choices.) Looks good to me. I don't know that it's necessary to include pigz examples, because it sounds like pigz is a drop-in gzip replacement. I've left your examples in, in case you feel they are necessary. We do. We believe it can encourage more people to consider using it. The way we see it, most people seem to be running mutlicore systems these days, yet many simply are not aware of pigz. We have been routinely informing our customers of pigz as an alternative to tried and tested gzip when helping optimize their configurations, and all of them without exception went for it. I guess everybody just likes to squeeze some extra juice for free. The existing text of the SQL Dump section could use some alteration to reduce redundancy and add clarity. I'm thinking specifically of mention of pg_restore as being required to restore custom format backups and of the default pg_dump output being not just plain text but being a collection of SQL commands. Yes, the latter is obvious upon reflection, psql being what it is, but I think it would be helpful to spell this out. Especially in the context of the current patch. There could well be other areas like this to be addressed. I don't quite follow you here. I mean, I kinda understand what you mean in general, but when I look at the text I fail to see what you had in mind specifically. For example, pg_restore is mentioned only 3 times in section 24.1. Each mention seems pretty essential to me. And the text flow is pretty natural. Also, about plain text format being a collection of SQL commands. The very first paragraph of the section 24.1 reads The idea
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
Is there any hope to see it in libpq? If so, can anyone review latest version of my patch? On 10 July 2013 11:49, ivan babrou ibob...@gmail.com wrote: On 9 July 2013 18:43, Andres Freund and...@2ndquadrant.com wrote: On 2013-07-05 21:28:59 +0400, ivan babrou wrote: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 18fcb0c..58c1a35 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -1452,7 +1452,7 @@ static int connectDBComplete(PGconn *conn) { PostgresPollingStatusType flag = PGRES_POLLING_WRITING; - time_t finish_time = ((time_t) -1); + struct timeval finish_time; if (conn == NULL || conn-status == CONNECTION_BAD) return 0; @@ -1462,17 +1462,14 @@ connectDBComplete(PGconn *conn) */ if (conn-connect_timeout != NULL) { - int timeout = atoi(conn-connect_timeout); + int timeout_usec = (int) (atof(conn-connect_timeout) * 100); I'd rather not use a plain int for storing usecs. An overflow is rather unlikely, but still. Also, I'd rather use something like USECS_PER_SEC instead of a plain 100 in multiple places. - if (timeout 0) + if (timeout_usec 0) { - /* - * Rounding could cause connection to fail; need at least 2 secs - */ - if (timeout 2) - timeout = 2; - /* calculate the finish time based on start + timeout */ - finish_time = time(NULL) + timeout; + gettimeofday(finish_time, NULL); + finish_time.tv_usec += (int) timeout_usec; Accordingly adjust this. Looks like a sensible thing to me. *Independent* from this patch, you might want look into server-side connection pooling using transaction mode. If that's applicable for your application it might reduce latency noticeably. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services I tried to make it more safe. Still not sure about constants, I haven't found any good examples in libpq. -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 19:17, Dmitriy Igrishin dmit...@gmail.com wrote: 2013/7/9 Merlin Moncure mmonc...@gmail.com On Fri, Jul 5, 2013 at 12:28 PM, ivan babrou ibob...@gmail.com wrote: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? First thing that jumps into my head: why not use asynchronous connection (PQconnectStart, etc) and code the timeout on top of that? +1. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- // Dmitriy. Doesn't look like straightforward solution for me. In my case existing drivers will benefit from my patch, in async case they should be rewritten. We don't use libpq directly, we use native pgsql module from php. Even with that, kernel can wait for milliseconds — why should we limit precision 1000x times and reinvent milliseconds again in userspace? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 18:43, Andres Freund and...@2ndquadrant.com wrote: On 2013-07-05 21:28:59 +0400, ivan babrou wrote: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 18fcb0c..58c1a35 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -1452,7 +1452,7 @@ static int connectDBComplete(PGconn *conn) { PostgresPollingStatusType flag = PGRES_POLLING_WRITING; - time_t finish_time = ((time_t) -1); + struct timeval finish_time; if (conn == NULL || conn-status == CONNECTION_BAD) return 0; @@ -1462,17 +1462,14 @@ connectDBComplete(PGconn *conn) */ if (conn-connect_timeout != NULL) { - int timeout = atoi(conn-connect_timeout); + int timeout_usec = (int) (atof(conn-connect_timeout) * 100); I'd rather not use a plain int for storing usecs. An overflow is rather unlikely, but still. Also, I'd rather use something like USECS_PER_SEC instead of a plain 100 in multiple places. - if (timeout 0) + if (timeout_usec 0) { - /* - * Rounding could cause connection to fail; need at least 2 secs - */ - if (timeout 2) - timeout = 2; - /* calculate the finish time based on start + timeout */ - finish_time = time(NULL) + timeout; + gettimeofday(finish_time, NULL); + finish_time.tv_usec += (int) timeout_usec; Accordingly adjust this. Looks like a sensible thing to me. *Independent* from this patch, you might want look into server-side connection pooling using transaction mode. If that's applicable for your application it might reduce latency noticeably. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services I tried to make it more safe. Still not sure about constants, I haven't found any good examples in libpq. -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou connect_timeout_in_ms.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 11:05, Markus Wanner mar...@bluegap.ch wrote: On 07/08/2013 08:31 PM, ivan babrou wrote: Seriously, I don't get why running 150 poolers is easier. Did you consider running pgbouncer on the database servers? Regards Markus Wanner Database server lost network — boom, 2 seconds delay. What's the point then? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 12:20, Markus Wanner mar...@bluegap.ch wrote: On 07/09/2013 09:15 AM, ivan babrou wrote: Database server lost network — boom, 2 seconds delay. What's the point then? Oh, I see. Good point. It could still improve connection time during normal operation, though. Connection time during normal operation is 1.5ms which is fast enough for now. None the less, I now agree with you: we recommend a pooler, which may be capable of millisecond timeouts, but arguably is vastly more complex than the proposed patch. And it even brings its own set of gotchas (lots of connections). I guess I don't quite buy the complexity argument, yet. Pooler isn't capable of millisecond timeouts. At least I don't see how could I understand that pooler is dead in 50ms. Sure, gettimeofday() is subject to clock adjustments. But so is time(). And if you're setting timeouts that low, you probably know what you're doing (or at least care about latency a lot). Or is gettimeofday() still considerably slower on certain architectures or in certain scenarios? Where's the complexity? There's no complexity here :) Regards Markus Wanner -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 17:59, Markus Wanner mar...@bluegap.ch wrote: Ian, On 07/05/2013 07:28 PM, ivan babrou wrote: - /* - * Rounding could cause connection to fail; need at least 2 secs - */ You removed this above comment... please check why it's there. The relevant revision seems to be: ### commit 2908a838ac2cf8cdccaa115249f8399eef8a731e Author: Tom Lane t...@sss.pgh.pa.us Date: Thu Oct 24 23:35:55 2002 + That's not correct, facb72007 is the relevant revision. It seems that it's only applicable for small timeouts in seconds, but it you request connect timeout in 1 ms you should be ready to fail. I may be wrong about this, Bruce Momjian introduced that change in 2002. Code review for connection timeout patch. Avoid unportable assumption that tv_sec is signed; return a useful error message on timeout failure; honor PGCONNECT_TIMEOUT environment variable in PQsetdbLogin; make code obey documentation statement that timeout=0 means no timeout. ### - if (timeout 2) - timeout = 2; - /* calculate the finish time based on start + timeout */ - finish_time = time(NULL) + timeout; + gettimeofday(finish_time, NULL); + finish_time.tv_usec += (int) timeout_usec; I vaguely recall tv_usec only being required to hold values up to 100 by some standard. A signed 32 bit value would qualify, but only hold up to a good half hour worth of microseconds. That doesn't quite seem enough to calculate finish_time the way you are proposing to do it. Agree, this should be fixed. + finish_time.tv_sec += finish_time.tv_usec / 100; + finish_time.tv_usec = finish_time.tv_usec % 100; } } @@ -1073,15 +1074,15 @@ pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time) input_fd.events |= POLLOUT; /* Compute appropriate timeout interval */ - if (end_time == ((time_t) -1)) + if (end_time == NULL) timeout_ms = -1; else { - time_t now = time(NULL); + struct timeval now; + gettimeofday(now, NULL); - if (end_time now) - timeout_ms = (end_time - now) * 1000; - else + timeout_ms = (end_time-tv_sec - now.tv_sec) * 1000 + (end_time-tv_usec - now.tv_usec) / 1000; I think that's incorrect on a platform where tv_sec and/or tv_usec is unsigned. (And the cited commit above indicates there are such platforms.) I don't get it. timeout_ms is signed, and can hold unsigned - unsigned. Is it about anything else? On 07/09/2013 02:25 PM, ivan babrou wrote: There's no complexity here :) Not so fast, cowboy... :-) Regards Markus Wanner Is there anything else I should fix? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 5 July 2013 23:47, ivan babrou ibob...@gmail.com wrote: On 5 July 2013 23:26, Tom Lane t...@sss.pgh.pa.us wrote: ivan babrou ibob...@gmail.com writes: If you can figure out that postgresql is overloaded then you may decide what to do faster. In our app we have very strict limit for connect time to mysql, redis and other services, but postgresql has minimum of 2 seconds. When processing time for request is under 100ms on average sub-second timeouts matter. If you are issuing a fresh connection for each sub-100ms query, you're doing it wrong anyway ... regards, tom lane In php you cannot persist connection between requests without worrying about transaction state. We don't use postgresql for every sub-100ms query because it can block the whole request for 2 seconds. Usually it takes 1.5ms to connect, btw. Can you tell me why having ability to specify more accurate connect timeout is a bad idea? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou Nobody answered my question yet. -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 8 July 2013 20:40, David E. Wheeler da...@justatheory.com wrote: On Jul 8, 2013, at 7:44 AM, ivan babrou ibob...@gmail.com wrote: Can you tell me why having ability to specify more accurate connect timeout is a bad idea? Nobody answered my question yet. From an earlier post by Tom: What exactly is the use case for that? It seems like extra complication for something with little if any real-world usefulness. So the answer is: extra complication. Best, David I don't see any extra complication in backwards-compatible patch that removes more lines that adds. Can you tell me, what exactly is extra complicated? About pooling connections: we have 150 applications servers and 10 postgresql servers. Each app connects to each server - 150 connections per server if I run pooler on each application server. That's more than default setting and now we usually have not more than 10 connections per server. What would happen if we have 300 app servers? I thought connections consume some memory. Running pooler not on every app server gives no advantage — I still may get network blackhole and 2 seconds delay. Moreover, now I can guess that postgresql is overloaded if it does not accept connections, with pooler I can simply blow up disks with heavy io. Seriously, I don't get why running 150 poolers is easier. And my problem is still here: server (pooler is this case) is down — 2 seconds delay. 2000% slower. Where am I wrong? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Millisecond-precision connect_timeout for libpq
Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou connect_timeout_in_ms.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
If you can figure out that postgresql is overloaded then you may decide what to do faster. In our app we have very strict limit for connect time to mysql, redis and other services, but postgresql has minimum of 2 seconds. When processing time for request is under 100ms on average sub-second timeouts matter. On 5 July 2013 22:20, Tom Lane t...@sss.pgh.pa.us wrote: ivan babrou ibob...@gmail.com writes: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. What exactly is the use case for that? It seems like extra complication for something with little if any real-world usefulness. regards, tom lane -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 5 July 2013 23:26, Tom Lane t...@sss.pgh.pa.us wrote: ivan babrou ibob...@gmail.com writes: If you can figure out that postgresql is overloaded then you may decide what to do faster. In our app we have very strict limit for connect time to mysql, redis and other services, but postgresql has minimum of 2 seconds. When processing time for request is under 100ms on average sub-second timeouts matter. If you are issuing a fresh connection for each sub-100ms query, you're doing it wrong anyway ... regards, tom lane In php you cannot persist connection between requests without worrying about transaction state. We don't use postgresql for every sub-100ms query because it can block the whole request for 2 seconds. Usually it takes 1.5ms to connect, btw. Can you tell me why having ability to specify more accurate connect timeout is a bad idea? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml patch that adds information on custom format backups
Alright gentlemen here we go, take two. backup.sgml-cmd-v002.patch addresses your feedback and offers a better worded statements that avoid implying that some features are being deprecated when it isn't the case. We also spent some more time polishing other details, like making adjustments to the tone of the text so that it sounds like a manual, not a blog post, more importantly, making it clear that superuser privileges are not always required to perform a successful backup in practice as long as the role used to make the backup has sufficient read privileges on all of the objects a user is interested in. We also mention and show examples of usage for pigz alongside with gzip, and probably something else too :P Please, see the diff file attached to this message for complete and detailed log of the changes. backup.sgml-cmd-v002.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml patch that adds information on custom format backups
Hello everyone! I'd like to thank you for quick replies and for the thoughtful feedback. I'm working on improving the current wording and I'm going to follow up shortly with an updated version. Please, stay tuned. Ivan On Feb 11, 2013, at 12:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I'm not sure what others think, but the proposed wording seems a bit hard on plain text dumps to me. I wasn't terribly thrilled with labeling pg_dumpall deprecated, either. It might be imperfect for some use cases, but that adjective suggests that we're trying to get rid of it, which surely isn't the case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] backup.sgml patch that adds information on custom format backups
Hello, I'd like to submit the following patch that extends backup.sgml with a bit of practical but important information. Project: postgresql Patch filename: backup.sgml-cmd-v001.patch The patch extends backup.sgml and adds practical information on custom format backups approach. Basically, we believe that plaintext backup format is suitable for a very limited range of use cases, and that in real world people are usually better off with a custom format backup. This is what we want PostgreSQL users to be aware of and provide some hands-on examples of how to do backups using this approach. It is meant for application, and is against master branch. The patch does pass 'make check' and 'make html' successfully. PS: this is my first submission ever. So, if I'm missing something or not doing it as expected, please, do let me know. Thank you. Ivan backup.sgml-cmd-v001.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dedebugging and a functions that just don't work on debian flavour
On Sun, 28 Feb 2010 23:02:39 -0500 Robert Haas robertmh...@gmail.com wrote: On Sun, Feb 28, 2010 at 8:43 AM, Ivan Sergio Borgonovo m...@webthatworks.it wrote: I've no idea what to do next other than asking if someone can give a look to the code and check my comprehension of what a tsquery should be in memory. Have you tried valgrind? It was something related to my misinterpretation on how a tsquery is built. operand.length is the length of the cstring *excluding* the \0. I was mislead by the fact that in tsvector there are no ending \0 while in tsquery there are. It still need more extensive testing but I think I got it working. Once I'm sure I got it I'll make the polished source available and write some docs for all the poor guys like me willing to start to write extensions ;) thanks to everybody for the patience. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] dedebugging and a functions that just don't work on debian flavour
I wrote a function that turns tsvectors into tsqueries. I can't see any evident error and actually the function return the expected results but just for a strange set of queries. Otherwise it causes tsqueryout to exit with errors when the result is returned. It seems that tsqueryout is running beyond what it should and most likely it could be some dirty memory/allocation issue. (ERROR: unrecognized operator type: or ERROR: stack depth limit exceeded) The queries that succede and the one that fail belong to a pretty curious set: SUCCEDE: select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items limit 2; -- q1 select itemid, tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items; -- q2 select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items where itemid=10 or itemid=15 or itemid=27; -- q3 select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items order by random() limit 3; -- q4 FAIL: select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items limit 3; -- q5 If I compile my function with -O0 all queries succeed. Then I created a table create table tsvectors(i serial, tsv tsvector); in the same DB of catalog_items and filled it with some data. Length of tsvector is similar, just catalog_items contains more fields and records. All queries above succeeded. I then compiled from scratch postgresql with default configure options, loaded the whole DB containing catalog_items and no errors. Then I looked into configure options used by debian, copied nearly all with the exception of tcl stuff and rpath, compiled once more... tested my function and no error. I've no idea what to do next other than asking if someone can give a look to the code and check my comprehension of what a tsquery should be in memory. http://www.webthatworks.it/d1/files/ts_utilities.tar.bz2 thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] I still didn't get how tsquery is internally
I've been working on this http://pgsql.privatepaste.com/2a81432f4f for 8.3 (there should be some comments to port it to 8.4). tsvector_tsquery_size works as expected. But whatever I pass to the function I get an empty tsquery. Yet no memory allocation problem or hang etc... just an empty vector, but that's not enough to say I'm not filling distance and left with the wrong values etc... anyway. CREATE OR REPLACE FUNCTION tsvector_to_tsquery(IN tsv tsvector, op IN char(1), weights IN varchar(4), maxpos IN smallint ) RETURNS tsquery AS 'MODULE_PATHNAME' LANGUAGE C STRICT; The function should turn a tsvector into a tsquery. op is the operator (| or ) weights is a filter. lexemes wil be picked up just if they don't have a position/weight or they have one of the passed weights. maxpos will filter out all lexemes that have a positionmaxpos. So tsvector_to_tsquery('java:1A,2B tano:3C,4D', '', 'ABC', 100) - java:A java:B tano:C tsvector_to_tsquery('java:1A,2B tano:3C,4D', '|', 'ABC', 100) - java:AB | tano:C There is also a small problem. The op is always char = 0x08 (even on a much simpler function that just retrieve op and return it as a masked int64. It seems that char op = PG_GETARG_CHAR(1); is not really what I thought it to be. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] building tsquery directly in memory (avoid makepol)
I know in advance the structure of a whole tsquery, it has already been reduced and lexemes have been already computed. I'd like to directly write it in memory without having to pass through pushValue/makepol. Anyway I'm not pretty sure about what is the layout of a tsquery in memory and I still haven't been able to find the MACRO that could help me [1]. Before doing it the trial and error way can somebody just make me an example? I'm not pretty sure about my interpretation of the comments of the documentation. This is how I'd write X:AB | YY:C | ZZZ:D TSQuery vl_len_ (total # of bytes of the whole following structure QueryItems*size + total lexeme length) size (# of QueryItems in the query) QueryItem type QI_OPR oper OP_OR left - distance from QueryItem X:AB QueryItem type QI_OPR oper OP_OR left - distance from QueryItem ZZZ:D QueryItem (X) type QI_VAL weight 1100 valcrc ??? lenght 1 distance QueryItem (YY) type QI_VAL weight 0010 valcrc ??? lenght 2 distance QueryItem (ZZZ) type QI_VAL weight 0001 valcrc ??? lenght 3 distance X YY ZZZ [1] the equivalent of POSTDATALEN, WEP_GETWEIGHT, macro to compute the size of various parts of TSQuery etc... I couldn't see any place in the code where TSQuery is built in one shot in spite of using pushValue. Another thing I'd like to know is: what is going to be preferred during a scan between 'java:1A,2B '::tsvector @@ to_tsquery('java:A | java:B'); vs. 'java:1A,2B '::tsvector @@ to_tsquery('java:AB') ? they look equivalent. Are they? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] building tsquery directly in memory (avoid makepol)
On Thu, 04 Feb 2010 22:13:02 +0300 Teodor Sigaev teo...@sigaev.ru wrote: Before doing it the trial and error way can somebody just make me an example? I'm not pretty sure about my interpretation of the comments of the documentation. TSQuery [skipped] Right, valcrc is computed in pushValue Anyway the structure I posted is correct, isn't it? Is there any equivalent MACRO to POSTDATALEN, WEP_GETWEIGHT and macro to know the memory size of a TSQuery? I think I've seen MACRO that could help me to determine the size of a TSQuery... but I haven't noticed anything like POSTDATALEN that could come very handy to traverse a TSQuery. I was thinking to skip pushValue and directly build the TSQuery in memory since my queries have very simple structure and they are easy to reduce... Still it is not immediate to know the memory size in advance. For OR queries it is easy but for AND queries I'll have to loop over a tsvector, filter the weight according to a passed parameter and see how many time I've to duplicate a lexeme for each weight. eg. tsvector_to_tsquery( 'pizza:1A,2B risotto:2C,4D barolo:5A,6C', '', 'ACD' ); should be turned into pizza:A risotto:C risotto:D barolo:A barolo:C I noticed you actually loop over the tsvector in tsvectorout to allocate the memory for the string buffer and I was wondering if it is really worth for my case as well. Any good receipt in Moscow? ;) thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Sun, 31 Jan 2010 22:24:40 + Mark Cave-Ayland mark.cave-ayl...@siriusit.co.uk wrote: Ivan Sergio Borgonovo wrote: Of course I can write a script that can workaround this. It seems that the only thing missing is that pgxs 8.3 used to prefix .so with lib and then rename them at install time, but pgxs 8.4 build them directly without prefix. I'm just speculating this is the issue and it is the only one I still have to solve... but... what's going to happen next release? Wouldn't it be better if make install could install stuff where I ask so I could put modules in different places *even* for the same installation of postgres? FWIW the soon-to-be-released PostGIS 1.5 has an out of place regression testing feature that allows PostGIS to be built using PGXS and regression tested without putting anything in the PostgreSQL installation directory itself. Thanks I'll give a look as soon as possible, now I'll try to concentrate on C development. I used something surely simpler. I wrote this small script and it actually does what I need. #!/bin/bash export USE_PGXS=1; make MODULE_big=$(sed -ne '/MODULE_big/s/^MODULE_big[ \t]*=[ \t]*\([^ \t]*\)/\1/gp' Makefile) so=$(ls -1 *$MODULE_big*.so) sed -e 's#\$libdir[^''']*#'`pwd -P`'/'$so'#g' $MODULE_big.sql $MODULE_big.l.sql sed -e 's#\$libdir[^''']*#'`pwd -P`'/'$so'#g' uninstall_$MODULE_big.sql uninstall_$MODULE_big.l.sql psql psql test $MODULE_big.l.sql /* some more stuff to test functions */ And finally I have my first function working. Without other functions the extension isn't that useful yet but I think I'll be able to build something useful. Thanks for the help. Especially to RhodiumToad and klando on #postgresql I'll try to write some documentation shortly. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Sun, 31 Jan 2010 02:44:13 -0200 Euler Taveira de Oliveira eu...@timbira.com wrote: Ivan Sergio Borgonovo escreveu: I'm pretty sure that what you're pointing at is not going to work unless you specify a bunch of other parameters. Ugh? Are you saying there is something wrong in our *official* documentation? It is just a normal compilation command; if you're a C programmer you'll have no problem. Well I think if I didn't have to compile for 8.3 I'd be more naturally guided to a solution since when make install does more than just moving files, you wonder if it is doing even more. I still don't know if make install is doing something more other than moving/renaming files. Still I think the documentation doesn't provide a reasonable path to *try* to write Hello world. Actually there isn't even a suggested path that works unless you knock at pgsql-hackers and ask for details. Of course explanation on how to compile manually an extension without pgxs and a template makefile aren't sufficient. When I'd like to try something new I'd like to put myself in the most diffused, standard environment eg. one thing I'd like to avoid is choosing my own flavour of compile options. So... what's better than providing a template makefile? And there is one, and it works. But postgres makefile read module.sql.in and output module.sql. I'd consider module.sql.in part of the source of the project and I'd think it has to be relocatable without change. Now you use pgsx and it works great... You've your .so there, you look at the pattern used in the .sql, you adapt it and it still does work. Oh look! 8.3 change the .so name at make install. You adapt it once more but that makes you wonder... is make install doing something else? It would be better if: a) you document what make install is really doing or b) provided that contrib make install just move stuff around you let specify people where they would like to install the lib at make time so a sensible module.sql is produced b) looks easier to maintain and easier to use. But as said I may have a naïve idea of what really means providing a working build system for many architecture/OSes. No rocket science indeed. I'm not crying, I've already written mysimple script to just post-process module.sql. I'm describing my experience so you may be aware of the problems that new comers face. It is up to the people that will write actual code/docs to see if it is worth for them to do so. As soon as I'll feel comfortable to write something worth I'll write some docs. Once you ask... people direct you to pgxs. pgxs seems a good choice... a newcomer like me thinks... well that was made exactly to pick up all the information it needs from the environment and build my module. Again, PGXS was developed to make packagers' life easier. Why on earth I want to install my library in a path different from $libdir? Packagers don't. Besides, PGXS won't work on Windows unless you're using a installation built using MinGW. This is something like you bought a car to go from Paris to Berlin but you can't use it to go to Marseilles just because you don't have a map. pgxs and debian packages did a nearly perfect job even for my needs. Being able to compile and install an extension without a full dev environment and without being root has a non empty set of reasonable applications. Making easier to automate it out of the box doesn't look a bad thing. Sorry if it seemed I was complaining, I was trying to give feedback while learning my way to write Hello world. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] development setup and libdir
I finally put enough code together and it has some chance it could work. It's time to test it. Now I'm going to test it quite frequently. My code is written and compiled in my ~/. Then it is moved through svn to a test box where it is compiled under another version of postgres, but still it ends up far away from anything pg related. Now it would be nice to be able to test it without any modification to the source (namely the sql.in file). But the kosher way to write one and then be able to eventually package it into a deb or even make it reach contrib would be to use $libdir. I can't make install (provided it works as expected, I didn't try yet) since I sincerely hope that my user doesn't have write right on whatever is pointed by $libdir. My *nix user is also postgres superuser. One way would be to - let postgres user (or anyone) read in the dev dir - export USE_PGXS=1; export someunknownparam=`pwd`; make So that the resulting module.sql will point to the dev dir. Does that someunknownparam exist? 2nd choice would be to write a bash script that just - make - sed -e 's/\$libdir/'`pwd`'/g' module.sql module.l.sql - sed -e 's/\$libdir/'`pwd`'/g' uninstall_module.sql uninstall_module.l.sql - psql sometestcode.sql Not that bad... but if that someunknownparam existed it would be nicer and avoid hard coding the module name in 2 places (Makefile and bash script)[1] [1] OK I know I can parse the Makefile with sed... ;) thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Sat, 30 Jan 2010 11:06:02 -0500 Tom Lane t...@sss.pgh.pa.us wrote: Ivan Sergio Borgonovo m...@webthatworks.it writes: ... I can't make install (provided it works as expected, I didn't try yet) since I sincerely hope that my user doesn't have write right on whatever is pointed by $libdir. Why not? In order to install a C function, you have to be superuser, which means you already have the keys to the kingdom database-wise. There is no reason to not give a developer of C functions write access on $libdir. He could do whatever he wanted inside the C code anyway. It's not just about security, it's about maintenance. Anyway I may have more than one instance of pg running but all will point to the same lib directory and anyway if I'm going to use distribution package I don't want things that mix (and may be overwritten). Still I should give write access to myself as a developer to a system directory I really shouldn't have access too (and on Debian it should be root:root if I'm remembering right). Anyway I'm just getting acquainted with pg build system and today I just learnt a lot on #postgresql. Again... as soon as I'll have something working, I'll post what I've learnt from the experience somewhere. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Sat, 30 Jan 2010 11:06:02 -0500 Tom Lane t...@sss.pgh.pa.us wrote: Ivan Sergio Borgonovo m...@webthatworks.it writes: ... I can't make install (provided it works as expected, I didn't try yet) since I sincerely hope that my user doesn't have write right on whatever is pointed by $libdir. Why not? In order to install a C function, you have to be superuser, which means you already have the keys to the kingdom database-wise. There is no reason to not give a developer of C functions write access on $libdir. He could do whatever he wanted inside the C code anyway. It is becoming a more serious issue than what I thought... Debian install everything in /usr/lib/postgresql/8.3/lib/ -rw-r--r-- 1 root root so definitively it would be hard to write there. Still I'd like to keep a standard installation of Debian and just compile my extension somewhere else where the postgres user can read. I just discovered that depending on the postgres version .so are created with different names (module.so vs libmodule.so) and then renamed. Since I'd like to use as much as possible of all the magic that pgxs and provided makefile do... and I may discover that the renaming is not the only change between versions, I'd prefer all this magic happens using make install. I've read I can use prefix. From my understanding you can set prefix from configure but: - maybe you've just installed debian -dev package - running configure for each extension you'd like to place in a different path is just a pain especially if you would really like to keep configuration identical with the exception of the install dir I've tried to export USE_PGSX=1; make install /mydir export USE_PGSX=1; make install prefix=/mydir with no success. make still try to copy stuff in /usr/lib/postgresql/... What am I supposed to do to install modules where I want? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Sat, 30 Jan 2010 16:51:44 -0500 Andrew Dunstan and...@dunslane.net wrote: Ivan Sergio Borgonovo wrote: It is becoming a more serious issue than what I thought... Debian install everything in /usr/lib/postgresql/8.3/lib/ -rw-r--r-- 1 root root so definitively it would be hard to write there. Still I'd like to keep a standard installation of Debian and just compile my extension somewhere else where the postgres user can read. [snip] What am I supposed to do to install modules where I want? pgxs is about building somrthing that will install in the configured locations. If you don't want to install in the configured locations don't build/install with pgxs. For development purposes you would be far better off building a private version of postgres (with configure --prefix=/path) and using its pgxs to build, install and test your module. That's pretty expensive. I mean... I just would like my .so end up with the expected name somewhere else. I can put up a working postgres installation just using aptitude and having a semi-functional[1] dev environment installing postgres -dev package. Requiring I compile a private version of postgres[1] increase the cost of development unreasonably, considering that what's missing is being able to provide install dir as a parameter during make install I'm absolutely aware I can't ask features unless someone is willing to implement them or is paid for... but the easier/cheaper it is to build up a dev/test environment the more people will try to build/test something for postgres. Of course I can write a script that can workaround this. It seems that the only thing missing is that pgxs 8.3 used to prefix .so with lib and then rename them at install time, but pgxs 8.4 build them directly without prefix. I'm just speculating this is the issue and it is the only one I still have to solve... but... what's going to happen next release? Wouldn't it be better if make install could install stuff where I ask so I could put modules in different places *even* for the same installation of postgres? I'll write my script. I can do my homework and I'm not expecting someone else make using pgxs nicer but still it doesn't look an unreasonable feature. I mean... what's so weird about being able to develop postgres modules without requiring people build the whole postgresql or switching back and forward from root? And... it's really a pity... I mean... I'm so near to build and test modules just doing aptitude install postgresql postgresql-server-dev-8.4 [1] that's not just compile, you've to set it up, make it start at boot, make it work on a different port... There is already a lot of work made by distribution packagers. Why should it be wasted? [2] semi-functional... because I can't actually install modules with the provided tools where I'd like to have them. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Sat, 30 Jan 2010 18:32:58 -0500 Robert Haas robertmh...@gmail.com wrote: On Sat, Jan 30, 2010 at 5:36 PM, Ivan Sergio Borgonovo m...@webthatworks.it wrote: For development purposes you would be far better off building a private version of postgres (with configure --prefix=/path) and using its pgxs to build, install and test your module. That's pretty expensive. How? I rebuild the backend five or ten times a day when doing PG work. Doesn't seem like a big deal. I'm not scared about compile time. But considering that what it's really missing between what I need and what I get is renaming a file... it's just a pain I've to set up a whole new instance of postgres, install the whole source etc... I think most of you are core developers or people that really invest and have invested a reasonably large % of your time in postgres development. It makes sense to have a complete development environment[1]. But well again... considering I'm a rename away from being able to compile and test an extension with my user privileges... it's just a pity it doesn't work that way out of the box. Another scenario could be I could just svn update on a staging box, compile there and then move everything to production easier. Not every shop is a multimillion dollars enterprise that can afford a dev/staging/production box for every version of postgres it is using. If you don't need to squeeze out every cpu cycle in a production box you may be happy with a pre-packaged postgres. I admit my approach may be naïve considering the build system has to work on many architecture/OSes... so the problems that have to be solved just to install somewhere else may be more complicated but still I think my need isn't that weird and could lower the entry point for many developers quite a bit. I've spent some time greping through makefile to see if I could propose a patch. From my point of view passing a parameter at make install time would make things much better. Another thing I had to tweak was the path in in.sql (that one should be taken from a parameter too). Ideally once you write the source code... where/how you're going to use it should just depend on parameters passed at make time. Now my main concern is making my C code work in a reasonably decent development environment. I hope if I'll ever succeed to take this project to an end before being forced to take care of other stuff, my code or my documented experience will come useful to others. Trying to understand how pgxs works may be my next effort, right now I'll use a workaround since just being able to build and load my modules wherever I want is going to make *my* development experience much manageable. I still think that *my* need is not that weird. Now let's see if I can come up with a useful module. At least one other user of postgres has shown interest on what I was trying to do on pgsql-general. Next step in my postgres awareness may be using peg. Thanks Greg. [1] and yeah I'll need dbg symbols but that's going to happen later. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Sat, 30 Jan 2010 22:25:32 -0200 Euler Taveira de Oliveira eu...@timbira.com wrote: Ivan Sergio Borgonovo escreveu: That's pretty expensive. Ugh? I mean... I just would like my .so end up with the expected name somewhere else. It's just two command lines [1]. Consider I'm absolutely new to postgres C development. I'm pretty sure that what you're pointing at is not going to work unless you specify a bunch of other parameters. I've to count the time I've to spend to learn what is needed and and whenever I'll stumble into a bug/unexpected behaviour I'll have to take into account the time I'll spend wondering if I did miss to pass any parameter that caused that behaviour. Then I'm not aware of all the magic that pgxs and makefiles are doing. Once you ask... people direct you to pgxs. pgxs seems a good choice... a newcomer like me thinks... well that was made exactly to pick up all the information it needs from the environment and build my module. And it works and it is simple. Wouldn't it be better if make install could install stuff where I ask so I could put modules in different places *even* for the same installation of postgres? Why do you want to mess your installation? Besides, you will need to add the install path in dynamic_library_path and or the absolute path to your CREATE FUNCTION script. Man, that's too much work for a packager! Of course, you can always build on your way; that's up to you. I mean... what's so weird about being able to develop postgres modules without requiring people build the whole postgresql or switching back and forward from root? Ugh? You don't need root; I mean, you can always set absolute path or even the dynamic_library_path [2] parameter. Absolute path requires that *the code* has to be different for every installation. dynamic_library_path looks more interesting if it spans just a connection, but still different version of pgxs gives different names to .so. If the absolute path could be set as a parameter passed to make that would be better. Anyway pg 8.3 make install change the name of libraries, so passing the absolute path depends on the version on which you're building. So it's not enough to write a post processing script that sed the install.sql. As said I'm able to write a script that will check if libmodule.so or module.so were generated and then fix the instal.sql[1], but still that's a workaround that may break in future versions... maybe postgres 9.0 will build libraries in a subdirectory of the source and then install will rename and move them with a different pattern. I think the most compelling reason for building postgres from source for people that would like to occasionally develop modules is having debug symbols. Maybe my point of view really sound alien to many in the hacker list since the learning curve to develop in C with postgres seems pretty selective so people here already decided to invest pretty much of their time into postgres... but many people may just would like to try. There are several shades of gray between screwing my system postgres and test using a private installation. That's exactly why I'd like to keep my modules out of the system dir but still use the system postgres. I'm developing my module to see if it can serve a need of another project. What should I do then? make a private installation of apache and php as well? Develop my module on a private instance then deploy it in my test env, with system postgres for the other project then move it to the staging box... Costs add up, and right now I'd just like to try if I can come up with something useful. [1] that's exactly what I'm going to do. When I'll have a working module I'll see if that's enough for me or I want to do more complex work and I need a more advanced development environment. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] helpers to convert C types to postgres types (Array)
I'm still trying to collect all the bits to be able to read and return several types of data in C functions. I'm looking for quick ways to deal with ArrayType. I'd expect some helper because these kind of operation should be frequent and without any helper (function/macro) they really make the code awful. Generally you work with C types that later you've to convert to Postgres types. So for example you may have an array of int2 that then you've to place into an ArrayType. I think there are 3 kinds of operation you may have to do: 1 You may have an already formed C array type and you'd just copy it into an ArrayType 2 You may know the # of elements of the C array type but you're filling it an element at a time 3 You don't know the number of elements in the array in advance so you'd like to append to the ArrayType one element at a time 1 seems to require - creating an array of Datum - looping over the C array - assign to each Datum element the converted C value - construct_array the Postgres array That's a pain. Any other way? macro? 2 Seems the easiest 3 ??? Is there any function in postgres that let you append elements to an ArrayType? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] returning array in a field together with other types
I'm trying to return a set of record as: (text, int[]), (text, int[]), ... row from C and I really would like to avoid to use BuildTupleFromCString So this is how I think my function should end... char *curout; /* cstring */ int4 *pos; ... get_typlenbyvalalign(INT4OID, typlen, typbyval, typalign); v_pos = construct_array( pos, npos, INT4OID, typlen, typbyval, typalign ); values[0] = GET_TEXT(curout); /* correct? */ values[1] = PointerGetDatum(v_pos); /* */ resultTuple = heap_form_tuple(inter_call_data-tupd, values, nulls); result = HeapTupleGetDatum(resultTuple); SRF_RETURN_NEXT(fctx, result); ... I've seen that GET_TEXT(curout) just call textin. I wonder if other GET_ macro aren't really converting to text representation and then back to Datum... so there is no real difference with using BuildTupleFromCString. BTW could I pass static values to construct_array since they are basic type array so the size etc... should be known at compile time, inspite of having to call get_typlenbyvalalign? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Mon, 25 Jan 2010 16:36:46 -0600 Kevin Grittner kevin.gritt...@wicourts.gov wrote: Ivan Sergio Borgonovo m...@webthatworks.it wrote: The README files might be a good place to start, then browse code. Is there a book? The more I read the source and the few info about it, the more I have questions that should have been answered by documenting the function or data structure in spite of looking for some code that use it and see if I can infer what is expecting, what should be the best context to use it in, if there are better candidates to do the same thing etc... - example of returning text (especially if size of outin) - what should I take care of for encoding (what's really inside a tsvector, text, cstring)? src/backend/utils/fmgr/README I just learned there is a return all row mode for returning set functions: There are currently two modes in which a function can return a set result: value-per-call, or materialize. In value-per-call mode, the function returns one value each time it is called, and finally reports done when it has no more values to return. In materialize mode, the function's output set is instantiated in a Tuplestore object; all the values are returned in one call. Additional modes might be added in future. There is no example of a function working in this mode. I'd guess it should be suited for quick operation and small return. But... what should be considered quick and small? When someone should use a row at a time function and a return all row function? I haven't been able to understand the difference between function returning cstring and text and if there is any need to be careful about encoding and escaping when copying from the lexeme to a buffer that will return a cstring or text. Is there any difference from function returning text and function returning cstring? Can I make direct changes to input parameters and return pointers to internal parts of these structures? Or should I always allocate my char*, make the modification there and then return my copy? Is there any operation that should take care of encoding when dealing with cstring or text? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Wed, 27 Jan 2010 14:44:02 +0100 Martijn van Oosterhout klep...@svana.org wrote: On Wed, Jan 27, 2010 at 02:14:36PM +0100, Ivan Sergio Borgonovo wrote: I haven't been able to understand the difference between function returning cstring and text and if there is any need to be careful about encoding and escaping when copying from the lexeme to a buffer that will return a cstring or text. Well, the difference is that one is a cstring and the other is text. Seriously though, text is more useful if you want people to be able to use the result in other functions since on SQL level almost everything is text. cstring is needed for some APIs but it generally not used unless necessary. I didn't get it. Maybe I really chose the wrong function as an example (tsvectorout). What's not included in on SQL level almost everything is text? There are a lot of functions in contrib taking cstring input and returning cstring output. Are they just in the same special class of [type]in, [type]out [type]recv... functions? I've to re-read carefully http://www.postgresql.org/docs/8.4/static/xfunc-c.html since I discovered there may be explanations about text buffers etc... I discover there is a cstring_to_text function... and a text_to_cstring_buffer too... let me see if I can find something else... thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Wed, 27 Jan 2010 21:41:02 +0800 Craig Ringer cr...@postnewspapers.com.au wrote: I don't code on PostgreSQL's guts, so I'm perhaps not in the best position to speak, but: - Documentation has a cost too, particularly a maintenance cost. Outdated docs become misleading or downright false and can be much more harm than good. So a reasonable balance must be struck. I'm not saying PostgreSQL is _at_ that reasonable balance re its internal documentation, but there is such a thing as over-documenting. Writing a small book on each function means you have to maintain that, and that gets painful if code is undergoing any sort of major change. I'd be willing to pay for a book. - It's easy to say should when you're not the one writing it. Personally, I try to say hey, it's cool that I have access to this system and isn't it great I even have the right to modify it to do what I want, even though the learning curve _can_ be pretty steep. Well... I tend to generally make available to others everything I learn. I'd be nice a more advanced use of doxygen so it would be easier to have a map of functions. Hey, you could contribute yourself - patch some documentation into those functions where you find that reading the source isn't clear enough, and they really need a see also or called from comment or the like. Right now I've not enough knowledge to hope my notes get into the source code. Once I've a working piece of code I'll put the information I gathered in the process on my web site and if someone find them worth for a better place I'll release them with a suitable license. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Wed, 27 Jan 2010 11:49:46 -0300 Alvaro Herrera alvhe...@commandprompt.com wrote: There are a lot of functions in contrib taking cstring input and returning cstring output. Are they just in the same special class of [type]in, [type]out [type]recv... functions? Probably, but I didn't check. Does this nearly translate to: nothing you should care about right now and anyway just functions that won't return results to SQL? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Wed, 27 Jan 2010 10:10:01 -0500 Andrew Dunstan and...@dunslane.net wrote: There are quite a few SRF functions in the code. Look for example in contrib/hstore/hstore_op.c for some fairly simple examples. SRFs are quite capable of returning huge resultsets, not just small ones. Example code for matrerialize mode can be found in the PLs among other places (e.g. plperl_return_next() ) I'm more interested in understanding when I should use materialized mode. eg. I should be more concerned about memory or cpu cycles and what should be taken as a reference to consider memory needs large? If for example I was going to split a large TEXT into a set of record (let's say I'm processing csv that has been loaded into a text field)... I'd consider the CPU use light but the memory needs large. Would be this task suited for the materialized mode? Is there a rule of thumb to chose between one mode or the other? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Wed, 27 Jan 2010 17:37:23 +0200 Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Currently, there's no difference in terms of memory needs. The backend always materializes the result of a SRF into a tuplestore anyway, if the function didn't do it itself. There has been discussion of optimizing away that materialization step, but no-one has come up with an acceptable patch for that yet. There probably isn't much difference in CPU usage either. On Wed, 27 Jan 2010 10:34:10 -0500 Andrew Dunstan and...@dunslane.net wrote: If you don't know your memory use will be light, use materialized mode. For small results the data will still be in memory anyway. The Tuplestore will only spill to disk if it grows beyond a certain size. I keep on missing something. Considering all the context switching stuff, the first call initialization etc... I'd expect that what's collecting the result was going to push it downstream a bit at a time. What's actually happening is... the result get collected anyway in a Tuplestore. But then... why do we have all that logic to save the function context if anyway it is more convenient to process everything in one run? It's a pain to save the context just to save a pointer inside a structure, it would be more convenient to just process all the structure and return it as a Tuplestore in one pass. BTW thanks to Greg for pointing me once more to Joe Conway's tutorial. When I read it the first time I wasn't in a condition to take advantage of it. Now it looks more useful. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Wed, 27 Jan 2010 22:06:43 +0200 Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Ivan Sergio Borgonovo wrote: But then... why do we have all that logic to save the function context if anyway it is more convenient to process everything in one run? It's a pain to save the context just to save a pointer inside a structure, it would be more convenient to just process all the structure and return it as a Tuplestore in one pass. When the set-returning-function feature was written originally, years ago, the tuple at a time mode did really work tuple at a time. But it had issues and was axed out of the patch before it was committed, to keep it simple. The idea was to revisit it at some point, but it hasn't bothered anyone enough to fix it. It's basically not implemented yet. Summing it up: 1) tuple at a time theoretically should be better and future proof once someone write the missing code but the code is still not there 2) practically there is no substantial extra cost in returning tuple at a time Is 2) really true? It seems that materialized mode is much simpler. It requires a lot less code and it doesn't force you to save local variables and then restore them every time. So does it still make sense to get an idea about when the returned data set and complexity of computation really fit value_per_call or materialized mode? What could happen between function calls in value_per_call? Would still value_per_call offer a chance to postgresql/OS to better allocate CPU cycles/memory? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] C function accepting/returning cstring vs. text
I'm having a hard time trying to understand how everything should be done in C extensions. Now I'm trying to understand how and when I should accept/return cstring vs text and if and how I should take care of any encoding. I'm trying to output the content of a tsvector as a set of record pos, weight, text|cstring I've found tsvectorout that should be enough educational. But I still miss details about when/why use cstrings vs. text and encoding. Is there any part of the code from where I could learn about: - memory allocation for both cstring and text - example of returning text (especially if size of outin) - what should I take care of for encoding (what's really inside a tsvector, text, cstring)? I don't know if pgsql-hackers is the right place to ask and if this kind of questions are considered homework but without any reference and knowledge of the code base it's pretty hard even to find prototype/sample code. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Cstring vs. Datum values ( BuildTupleFromCStrings vs. BlessTupleDesc)
Hi, I'm trying to write a couple of C function that will: - return a tsvector as a set of record. - turn a tsvector into a tsquery I'm looking at the example function to return set of records http://www.postgresql.org/docs/8.4/static/xfunc-c.html#AEN44970 and to this: There are two ways you can build a composite data value (henceforth a tuple): you can build it from an array of Datum values, or from an array of C strings that can be passed to the input conversion functions of the tuple's column data types. In either case, you first need to obtain or construct a TupleDesc descriptor for the tuple structure. When working with Datums, you pass the TupleDesc to BlessTupleDesc, and then call heap_form_tuple for each row. When working with C strings, you pass the TupleDesc to TupleDescGetAttInMetadata, and then call BuildTupleFromCStrings for each row. In the case of a function returning a set of tuples, the setup steps can all be done once during the first call of the function. And I can't understand if I can avoid transforming everything into its text representation: snprintf(values[0], 16, %d, 1 * PG_GETARG_INT32(1)); snprintf(values[1], 16, %d, 2 * PG_GETARG_INT32(1)); snprintf(values[2], 16, %d, 3 * PG_GETARG_INT32(1)); And if I can... how, why and when... because I didn't find any clear example in the source tree that gives me a clue about when I'd use one form or the other. Up to my understanding the way to go to use Datum is: /* create a template tupledesc specifying numbers of columns (eg.2) */ tupdesc = CreateTemplateTupleDesc(3, false); /* ??? bool hasoid */ /* configure the fields */ TupleDescInitEntry(tupdesc, (AttrNumber) 1, word, TEXTOID, -1, 0); /* ?? Oid oidtypeid, // where can I get a list of OID int32 typmod, int attdim) */ /* ??? */ funcctx-tuple_desc = BlessTupleDesc(tupdesc); funcctx-attinmeta = TupleDescGetAttInMetadata(tupdesc); -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Cstring vs. Datum values ( BuildTupleFromCStrings vs. BlessTupleDesc)
On Sat, 23 Jan 2010 15:54:04 + Greg Stark gsst...@mit.edu wrote: On Sat, Jan 23, 2010 at 12:56 PM, Ivan Sergio Borgonovo m...@webthatworks.it wrote: And if I can... how, why and when... because I didn't find any clear example in the source tree that gives me a clue about when I'd use one form or the other. There are a few contrib modules which make good examples, you could look at contrib/pageinspect/heapfuncs.c:heap_page_items() for example. That confused me further since it seems a 3rd technique to return set of records. One way is given for example in: ./src/backend/utils/adt/tsvector_op.c that uses BlessTupleDesc another that seems to be more frequent is given in the docs: http://www.postgresql.org/docs/8.4/static/xfunc-c.html#AEN44970 and finally the one you gave as an example I can't understand when each one should be used and I can't understand why so many ways to return a tuple... not to mention I've to reverse engineer many of the parameters of the function involved. Could someone give me a clue about when it is more suitable to use all the above techniques? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simple row serialization?
On 27/01/2008, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On Sat, Jan 26, 2008 at 07:32:35PM +0100, Ivan Voras wrote: Andrew Dunstan wrote: [...] Or you could use an hstore (see contrib). Doesn't seem applicable. Have a closer look: it might :-) I found these documents on hstore: http://developer.postgresql.org/pgdocs/postgres/hstore.html http://www.sai.msu.su/~megera/wiki/Hstore From them, it seems hstore is another datatype (as in: 'a=b'::hstore), which sounds good (though if it was me I'd have picked a different name for it, like dict or hashmap :) ) for storing both field names and their values, but I don't see a way to convert a row/record passed to a trigger to a hstore. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Simple row serialization?
Hi, I'd like to implement some simple data logging via triggers on a small number of infrequently updated tables and I'm wondering if there are some helpful functions, plugins or idioms that would serialize a row (received for example in a AFTER INSERT trigger) into a string that I'd store in the log table. There's a limited number of field types involved: varchars, integers and booleans. I'm not looking for anything fancy, comma-separated string result will be just fine; Even better, something like a dictionary (field_name:field_value,...) would be nice. The reason for trying to do it this way is that I don't want to create separate log tables for every table I wish to log. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Simple row serialization?
Pavel Stehule wrote: Hello, use plperl. I'd like something more light-weight to reduce complexity of deployment. Something in pgplsql would be ideal. Is there a way to simply iterate over the fields of a row and retrieve field names and values? PostgreSQL hasn't simple tool for it. Maybe this link will be usefull for you http://www.ciselant.de/projects/pg_ci_diff/ Thanks, this is very interesting work! It's an overkill for my current needs but I'll keep it in mind. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Simple row serialization?
Andrew Dunstan wrote: Why not create the audit tables with composite types rather than strings? create table audit_foo_table as (who text, when timestamptz, old foo, new foo); Because this would lead to having a log/shadow/audit table for every table I wish to log. (or is there an opaque generic row data type? record and any generate syntax errors). Or you could use an hstore (see contrib). Doesn't seem applicable. Or you could possibly use some of the XML support in 8.3 for serialization. I need this for 8.1 :) This is a usage question, so it really doesn't belong on -hackers. Thank you - I'm reading the list through gmane and I didn't notice its name gmane.comp.db.postgresql.devel.general is incorrect. I'll find a more suitable list. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Simple row serialization?
Tom Lane wrote: Pavel Stehule [EMAIL PROTECTED] writes: On 26/01/2008, Tom Lane [EMAIL PROTECTED] wrote: Although this won't work at the SQL level in 8.1, I think you might be able to accomplish the equivalent within plpgsql by assigning the rowtype value to a text variable. you lost names :(. And datatype info too; but AFAICT that's exactly what the OP wants. Thanks! Having names would be great, but this is sufficient for now. (I've tested it and it works!). signature.asc Description: OpenPGP digital signature
[HACKERS] Google SoC idea: FTS support in GUI tools
Hello, There are many users (especially novice) asking about better and easier configuration of tsearch2, which is quite complicated currently. Even new FTS features in 8.3 (e.g. much clearer configuration with SQL commands) won't help much for users that want simple several clicks setup of a small system with full text indexing. Adding support of nice user-friendly FTS configuration in popular PostgreSQL GUI administration tools would help a lot to spread PostgreSQL among wide group of potential users. Also certain GUI features can be helpful even for experienced users. That is why I would like to propose idea to implement FTS configuration functionality in GUI tools in a frame of Google SoC 2007. Basic plan of the work is to code all interfaces and debug ideas firstly in phpPgAdmin and then re-implement ready solutions in pgadmin3. I consider below features as most useful and would like to hear any comments what to change/add to this list. - interfaces to add/modify FTS configurations, dictionaries and tokens with their processing rules (with drop down menus to exclude typos and other useful features) - user-frienly visualisation of ts_debug() to simplify understanding of what processing happened with the input text and swtiches of configuration against which we tested the input - FTS operators in usual phpPgAdmin table select interface - online query rewriting (special table to store rewrite rules and test if given rule works as expected); this is especially helpful because does not require reindexing - possible interface to index statistics and heuristics to understand if its healthy (to be discussed with Oleg) - user-friendly index creation (tsvector column + FTS index + trigger on parent text field) - ability to check if FTS configured properly. This includes: * some knowledge contained in the interface to perform tests * possibly several FTS core features like check if dictionary files exist and are accessible (will be proposed before 8.3 feature freeze, to be discussed with Oleg) * ability to reset configuration and re-parse all dictionary files - all user actions should generate on demand valid SQL suitable for copy-paste into other clients - possibly implement simple PHP interface for text searches with themes support to simplify building of FTS-enabled web-interfaces (just copypaste this PHP code and one line of HTML form code to your site and switch on FTS on given table in phpPgAdmin) Open questions: - what to change/add to this list? - should we support old tsearch2 configuration for backward compatibilty with versions prior to 8.3? This seriously increases amount of work to be done. - should we analyze FTS configuration experience and interfaces existing in $$$ databases? One of the advantage of this work would be independance from PostgreSQL 8.3 release that does not fit into SoC schedule. It means that one does not need to wait till next PostgreSQL release since SoC results can be released just after it finishes with nearest phpPgAdmin release. Any feedback is welcome. Best regards, Ivan Zolotukhin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Fwd: pg_dump VS alter database ... set search_path ...
Tom, Can you please suggest a good practice how to propagate such DB settings into dumps? I also suffer from this: my DB currently have 5 schemas and application strongly depends on the search_path. I cannot dump whole cluster, I need only 1 specific database. At this moment I use ugly solution and store search_path setting as per-user settings in my secondary databases. Solution of Nikolay, being improved for backward compatibility (additional switch for pg_dump to include alter database statements with these settings into sql dump generated) would fit me perfectly. But unfortunately you're not constructive in your critics here and do not propose a way to solve the problem, only saying that this (very useful and awaited option!) is ugly. With approach like this the community will wait for the solution for ages. :-( On 10/9/06, Tom Lane [EMAIL PROTECTED] wrote: Nikolay Samokhvalov [EMAIL PROTECTED] writes: What is the reason to not include database settings (like search_path) to database dump created with pg_dump -C? Duplication of code and functionality with pg_dumpall. I'd want to see some thought about how to resolve that, not just a quick copy-some-code- from-pg_dumpall-into-pg_dump. You also need to explain why this issue should be treated differently from users and groups ... a dump won't restore correctly without that supporting context either. I have no objection to rethinking the division of labor between the two programs, but let's end up with something that's cleaner not uglier. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] select function search_path problem/bug?
hi, i have a problem with selecting function : db=# select auxilium.exists('arx.mods', 'r'); exists t (1 row) db=# select exists('arx.mods', 'r'); ERROR: syntax error at or near 'arx.mods' at character 15 db=# show search_path; search_path --- auxilium, pg_catalog, arx (1 row) why i have to give namespace name, when i have it already in search_path ? (its in pg 7.4.3 and 8.0) thanks ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] query INSERT OR REPLACE
Hi In SQLite or MySQL there is a statement INSERT OR REPLACE , is something like this in postgres , or could be ? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] process num.
hi where can i set min/max number of process which are waiting for connections from clients ? bye ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] create type with like
hi, we can create table using LIKE syntax , so i think we could create type with LIKE, its quite this same, ?? bye ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] md5 function
can you create second md5 function like TEXT md5(TEXT string, VARCHAR salt); ? , This is using in shadow passwords and give more combination of this same password. thanks, bye ivan ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] lib for clients
but on postgres ftps mirrors are only full-packet, -base, -dosc, -opt, and -test , where can be -devel or -libs ? ( use -base ? ) On Mon, 1 Mar 2004, Martin Marques wrote: El Lun 01 Mar 2004 11:11, ivan escribió: hi, is there same packets (or sources to compile) only for client-systems ( headers and libs like libpq, and so on) ? On a normal Linux distribution, you would have packages like this: postgresql - PostgreSQL client programs and libraries. postgresql-contrib - Contributed source and binaries distributed with PostgreSQL postgresql-devel - PostgreSQL development header files and libraries. postgresql-docs - Extra documentation for PostgreSQL postgresql-jdbc - Files needed for Java programs to access a PostgreSQL database. postgresql-libs - The shared libraries required for any PostgreSQL clients. postgresql-odbc - The ODBC driver needed for accessing a PostgreSQL DB using ODBC. postgresql-perl - Development module needed for Perl code to access a PostgreSQL DB. postgresql-python - Development module for Python code to access a PostgreSQL DB. postgresql-server - The programs needed to create and run a PostgreSQL server. postgresql-tcl - A Tcl client library, and the PL/Tcl procedural language for PostgreSQL. postgresql-test - The test suite distributed with PostgreSQL. postgresql-tk - Tk shell and tk-based GUI for PostgreSQL. Hope this helps you. -- 11:19:02 up 96 days, 17:28, 7 users, load average: 0.62, 0.65, 0.89 - Martín Marqués| select 'mmarques' || '@' || 'unl.edu.ar' Centro de Telematica | DBA, Programador, Administrador Universidad Nacional del Litoral - ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] lib for clients
hi, is there same packets (or sources to compile) only for client-systems ( headers and libs like libpq, and so on) ? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] operators +/- with oids
why there are no operators +-*/ to oid data type ? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] time format
there can be same problems with back convering, but its would be only when select is going out, to cstring form, and all its depend on user, but this is ok :) On Mon, 5 Jan 2004, Karel Zak wrote: On Sat, Jan 03, 2004 at 10:20:57PM +0100, ivan wrote: ok, bat each time where i want to do select .. a nie tu use to_char, but it should be in function timestamp_out to convert time to string it would be easer and faster. I don't think it will too much faster :-) BTW, for example the Oracle allows to define default date/time output format by same way as for to_char(). TODO (?): SET TIMESTAMP_FORMAT = 'MM/DD/ HH24:MI:SS'; SELECT 'now'::timestamp; timestamp 01/05/2004 10:25:01 But it require check (be sure) that defined format is possible without problems convert back from string to timestamp. For this Thomas didn't like this idea. I think dynamic timestamp format is final solution of all problems with PostgreSQL date/time formats. Comments? Karel On Sat, 3 Jan 2004, Kurt Roeckx wrote: On Sat, Jan 03, 2004 at 09:25:14AM +0100, ivan wrote: but what about default style ? first time when i saw DateStyle i thought that i can use it like C/C++ function strftime. I would be not bad idea to have custom data style :) Use to_char() function to put it in any format you want. Kurt ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Karel Zak [EMAIL PROTECTED] http://home.zf.jcu.cz/~zakkr/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] time format
i know, but i talking about default time output, it would be for time, date and date with time, not formating all the time. On Sun, 4 Jan 2004, Tom Lane wrote: Martin Marques [EMAIL PROTECTED] writes: Look deeper into what Christopher said and use casting to get the right output: prueba= select now()::timestamp(0); There's also current_timestamp(0), which is a more standards-compliant way of doing the same thing. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] time format
but what about default style ? first time when i saw DateStyle i thought that i can use it like C/C++ function strftime. I would be not bad idea to have custom data style :) On Fri, 2 Jan 2004, Christopher Kings-Lynne wrote: Create table with type TIMESTAMP(0) Chris ivan wrote: how can i change default time format because now i have for example 2004-01-01 16:51:46.995927 but i want only 2004-01-01 16:51:46, with out millisec. a tryed with Data-Style but there arent custom style :/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] time format
ok, bat each time where i want to do select .. a nie tu use to_char, but it should be in function timestamp_out to convert time to string it would be easer and faster. On Sat, 3 Jan 2004, Kurt Roeckx wrote: On Sat, Jan 03, 2004 at 09:25:14AM +0100, ivan wrote: but what about default style ? first time when i saw DateStyle i thought that i can use it like C/C++ function strftime. I would be not bad idea to have custom data style :) Use to_char() function to put it in any format you want. Kurt ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] time format
how can i change default time format because now i have for example 2004-01-01 16:51:46.995927 but i want only 2004-01-01 16:51:46, with out millisec. a tryed with Data-Style but there arent custom style :/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] cache in plpgsql
may be postgres can use same way like triggers working, and when relation is droping ( what is equal to delete from pg_class) there could be something like trigger after .. which can waiting for CREATE or DROP command, and then clean-up cache, (for each backend). This could be for example same message, not just trigger (i said trigger only to show scheme of acction) ehheh this idea is also wrong ? On Wed, 31 Dec 2003, Jan Wieck wrote: ivan wrote: why all backend can not using one cache, which would be always Variable sized shared memory with garbage collection for SPI plans? in real state ... or i can just clear only my cache, at first (if i know that this relation could has another oid) and then normal using relations ? As said, that is not sufficient. The user who does the DDL statement can as well reconnect to the database to recompile all saved plans. It is the 200 persistent PHP DB connections that suffer from not finding the index any more. or ... just turn off cache, because its strange to has possible using drop, create etc in function, but using only EXECUTE .. Do you have any numbers about how that would affect performance? Jan there must be same solution .. no ? On Wed, 31 Dec 2003, Jan Wieck wrote: ivan wrote: as new know plpgsql has special cache which remember too long (event non-existing tables (i mean old oid)) so i suggest to create same function only in plpgsql which would clear this cache, or sth like this ? for ie, where i drop table i would do plpgsql_clear_cache (); and when i will create one more time table with this same name plpgsql will not remeber wrong oid possible ? You obviously did not bother to search the archives on this. This will not solve the problem since the cache you're talking about is per backend local memory. So if one backend modifies the schema, how does it cause all other to forgt? Since the same problem exists in general for everything that uses SPI, the solution lies in there. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] cache in plpgsql
why all backend can not using one cache, which would be always in real state ... or i can just clear only my cache, at first (if i know that this relation could has another oid) and then normal using relations ? or ... just turn off cache, because its strange to has possible using drop, create etc in function, but using only EXECUTE .. there must be same solution .. no ? On Wed, 31 Dec 2003, Jan Wieck wrote: ivan wrote: as new know plpgsql has special cache which remember too long (event non-existing tables (i mean old oid)) so i suggest to create same function only in plpgsql which would clear this cache, or sth like this ? for ie, where i drop table i would do plpgsql_clear_cache (); and when i will create one more time table with this same name plpgsql will not remeber wrong oid possible ? You obviously did not bother to search the archives on this. This will not solve the problem since the cache you're talking about is per backend local memory. So if one backend modifies the schema, how does it cause all other to forgt? Since the same problem exists in general for everything that uses SPI, the solution lies in there. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Happy New Year ! :) and about gethostbyaddr
i didnt find gethostbyaddr, could be this in default implementation ? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] cache in plpgsql
but , all in all, do you think that now it is ok ? On Wed, 31 Dec 2003, Jan Wieck wrote: ivan wrote: why all backend can not using one cache, which would be always Variable sized shared memory with garbage collection for SPI plans? in real state ... or i can just clear only my cache, at first (if i know that this relation could has another oid) and then normal using relations ? As said, that is not sufficient. The user who does the DDL statement can as well reconnect to the database to recompile all saved plans. It is the 200 persistent PHP DB connections that suffer from not finding the index any more. or ... just turn off cache, because its strange to has possible using drop, create etc in function, but using only EXECUTE .. Do you have any numbers about how that would affect performance? Jan there must be same solution .. no ? On Wed, 31 Dec 2003, Jan Wieck wrote: ivan wrote: as new know plpgsql has special cache which remember too long (event non-existing tables (i mean old oid)) so i suggest to create same function only in plpgsql which would clear this cache, or sth like this ? for ie, where i drop table i would do plpgsql_clear_cache (); and when i will create one more time table with this same name plpgsql will not remeber wrong oid possible ? You obviously did not bother to search the archives on this. This will not solve the problem since the cache you're talking about is per backend local memory. So if one backend modifies the schema, how does it cause all other to forgt? Since the same problem exists in general for everything that uses SPI, the solution lies in there. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] cache in plpgsql
as new know plpgsql has special cache which remember too long (event non-existing tables (i mean old oid)) so i suggest to create same function only in plpgsql which would clear this cache, or sth like this ? for ie, where i drop table i would do plpgsql_clear_cache (); and when i will create one more time table with this same name plpgsql will not remeber wrong oid possible ? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] select from table with unique values
hi how to do select from same table to get only unique values from same column(s) ? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] windows new-line to linux new-line
is there same easy way to turn on translation bettwen client and server from linux new-line style (0x0A) to windows style (0x0A0D?) (something like conversions) ? ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] new-line
its must be in run time, server should has text with linux new line, and if clinet is windows, postgres would convert \n to \n\r but when clinet is also linux postgresa can do nothing about it :) any idea ? On Sat, 27 Dec 2003, [iso-8859-1] Jürgen Cappel wrote: Hi Ivan, maybe your're looking for something like this: http://rcamera.org/cgi-bin/man2html.cgi?1+unix2dos HTH, Jürgen Cappel ivan [EMAIL PROTECTED] schrieb am 27.12.2003, 14:19:10: is there same easy way to turn on translation bettwen client and server from linux new-line style (0x0A) to windows style (0x0A0D?) (something like conversions) ? ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] connections problem
hi i need to connect to by my database more then 100 connections, but after ~20-30 conn, postmaster says me Resource temporarily unavailable, what are this resource ? im using debain with kernel 2.4.23-pre7, on P4 , (postgres 7.4.1) what can be wrong ? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] connections problem
max_connections=200 shared_buffers=2000 ram = 500M + 300M swap hdd = infinite On Fri, 26 Dec 2003, Joshua D. Drake wrote: hi i need to connect to by my database more then 100 connections, but after ~20-30 conn, postmaster says me Resource temporarily unavailable, what are this resource ? We need more info. What does your postgresql.conf say about max_connections? How much ram do you have? How much shared memory have you allocated? Sincerely, Joshua D. Drake im using debain with kernel 2.4.23-pre7, on P4 , (postgres 7.4.1) what can be wrong ? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC - S/JDBC Postgresql support, programming, shared hosting and dedicated hosting. +1-503-222-2783 - [EMAIL PROTECTED] - http://www.commandprompt.com ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] connections problem
.. many others stuf ... the error is : could not fork new process for connection: Resource temporarily unavailable On Fri, 26 Dec 2003, Joshua D. Drake wrote: Hello, Perhaps you have too many open files? What else is running on this machine? Sincerely, Joshua D. Drake ivan wrote: max_connections=200 shared_buffers=2000 ram = 500M + 300M swap hdd = infinite On Fri, 26 Dec 2003, Joshua D. Drake wrote: hi i need to connect to by my database more then 100 connections, but after ~20-30 conn, postmaster says me Resource temporarily unavailable, what are this resource ? We need more info. What does your postgresql.conf say about max_connections? How much ram do you have? How much shared memory have you allocated? Sincerely, Joshua D. Drake im using debain with kernel 2.4.23-pre7, on P4 , (postgres 7.4.1) what can be wrong ? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC - S/JDBC Postgresql support, programming, shared hosting and dedicated hosting. +1-503-222-2783 - [EMAIL PROTECTED] - http://www.commandprompt.com ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC - S/JDBC Postgresql support, programming, shared hosting and dedicated hosting. +1-503-222-2783 - [EMAIL PROTECTED] - http://www.commandprompt.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] connections problem
ulimit -u = 256 , it's ok, ? On Fri, 26 Dec 2003, Tom Lane wrote: ivan [EMAIL PROTECTED] writes: i need to connect to by my database more then 100 connections, but after ~20-30 conn, postmaster says me Resource temporarily unavailable, what are this resource ? First bet is that you are running into a maximum-number-of-processes- per-user limit imposed by ulimit -u. It's also possible that you need more RAM or swap space. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] connections problem
from log : LOG: could not fork new process for connection: Resource temporarily unavailable LOG: could not fork checkpoint process: Resource temporarily unavailable ulimit -a : core file size(blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files(-n) 2048 pipe size (512 bytes, -p) 8 stack size(kbytes, -s) 15000 cpu time (seconds, -t) unlimited max user processes(-u) 400 virtual memory(kbytes, -v) unlimited On Fri, 26 Dec 2003, Tom Lane wrote: ivan [EMAIL PROTECTED] writes: the error is : could not fork new process for connection: Resource temporarily unavailable We could have saved some time if you'd spelled out the full error message to begin with. This is undoubtedly a kernel-imposed restriction you are running into. I'm not sure if the failure is related to sheer number of processes, number of open files, or RAM/swap space, but one way or another you are trying to create more processes than your kernel will support. It seems moderately unlikely that twenty or thirty Postgres processes would by themselves be able to run the kernel out of resources (although if the problem is related to number of open files, reducing max_files_per_process might help). What else do you have running on this machine? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_user
you can also patch your kernel and when you write cat /etc/passwd system give you only your line , whitout any others users, so exacly what you need , in pgsql i think that users dont need to know about others , and also them databases, i call it security :) On Mon, 27 Oct 2003, Jan Wieck wrote: ivan wrote: hi can we change initdb when view pg_user is createing to : CREATE VIEW pg_user AS \ SELECT \ usename, \ usesysid, \ usecreatedb, \ usesuper, \ usecatupd, \ ''::text as passwd, \ valuntil, \ useconfig \ FROM pg_shadow WHERE usename = SESSION_USER; No, at least not without a complete proposal how to retain the current behaviour of pg_tables, pg_views, psql's \d and other places that rely on pg_user being able to display all users. It's the same thing with your /etc/passwd. chmod o-rwx /etc/passwd will hide the usernames but break many utilities. If you don't want someone to know all the logins, don't give him one. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Dump error
pg_dump: handler procedure for procedural language plpgsql not found pg_dumpall: pg_dump failed on database db, exiting why ? i create this language by script createlang as superuser for this database so, ? :) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings