[HACKERS] Reading deleted records - PageHeader v3
Hi, So first I'm a pgsql hacker newbie and I've been reading up on the storage structure: http://www.postgresql.org/docs/8.2/interactive/storage-page-layout.html I'm trying to recover deleted records from a page file (postgresql 8.2) : i.e. base/dbId/20132 I am able to successfully read all the header data I need (PageHeaderData, ItemIdData , HeapTupleHeaderData) but I hit a wall when I try to start reading user data. This has helped: http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/postgres.h?rev=1. 77;content-type=text%2Fplain I've read and understood fairly well how varlena structures are stored (plain, compressed, external/toast) but so far I can't seem to read a plain inline value. I think part of my problem is I haven't really understood what 'Then make sure you have the right alignment' means. My approach currently is: After reading HeapTupleHeaderData (23 bytes), I advance another 4 bytes (hoff) and try to read a 32 bit integer (first attribute). I am expecting to get an integer value 1 but I get 512 . Am I doing this wrong? Could someone point me to the pgsql code pieces I should be looking at? If useful, this is the information I have before reading the 'user data': object(PostgreSQL_HeapTupleHeaderData)#14 (7) { [xmin]= string(5) 13824 [xmax]= string(1) 0 [cid]= string(1) 0 [ctid]= object(PostgreSQL_ItemPointerData)#16 (2) { [blockId]= string(1) 0 [posId]= int(0) } [infomask2]= int(0) [infomask]= int(2) [hoff]= int(4) } object(PostgreSQL_Attribute)#7 (6) { [name]= string(7) book_id [relid]= int(20132) [len]= int(4) [num]= int(1) [ndims]= int(0) [align]= string(1) i } array(1) { [book_id]= int(512) }
[HACKERS] Add on_plperl_init and on_plperlu_init to plperl UPDATE 3 [PATCH]
This is the third update to the fourth of the patches to be split out from the former 'plperl feature patch 1'. Changes in this patch: - Added plperl.on_plperl_init and plperl.on_plperlu_init GUCs Both are PGC_SUSET SPI functions are not available when the code is run. Errors are detected and reported as ereport(ERROR, ...) Corresponding documentation and tests for both. - Renamed plperl.on_perl_init to plperl.on_init - Improved state management of select_perl_context() An error during interpreter initialization will leave the state (interp_state etc) unchanged. - The utf8fix code has been greatly simplified. - More code comments re PGC_SUSET and no access to SPI functions. Tim. diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml index 7018624..0999bd0 100644 *** a/doc/src/sgml/plperl.sgml --- b/doc/src/sgml/plperl.sgml *** $$ LANGUAGE plperl; *** 748,753 --- 748,758 literalreturn $_SHARED{myquote}-gt;($_[0]);/literal at the expense of readability.) /para + + para + The varname%_SHARED/varname variable, and other global state within + the language, should be considered insecure in a multi-user database. + /para /sect1 sect1 id=plperl-trusted *** CREATE TRIGGER test_valid_id_trig *** 1044,1057 variablelist ! varlistentry id=guc-plperl-on-perl-init xreflabel=plperl.on_perl_init ! termvarnameplperl.on_perl_init/varname (typestring/type)/term indexterm !primaryvarnameplperl.on_perl_init/ configuration parameter/primary /indexterm listitem para !Specifies perl code to be executed when a perl interpreter is first initialized. The SPI functions are not available when this code is executed. If the code fails with an error it will abort the initialization of the interpreter and propagate out to the calling query, causing the current transaction --- 1049,1063 variablelist ! varlistentry id=guc-plperl-on-init xreflabel=plperl.on_init ! termvarnameplperl.on_init/varname (typestring/type)/term indexterm !primaryvarnameplperl.on_init/ configuration parameter/primary /indexterm listitem para !Specifies perl code to be executed when a perl interpreter is first initialized !and before it is specialized for use by literalplperl/ or literalplperlu/. The SPI functions are not available when this code is executed. If the code fails with an error it will abort the initialization of the interpreter and propagate out to the calling query, causing the current transaction *** CREATE TRIGGER test_valid_id_trig *** 1059,1069 /para para The perl code is limited to a single string. Longer code can be placed !into a module and loaded by the literalon_perl_init/ string. Examples: programlisting ! plplerl.on_perl_init = '$ENV{NYTPROF}=start=no; require Devel::NYTProf::PgPLPerl' ! plplerl.on_perl_init = 'use lib /my/app; use MyApp::PgInit;' /programlisting /para para --- 1065,1075 /para para The perl code is limited to a single string. Longer code can be placed !into a module and loaded by the literalon_init/ string. Examples: programlisting ! plplerl.on_init = '$ENV{NYTPROF}=start=no; require Devel::NYTProf::PgPLPerl' ! plplerl.on_init = 'use lib /my/app; use MyApp::PgInit;' /programlisting /para para *** plplerl.on_perl_init = 'use lib /my/app *** 1077,1082 --- 1083,1134 /listitem /varlistentry + varlistentry id=guc-plperl-on-plperl-init xreflabel=plperl.on_plperl_init + termvarnameplperl.on_plperl_init/varname (typestring/type)/term + indexterm +primaryvarnameplperl.on_plperl_init/ configuration parameter/primary + /indexterm + listitem +para +Specifies perl code to be executed when the literalplperl/ language +is first used in a session. Changes made after the literalplperl/ +language has been used will have no effect. +The perl code can only perform trusted operations. +The SPI functions are not available when this code is executed. +/para +para +If the code fails with an error it will abort the initialization and +propagate out to the calling query, causing the current transaction or +subtransaction to be aborted. Any changes within perl won't be undone. +If the literalplperl/ language is used again the +initialization will be repeated. +/para + /listitem + /varlistentry + + varlistentry id=guc-plperl-on-untrusted-init xreflabel=plperl.on_untrusted_init + termvarnameplperl.on_untrusted_init/varname (typestring/type)/term + indexterm +
[HACKERS] Failed assertion during recovery of partial WAL file
Encountered the following FailedAssertion while testing database recovery (actually this would be HS) with partial WAL file: LOG: restored log file 00010003 from archive LOG: consistent recovery state reached at 0/3001EEC LOG: record with zero length at 0/3001EEC LOG: redo done at 0/3001D68 LOG: last completed transaction was at log time 2010-02-05 11:02:49.695544+02 LOG: database system is ready to accept read only connections LOG: selected new timeline ID: 3 TRAP: FailedAssertion(!(readFile = 0), File: xlog.c, Line: 5117) and the assorted backtrace: #0 0xb7f2e410 in __kernel_vsyscall () #1 0xb7dc9085 in raise () from /lib/tls/i686/cmov/libc.so.6 #2 0xb7dcaa01 in abort () from /lib/tls/i686/cmov/libc.so.6 #3 0x0834926e in ExceptionalCondition (conditionName=0x8389a82 !(readFile = 0), errorType=0x837d5b4 FailedAssertion, fileName=0x8390338 xlog.c, lineNumber=5117) at assert.c:57 #4 0x080dde34 in exitArchiveRecovery (endTLI=1, endLogId=0, endLogSeg=18) at xlog.c:5117 #5 0x080e3eab in StartupXLOG () at xlog.c:6029 #6 0x080e6055 in StartupProcessMain () at xlog.c:8666 #7 0x08106a30 in AuxiliaryProcessMain (argc=2, argv=0xbfebdd64) at bootstrap.c:412 #8 0x08253f0c in StartChildProcess (type=StartupProcess) at postmaster.c:4340 #9 0x0825669e in PostmasterMain (argc=3, argv=0x8516e08) at postmaster.c:1078 #10 0x081f7919 in main (argc=3, argv=0x8516e08) at main.c:188 The crash happened on a HS slave which was fed a partial WAL file. The partial was constructed by extracting data up to pg_current_xlog_location() and zero padding it up to 16MB. This only seems to be happening on HEAD - quick tests indicate that both 8.3 and 8.4 are not affected (or maybe I didn't try hard enough). regards, Martin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Confusion over Python drivers
My son has brought to my attention that our current crop of Python client libraries is inadequate/confusing. I took a look myself, and asked on our IRC channel, and am now convinced this area needs attention. First, the sheer number of libraries is confusing. This is the page from the Postgres wiki: http://wiki.postgresql.org/wiki/Python The first one listed, Psycopg, is noted as preferred libpq-based driver, but the license is GPL. Isn't that a problem for many client applications? The next one, PyGreSQL, is BSD licensed, but only has documentation for the classic interface. The DB-API module says about documentation: http://www.pygresql.org/pgdb.html This section of the documentation still needs to be written. The other three are pure Python drivers, which I guess can be good, but why three, and then there are three more listed as obsolete/stalled. Clearly something is wrong here. The Python-hosted PostgreSQL page has similar problems: http://wiki.python.org/moin/PostgreSQL Does Perl have a similar mess? While I realize experienced people can easily navigate this confusion, I am concerned about new Postgres adopters being very confused by this and it is hurting our database adoption in general. What is really needed is for someone to take charge of one of these projects and make a best-of-breed Python driver that can gain general acceptance as our preferred driver. I feel Python is too important a language to be left in this state. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Personal Copyright Notices
On Fri, Feb 5, 2010 at 3:24 AM, Bruce Momjian br...@momjian.us wrote: Bruce Momjian wrote: The intagg copyright is on a _Makefile_: # Makefile for integer aggregator # Copyright (C) 2001 Digital Music Network. # by Mark L. Woodward # $PostgreSQL: pgsql/contrib/intagg/Makefile,v 1.10 2008/11/14 19:58:45 tgl Exp $ Seems we either have to contact the author or rewrite the file. I have rewritten the above file from scratch to enable removal of the copyright mention. Uhm, you rewrote it and it turned out byte-for-byte identical except for the removed copyright comments? Actually the file is trivial enough -- it's just our standard pgxs template with the two filenames inserted -- that that's plausible. I think that indicates that it wasn't really something copyrightable to begin with. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Personal Copyright Notices
Greg Stark wrote: On Fri, Feb 5, 2010 at 3:24 AM, Bruce Momjian br...@momjian.us wrote: Bruce Momjian wrote: The intagg copyright is on a _Makefile_: ? ? ? # Makefile for integer aggregator ? ? ? # Copyright (C) 2001 Digital Music Network. ? ? ? # by Mark L. Woodward ? ? ? # $PostgreSQL: pgsql/contrib/intagg/Makefile,v 1.10 2008/11/14 19:58:45 tgl Exp $ Seems we either have to contact the author or rewrite the file. I have rewritten the above file from scratch to enable removal of the copyright mention. Uhm, you rewrote it and it turned out byte-for-byte identical except for the removed copyright comments? Actually the file is trivial enough -- it's just our standard pgxs template with the two filenames inserted -- that that's plausible. I think that indicates that it wasn't really something copyrightable to begin with. Right, I copied /contrib/intarray/Makefile and modified it. If it turns out identical, so be it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Bruce, http://wiki.postgresql.org/wiki/Python The first one listed, Psycopg, is noted as preferred libpq-based driver, but the license is GPL. Isn't that a problem for many client applications? The licence of psycopg2 is a little more complicated; the GPL in that summary just tries to sum it. The actual licence is: [from the LICENCE in the source distribution] - psycopg and the GPL === psycopg is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. See file COPYING for details. As a special exception, specific permission is granted for the GPLed code in this distribition to be linked to OpenSSL and PostgreSQL libpq without invoking GPL clause 2(b). Note that the GPL was chosen to avoid proprietary adapters based on psycopg code. Using psycopg in a proprietary product (even bundling psycopg with the proprietary product) is fine as long as: 1. psycopg is called from Python only using only the provided API (i.e., no linking with C code and no C modules based on it); and 2. all the other points of the GPL are respected (you offer a copy of psycopg's source code, and so on.) Alternative licenses If you prefer you can use the Zope Database Adapter ZPsycopgDA (i.e., every file inside the ZPsycopgDA directory) user the ZPL license as published on the Zope web site, http://www.zope.org/Resources/ZPL. Also, the following BSD-like license applies (at your option) to the files following the pattern psycopg/adapter*.{h,c} and psycopg/microprotocol*.{h,c}: Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. psycopg is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Proprietary licenses A non-exclusive license is available for companies that want to include psycopg in their proprietary products without respecting the spirit of the GPL. The price of the license is one day of development done by the author, at the consulting fee he applies to his usual customers at the day of the request. - Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 no fx, no carrier pigeon - %s is too gigantic of an industry to bend to the whims of reality
Re: [HACKERS] Confusion over Python drivers
Massa, Harald Armin wrote: Bruce, http://wiki.postgresql.org/wiki/Python The first one listed, Psycopg, is noted as preferred libpq-based driver, but the license is GPL. Isn't that a problem for many client applications? The licence of psycopg2 is a little more complicated; the GPL in that summary just tries to sum it. The actual licence is: [from the LICENCE in the source distribution] Wow, that is super-confusing. I am dealing with an issue now that some companies are concerned that individual names appear with our own BSD copyrights in small files. That pales in comparison to this copyright. I would never be able to recommend software with that complex a copyright. I think the community needs to look at other alternatives. (I am amazed that not only are there a confusing number of drivers, but there are a confusing number of copyright options for this single driver.) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On fre, 2010-02-05 at 09:19 -0500, Bruce Momjian wrote: While I realize experienced people can easily navigate this confusion, I am concerned about new Postgres adopters being very confused by this and it is hurting our database adoption in general. What is really needed is for someone to take charge of one of these projects and make a best-of-breed Python driver that can gain general acceptance as our preferred driver. I feel Python is too important a language to be left in this state. The situation is unfortunate, but you might as well argue that too many Linux desktops or Linux distributions confuse new users and hurt adoption. These alternatives all exist for a reason, and it will be difficult to get some of them to abandon their work with the aim of reducing the overall confusion. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Fri, Feb 05, 2010 at 09:19:26AM -0500, Bruce Momjian wrote: My son has brought to my attention that our current crop of Python client libraries is inadequate/confusing. I took a look myself, and asked on our IRC channel, and am now convinced this area needs attention. http://wiki.postgresql.org/wiki/Python The Python-hosted PostgreSQL page has similar problems: http://wiki.python.org/moin/PostgreSQL Does Perl have a similar mess? I don't think so. The primary database interface is DBI and as far as I can see there's only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/ The only non-DBI interfaces I could find (by skimming the 384 results for postgresql on search.cpan.org) were: Postgres: http://search.cpan.org/dist/Postgres/last updated in 1998. Pg: http://search.cpan.org/dist/pgsql_perl5/ last updated in 2000. Tim. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Peter Eisentraut wrote: On fre, 2010-02-05 at 09:19 -0500, Bruce Momjian wrote: While I realize experienced people can easily navigate this confusion, I am concerned about new Postgres adopters being very confused by this and it is hurting our database adoption in general. What is really needed is for someone to take charge of one of these projects and make a best-of-breed Python driver that can gain general acceptance as our preferred driver. I feel Python is too important a language to be left in this state. The situation is unfortunate, but you might as well argue that too many Linux desktops or Linux distributions confuse new users and hurt adoption. These alternatives all exist for a reason, and it will be difficult to get some of them to abandon their work with the aim of reducing the overall confusion. Agreed, but can't we do a better job of steering new users and improving the novice experience for the most popular of them, i.e. think Ubuntu. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Tim Bunce wrote: On Fri, Feb 05, 2010 at 09:19:26AM -0500, Bruce Momjian wrote: My son has brought to my attention that our current crop of Python client libraries is inadequate/confusing. I took a look myself, and asked on our IRC channel, and am now convinced this area needs attention. http://wiki.postgresql.org/wiki/Python The Python-hosted PostgreSQL page has similar problems: http://wiki.python.org/moin/PostgreSQL Does Perl have a similar mess? I don't think so. The primary database interface is DBI and as far as I can see there's only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/ The only non-DBI interfaces I could find (by skimming the 384 results for postgresql on search.cpan.org) were: Postgres: http://search.cpan.org/dist/Postgres/last updated in 1998. Pg: http://search.cpan.org/dist/pgsql_perl5/ last updated in 2000. Yes, that's what I thought, and am glad to here it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.
Peter Eisentraut wrote: On tis, 2010-01-26 at 10:20 -0800, David Fetter wrote: On Tue, Jan 26, 2010 at 02:21:29PM +, Bruce Momjian wrote: Log Message: --- Remove tabs in SGML. Can we see about making a commit hook for CVS that disallows \t in SGML files? The process in git is pretty simple. Better write a check in the makefile. Done, and attached. The rule is more shell script than makefile, but I am not sure how else to do it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/Makefile === RCS file: /cvsroot/pgsql/doc/src/sgml/Makefile,v retrieving revision 1.135 diff -c -c -r1.135 Makefile *** doc/src/sgml/Makefile 6 Jan 2010 19:07:05 - 1.135 --- doc/src/sgml/Makefile 5 Feb 2010 14:49:26 - *** *** 345,350 --- 345,352 endif # sqlmansectnum != 7 + check-tabs: + ( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files; exit 1) ## ## Clean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On fre, 2010-02-05 at 14:45 +, Tim Bunce wrote: Does Perl have a similar mess? I don't think so. The primary database interface is DBI and as far as I can see there's only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/ I think another difference is that the Perl DBI interface is very rich, whereas the Python DB-API is quite minimal and almost forces people to write (incompatible) extensions. The DB-SIG at Python that ought to drive all this is also quite dead, possibly because everyone has moved on to SQLAlchemy. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failed assertion during recovery of partial WAL file
On Fri, Feb 5, 2010 at 10:46 PM, Martin Pihlak martin.pih...@gmail.com wrote: Encountered the following FailedAssertion while testing database recovery (actually this would be HS) with partial WAL file: LOG: restored log file 00010003 from archive LOG: consistent recovery state reached at 0/3001EEC LOG: record with zero length at 0/3001EEC LOG: redo done at 0/3001D68 LOG: last completed transaction was at log time 2010-02-05 11:02:49.695544+02 LOG: database system is ready to accept read only connections LOG: selected new timeline ID: 3 TRAP: FailedAssertion(!(readFile = 0), File: xlog.c, Line: 5117) and the assorted backtrace: #0 0xb7f2e410 in __kernel_vsyscall () #1 0xb7dc9085 in raise () from /lib/tls/i686/cmov/libc.so.6 #2 0xb7dcaa01 in abort () from /lib/tls/i686/cmov/libc.so.6 #3 0x0834926e in ExceptionalCondition (conditionName=0x8389a82 !(readFile = 0), errorType=0x837d5b4 FailedAssertion, fileName=0x8390338 xlog.c, lineNumber=5117) at assert.c:57 #4 0x080dde34 in exitArchiveRecovery (endTLI=1, endLogId=0, endLogSeg=18) at xlog.c:5117 #5 0x080e3eab in StartupXLOG () at xlog.c:6029 #6 0x080e6055 in StartupProcessMain () at xlog.c:8666 #7 0x08106a30 in AuxiliaryProcessMain (argc=2, argv=0xbfebdd64) at bootstrap.c:412 #8 0x08253f0c in StartChildProcess (type=StartupProcess) at postmaster.c:4340 #9 0x0825669e in PostmasterMain (argc=3, argv=0x8516e08) at postmaster.c:1078 #10 0x081f7919 in main (argc=3, argv=0x8516e08) at main.c:188 The crash happened on a HS slave which was fed a partial WAL file. The partial was constructed by extracting data up to pg_current_xlog_location() and zero padding it up to 16MB. This only seems to be happening on HEAD - quick tests indicate that both 8.3 and 8.4 are not affected (or maybe I didn't try hard enough). Thanks for the report! This assertion failure derives from the recent refactoring of ReadRecord(). Which changed the startup process so as to re-fetch the last applied WAL record at the end of recovery from the buffer instead of the WAL file if it's stored in the buffer. In this case, the last applied WAL file remains closed. So readFile (which ought to have been the file descriptor of that WAL file) might be -1 in exitArchiveRecovery(). In the now, that assertion is obsolete. So I attached the patch that removes the assert() from exitArchiveRecovery(). Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 5110,5125 exitArchiveRecovery(TimeLineID endTLI, uint32 endLogId, uint32 endLogSeg) UpdateMinRecoveryPoint(InvalidXLogRecPtr, true); /* ! * We should have the ending log segment currently open. Verify, and then ! * close it (to avoid problems on Windows with trying to rename or delete ! * an open file). */ ! Assert(readFile = 0); ! Assert(readId == endLogId); ! Assert(readSeg == endLogSeg); ! ! close(readFile); ! readFile = -1; /* * If the segment was fetched from archival storage, we want to replace --- 5110,5123 UpdateMinRecoveryPoint(InvalidXLogRecPtr, true); /* ! * If the ending log segment is currently open, close it (to avoid ! * problems on Windows with trying to rename or delete an open file). */ ! if (readFile = 0) ! { ! close(readFile); ! readFile = -1; ! } /* * If the segment was fetched from archival storage, we want to replace *** *** 5974,5979 StartupXLOG(void) --- 5972,5980 EndOfLog = EndRecPtr; XLByteToPrevSeg(EndOfLog, endLogId, endLogSeg); + Assert(readId == endLogId); + Assert(readSeg == endLogSeg); + /* * Complain if we did not roll forward far enough to render the backup * dump consistent. Note: it is indeed okay to look at the local variable -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronized snapshots
Hello Joachim, a little daughter eats lots of spare cycles - among other things. Sorry it took that long to review. On Fri, 8 Jan 2010 20:36:44 +0100, Joachim Wieland j...@mcknight.de wrote: The attached patch implements the idea of Heikki / Simon published in http://archives.postgresql.org/pgsql-hackers/2009-11/msg00271.php I must admit I didn't read that up front, but thought your patch could be useful for implementing parallel querying. So, let's first concentrate on the intended use case: allowing parallel pg_dump. To me it seems like a pragmatic and quick solution, however, I'm not sure if requiring superuser privileges is acceptable. The patch currently compiles (modulo some OID changes in pg_proc.h to prevent duplicates) and the test suite runs through fine. I haven't tested the new functions, though. Reading the code, I'm missing the part that actually acquires the snapshot for the transaction(s). After setting up multiple transactions with pg_synchronize_snapshot and pg_synchronize_snapshot_taken, they still don't have a snapshot, do they? Also, you should probably ensure the calling transactions don't have a snapshot already (let alone a transaction id). In a similar vein, and answering your question in a comment: yes, I'd say you want to ensure your transactions are in SERIALIZABLE isolation mode. There's no other isolation level for which that kind of snapshot serialization makes sense, is there? Using the exposed functions in a more general sense, I think it's important to note that the patch only intents to synchronize snapshots at the start of the transaction, not contiguously. Thus, normal transaction isolation applies for concurrent writes and each of the transactions can commit or rollback independently. The timeout is nice, but is it really required? Isn't the normal query cancellation infrastructure sufficient? Hope that helps. Thanks for working on this issue. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Peter Eisentraut wrote: On fre, 2010-02-05 at 14:45 +, Tim Bunce wrote: Does Perl have a similar mess? I don't think so. The primary database interface is DBI and as far as I can see there's only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/ I think another difference is that the Perl DBI interface is very rich, whereas the Python DB-API is quite minimal and almost forces people to write (incompatible) extensions. The DB-SIG at Python that ought to drive all this is also quite dead, possibly because everyone has moved on to SQLAlchemy. I assumed it would be Perl that had confusion because of the Perl 6, but obviously it is Python. The second PL/Python server-side language that was proposed recently is taking on a whole new appearance for me. :-( -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL-ism help patch for psql
Josh Berkus wrote: My problem with this whole idea is that it seems to be very MySQL-specific. Why aren't we providing help for users migrating from Oracle, Sybase, Informix, Ingres, DB2, SQLServer and Firebird, to name but a few? And if we turn all those on by default, we'll have a pretty horrible banner when starting psql. We should do all of those. However, we have to start somewhere. Therefore, I like the idea of having a switch, like: \advice mysql \advice db2 etc. ALSO, I'll point out that a page in our official documentation with extensive docs on commands and migration issues would be far more helpful than any amount of psql messages. Just sayin'. I don't think we came to any firm conclusion in this thread, so I have added a TODO item: Add option to print advice for people familiar with other databases * http://archives.postgresql.org/pgsql-hackers/2010-01/msg01845.php -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [CFReview] Red-Black Tree
Teodor Sigaev wrote: I would like to see point #2 of the following email addressed before commit. As things stand, it is not clear (at least to me) whether this is a win. Reimplementation of ginInsertRecordBA reduces difference of HEAD and HEAD+rbtree in regular case. Test suite is taken from http://www.sai.msu.su/~megera/wiki/2009-04-03: SEQ: SELECT array_to_string(ARRAY(select '' || a || '.' || b from generate_series(1,50) b), ' ')::tsvector AS i INTO foo FROM generate_series(1,10) a; RND: SELECT array_to_string(ARRAY(select '' || random() from generate_series(1,50) b), ' ')::tsvector AS i INTO foo FROM generate_series(1,10) a; Times in seconds: HEAD 0.9 0.11 SEQ 130 113111 RND11.4 12.6 11.5 The ides was to change order of insertion - now insertion order decreases number of rebalancing. Oleg's test (http://www.sai.msu.su/~megera/wiki/rbtree_test) are made with v0.10 which is differ from 0.11 only by comments around ginInsertRecordBA() Here is a quick comparison between the current 0.11 patch against my original 0.7 patch when building Oleg's simple data. (Note: due to time constraints, this is just a single run to get a feel for performance) 0.7 patch = rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin); CREATE INDEX Time: 1910741.352 ms rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout); CREATE INDEX Time: 1647609.300 ms 0.11 patch == rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin); CREATE INDEX Time: 1864013.526 ms rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout); CREATE INDEX Time: 1661200.454 ms HTH, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: psql \whoami option
I have added this patch to the next commit-fest. Thanks: https://commitfest.postgresql.org/action/commitfest_view?id=6 --- David Christensen wrote: -hackers, In the spirit of small, but hopefully useful interface improvement patches, enclosed for your review is a patch for providing psql with a \whoami command (maybe a better name is \conninfo or similar). Its purpose is to print information about the current connection, by default in a human-readable format. There is also an optional format parameter which currently accepts 'dsn' as an option to output the current connection information as a DSN. Example output: $psql -d postgres -p 8555 psql (8.5devel) You are now connected to database postgres. [Tue Jan 26 17:17:31 CST 2010] machack:postgres:8555=# \whoami Connected to database: postgres, user: machack, port: 8555 via local domain socket [Tue Jan 26 17:17:34 CST 2010] machack:postgres:8555=# \c - - localhost 8555 psql (8.5devel) You are now connected to database postgres on host localhost. [Tue Jan 26 17:17:42 CST 2010] machack:postgres:8555=# \whoami Connected to database: postgres, user: machack, host: localhost, port: 8555 [Tue Jan 26 17:17:46 CST 2010] machack:postgres:8555=# \whoami dsn dbname=postgres;user=machack;host=localhost;port=8555 [Tue Jan 26 17:19:02 CST 2010] machack:postgres:8555=# \q Regards, David -- David Christensen End Point Corporation da...@endpoint.com diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql- ref.sgml index 3ce5996..b58b24d 100644 *** a/doc/src/sgml/ref/psql-ref.sgml --- b/doc/src/sgml/ref/psql-ref.sgml *** lo_import 152801 *** 2149,2154 --- 2149,2167 varlistentry + termliteral\whoami/literal [ replaceable class=parameterdefault/replaceable | replaceable class=parameterdsn/replaceable ] /term + listitem + para + Outputs connection information about the current database + connection. When passed parameter literaldsn/literal, + outputs as a DSN. If parameter is unspecified or + unrecognized, outputs in a human-readable format. + /para + /listitem + /varlistentry + + + varlistentry termliteral\x/literal/term listitem para diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 5188b18..21b2468 100644 *** a/src/bin/psql/command.c --- b/src/bin/psql/command.c *** exec_command(const char *cmd, *** 1106, --- 1106,1156 free(fname); } + /* \whoami -- display information about the current connection */ + else if (strcmp(cmd, whoami) == 0) + { + char *format = psql_scan_slash_option(scan_state, + OT_NORMAL, NULL, true); + char *host = PQhost(pset.db); + + if (format !pg_strcasecmp(format, dsn)) { + if (host) { + printf(dbname=%s;user=%s;host=%s;port=%s\n, +PQdb(pset.db), +PQuser(pset.db), +host, +PQport(pset.db) + ); + } + else { + printf(dbname=%s;user=%s;port=%s\n, +PQdb(pset.db), +PQuser(pset.db), +PQport(pset.db) + ); + } + } + else { + /* default case */ + if (host) { + printf(Connected to database: \%s\, user: \%s\, host: \%s \, port: \%s\\n, +PQdb(pset.db), +PQuser(pset.db), +host, +PQport(pset.db) + ); + } + else { + printf(Connected to database: \%s\, user: \%s\, port: \%s \ via local domain socket\n, +PQdb(pset.db), +PQuser(pset.db), +PQport(pset.db) + ); + } + } + free(format); + } + /* \x -- toggle expanded table representation */
Re: [HACKERS] remove contrib/xml2
On Thu, Feb 4, 2010 at 10:51 PM, M Z jm80...@gmail.com wrote: I did some tests followed Robert's test cases on both postgresql 8.4.2-0ubu and 8.3.8-1, OS: Ubuntu Karmic. 1) 1st test case, it doesn't crash on 8.3.8 but crash on 8.4.2; Interesting. So, that's a regression of some kind. 2) 2nd test case, both 8.3.8 and 8.4.2 are fine, and no warning (different from Robert's test?); I built with --enable-debug and --enable-cassert, which might be relevant. 3) 3rd test case (and modified test case for 8.3.8), both 8.3.8 and 8.4.2 are not correct, same with Robert's test (8.5 beta?); As I think about that further, it might not be a bug - how is the processor supposed to know what we expect to happen? But then, I don't really know how this is supposed to work. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove contrib/xml2
On Wed, Feb 3, 2010 at 8:49 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: (2) add a very, very large warning that this will crash if you do almost anything with it. I think that's an exaggeration. Certain people are known to be using it quite successfully. Hmm. Well, all I know is that the first thing I tried crashed the server. CREATE TABLE xpath_test (id integer NOT NULL, t xml); INSERT INTO xpath_test VALUES (1, 'docint1/int/doc'); SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true') as t(id int4); This trivial patch lingering on my system fixes this crasher (this is for the 8.3 branch). It makes the problem in alloc set ExprContext warning show up instead. There are still lotsa other holes, but hey, this is a start ... Interestingly M Z found he couldn't reproduce this crash on 8.3. Can you? If so, +1 for applying this and backpatching it as far as make sense. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
I think another difference is that the Perl DBI interface is very rich, whereas the Python DB-API is quite minimal and almost forces people to write (incompatible) extensions. The DB-SIG at Python that ought to drive all this is also quite dead, possibly because everyone has moved on to SQLAlchemy. For people who use Python a lot, could I have a list of the deficiencies in DBAPI? I've got my horse and lance ready. Given that SQLAlchemy isn't for everyone, of course ... it couldn't be, or Django would use it, no? --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [CFReview] Red-Black Tree
That's all around 1% 0.7 patch = rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin); CREATE INDEX Time: 1910741.352 ms rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout); CREATE INDEX Time: 1647609.300 ms 0.11 patch == rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin); CREATE INDEX Time: 1864013.526 ms rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout); CREATE INDEX Time: 1661200.454 ms -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.
Alvaro Herrera wrote: Bruce Momjian wrote: Peter Eisentraut wrote: On tis, 2010-01-26 at 10:20 -0800, David Fetter wrote: On Tue, Jan 26, 2010 at 02:21:29PM +, Bruce Momjian wrote: Log Message: --- Remove tabs in SGML. Can we see about making a commit hook for CVS that disallows \t in SGML files? The process in git is pretty simple. Better write a check in the makefile. Done, and attached. The rule is more shell script than makefile, but I am not sure how else to do it. Maybe have the check-tabs rule as a dependency of the check, html and/or draft rules? + check-tabs: + ( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files; exit 1) Uh, I didn't think people wanted to actually have any tab testing in their automatically. Do they? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.
Bruce Momjian wrote: Alvaro Herrera wrote: Maybe have the check-tabs rule as a dependency of the check, html and/or draft rules? + check-tabs: + ( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files; exit 1) Uh, I didn't think people wanted to actually have any tab testing in their automatically. Do they? IMO that's the point of the whole exercise. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.
Bruce Momjian wrote: Peter Eisentraut wrote: On tis, 2010-01-26 at 10:20 -0800, David Fetter wrote: On Tue, Jan 26, 2010 at 02:21:29PM +, Bruce Momjian wrote: Log Message: --- Remove tabs in SGML. Can we see about making a commit hook for CVS that disallows \t in SGML files? The process in git is pretty simple. Better write a check in the makefile. Done, and attached. The rule is more shell script than makefile, but I am not sure how else to do it. Maybe have the check-tabs rule as a dependency of the check, html and/or draft rules? + check-tabs: + ( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files; exit 1) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
The situation is unfortunate, but you might as well argue that too many Linux desktops or Linux distributions confuse new users and hurt adoption. These alternatives all exist for a reason, and it will be difficult to get some of them to abandon their work with the aim of reducing the overall confusion. I'm not as concerned about confusion as the fact that *all* of the various Python drivers suck in different, and crippling, ways. I don't care how many drivers we have, as long as we have at least one 1st-class driver. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Josh Berkus wrote: The situation is unfortunate, but you might as well argue that too many Linux desktops or Linux distributions confuse new users and hurt adoption. These alternatives all exist for a reason, and it will be difficult to get some of them to abandon their work with the aim of reducing the overall confusion. I'm not as concerned about confusion as the fact that *all* of the various Python drivers suck in different, and crippling, ways. I don't care how many drivers we have, as long as we have at least one 1st-class driver. Agreed, and we can point newbies to that one great driver. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl db access documentation enhancement
Patch applied for 9.0. We don't normally backpatch such documentation improvements unless we receive multiple reports of confusion. --- Alexey Klyukin wrote: Hello, We were asked by Enova Financial to improve the documentation of PL/Perl database access functions. Alvaro and me worked on that and we produced the patch that is attached. It splits initial block of functions into the groups with the description directly following each of the group, corrects couple of mistakes and adds an example. One of the existing mistakes was confusion in definitions of spi_exec_prepared and spi_query_prepared. Another one is usage of INTEGER type to return the result of spi_prepare in the example for prepared queries. When trying to execute that function I've got the following error: postgres=# CREATE OR REPLACE FUNCTION init() RETURNS INTEGER AS $# $_SHARED{my_plan} = spi_prepare( 'SELECT (now() + $1)::date AS now', 'INTERVAL'); $$ LANGUAGE plperl; CREATE FUNCTION postgres=# select init(); ERROR: invalid input syntax for integer: 0x1007d6f40 CONTEXT: PL/Perl function init Since the return value is not used anyway, I've changed the return type of the function declaration in the example to VOID. I think this is a good reason to suggest backpatching these changes down to 8.2. [ Attachment, skipping... ] -- Alexey Klyukin http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.
Alvaro Herrera wrote: Bruce Momjian wrote: Alvaro Herrera wrote: Maybe have the check-tabs rule as a dependency of the check, html and/or draft rules? + check-tabs: + ( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files; exit 1) Uh, I didn't think people wanted to actually have any tab testing in their automatically. Do they? IMO that's the point of the whole exercise. OK, done with attached patch, now done as part of 'gmake check'. If people want it as part of the build process, please let me know. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: Makefile === RCS file: /cvsroot/pgsql/doc/src/sgml/Makefile,v retrieving revision 1.136 diff -c -c -r1.136 Makefile *** Makefile 5 Feb 2010 14:51:50 - 1.136 --- Makefile 5 Feb 2010 19:28:14 - *** *** 263,269 ## # Quick syntax check without style processing ! check: postgres.sgml $(ALMOSTALLSGML) $(NSGMLS) $(SPFLAGS) $(SGMLINCLUDE) -s $ --- 263,269 ## # Quick syntax check without style processing ! check: postgres.sgml $(ALMOSTALLSGML) check-tabs $(NSGMLS) $(SPFLAGS) $(SGMLINCLUDE) -s $ *** *** 346,352 endif # sqlmansectnum != 7 check-tabs: ! ( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files; exit 1) ## ## Clean --- 346,352 endif # sqlmansectnum != 7 check-tabs: ! @( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files; exit 1) ## ## Clean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl db access documentation enhancement
Bruce Momjian wrote: Patch applied for 9.0. We don't normally backpatch such documentation improvements unless we receive multiple reports of confusion. I think that's a mistake in this case. The documentation wasn't confusing -- it was bogus. (Actually, the bug fixing is a smaller change than the whole of this patch, so we could provide the smaller patch if desired to apply to 8.4. To be honest I think it is better to just apply the larger patch verbatim.) I could do the backpatch if you want. I just wanted some more peer review on the changes. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Marko Kreen wrote: Psycopg was the leader, especially in web-environments, but it has non-obvious license and with dead website it does not seem that attractive. Although it is well-maintained still. Best path forward would be to talk with Psycopg guys about license clarification/change. I suspect GPL does not extend anyway to Python code that imports it dynamically, and it does not seem to be their intention - they even try to tell it in their current clarification, but it is not as clear as it could be. Yes, this is a good analysis and has some good suggestions for moving forward. In summary, I don't think the current PG/Python driver situation is helping PG or Python, and I believe there are enough people who care about those two to get a group together to sort out a viable direction and start working toward the goal of improving things. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Bruce Momjian wrote: While I realize experienced people can easily navigate this confusion... No, that's the worst part--the more you know and the deeper you dig into it, the more broken you realize the whole thing is. When one of the best drivers (in some respects) has a web page that looks like this: http://initd.org/ that doesn't seem so bad at first--but if you're experienced, you know that the page has been in that disturbing state since late 2006. You start digging into the driver mess, figure you just need to learn how things fit together, but the hole keeps getting bigger as you dig. The issues here have already been identified: the Perl DBI is an excellent spec, while the Python one is so weak everybody ends up needing their own extensions to it. And then portability *even among Python PostgreSQL drivers* goes out the window. If somebody built a BSD/MIT licensed driver that replaces every useful feature of all the forks, with no major problems, and a couple of major projects switched over to it (think Skype level big), maybe this mess could get resolved. I think it would take someone already familiar with the major issues involved a couple of months of regular work to make any serious progress on it. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange heuristic in analyze.c
Greg Stark wrote: So I never realized the consequences of this little heuristic in analyze.c in the handling of very low cardinality columns where we want to just capture the complete list of values in the mcv and throw away the histogram: else if (toowide_cnt == 0 nmultiple == ndistinct) { /* * Every value in the sample appeared more than once. Assume the * column has just these values. */ stats-stadistinct = ndistinct; } The problem with this heuristic is that if the table is small enough you might expect you can set the statistics target high and sample the entire table and get a very accurate mcv covering all the values. However if any of the values in the table appears only once this heuristic will defeat you. The following code will then throw out of the mcv any value which isn't 25% more common than average. Leaving you with a histogram for those values which often does very poorly if the values don't fit any pattern and are just discrete arbitrary values. Do you want a C comment to document this problem? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Greg Smith wrote: Bruce Momjian wrote: While I realize experienced people can easily navigate this confusion... No, that's the worst part--the more you know and the deeper you dig into it, the more broken you realize the whole thing is. When one of the best drivers (in some respects) has a web page that looks like this: http://initd.org/ that doesn't seem so bad at first--but if you're experienced, you know that the page has been in that disturbing state since late 2006. You start digging into the driver mess, figure you just need to learn how things fit together, but the hole keeps getting bigger as you dig. The issues here have already been identified: the Perl DBI is an excellent spec, while the Python one is so weak everybody ends up needing their own extensions to it. And then portability *even among Python PostgreSQL drivers* goes out the window. If somebody built a BSD/MIT licensed driver that replaces every useful feature of all the forks, with no major problems, and a couple of major projects switched over to it (think Skype level big), maybe this mess could get resolved. I think it would take someone already familiar with the major issues involved a couple of months of regular work to make any serious progress on it. Yes, it is going to take serious time, and it is going to take more than one person to be involved, but I think there are enough people who care that something serious can be done to improve its current state. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Fri, 2010-02-05 at 09:38 -0500, Bruce Momjian wrote: Wow, that is super-confusing. Agreed. Standardization among licenses is useful, and I think it's important to have a driver with a license that people already understand. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP patch for system-catalog vacuuming via a relation map
Attached is the current state of my work on letting system catalogs be processed by new-style VACUUM FULL (a/k/a CLUSTER). I haven't done the WAL support nor worried about interlocking concurrent updates of relation map files, but it passes the regression tests and can do VACUUM FULL of every system catalog. So I think this is proof that the approach is workable and there are no further gotchas to stumble over in dark recesses of the system. The patch adds about 1250 lines of code, mostly in a new file relmapper.c, and will be bigger once WAL support is added. However, I anticipate that we will be able to rip out probably three times that much code by eliminating VACUUM FULL INPLACE (commands/vacuum.c is alone almost 4000 lines, though of course it won't all go away). So it will be a substantial net reduction in code size, plus we will get the previously discussed benefits such as removing reindex-in-place for shared indexes. Barring objections I'm going to press ahead with completing and committing this; then in a separate patch remove VACUUM FULL INPLACE. regards, tom lane bin6wpmdlL4Yo.bin Description: relation-mapper-1.patch.gz -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map
Tom Lane wrote: Barring objections I'm going to press ahead with completing and committing this; then in a separate patch remove VACUUM FULL INPLACE. With the second patch, we will continue to support reading XVAC_MOVED_OUT and IN hint bits, but never set them, correct? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: Barring objections I'm going to press ahead with completing and committing this; then in a separate patch remove VACUUM FULL INPLACE. With the second patch, we will continue to support reading XVAC_MOVED_OUT and IN hint bits, but never set them, correct? Right, the tqual support has to stay for as long as we support binary upgrade from previous releases. (Unless we want to work out some way of ensuring no MOVED bits remain set in a database.) There is one heck of a lot of other infrastructure and complexity that can go away, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On 2/5/10, Josh Berkus j...@agliodbs.com wrote: I think another difference is that the Perl DBI interface is very rich, whereas the Python DB-API is quite minimal and almost forces people to write (incompatible) extensions. The DB-SIG at Python that ought to drive all this is also quite dead, possibly because everyone has moved on to SQLAlchemy. For people who use Python a lot, could I have a list of the deficiencies in DBAPI? I've got my horse and lance ready. I took quick glance on DBI and compared to that, DB-API does not define specific api for: - resultset cursors - prepared plans - db structure examination OTOH, my guess is that DB-API authors assumed the first two are used transparently by the driver, and I see no reason why they cannot be, with current DB-API. Last item is rich indeed, but seems slightly overengineered.. Now, none of the differences between drivers and current confusion seem to be related to above points. Hysterical Raisins is only good explanation about current situation. The pg8000 / bpgsql seem to be toy projects, and anyway you dont want to use pure-Python drivers in high-performance environments. We are not talking about C#/java here. py-postgresql seems to be more serious, but as it's python3 only which makes it irrelevant today. PyGreSQL is the oldest, older than DB-API, and so it's DB-API interface seems an afterthought and is untested/underused - eg. it does not support bytea. Psycopg was the leader, especially in web-environments, but it has non-obvious license and with dead website it does not seem that attractive. Although it is well-maintained still. Best path forward would be to talk with Psycopg guys about license clarification/change. I suspect GPL does not extend anyway to Python code that imports it dynamically, and it does not seem to be their intention - they even try to tell it in their current clarification, but it is not as clear as it could be. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map
Barring objections I'm going to press ahead with completing and committing this; then in a separate patch remove VACUUM FULL INPLACE. Was it our determination that we could remove VFI if we eliminated the system catalogs? I'm fine with it, I just thought some people had a marginal use case for VFI. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map
Josh Berkus j...@agliodbs.com writes: Barring objections I'm going to press ahead with completing and committing this; then in a separate patch remove VACUUM FULL INPLACE. Was it our determination that we could remove VFI if we eliminated the system catalogs? I'm fine with it, I just thought some people had a marginal use case for VFI. I thought the consensus was to remove it if possible. There may still be some marginal use cases, but they don't justify the work that'd be needed to make it play safely with HS; let alone fixing the other longstanding gotchas with it, like the double-commit risk. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add on_plperl_init and on_plperlu_init to plperl UPDATE 3 [PATCH]
On Fri, Feb 5, 2010 at 06:40, Tim Bunce tim.bu...@pobox.com wrote: This is the third update to the fourth of the patches to be split out from the former 'plperl feature patch 1'. Changes in this patch: - Added plperl.on_plperl_init and plperl.on_plperlu_init GUCs Both are PGC_SUSET SPI functions are not available when the code is run. Errors are detected and reported as ereport(ERROR, ...) Corresponding documentation and tests for both. *sniffle* OK I think I agree with everyone else on setting this as PGC_SUSET until we can either prove it can be USERSET or we can fix it so we can check permissions on SET properly. It seems if you really wanted a user to be able to set it you should be able to define a SECURITY DEFINER function that sets it to a string you pass in or something. Obviously not part of core postgres... - Renamed plperl.on_perl_init to plperl.on_init *shrug* OK (I think there was still some flack on this var, but I think its ok-- we can discuss that in a separate thread if people still disagree :) ) - Improved state management of select_perl_context() An error during interpreter initialization will leave the state (interp_state etc) unchanged. - The utf8fix code has been greatly simplified. - More code comments re PGC_SUSET and no access to SPI functions. I like the doc changes and think the new comment about %_SHARED being unsafe is good. All looks good to me. Ill mark it as Ready for Committer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map
Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: Barring objections I'm going to press ahead with completing and committing this; then in a separate patch remove VACUUM FULL INPLACE. Was it our determination that we could remove VFI if we eliminated the system catalogs? I'm fine with it, I just thought some people had a marginal use case for VFI. I thought the consensus was to remove it if possible. There may still be some marginal use cases, but they don't justify the work that'd be needed to make it play safely with HS; let alone fixing the other longstanding gotchas with it, like the double-commit risk. Oh, yea, I remember, HS killed it because of the work required to make FULL work for 9.0. Thanks. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl db access documentation enhancement
Alvaro Herrera wrote: Bruce Momjian wrote: Patch applied for 9.0. We don't normally backpatch such documentation improvements unless we receive multiple reports of confusion. I think that's a mistake in this case. The documentation wasn't confusing -- it was bogus. (Actually, the bug fixing is a smaller change than the whole of this patch, so we could provide the smaller patch if desired to apply to 8.4. To be honest I think it is better to just apply the larger patch verbatim.) I could do the backpatch if you want. I just wanted some more peer review on the changes. Sure, go ahead if you are sure. I wasn't clear enough to risk it, and documentation churn in back branches has its own downsides, which is why I avoid it, especially for larger patches. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Database-Role settings behaviour and docs mismatch
Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Mon, 2010-02-01 at 20:11 -0300, Alvaro Herrera wrote: It'd probably be worth changing the order of the ApplySetting calls so that it doesn't look suspicious. Just a comment would be enough I think Yeah. Changing the order would mean that we'd do extra work applying and then removing conflicting settings. But the general principle here is that GUC settings coming from different places are resolved by source priority, not order of execution. C comment patch attached and applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/utils/init/postinit.c === RCS file: /cvsroot/pgsql/src/backend/utils/init/postinit.c,v retrieving revision 1.201 diff -c -c -r1.201 postinit.c *** src/backend/utils/init/postinit.c 15 Jan 2010 09:19:04 - 1.201 --- src/backend/utils/init/postinit.c 5 Feb 2010 20:25:38 - *** *** 855,860 --- 855,861 relsetting = heap_open(DbRoleSettingRelationId, AccessShareLock); + /* Later settings are ignored if set earlier. */ ApplySetting(databaseid, roleid, relsetting, PGC_S_DATABASE_USER); ApplySetting(InvalidOid, roleid, relsetting, PGC_S_USER); ApplySetting(databaseid, InvalidOid, relsetting, PGC_S_DATABASE); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mammoth in Core?
Joshua D. Drake wrote: On Tue, 2010-01-19 at 21:55 +0100, Markus Wanner wrote: Hi, So, that's what I'd recommend the Mammoth developers to do as well: cherry-picking, sort of. Maybe that fulfills one or the other item on our wish-list (in one way or another)... I doubt we are going to spend the time to do that. Mammoth is BSD and open source. If people want to jump in and help, that would be interesting, but on our own... it isn't on our priority list. Our priority list with mammoth is simple. Finish 1.9 Fix things we know the community as hackers (not feature set) will grump about. Make it work on 8.4 ad 8.5 But again, as Tom says... none of this is relevant until 8.5 is released at which point, we can talk about all of this again. Hopefully 1.9 will be done by then. I wanted to comment on this, however late I am. I think the writing is on the wall and it is now questionable if there is much community value to continue development of Mammoth Replicator, though we might want to pull some ideas from the code at some point. Company-sponsored development is a valuable resource for this community and I would love to see new things worked on. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_timeout behavior for no activity
Fujii Masao wrote: On Fri, Jan 15, 2010 at 12:50 AM, Bruce Momjian br...@momjian.us wrote: Looking at the archive_timeout documentation and CheckArchiveTimeout(), it appears we force a new xlog file and archive it even if no activity has been recorded in the xlog file. ?Is this correct? No. CheckArchiveTimeout() doesn't switch WAL files if there is no activity after the last switch. In fact, though it calls RequestXLogSwitch(), the switch is skipped in XLogInsert() because we are exactly at the start of a file in that case. But unfortunately checkpoint would be often recorded between each switches. So the archive_timeout appears to always force a new WAL file. I have documented that increasing checkpoint_timeout can avoid WAL writes on idle systems with archive_timeout. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_timeout behavior for no activity
Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: Looking at the archive_timeout documentation and CheckArchiveTimeout(), it appears we force a new xlog file and archive it even if no activity has been recorded in the xlog file. Is this correct? Should we document this or fix it so only xlog files with contents are archived? Er, you can probably blame me for that. Tom was going to fix it and I pointed out that it would break our monitoring of our warm standby processes. We have a one hour maximum and send alerts if we've gone 75 minutes or more without receiving a WAL file from one of our databases. Of course, if we had a nicer way to know that we were up-to-date with our WAL file copies, we wouldn't need this; but right now there aren't a lot of options for monitoring these things. I am dismayed that we are using a 16MB file for monitoring archive activity. Can't you use pg_current_xlog_location() and only check for an archive file when that location changes? Anyway, I have updated the documentation with the attached patch to mention this issue, and added a C comment as well. Is there a TODO here? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/config.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.249 diff -c -c -r1.249 config.sgml *** doc/src/sgml/config.sgml 3 Feb 2010 17:25:05 - 1.249 --- doc/src/sgml/config.sgml 5 Feb 2010 23:17:17 - *** *** 1739,1745 server to switch to a new WAL segment file periodically. When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last ! segment file switch. Note that archived files that are closed early due to a forced switch are still the same length as completely full files. Therefore, it is unwise to use a very short varnamearchive_timeout/ mdash; it will bloat your archive --- 1739,1749 server to switch to a new WAL segment file periodically. When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last ! segment file switch, and there has been any database activity, ! including a single checkpoint. (Increasing ! varnamecheckpoint_timeout/ will reduce unnecessary ! checkpoints on an idle system.) ! Note that archived files that are closed early due to a forced switch are still the same length as completely full files. Therefore, it is unwise to use a very short varnamearchive_timeout/ mdash; it will bloat your archive Index: src/backend/postmaster/bgwriter.c === RCS file: /cvsroot/pgsql/src/backend/postmaster/bgwriter.c,v retrieving revision 1.66 diff -c -c -r1.66 bgwriter.c *** src/backend/postmaster/bgwriter.c 15 Jan 2010 09:19:02 - 1.66 --- src/backend/postmaster/bgwriter.c 5 Feb 2010 23:17:21 - *** *** 543,549 /* * CheckArchiveTimeout -- check for archive_timeout and switch xlog files ! * if needed */ static void CheckArchiveTimeout(void) --- 543,552 /* * CheckArchiveTimeout -- check for archive_timeout and switch xlog files ! * ! * This will switch to a new WAL file and force an archive file write ! * if any activity is recorded in the current WAL file, including just ! * a single checkpoint record. */ static void CheckArchiveTimeout(void) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby documentation
Joshua Tolley wrote: -- Start of PGP signed section. Having concluded I really need to start playing with hot standby, I started looking for documentation on the subject. I found what I was looking for; I also found this page[1], which, it seems, ought to mention hot standby. Comments? [1] http://developer.postgresql.org/pgdocs/postgres/high-availability.html Ah, I now realize it only mentions warm standby, not hot, so I just updated the documentation to reflect that; you can see it here: http://momjian.us/tmp/pgsql/high-availability.html Warm and Hot Standby Using Point-In-Time Recovery (PITR) Do we want to call the feature hot standby? Is a read-only standby a standby or a slave? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
Imho a big problem is that it does way too much itself - i.e. it does not use things like PQExecParams but does escaping/parsing itself... Other people may think thats a good idea - I definitely do not think so. It also has issues with transaction control which cause idle transactions if the Django front-end times out due to load. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] VAC FULL/CLUSTER on system catalogs is prone to deadlock
I discovered $subject while continuing to test my patch. The problem occurs because many parts of the backend are in the habit of releasing lock on system catalogs as soon as they've updated whatever they wanted to update. This means that we can encounter INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples, even though we have exclusive lock on the table. That's no big problem as far as copy_heap_data goes; it can just transfer the rows anyway. However, when we get to the point of reindexing the catalog, IndexBuildHeapScan will wait for commit of whichever transaction did the deed. Since we're still holding exclusive lock on the catalog, it's entirely likely for that transaction to need to read or write the catalog again, whereupon deadlock. There are a couple of things we could try to do about this, other than just live with it: 1. Try to avoid releasing locks early in system catalog operations. The trouble with this is that it would inevitably create deadlock failures of its own, due to adding lock-upgrade situations where there were none before. That is, instead of patterns like Acquire(AccessShareLock), Release(AccessShareLock), Acquire(RowExclusiveLock), then Release(RowExclusiveLock), we'd have Acquire(AccessShareLock), Acquire(RowExclusiveLock), then release; which could deadlock against a parallel acquisition of higher-level locks. Maybe we could make it work reasonably well if we release AccessShareLock quickly after a read but always hold RowExclusiveLock once acquired. Not sure. 2. In VAC FULL and CLUSTER, tell index rebuild not to wait around for INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples. I believe that the price of this would be not re-verifying the integrity of unique indexes. Which is kind of annoying, but then again rechecking data consistency is not the purpose of these commands. It would not be too unreasonable to tell people to use REINDEX if they suspect the validity of an index --- that takes a significantly weaker lock and wouldn't be nearly as big a problem to use on system catalogs. 3. Try to ameliorate the problem by making sure it's the VACUUM FULL that fails, and not the other process, when there's a deadlock. BTW, the reason we don't see this type of problem with the existing VAC FULL INPLACE logic is that it essentially silently adopts #2 --- it suppresses uniqueness checks even when it's making new index insertions. At the moment #2 is seeming like the most attractive alternative; both of the others look like research projects. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VAC FULL/CLUSTER on system catalogs is prone to deadlock
Tom Lane wrote: 2. In VAC FULL and CLUSTER, tell index rebuild not to wait around for INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples. I believe that the price of this would be not re-verifying the integrity of unique indexes. Which is kind of annoying, but then again rechecking data consistency is not the purpose of these commands. It would not be too unreasonable to tell people to use REINDEX if they suspect the validity of an index --- that takes a significantly weaker lock and wouldn't be nearly as big a problem to use on system catalogs. 3. Try to ameliorate the problem by making sure it's the VACUUM FULL that fails, and not the other process, when there's a deadlock. BTW, the reason we don't see this type of problem with the existing VAC FULL INPLACE logic is that it essentially silently adopts #2 --- it suppresses uniqueness checks even when it's making new index insertions. At the moment #2 is seeming like the most attractive alternative; both of the others look like research projects. Comments? Yea, #2. No one has complained about this behavior in the past, as I remember. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Tom Lane wrote: I thought the consensus was to remove it if possible. There may still be some marginal use cases, but they don't justify the work that'd be needed to make it play safely with HS; let alone fixing the other longstanding gotchas with it, like the double-commit risk. I think part of the plan was to also provide an online reorg tool that works by doing dummy UPDATEs, which means that you can get serialization errors in serializable mode, but doesn't need to lock the table. Yeah. There's a good deal of interest in incremental/partial vacuuming. But that wouldn't make use of the existing VFI infrastructure either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Friday 05 February 2010 21:34:53 Marko Kreen wrote: On 2/5/10, Josh Berkus j...@agliodbs.com wrote: I think another difference is that the Perl DBI interface is very rich, whereas the Python DB-API is quite minimal and almost forces people to write (incompatible) extensions. The DB-SIG at Python that ought to drive all this is also quite dead, possibly because everyone has moved on to SQLAlchemy. For people who use Python a lot, could I have a list of the deficiencies in DBAPI? I've got my horse and lance ready. Psycopg was the leader, especially in web-environments, but it has non-obvious license and with dead website it does not seem that attractive. Although it is well-maintained still. Imho a big problem is that it does way too much itself - i.e. it does not use things like PQExecParams but does escaping/parsing itself... Other people may think thats a good idea - I definitely do not think so. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On 2/5/10, Greg Smith g...@2ndquadrant.com wrote: The issues here have already been identified: the Perl DBI is an excellent spec, while the Python one is so weak everybody ends up needing their own extensions to it. And then portability *even among Python PostgreSQL drivers* goes out the window. Well, no. You are overplaying on the DBI angle. If the driver can offer via the minimal DB-API the same features the DBI driver can offer, then the DBI API is bloated, not the minimal API weak... Not that DB-API is perfect, a way to toggle prepare and db cursors on/off in per-query basis would be good to have, but thats mostly it. Rest of extra features (COPY) need to be done as extensions anyway. But all that is off-topic to the current driver suckiness. The problems with Python drivers are that they fail to implement even the minimal DB-API 2.0: - correctly, across all interesting data types. PygreSQL (bytea). pg8000 (does rollback after query fails - WTF). None of the drivers seem to support both array and tuple. - robustly. Psycopg2 - their misguided attempt to turn DB-API calls optionally async cost their stability big-time as it complexified their codebase. It got stable around 2.0.[678]. I'm not holding my breath on the other ones. Fear the fancy API-s. - with threadsafety level 1. (I don't care, but seems in some circles it's necessary.) Note how many ways a driver can suck without stepping above basic DB-API 2.0. If somebody built a BSD/MIT licensed driver that replaces every useful feature of all the forks, with no major problems, and a couple of major projects switched over to it (think Skype level big), maybe this mess could get resolved. I think it would take someone already familiar with the major issues involved a couple of months of regular work to make any serious progress on it. Yeah, but the problem here is that there actually is not a problem. Psycopg license is not bad, just confusing on first sight. And website problem can be fix by advertising http://wiki.postgresql.org/wiki/Psycopg as new website... If your requirement is BSD license, then probably getting PyGreSQL into shape would be faster than writing from scratch. Main dev effort would be writing Postgres-specific DB-API 2.0 testsuite anyway. It's likely that actual improvements needed would be small. Also I would suggest to match Psycopg extensions API, that would make it usable to much broader audience. (Eg: PygreSql does not offer COPY extensions via the DB-API 2.0 API. It would be good to match Psycopg here, instead inventing new API.) -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Hot Standby and DROP DATABASE
Hi, While testing Hot Standby, I have encountered strange behavior with DROP DATABASE command. 1) connect to test database at standby via psql 2) issue DROP DATABASE test command to primary 3) session #1 works fine 4) close session #1 5) test database dropped on standby Fromt the manual: Running DROP DATABASE, ALTER DATABASE ... SET TABLESPACE, or ALTER DATABASE ... RENAME on primary will generate a log message that will cause all users connected to that database on the standby to be forcibly disconnected. This action occurs immediately, whatever the setting of max_standby_delay. So it seems at least the behavior is quite different from what the docs stats. Am I missing something here? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Feb 5, 2010, at 1:34 PM, Marko Kreen wrote: py-postgresql seems to be more serious, but as it's python3 only which makes it irrelevant today. Furthermore, if it did work on python2, it's *not* something that's going to appeal to mainstream users (Python heavy web frameworks) as it *partially* suffers from the same problem that pg8000 does. It's mostly pure-Python, but it has some C optimizations(notably, PQ message buffer). I have done some profiling, and *with a few tweaks* it's about 2x-3x *slower than psycopg2* for the retrieval of a single int column. I think it could go faster, but I don't think it's worth the work. ISTM that the target audience are folk who are married to PG, and are generally unhappy with DB-API, but do not want to buy into a big abstraction layer like SQLAlchemy. Sure, it supports DB-API like other drivers so it *would be* usable with frameworks, but why take the 3x *or greater* hit over a properly implemented libpq version? Finally, I just don't see the existing (often PG specific) goals that I have in mind for it appealing to the majority of [web framework/abstraction] users. Psycopg was the leader, especially in web-environments, but it has non-obvious license and with dead website it does not seem that attractive. Although it is well-maintained still. Best path forward would be to talk with Psycopg guys about license clarification/change. Yep. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Feb 5, 2010, at 11:34 AM, Josh Berkus wrote: For people who use Python a lot, could I have a list of the deficiencies in DBAPI? I've got my horse and lance ready. Given that SQLAlchemy isn't for everyone, of course ... it couldn't be, or Django would use it, no? Here are some to start with: - paramstyle - No direct support of prepared statements [they *tried* to compensate for this with cached statements, but it's inadequate] - Too many *optional* extensions (Cursor Method .__iter__() being rather notable) http://www.python.org/dev/peps/pep-0249/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Feb 5, 2010, at 8:00 AM, Peter Eisentraut wrote: I think another difference is that the Perl DBI interface is very rich, whereas the Python DB-API is quite minimal and almost forces people to write (incompatible) extensions. Yep. The DB-SIG at Python that ought to drive all this is also quite dead, From reading messages that come into that list(not-so-lately), I've gotten the impression that they are content with DB-API 2. Aside from the TPC interfaces, the last round of activity that I know of was dbapi3[1 (which was a long while ago)]. possibly because everyone has moved on to SQLAlchemy. Yeah. AFAICT, that is the general direction encouraged by DB-SIG. Write an abstraction on top of DB-API. SQLAlchemy, SQLObject, anysql, and I'm sure there are others. [1] http://wiki.python.org/moin/Aug2001DbApi3Strawman http://wiki.python.org/moin/DbApi3 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)
Andres Freund wrote: On 02/03/10 14:42, Robert Haas wrote: Well, maybe we should start with a discussion of what kernel calls you're aware of on different platforms and then we could try to put an API around it. In linux there is sync_file_range. On newer Posixish systems one can emulate that with mmap() and msync() (in batches obviously). No idea about windows. There's a series of parameters you can pass into CreateFile: http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx A lot of these are already mapped inside of src/port/open.c in a pretty straightforward way from the POSIX-oriented interface: O_RDWR,O_WRONLY - GENERIC_WRITE, GENERIC_READ O_RANDOM - FILE_FLAG_RANDOM_ACCESS O_SEQUENTIAL - FILE_FLAG_SEQUENTIAL_SCAN O_SHORT_LIVED - FILE_ATTRIBUTE_TEMPORARY O_TEMPORARY - FILE_FLAG_DELETE_ON_CLOSE O_DIRECT - FILE_FLAG_NO_BUFFERING O_DSYNC - FILE_FLAG_WRITE_THROUGH You have to read the whole Caching Behavior section to see exactly how all of those interact, and even then notes like http://support.microsoft.com/kb/99794 are needed to follow the fine points of things like FILE_FLAG_NO_BUFFERING vs. FILE_FLAG_WRITE_THROUGH. So anything that's setting those POSIX open flags better than before is getting the benefit of that improvement on Windows, too. But that's not quite the same as the changes using fadvise to provide better targeted cache control hints. I'm getting the impression that doing much better on Windows might fall into the same sort of category as Solaris, where the primary interface for this sort of thing is to use an AIO implementation instead: http://msdn.microsoft.com/en-us/library/aa365683(VS.85).aspx The effective_io_concurrency feature had proof of concept test programs that worked using AIO, but actually following through on that implementation would require a major restructuring of how the database interacts with the OS in terms of reads and writes of blocks. It looks to me like doing something similar to sync_file_range on Windows would be similarly difficult. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reading deleted records - PageHeader v3
Jonathan Bond-Caron jbo...@gmail.com writes: I think part of my problem is I haven't really understood what 'Then make sure you have the right alignment' means. My approach currently is: After reading HeapTupleHeaderData (23 bytes), I advance another 4 bytes (hoff) and try to read a 32 bit integer (first attribute). No. First you start at the tuple beginning plus the number of bytes indicated by hoff (which should be at least 24). The first field will always be right there, because this position is always maximally aligned. For subsequent fields you have to advance to a multiple of the alignment requirement of the datatype. For example, assume the table's first column is of type bool (1 byte) and the second column is of type integer. The bool will be at offset hoff, but the integer will be at offset hoff + 4 ... it can't immediately follow the bool, at offset hoff + 1, because that position isn't correctly aligned. It has to start at the next offset that's a multiple of 4. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [CFReview] Red-Black Tree
2010/2/4 Teodor Sigaev teo...@sigaev.ru: Oleg's test (http://www.sai.msu.su/~megera/wiki/rbtree_test) are made with v0.10 which is differ from 0.11 only by comments around ginInsertRecordBA() That looks pretty good. I confess I don't fully understand why it works. If we're inserting a bunch of equal-key entries, why does it matter what order we insert them in? Is there some code in here (where?) that breaks ties on the basis of where they are in the input data? I think that the code in ginInsertRecordBA() is needlessly complex. As far as I can see, nNodesOnCurrentLevel is always exactly one more than nNodesOnPreviousLevel, and I think step is also basically redundant with both of these although the relationship is a little more complex. What I would suggest is something like: - initialize step to the largest power of 2 s.t. step nentry - while step 0 -- for (i = step; true; i += 2 * step) --- insert entry #i-1 --- if i nentry - (2 * step) /* must test before incrementing i, to guard against overflow */ break -- step = step / 2 Typos: bunary - binary This insertion order decreases number of rebalancing for tree - should be number of rebalancings castomized - customized ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers