[HACKERS] Reading deleted records - PageHeader v3

2010-02-05 Thread Jonathan Bond-Caron
Hi, 

 

So first I'm a pgsql hacker newbie and I've been reading up on the storage
structure: 

http://www.postgresql.org/docs/8.2/interactive/storage-page-layout.html

 

I'm trying to recover deleted records from a page file (postgresql 8.2) :
i.e. base/dbId/20132

 

I am able to successfully read all the header data I need (PageHeaderData,
ItemIdData , HeapTupleHeaderData) 

but I hit a wall when I try to start reading user data. 

 

This has helped:

http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/postgres.h?rev=1.
77;content-type=text%2Fplain

 

I've read and understood fairly well how varlena structures are stored
(plain, compressed, external/toast) but so far I can't seem to read a plain
inline value.

I think part of my problem is I haven't really understood what 'Then make
sure you have the right alignment' means. 

 

My approach currently is:

After reading HeapTupleHeaderData (23 bytes), I advance another 4 bytes
(hoff) and try to read a 32 bit integer (first attribute).

 

I am expecting to get an integer value 1 but I get 512 .

 

Am I doing this wrong? 

Could someone point me to the pgsql code pieces I should be looking at? 

 

If useful, this is the information I have before reading the 'user data':

 

object(PostgreSQL_HeapTupleHeaderData)#14 (7) {

  [xmin]=

  string(5) 13824

  [xmax]=

  string(1) 0

  [cid]=

  string(1) 0

  [ctid]=

  object(PostgreSQL_ItemPointerData)#16 (2) {

[blockId]=

string(1) 0

[posId]=

int(0)

  }

  [infomask2]=

  int(0)

  [infomask]=

  int(2)

  [hoff]=

  int(4)

}

object(PostgreSQL_Attribute)#7 (6) {

  [name]=

  string(7) book_id

  [relid]=

  int(20132)

  [len]=

  int(4)

  [num]=

  int(1)

  [ndims]=

  int(0)

  [align]=

  string(1) i

}

array(1) {

  [book_id]=

  int(512)

}



[HACKERS] Add on_plperl_init and on_plperlu_init to plperl UPDATE 3 [PATCH]

2010-02-05 Thread Tim Bunce
This is the third update to the fourth of the patches to be split out
from the former 'plperl feature patch 1'.

Changes in this patch:

- Added plperl.on_plperl_init and plperl.on_plperlu_init GUCs
Both are PGC_SUSET
SPI functions are not available when the code is run.
Errors are detected and reported as ereport(ERROR, ...)
Corresponding documentation and tests for both.

- Renamed plperl.on_perl_init to plperl.on_init

- Improved state management of select_perl_context()
An error during interpreter initialization will leave
the state (interp_state etc) unchanged.

- The utf8fix code has been greatly simplified.

- More code comments re PGC_SUSET and no access to SPI functions.

Tim.

diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml
index 7018624..0999bd0 100644
*** a/doc/src/sgml/plperl.sgml
--- b/doc/src/sgml/plperl.sgml
*** $$ LANGUAGE plperl;
*** 748,753 
--- 748,758 
 literalreturn $_SHARED{myquote}-gt;($_[0]);/literal
 at the expense of readability.)
/para
+ 
+   para
+ 	The varname%_SHARED/varname variable, and other global state within
+ 	the language, should be considered insecure in a multi-user database.
+   /para
   /sect1
  
   sect1 id=plperl-trusted
*** CREATE TRIGGER test_valid_id_trig
*** 1044,1057 
  
variablelist
  
!  varlistentry id=guc-plperl-on-perl-init xreflabel=plperl.on_perl_init
!   termvarnameplperl.on_perl_init/varname (typestring/type)/term
indexterm
!primaryvarnameplperl.on_perl_init/ configuration parameter/primary
/indexterm
listitem
 para
!Specifies perl code to be executed when a perl interpreter is first initialized.
 The SPI functions are not available when this code is executed.
 If the code fails with an error it will abort the initialization of the interpreter
 and propagate out to the calling query, causing the current transaction
--- 1049,1063 
  
variablelist
  
!  varlistentry id=guc-plperl-on-init xreflabel=plperl.on_init
!   termvarnameplperl.on_init/varname (typestring/type)/term
indexterm
!primaryvarnameplperl.on_init/ configuration parameter/primary
/indexterm
listitem
 para
!Specifies perl code to be executed when a perl interpreter is first initialized
!and before it is specialized for use by literalplperl/ or literalplperlu/.
 The SPI functions are not available when this code is executed.
 If the code fails with an error it will abort the initialization of the interpreter
 and propagate out to the calling query, causing the current transaction
*** CREATE TRIGGER test_valid_id_trig
*** 1059,1069 
 /para
 para
 The perl code is limited to a single string. Longer code can be placed
!into a module and loaded by the literalon_perl_init/ string.
 Examples:
  programlisting
! plplerl.on_perl_init = '$ENV{NYTPROF}=start=no; require Devel::NYTProf::PgPLPerl'
! plplerl.on_perl_init = 'use lib /my/app; use MyApp::PgInit;'
  /programlisting
 /para
 para
--- 1065,1075 
 /para
 para
 The perl code is limited to a single string. Longer code can be placed
!into a module and loaded by the literalon_init/ string.
 Examples:
  programlisting
! plplerl.on_init = '$ENV{NYTPROF}=start=no; require Devel::NYTProf::PgPLPerl'
! plplerl.on_init = 'use lib /my/app; use MyApp::PgInit;'
  /programlisting
 /para
 para
*** plplerl.on_perl_init = 'use lib /my/app
*** 1077,1082 
--- 1083,1134 
/listitem
   /varlistentry
  
+  varlistentry id=guc-plperl-on-plperl-init xreflabel=plperl.on_plperl_init
+   termvarnameplperl.on_plperl_init/varname (typestring/type)/term
+   indexterm
+primaryvarnameplperl.on_plperl_init/ configuration parameter/primary
+   /indexterm
+   listitem
+para
+Specifies perl code to be executed when the literalplperl/ language
+is first used in a session.  Changes made after the literalplperl/
+language has been used will have no effect.
+The perl code can only perform trusted operations.
+The SPI functions are not available when this code is executed.
+/para
+para
+If the code fails with an error it will abort the initialization and
+propagate out to the calling query, causing the current transaction or
+subtransaction to be aborted. Any changes within perl won't be undone.
+If the literalplperl/ language is used again the
+initialization will be repeated.
+/para
+   /listitem
+  /varlistentry
+ 
+  varlistentry id=guc-plperl-on-untrusted-init xreflabel=plperl.on_untrusted_init
+   termvarnameplperl.on_untrusted_init/varname (typestring/type)/term
+   indexterm
+

[HACKERS] Failed assertion during recovery of partial WAL file

2010-02-05 Thread Martin Pihlak
Encountered the following FailedAssertion while testing database
recovery (actually this would be HS) with partial WAL file:

LOG:  restored log file 00010003 from archive
LOG:  consistent recovery state reached at 0/3001EEC
LOG:  record with zero length at 0/3001EEC
LOG:  redo done at 0/3001D68
LOG:  last completed transaction was at log time 2010-02-05 11:02:49.695544+02
LOG:  database system is ready to accept read only connections
LOG:  selected new timeline ID: 3
TRAP: FailedAssertion(!(readFile = 0), File: xlog.c, Line: 5117)

and the assorted backtrace:
#0  0xb7f2e410 in __kernel_vsyscall ()
#1  0xb7dc9085 in raise () from /lib/tls/i686/cmov/libc.so.6
#2  0xb7dcaa01 in abort () from /lib/tls/i686/cmov/libc.so.6
#3  0x0834926e in ExceptionalCondition (conditionName=0x8389a82 !(readFile = 
0),
errorType=0x837d5b4 FailedAssertion,
fileName=0x8390338 xlog.c, lineNumber=5117) at assert.c:57
#4  0x080dde34 in exitArchiveRecovery (endTLI=1, endLogId=0, endLogSeg=18) at 
xlog.c:5117
#5  0x080e3eab in StartupXLOG () at xlog.c:6029
#6  0x080e6055 in StartupProcessMain () at xlog.c:8666
#7  0x08106a30 in AuxiliaryProcessMain (argc=2, argv=0xbfebdd64) at 
bootstrap.c:412
#8  0x08253f0c in StartChildProcess (type=StartupProcess) at postmaster.c:4340
#9  0x0825669e in PostmasterMain (argc=3, argv=0x8516e08) at postmaster.c:1078
#10 0x081f7919 in main (argc=3, argv=0x8516e08) at main.c:188

The crash happened on a HS slave which was fed a partial WAL file. The
partial was constructed by extracting data up to pg_current_xlog_location()
and zero padding it up to 16MB.

This only seems to be happening on HEAD - quick tests indicate that both
8.3 and 8.4 are not affected (or maybe I didn't try hard enough).

regards,
Martin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
My son has brought to my attention that our current crop of Python
client libraries is inadequate/confusing.  I took a look myself, and
asked on our IRC channel, and am now convinced this area needs
attention.

First, the sheer number of libraries is confusing.  This is the page
from the Postgres wiki:

http://wiki.postgresql.org/wiki/Python

The first one listed, Psycopg, is noted as preferred libpq-based
driver, but the license is GPL.  Isn't that a problem for many client
applications?

The next one, PyGreSQL, is BSD licensed, but only has documentation for
the classic interface.  The DB-API module says about documentation:

http://www.pygresql.org/pgdb.html

This section of the documentation still needs to be written.

The other three are pure Python drivers, which I guess can be good, but
why three, and then there are three more listed as obsolete/stalled.

Clearly something is wrong here.

The Python-hosted PostgreSQL page has similar problems:

http://wiki.python.org/moin/PostgreSQL

Does Perl have a similar mess?

While I realize experienced people can easily navigate this confusion, I
am concerned about new Postgres adopters being very confused by this and
it is hurting our database adoption in general.  

What is really needed is for someone to take charge of one of these
projects and make a best-of-breed Python driver that can gain general
acceptance as our preferred driver.  I feel Python is too important a
language to be left in this state.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Personal Copyright Notices

2010-02-05 Thread Greg Stark
On Fri, Feb 5, 2010 at 3:24 AM, Bruce Momjian br...@momjian.us wrote:
 Bruce Momjian wrote:
 The intagg copyright is on a _Makefile_:

       # Makefile for integer aggregator
       # Copyright (C) 2001 Digital Music Network.
       # by Mark L. Woodward
       # $PostgreSQL: pgsql/contrib/intagg/Makefile,v 1.10 2008/11/14 
 19:58:45 tgl Exp $

 Seems we either have to contact the author or rewrite the file.

 I have rewritten the above file from scratch to enable removal of the
 copyright mention.


Uhm, you rewrote it and it turned out byte-for-byte identical except
for the removed copyright comments? Actually the file is trivial
enough -- it's just our standard pgxs template with the two filenames
inserted -- that that's plausible. I think that indicates that it
wasn't really something copyrightable to begin with.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Personal Copyright Notices

2010-02-05 Thread Bruce Momjian
Greg Stark wrote:
 On Fri, Feb 5, 2010 at 3:24 AM, Bruce Momjian br...@momjian.us wrote:
  Bruce Momjian wrote:
  The intagg copyright is on a _Makefile_:
 
  ? ? ? # Makefile for integer aggregator
  ? ? ? # Copyright (C) 2001 Digital Music Network.
  ? ? ? # by Mark L. Woodward
  ? ? ? # $PostgreSQL: pgsql/contrib/intagg/Makefile,v 1.10 2008/11/14 
  19:58:45 tgl Exp $
 
  Seems we either have to contact the author or rewrite the file.
 
  I have rewritten the above file from scratch to enable removal of the
  copyright mention.
 
 
 Uhm, you rewrote it and it turned out byte-for-byte identical except
 for the removed copyright comments? Actually the file is trivial
 enough -- it's just our standard pgxs template with the two filenames
 inserted -- that that's plausible. I think that indicates that it
 wasn't really something copyrightable to begin with.

Right, I copied /contrib/intarray/Makefile and modified it.  If it turns
out identical, so be it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Massa, Harald Armin
Bruce,

   http://wiki.postgresql.org/wiki/Python

 The first one listed, Psycopg, is noted as preferred libpq-based
 driver, but the license is GPL.  Isn't that a problem for many client
 applications?

 The licence of psycopg2 is a little more complicated; the GPL in that
summary just tries to sum it. The actual licence is: [from the LICENCE in
the source distribution]

-
psycopg and the GPL
===

psycopg is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version. See file COPYING for details.

As a special exception, specific permission is granted for the GPLed
code in this distribition to be linked to OpenSSL and PostgreSQL libpq
without invoking GPL clause 2(b).

Note that the GPL was chosen to avoid proprietary adapters based on
psycopg code. Using psycopg in a proprietary product (even bundling
psycopg with the proprietary product) is fine as long as:

 1. psycopg is called from Python only using only the provided API
(i.e., no linking with C code and no C modules based on it); and

 2. all the other points of the GPL are respected (you offer a copy
of psycopg's source code, and so on.)

Alternative licenses


If you prefer you can use the Zope Database Adapter ZPsycopgDA (i.e.,
every file inside the ZPsycopgDA directory) user the ZPL license as
published on the Zope web site, http://www.zope.org/Resources/ZPL.

Also, the following BSD-like license applies (at your option) to the
files following the pattern psycopg/adapter*.{h,c} and
psycopg/microprotocol*.{h,c}:

 Permission is granted to anyone to use this software for any purpose,
 including commercial applications, and to alter it and redistribute it
 freely, subject to the following restrictions:

 1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product documentation
would be appreciated but is not required.

 2. Altered source versions must be plainly marked as such, and must not
be misrepresented as being the original software.

 3. This notice may not be removed or altered from any source distribution.

psycopg is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Proprietary licenses


A non-exclusive license is available for companies that want to include
psycopg in their proprietary products without respecting the spirit of the
GPL. The price of the license is one day of development done by the author,
at the consulting fee he applies to his usual customers at the day of the
request.

-

Harald

-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Massa, Harald Armin wrote:
 Bruce,
 
http://wiki.postgresql.org/wiki/Python
 
  The first one listed, Psycopg, is noted as preferred libpq-based
  driver, but the license is GPL.  Isn't that a problem for many client
  applications?
 
  The licence of psycopg2 is a little more complicated; the GPL in that
 summary just tries to sum it. The actual licence is: [from the LICENCE in
 the source distribution]

Wow, that is super-confusing.  I am dealing with an issue now that some
companies are concerned that individual names appear with our own BSD
copyrights in small files.   That pales in comparison to this copyright.
I would never be able to recommend software with that complex a
copyright.

I think the community needs to look at other alternatives.

(I am amazed that not only are there a confusing number of drivers, but
there are a confusing number of copyright options for this single driver.)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Peter Eisentraut
On fre, 2010-02-05 at 09:19 -0500, Bruce Momjian wrote:
 While I realize experienced people can easily navigate this confusion,
 I
 am concerned about new Postgres adopters being very confused by this
 and
 it is hurting our database adoption in general.  
 
 What is really needed is for someone to take charge of one of these
 projects and make a best-of-breed Python driver that can gain general
 acceptance as our preferred driver.  I feel Python is too important a
 language to be left in this state.

The situation is unfortunate, but you might as well argue that too many
Linux desktops or Linux distributions confuse new users and hurt
adoption.  These alternatives all exist for a reason, and it will be
difficult to get some of them to abandon their work with the aim of
reducing the overall confusion.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Tim Bunce
On Fri, Feb 05, 2010 at 09:19:26AM -0500, Bruce Momjian wrote:
 My son has brought to my attention that our current crop of Python
 client libraries is inadequate/confusing.  I took a look myself, and
 asked on our IRC channel, and am now convinced this area needs
 attention.
 
   http://wiki.postgresql.org/wiki/Python
 The Python-hosted PostgreSQL page has similar problems:
   http://wiki.python.org/moin/PostgreSQL
 
 Does Perl have a similar mess?

I don't think so.

The primary database interface is DBI and as far as I can see there's
only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/

The only non-DBI interfaces I could find (by skimming the 384 results
for postgresql on search.cpan.org) were:
Postgres: http://search.cpan.org/dist/Postgres/last updated in 1998.
Pg:   http://search.cpan.org/dist/pgsql_perl5/ last updated in 2000.

Tim.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Peter Eisentraut wrote:
 On fre, 2010-02-05 at 09:19 -0500, Bruce Momjian wrote:
  While I realize experienced people can easily navigate this confusion,
  I
  am concerned about new Postgres adopters being very confused by this
  and
  it is hurting our database adoption in general.  
  
  What is really needed is for someone to take charge of one of these
  projects and make a best-of-breed Python driver that can gain general
  acceptance as our preferred driver.  I feel Python is too important a
  language to be left in this state.
 
 The situation is unfortunate, but you might as well argue that too many
 Linux desktops or Linux distributions confuse new users and hurt
 adoption.  These alternatives all exist for a reason, and it will be
 difficult to get some of them to abandon their work with the aim of
 reducing the overall confusion.

Agreed, but can't we do a better job of steering new users and improving
the novice experience for the most popular of them, i.e. think Ubuntu.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Tim Bunce wrote:
 On Fri, Feb 05, 2010 at 09:19:26AM -0500, Bruce Momjian wrote:
  My son has brought to my attention that our current crop of Python
  client libraries is inadequate/confusing.  I took a look myself, and
  asked on our IRC channel, and am now convinced this area needs
  attention.
  
  http://wiki.postgresql.org/wiki/Python
  The Python-hosted PostgreSQL page has similar problems:
  http://wiki.python.org/moin/PostgreSQL
  
  Does Perl have a similar mess?
 
 I don't think so.
 
 The primary database interface is DBI and as far as I can see there's
 only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/
 
 The only non-DBI interfaces I could find (by skimming the 384 results
 for postgresql on search.cpan.org) were:
 Postgres: http://search.cpan.org/dist/Postgres/last updated in 1998.
 Pg:   http://search.cpan.org/dist/pgsql_perl5/ last updated in 2000.

Yes, that's what I thought, and am glad to here it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.

2010-02-05 Thread Bruce Momjian
Peter Eisentraut wrote:
 On tis, 2010-01-26 at 10:20 -0800, David Fetter wrote:
  On Tue, Jan 26, 2010 at 02:21:29PM +, Bruce Momjian wrote:
   Log Message:
   ---
   Remove tabs in SGML.
  
  Can we see about making a commit hook for CVS that disallows \t in
  SGML files?  The process in git is pretty simple.
 
 Better write a check in the makefile.

Done, and attached.  The rule is more shell script than makefile, but I
am not sure how else to do it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/Makefile
===
RCS file: /cvsroot/pgsql/doc/src/sgml/Makefile,v
retrieving revision 1.135
diff -c -c -r1.135 Makefile
*** doc/src/sgml/Makefile	6 Jan 2010 19:07:05 -	1.135
--- doc/src/sgml/Makefile	5 Feb 2010 14:49:26 -
***
*** 345,350 
--- 345,352 
  
  endif # sqlmansectnum != 7
  
+ check-tabs:
+ 	( ! grep '	' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files;  exit 1)
  
  ##
  ## Clean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Peter Eisentraut
On fre, 2010-02-05 at 14:45 +, Tim Bunce wrote:
  Does Perl have a similar mess?
 
 I don't think so.
 
 The primary database interface is DBI and as far as I can see there's
 only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/

I think another difference is that the Perl DBI interface is very rich,
whereas the Python DB-API is quite minimal and almost forces people to
write (incompatible) extensions.  The DB-SIG at Python that ought to
drive all this is also quite dead, possibly because everyone has moved
on to SQLAlchemy.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failed assertion during recovery of partial WAL file

2010-02-05 Thread Fujii Masao
On Fri, Feb 5, 2010 at 10:46 PM, Martin Pihlak martin.pih...@gmail.com wrote:
 Encountered the following FailedAssertion while testing database
 recovery (actually this would be HS) with partial WAL file:

 LOG:  restored log file 00010003 from archive
 LOG:  consistent recovery state reached at 0/3001EEC
 LOG:  record with zero length at 0/3001EEC
 LOG:  redo done at 0/3001D68
 LOG:  last completed transaction was at log time 2010-02-05 11:02:49.695544+02
 LOG:  database system is ready to accept read only connections
 LOG:  selected new timeline ID: 3
 TRAP: FailedAssertion(!(readFile = 0), File: xlog.c, Line: 5117)

 and the assorted backtrace:
 #0  0xb7f2e410 in __kernel_vsyscall ()
 #1  0xb7dc9085 in raise () from /lib/tls/i686/cmov/libc.so.6
 #2  0xb7dcaa01 in abort () from /lib/tls/i686/cmov/libc.so.6
 #3  0x0834926e in ExceptionalCondition (conditionName=0x8389a82 !(readFile 
 = 0),
 errorType=0x837d5b4 FailedAssertion,
    fileName=0x8390338 xlog.c, lineNumber=5117) at assert.c:57
 #4  0x080dde34 in exitArchiveRecovery (endTLI=1, endLogId=0, endLogSeg=18) at 
 xlog.c:5117
 #5  0x080e3eab in StartupXLOG () at xlog.c:6029
 #6  0x080e6055 in StartupProcessMain () at xlog.c:8666
 #7  0x08106a30 in AuxiliaryProcessMain (argc=2, argv=0xbfebdd64) at 
 bootstrap.c:412
 #8  0x08253f0c in StartChildProcess (type=StartupProcess) at postmaster.c:4340
 #9  0x0825669e in PostmasterMain (argc=3, argv=0x8516e08) at postmaster.c:1078
 #10 0x081f7919 in main (argc=3, argv=0x8516e08) at main.c:188

 The crash happened on a HS slave which was fed a partial WAL file. The
 partial was constructed by extracting data up to pg_current_xlog_location()
 and zero padding it up to 16MB.

 This only seems to be happening on HEAD - quick tests indicate that both
 8.3 and 8.4 are not affected (or maybe I didn't try hard enough).

Thanks for the report!

This assertion failure derives from the recent refactoring of ReadRecord().
Which changed the startup process so as to re-fetch the last applied WAL
record at the end of recovery from the buffer instead of the WAL file if it's
stored in the buffer. In this case, the last applied WAL file remains closed.
So readFile (which ought to have been the file descriptor of that WAL file)
might be -1 in exitArchiveRecovery().

In the now, that assertion is obsolete. So I attached the patch that removes
the assert() from exitArchiveRecovery().

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 5110,5125  exitArchiveRecovery(TimeLineID endTLI, uint32 endLogId, uint32 endLogSeg)
  	UpdateMinRecoveryPoint(InvalidXLogRecPtr, true);
  
  	/*
! 	 * We should have the ending log segment currently open.  Verify, and then
! 	 * close it (to avoid problems on Windows with trying to rename or delete
! 	 * an open file).
  	 */
! 	Assert(readFile = 0);
! 	Assert(readId == endLogId);
! 	Assert(readSeg == endLogSeg);
! 
! 	close(readFile);
! 	readFile = -1;
  
  	/*
  	 * If the segment was fetched from archival storage, we want to replace
--- 5110,5123 
  	UpdateMinRecoveryPoint(InvalidXLogRecPtr, true);
  
  	/*
! 	 * If the ending log segment is currently open, close it (to avoid
! 	 * problems on Windows with trying to rename or delete an open file).
  	 */
! 	if (readFile = 0)
! 	{
! 		close(readFile);
! 		readFile = -1;
! 	}
  
  	/*
  	 * If the segment was fetched from archival storage, we want to replace
***
*** 5974,5979  StartupXLOG(void)
--- 5972,5980 
  	EndOfLog = EndRecPtr;
  	XLByteToPrevSeg(EndOfLog, endLogId, endLogSeg);
  
+ 	Assert(readId == endLogId);
+ 	Assert(readSeg == endLogSeg);
+ 
  	/*
  	 * Complain if we did not roll forward far enough to render the backup
  	 * dump consistent.  Note: it is indeed okay to look at the local variable

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-02-05 Thread Markus Wanner

Hello Joachim,

a little daughter eats lots of spare cycles - among other things. Sorry 
it took that long to review.


On Fri, 8 Jan 2010 20:36:44 +0100, Joachim Wieland j...@mcknight.de
wrote:

The attached patch implements the idea of Heikki / Simon published in

http://archives.postgresql.org/pgsql-hackers/2009-11/msg00271.php


I must admit I didn't read that up front, but thought your patch could 
be useful for implementing parallel querying.


So, let's first concentrate on the intended use case: allowing parallel 
pg_dump. To me it seems like a pragmatic and quick solution, however, 
I'm not sure if requiring superuser privileges is acceptable.


The patch currently compiles (modulo some OID changes in pg_proc.h to 
prevent duplicates) and the test suite runs through fine. I haven't 
tested the new functions, though.



Reading the code, I'm missing the part that actually acquires the 
snapshot for the transaction(s). After setting up multiple transactions 
with pg_synchronize_snapshot and pg_synchronize_snapshot_taken, they 
still don't have a snapshot, do they?


Also, you should probably ensure the calling transactions don't have a 
snapshot already (let alone a transaction id).


In a similar vein, and answering your question in a comment: yes, I'd 
say you want to ensure your transactions are in SERIALIZABLE isolation 
mode. There's no other isolation level for which that kind of snapshot 
serialization makes sense, is there?



Using the exposed functions in a more general sense, I think it's 
important to note that the patch only intents to synchronize snapshots 
at the start of the transaction, not contiguously. Thus, normal 
transaction isolation applies for concurrent writes and each of the 
transactions can commit or rollback independently.


The timeout is nice, but is it really required? Isn't the normal query
cancellation infrastructure sufficient?

Hope that helps. Thanks for working on this issue.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Peter Eisentraut wrote:
 On fre, 2010-02-05 at 14:45 +, Tim Bunce wrote:
   Does Perl have a similar mess?
  
  I don't think so.
  
  The primary database interface is DBI and as far as I can see there's
  only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/
 
 I think another difference is that the Perl DBI interface is very rich,
 whereas the Python DB-API is quite minimal and almost forces people to
 write (incompatible) extensions.  The DB-SIG at Python that ought to
 drive all this is also quite dead, possibly because everyone has moved
 on to SQLAlchemy.

I assumed it would be Perl that had confusion because of the Perl 6, but
obviously it is Python.  The second PL/Python server-side language that
was proposed recently is taking on a whole new appearance for me.  :-(

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL-ism help patch for psql

2010-02-05 Thread Bruce Momjian
Josh Berkus wrote:
 
  My problem with this whole idea is that it seems to be very MySQL-specific.
  Why aren't we providing help for users migrating from Oracle, Sybase,
  Informix, Ingres, DB2, SQLServer and Firebird, to name but a few? And if we
  turn all those on by default, we'll have a pretty horrible banner when
  starting psql.
 
 We should do all of those.  However, we have to start somewhere.
 
 Therefore, I like the idea of having a switch, like:
 \advice mysql
 \advice db2
 etc.
 
 ALSO, I'll point out that a page in our official documentation with
 extensive docs on commands and migration issues would be far more
 helpful than any amount of psql messages.  Just sayin'.

I don't think we came to any firm conclusion in this thread, so I have
added a TODO item:

Add option to print advice for people familiar with other databases

* http://archives.postgresql.org/pgsql-hackers/2010-01/msg01845.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-05 Thread Mark Cave-Ayland

Teodor Sigaev wrote:


I would like to see point #2 of the following email addressed before
commit.  As things stand, it is not clear (at least to me) whether
this is a win.


Reimplementation of ginInsertRecordBA reduces difference of HEAD and 
HEAD+rbtree in regular case.

Test suite is taken from http://www.sai.msu.su/~megera/wiki/2009-04-03:

SEQ: SELECT array_to_string(ARRAY(select '' || a || '.' || b from
generate_series(1,50) b), ' ')::tsvector AS i INTO foo FROM
generate_series(1,10) a;
RND: SELECT array_to_string(ARRAY(select '' || random() from
generate_series(1,50) b), ' ')::tsvector AS i INTO foo FROM
generate_series(1,10) a;

Times in seconds:
 HEAD  0.9   0.11
SEQ   130  113111
RND11.4 12.6   11.5

The ides was to change order of insertion - now insertion order 
decreases number of rebalancing.


Oleg's test (http://www.sai.msu.su/~megera/wiki/rbtree_test) are made 
with v0.10 which is differ from 0.11 only by comments around 
ginInsertRecordBA()


Here is a quick comparison between the current 0.11 patch against my 
original 0.7 patch when building Oleg's simple data. (Note: due to time 
constraints, this is just a single run to get a feel for performance)



0.7 patch
=

rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin);
CREATE INDEX
Time: 1910741.352 ms

rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout);
CREATE INDEX
Time: 1647609.300 ms


0.11 patch
==

rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin);
CREATE INDEX
Time: 1864013.526 ms

rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout);
CREATE INDEX
Time: 1661200.454 ms


HTH,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: psql \whoami option

2010-02-05 Thread Bruce Momjian

I have added this patch to the next commit-fest.  Thanks:

https://commitfest.postgresql.org/action/commitfest_view?id=6

---

David Christensen wrote:
 -hackers,
 
 In the spirit of small, but hopefully useful interface improvement  
 patches, enclosed for your review is a patch for providing psql with a  
 \whoami command (maybe a better name is \conninfo or similar).  Its  
 purpose is to print information about the current connection, by  
 default in a human-readable format.  There is also an optional format  
 parameter which currently accepts 'dsn' as an option to output the  
 current connection information as a DSN.
 
 Example output:
 
$psql -d postgres -p 8555
psql (8.5devel)
You are now connected to database postgres.
 
[Tue Jan 26 17:17:31 CST 2010]
machack:postgres:8555=# \whoami
Connected to database: postgres, user: machack, port: 8555  
 via local domain socket
 
[Tue Jan 26 17:17:34 CST 2010]
machack:postgres:8555=# \c - - localhost 8555
psql (8.5devel)
You are now connected to database postgres on host localhost.
 
[Tue Jan 26 17:17:42 CST 2010]
machack:postgres:8555=# \whoami
Connected to database: postgres, user: machack, host:  
 localhost, port: 8555
 
[Tue Jan 26 17:17:46 CST 2010]
machack:postgres:8555=# \whoami dsn
dbname=postgres;user=machack;host=localhost;port=8555
 
[Tue Jan 26 17:19:02 CST 2010]
machack:postgres:8555=# \q
 
 Regards,
 
 David
 --
 David Christensen
 End Point Corporation
 da...@endpoint.com
 
 
 
 diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql- 
 ref.sgml
 index 3ce5996..b58b24d 100644
 *** a/doc/src/sgml/ref/psql-ref.sgml
 --- b/doc/src/sgml/ref/psql-ref.sgml
 *** lo_import 152801
 *** 2149,2154 
 --- 2149,2167 
 
 
  varlistentry
 + termliteral\whoami/literal [ replaceable  
 class=parameterdefault/replaceable | replaceable  
 class=parameterdsn/replaceable ] /term
 + listitem
 + para
 + Outputs connection information about the current database
 + connection.  When passed parameter literaldsn/literal,
 + outputs as a DSN.  If parameter is unspecified or
 + unrecognized, outputs in a human-readable format.
 + /para
 + /listitem
 +   /varlistentry
 +
 +
 +   varlistentry
termliteral\x/literal/term
listitem
para
 diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
 index 5188b18..21b2468 100644
 *** a/src/bin/psql/command.c
 --- b/src/bin/psql/command.c
 *** exec_command(const char *cmd,
 *** 1106, 
 --- 1106,1156 
   free(fname);
   }
 
 + /* \whoami -- display information about the current connection  */
 + else if (strcmp(cmd, whoami) == 0)
 + {
 + char   *format = psql_scan_slash_option(scan_state,
 + 
 OT_NORMAL, NULL, true);
 + char   *host = PQhost(pset.db);
 +
 + if (format  !pg_strcasecmp(format, dsn)) {
 + if (host) {
 + printf(dbname=%s;user=%s;host=%s;port=%s\n,
 +PQdb(pset.db),
 +PQuser(pset.db),
 +host,
 +PQport(pset.db)
 + );
 + }
 + else {
 + printf(dbname=%s;user=%s;port=%s\n,
 +PQdb(pset.db),
 +PQuser(pset.db),
 +PQport(pset.db)
 + );
 + }
 + }
 + else {
 + /* default case */
 + if (host) {
 + printf(Connected to database: \%s\, user: 
 \%s\, host: \%s 
 \, port: \%s\\n,
 +PQdb(pset.db),
 +PQuser(pset.db),
 +host,
 +PQport(pset.db)
 + );
 + }
 + else {
 + printf(Connected to database: \%s\, user: 
 \%s\, port: \%s 
 \ via local domain socket\n,
 +PQdb(pset.db),
 +PQuser(pset.db),
 +PQport(pset.db)
 + );
 + }
 + }
 + free(format);
 + }
 +
   /* \x -- toggle expanded table representation */
   

Re: [HACKERS] remove contrib/xml2

2010-02-05 Thread Robert Haas
On Thu, Feb 4, 2010 at 10:51 PM, M Z jm80...@gmail.com wrote:
 I did some tests followed Robert's test cases on both postgresql 8.4.2-0ubu
 and 8.3.8-1, OS: Ubuntu Karmic.

 1) 1st test case, it doesn't crash on 8.3.8 but crash on 8.4.2;

Interesting.  So, that's a regression of some kind.

 2) 2nd test case, both 8.3.8 and 8.4.2 are fine, and no warning (different
 from Robert's test?);

I built with --enable-debug and --enable-cassert, which might be relevant.

 3) 3rd test case (and modified test case for 8.3.8), both 8.3.8 and 8.4.2
 are not correct, same with Robert's test (8.5 beta?);

As I think about that further, it might not be a bug - how is the
processor supposed to know what we expect to happen?  But then, I
don't really know how this is supposed to work.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove contrib/xml2

2010-02-05 Thread Robert Haas
On Wed, Feb 3, 2010 at 8:49 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Robert Haas escribió:
 On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan and...@dunslane.net wrote:
  Robert Haas wrote:
  (2) add a very, very large warning that this will crash if you do
  almost anything with it.
 
  I think that's an exaggeration. Certain people are known to be using it
  quite successfully.

 Hmm.  Well, all I know is that the first thing I tried crashed the server.

 CREATE TABLE xpath_test (id integer NOT NULL, t xml);
 INSERT INTO xpath_test VALUES (1, 'docint1/int/doc');
 SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true')
 as t(id int4);

 This trivial patch lingering on my system fixes this crasher (this is
 for the 8.3 branch).  It makes the problem in alloc set ExprContext
 warning show up instead.

 There are still lotsa other holes, but hey, this is a start ...

Interestingly M Z found he couldn't reproduce this crash on 8.3.  Can
you?  If so, +1 for applying this and backpatching it as far as make
sense.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Josh Berkus

 I think another difference is that the Perl DBI interface is very rich,
 whereas the Python DB-API is quite minimal and almost forces people to
 write (incompatible) extensions.  The DB-SIG at Python that ought to
 drive all this is also quite dead, possibly because everyone has moved
 on to SQLAlchemy.

For people who use Python a lot, could I have a list of the deficiencies
in DBAPI?  I've got my horse and lance ready.

Given that SQLAlchemy isn't for everyone, of course ... it couldn't be,
or Django would use it, no?

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-05 Thread Teodor Sigaev

That's all around 1%

0.7 patch
=

rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin);
CREATE INDEX
Time: 1910741.352 ms

rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout);
CREATE INDEX
Time: 1647609.300 ms


0.11 patch
==

rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin);
CREATE INDEX
Time: 1864013.526 ms

rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout);
CREATE INDEX
Time: 1661200.454 ms


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.

2010-02-05 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian wrote:
  Peter Eisentraut wrote:
   On tis, 2010-01-26 at 10:20 -0800, David Fetter wrote:
On Tue, Jan 26, 2010 at 02:21:29PM +, Bruce Momjian wrote:
 Log Message:
 ---
 Remove tabs in SGML.

Can we see about making a commit hook for CVS that disallows \t in
SGML files?  The process in git is pretty simple.
   
   Better write a check in the makefile.
  
  Done, and attached.  The rule is more shell script than makefile, but I
  am not sure how else to do it.
 
 Maybe have the check-tabs rule as a dependency of the check, html
 and/or draft rules?
 
  + check-tabs:
  +   ( ! grep '  ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) 
  || (echo Tabs appear in SGML files;  exit 1)

Uh, I didn't think people wanted to actually have any tab testing in
their automatically.  Do they?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.

2010-02-05 Thread Alvaro Herrera
Bruce Momjian wrote:
 Alvaro Herrera wrote:

  Maybe have the check-tabs rule as a dependency of the check, html
  and/or draft rules?
  
   + check-tabs:
   + ( ! grep '  ' $(wildcard $(srcdir)/*.sgml 
   $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files;  exit 1)
 
 Uh, I didn't think people wanted to actually have any tab testing in
 their automatically.  Do they?

IMO that's the point of the whole exercise.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.

2010-02-05 Thread Alvaro Herrera
Bruce Momjian wrote:
 Peter Eisentraut wrote:
  On tis, 2010-01-26 at 10:20 -0800, David Fetter wrote:
   On Tue, Jan 26, 2010 at 02:21:29PM +, Bruce Momjian wrote:
Log Message:
---
Remove tabs in SGML.
   
   Can we see about making a commit hook for CVS that disallows \t in
   SGML files?  The process in git is pretty simple.
  
  Better write a check in the makefile.
 
 Done, and attached.  The rule is more shell script than makefile, but I
 am not sure how else to do it.

Maybe have the check-tabs rule as a dependency of the check, html
and/or draft rules?

 + check-tabs:
 + ( ! grep '  ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) 
 || (echo Tabs appear in SGML files;  exit 1)



-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Josh Berkus

 The situation is unfortunate, but you might as well argue that too many
 Linux desktops or Linux distributions confuse new users and hurt
 adoption.  These alternatives all exist for a reason, and it will be
 difficult to get some of them to abandon their work with the aim of
 reducing the overall confusion.

I'm not as concerned about confusion as the fact that *all* of the
various Python drivers suck in different, and crippling, ways.  I don't
care how many drivers we have, as long as we have at least one 1st-class
driver.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Josh Berkus wrote:
 
  The situation is unfortunate, but you might as well argue that too many
  Linux desktops or Linux distributions confuse new users and hurt
  adoption.  These alternatives all exist for a reason, and it will be
  difficult to get some of them to abandon their work with the aim of
  reducing the overall confusion.
 
 I'm not as concerned about confusion as the fact that *all* of the
 various Python drivers suck in different, and crippling, ways.  I don't
 care how many drivers we have, as long as we have at least one 1st-class
 driver.

Agreed, and we can point newbies to that one great driver.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl db access documentation enhancement

2010-02-05 Thread Bruce Momjian

Patch applied for 9.0.  We don't normally backpatch such documentation
improvements unless we receive multiple reports of confusion.

---

Alexey Klyukin wrote:
 Hello,
 
 We were asked by Enova Financial to improve the documentation of PL/Perl 
 database access functions.
 Alvaro and me worked on that and we produced the patch that is attached. It 
 splits initial block of functions
 into the groups with the description directly following each of the group, 
 corrects couple of mistakes and
 adds an example.
 
 One of the existing mistakes was confusion in definitions of 
 spi_exec_prepared and spi_query_prepared.
 Another one is usage of INTEGER type to return the result of spi_prepare in 
 the example for prepared queries.
 When trying to execute that function I've got the following error:
 
 postgres=# CREATE OR REPLACE FUNCTION init() RETURNS INTEGER AS $#
   $_SHARED{my_plan} = spi_prepare( 'SELECT (now() + $1)::date AS now', 
 'INTERVAL');
 $$ LANGUAGE plperl;
 
 CREATE FUNCTION
 
 postgres=# select init();
 ERROR:  invalid input syntax for integer: 0x1007d6f40
 CONTEXT:  PL/Perl function init
 
 Since the return value is not used anyway, I've changed the return type of 
 the function declaration in the example to VOID.
 
 I think this is a good reason to suggest backpatching these changes  down to 
 8.2. 
 
 

[ Attachment, skipping... ]

 
 
 --
 Alexey Klyukin
 http://www.CommandPrompt.com/
 The PostgreSQL Company - Command Prompt, Inc
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove tabs in SGML.

2010-02-05 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian wrote:
  Alvaro Herrera wrote:
 
   Maybe have the check-tabs rule as a dependency of the check, html
   and/or draft rules?
   
+ check-tabs:
+   ( ! grep '  ' $(wildcard $(srcdir)/*.sgml 
$(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files;  exit 1)
  
  Uh, I didn't think people wanted to actually have any tab testing in
  their automatically.  Do they?
 
 IMO that's the point of the whole exercise.

OK, done with attached patch, now done as part of 'gmake check'.  If
people want it as part of the build process, please let me know.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: Makefile
===
RCS file: /cvsroot/pgsql/doc/src/sgml/Makefile,v
retrieving revision 1.136
diff -c -c -r1.136 Makefile
*** Makefile	5 Feb 2010 14:51:50 -	1.136
--- Makefile	5 Feb 2010 19:28:14 -
***
*** 263,269 
  ##
  
  # Quick syntax check without style processing
! check: postgres.sgml $(ALMOSTALLSGML)
  	$(NSGMLS) $(SPFLAGS) $(SGMLINCLUDE) -s $
  
  
--- 263,269 
  ##
  
  # Quick syntax check without style processing
! check: postgres.sgml $(ALMOSTALLSGML) check-tabs
  	$(NSGMLS) $(SPFLAGS) $(SGMLINCLUDE) -s $
  
  
***
*** 346,352 
  endif # sqlmansectnum != 7
  
  check-tabs:
! 	( ! grep '	' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files;  exit 1)
  
  ##
  ## Clean
--- 346,352 
  endif # sqlmansectnum != 7
  
  check-tabs:
! 	@( ! grep '	' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) ) || (echo Tabs appear in SGML files;  exit 1)
  
  ##
  ## Clean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl db access documentation enhancement

2010-02-05 Thread Alvaro Herrera
Bruce Momjian wrote:
 
 Patch applied for 9.0.  We don't normally backpatch such documentation
 improvements unless we receive multiple reports of confusion.

I think that's a mistake in this case.  The documentation wasn't
confusing -- it was bogus.  (Actually, the bug fixing is a smaller
change than the whole of this patch, so we could provide the smaller
patch if desired to apply to 8.4.  To be honest I think it is better to
just apply the larger patch verbatim.)

I could do the backpatch if you want.  I just wanted some more peer
review on the changes.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Marko Kreen wrote:
 Psycopg was the leader, especially in web-environments,
 but it has non-obvious license and with dead website it does not
 seem that attractive.  Although it is well-maintained still.
 
 Best path forward would be to talk with Psycopg guys about
 license clarification/change.
 
 I suspect GPL does not extend anyway to Python code that imports
 it dynamically, and it does not seem to be their intention - they
 even try to tell it in their current clarification, but it
 is not as clear as it could be.

Yes, this is a good analysis and has some good suggestions for moving
forward.  In summary, I don't think the current PG/Python driver
situation is helping PG or Python, and I believe there are enough people
who care about those two to get a group together to sort out a viable
direction and start working toward the goal of improving things.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Greg Smith

Bruce Momjian wrote:

While I realize experienced people can easily navigate this confusion...


No, that's the worst part--the more you know and the deeper you dig into 
it, the more broken you realize the whole thing is.  When one of the 
best drivers (in some respects) has a web page that looks like this:  
http://initd.org/ that doesn't seem so bad at first--but if you're 
experienced, you know that the page has been in that disturbing state 
since late 2006.  You start digging into the driver mess, figure you 
just need to learn how things fit together, but the hole keeps getting 
bigger as you dig.


The issues here have already been identified:  the Perl DBI is an 
excellent spec, while the Python one is so weak everybody ends up 
needing their own extensions to it.  And then portability *even among 
Python PostgreSQL drivers* goes out the window.  If somebody built a 
BSD/MIT licensed driver that replaces every useful feature of all the 
forks, with no major problems, and a couple of major projects switched 
over to it (think Skype level big), maybe this mess could get 
resolved.  I think it would take someone already familiar with the major 
issues involved a couple of months of regular work to make any serious 
progress on it.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange heuristic in analyze.c

2010-02-05 Thread Bruce Momjian
Greg Stark wrote:
 So I never realized the consequences of this little heuristic in
 analyze.c in the handling of very low cardinality columns where we
 want to just capture the complete list of values in the mcv and throw
 away the histogram:
 
   else if (toowide_cnt == 0  nmultiple == ndistinct)
   {
   /*
* Every value in the sample appeared more than once.  
 Assume the
* column has just these values.
*/
   stats-stadistinct = ndistinct;
   }
 
 The problem with this heuristic is that if the table is small enough
 you might expect you can set the statistics target high and sample
 the entire table and get a very accurate mcv covering all the values.
 However if any of the values in the table appears only once this
 heuristic will defeat you. The following code will then throw out of
 the mcv any value which isn't 25% more common than average. Leaving
 you with a histogram for those values which often does very poorly if
 the values don't fit any pattern and are just discrete arbitrary
 values.

Do you want a C comment to document this problem?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Greg Smith wrote:
 Bruce Momjian wrote:
  While I realize experienced people can easily navigate this confusion...
 
 No, that's the worst part--the more you know and the deeper you dig into 
 it, the more broken you realize the whole thing is.  When one of the 
 best drivers (in some respects) has a web page that looks like this:  
 http://initd.org/ that doesn't seem so bad at first--but if you're 
 experienced, you know that the page has been in that disturbing state 
 since late 2006.  You start digging into the driver mess, figure you 
 just need to learn how things fit together, but the hole keeps getting 
 bigger as you dig.
 
 The issues here have already been identified:  the Perl DBI is an 
 excellent spec, while the Python one is so weak everybody ends up 
 needing their own extensions to it.  And then portability *even among 
 Python PostgreSQL drivers* goes out the window.  If somebody built a 
 BSD/MIT licensed driver that replaces every useful feature of all the 
 forks, with no major problems, and a couple of major projects switched 
 over to it (think Skype level big), maybe this mess could get 
 resolved.  I think it would take someone already familiar with the major 
 issues involved a couple of months of regular work to make any serious 
 progress on it.

Yes, it is going to take serious time, and it is going to take more than
one person to be involved, but I think there are enough people who care
that something serious can be done to improve its current state.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Jeff Davis
On Fri, 2010-02-05 at 09:38 -0500, Bruce Momjian wrote:
 Wow, that is super-confusing.

Agreed. Standardization among licenses is useful, and I think it's
important to have a driver with a license that people already
understand.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP patch for system-catalog vacuuming via a relation map

2010-02-05 Thread Tom Lane
Attached is the current state of my work on letting system catalogs
be processed by new-style VACUUM FULL (a/k/a CLUSTER).  I haven't done
the WAL support nor worried about interlocking concurrent updates of
relation map files, but it passes the regression tests and can do
VACUUM FULL of every system catalog.  So I think this is proof that
the approach is workable and there are no further gotchas to stumble
over in dark recesses of the system.   The patch adds about 1250 lines
of code, mostly in a new file relmapper.c, and will be bigger once
WAL support is added.  However, I anticipate that we will be able to
rip out probably three times that much code by eliminating VACUUM FULL
INPLACE (commands/vacuum.c is alone almost 4000 lines, though of course
it won't all go away).  So it will be a substantial net reduction in
code size, plus we will get the previously discussed benefits such as
removing reindex-in-place for shared indexes.

Barring objections I'm going to press ahead with completing and
committing this; then in a separate patch remove VACUUM FULL INPLACE.

regards, tom lane



bin6wpmdlL4Yo.bin
Description: relation-mapper-1.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map

2010-02-05 Thread Alvaro Herrera
Tom Lane wrote:

 Barring objections I'm going to press ahead with completing and
 committing this; then in a separate patch remove VACUUM FULL INPLACE.

With the second patch, we will continue to support reading
XVAC_MOVED_OUT and IN hint bits, but never set them, correct?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map

2010-02-05 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 Barring objections I'm going to press ahead with completing and
 committing this; then in a separate patch remove VACUUM FULL INPLACE.

 With the second patch, we will continue to support reading
 XVAC_MOVED_OUT and IN hint bits, but never set them, correct?

Right, the tqual support has to stay for as long as we support binary
upgrade from previous releases.  (Unless we want to work out some way
of ensuring no MOVED bits remain set in a database.)  There is one
heck of a lot of other infrastructure and complexity that can go away,
though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Marko Kreen
On 2/5/10, Josh Berkus j...@agliodbs.com wrote:
   I think another difference is that the Perl DBI interface is very rich,
   whereas the Python DB-API is quite minimal and almost forces people to
   write (incompatible) extensions.  The DB-SIG at Python that ought to
   drive all this is also quite dead, possibly because everyone has moved
   on to SQLAlchemy.


 For people who use Python a lot, could I have a list of the deficiencies
  in DBAPI?  I've got my horse and lance ready.

I took quick glance on DBI and compared to that, DB-API does
not define specific api for:
- resultset cursors
- prepared plans
- db structure examination

OTOH, my guess is that DB-API authors assumed the first two
are used transparently by the driver, and I see no reason why
they cannot be, with current DB-API.

Last item is rich indeed, but seems slightly overengineered..

Now, none of the differences between drivers and current confusion
seem to be related to above points.  Hysterical Raisins is only
good explanation about current situation.

The pg8000 / bpgsql seem to be toy projects, and anyway you dont
want to use pure-Python drivers in high-performance environments.
We are not talking about C#/java here.

py-postgresql seems to be more serious, but as it's python3 only
which makes it irrelevant today.

PyGreSQL is the oldest, older than DB-API, and so it's DB-API
interface seems an afterthought and is untested/underused - eg.
it does not support bytea.

Psycopg was the leader, especially in web-environments,
but it has non-obvious license and with dead website it does not
seem that attractive.  Although it is well-maintained still.

Best path forward would be to talk with Psycopg guys about
license clarification/change.

I suspect GPL does not extend anyway to Python code that imports
it dynamically, and it does not seem to be their intention - they
even try to tell it in their current clarification, but it
is not as clear as it could be.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map

2010-02-05 Thread Josh Berkus

 Barring objections I'm going to press ahead with completing and
 committing this; then in a separate patch remove VACUUM FULL INPLACE.

Was it our determination that we could remove VFI if we eliminated the
system catalogs?  I'm fine with it, I just thought some people had a
marginal use case for VFI.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map

2010-02-05 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Barring objections I'm going to press ahead with completing and
 committing this; then in a separate patch remove VACUUM FULL INPLACE.

 Was it our determination that we could remove VFI if we eliminated the
 system catalogs?  I'm fine with it, I just thought some people had a
 marginal use case for VFI.

I thought the consensus was to remove it if possible.  There may still
be some marginal use cases, but they don't justify the work that'd
be needed to make it play safely with HS; let alone fixing the other
longstanding gotchas with it, like the double-commit risk.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add on_plperl_init and on_plperlu_init to plperl UPDATE 3 [PATCH]

2010-02-05 Thread Alex Hunsaker
On Fri, Feb 5, 2010 at 06:40, Tim Bunce tim.bu...@pobox.com wrote:
 This is the third update to the fourth of the patches to be split out
 from the former 'plperl feature patch 1'.

 Changes in this patch:

 - Added plperl.on_plperl_init and plperl.on_plperlu_init GUCs
    Both are PGC_SUSET
    SPI functions are not available when the code is run.
    Errors are detected and reported as ereport(ERROR, ...)
    Corresponding documentation and tests for both.

*sniffle* OK I think I agree with everyone else on setting this as
PGC_SUSET until we can either prove it can be USERSET or we can fix it
so we can check permissions on SET properly.  It seems if you really
wanted a user to be able to set it you should be able to define a
SECURITY DEFINER function that sets it to a string you pass in or
something.  Obviously not part of core postgres...

 - Renamed plperl.on_perl_init to plperl.on_init

*shrug* OK (I think there was still some flack on this var, but I
think its ok-- we can discuss that in a separate thread if people
still disagree :) )

 - Improved state management of select_perl_context()
    An error during interpreter initialization will leave
    the state (interp_state etc) unchanged.

 - The utf8fix code has been greatly simplified.

 - More code comments re PGC_SUSET and no access to SPI functions.

I like the doc changes and think the new comment about %_SHARED being
unsafe is good.

All looks good to me.  Ill mark it as Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map

2010-02-05 Thread Bruce Momjian
Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
  Barring objections I'm going to press ahead with completing and
  committing this; then in a separate patch remove VACUUM FULL INPLACE.
 
  Was it our determination that we could remove VFI if we eliminated the
  system catalogs?  I'm fine with it, I just thought some people had a
  marginal use case for VFI.
 
 I thought the consensus was to remove it if possible.  There may still
 be some marginal use cases, but they don't justify the work that'd
 be needed to make it play safely with HS; let alone fixing the other
 longstanding gotchas with it, like the double-commit risk.

Oh, yea, I remember, HS killed it because of the work required to make
FULL work for 9.0.  Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl db access documentation enhancement

2010-02-05 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian wrote:
  
  Patch applied for 9.0.  We don't normally backpatch such documentation
  improvements unless we receive multiple reports of confusion.
 
 I think that's a mistake in this case.  The documentation wasn't
 confusing -- it was bogus.  (Actually, the bug fixing is a smaller
 change than the whole of this patch, so we could provide the smaller
 patch if desired to apply to 8.4.  To be honest I think it is better to
 just apply the larger patch verbatim.)
 
 I could do the backpatch if you want.  I just wanted some more peer
 review on the changes.

Sure, go ahead if you are sure. I wasn't clear enough to risk it, and
documentation churn in back branches has its own downsides, which is why
I avoid it, especially for larger patches.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Database-Role settings behaviour and docs mismatch

2010-02-05 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Mon, 2010-02-01 at 20:11 -0300, Alvaro Herrera wrote:
  It'd probably be worth changing the order of the ApplySetting calls so
  that it doesn't look suspicious.
 
  Just a comment would be enough I think
 
 Yeah.  Changing the order would mean that we'd do extra work applying
 and then removing conflicting settings.  But the general principle here
 is that GUC settings coming from different places are resolved by
 source priority, not order of execution.

C comment patch attached and applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/utils/init/postinit.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/init/postinit.c,v
retrieving revision 1.201
diff -c -c -r1.201 postinit.c
*** src/backend/utils/init/postinit.c	15 Jan 2010 09:19:04 -	1.201
--- src/backend/utils/init/postinit.c	5 Feb 2010 20:25:38 -
***
*** 855,860 
--- 855,861 
  
  	relsetting = heap_open(DbRoleSettingRelationId, AccessShareLock);
  
+ 	/* Later settings are ignored if set earlier. */
  	ApplySetting(databaseid, roleid, relsetting, PGC_S_DATABASE_USER);
  	ApplySetting(InvalidOid, roleid, relsetting, PGC_S_USER);
  	ApplySetting(databaseid, InvalidOid, relsetting, PGC_S_DATABASE);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mammoth in Core?

2010-02-05 Thread Bruce Momjian
Joshua D. Drake wrote:
 On Tue, 2010-01-19 at 21:55 +0100, Markus Wanner wrote:
  Hi,
 
  So, that's what I'd recommend the Mammoth developers to do as well:
  cherry-picking, sort of. Maybe that fulfills one or the other item on
  our wish-list (in one way or another)...
  
 
 I doubt we are going to spend the time to do that. Mammoth is BSD and
 open source. If people want to jump in and help, that would be
 interesting, but on our own... it isn't on our priority list.
 
 Our priority list with mammoth is simple.
 
 Finish 1.9
 
 Fix things we know the community as hackers (not feature set) will grump
 about.
 
 Make it work on 8.4 ad 8.5
 
 But again, as Tom says... none of this is relevant until 8.5 is released
 at which point, we can talk about all of this again. Hopefully 1.9 will
 be done by then.

I wanted to comment on this, however late I am.

I think the writing is on the wall and it is now questionable if there
is much community value to continue development of Mammoth Replicator,
though we might want to pull some ideas from the code at some point. 
Company-sponsored development is a valuable resource for this community
and I would love to see new things worked on.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] archive_timeout behavior for no activity

2010-02-05 Thread Bruce Momjian
Fujii Masao wrote:
 On Fri, Jan 15, 2010 at 12:50 AM, Bruce Momjian br...@momjian.us wrote:
  Looking at the archive_timeout documentation and CheckArchiveTimeout(),
  it appears we force a new xlog file and archive it even if no activity
  has been recorded in the xlog file. ?Is this correct?
 
 No. CheckArchiveTimeout() doesn't switch WAL files if there is no activity
 after the last switch. In fact, though it calls RequestXLogSwitch(),
 the switch is skipped in XLogInsert() because we are exactly at the start
 of a file in that case.
 
 But unfortunately checkpoint would be often recorded between each
 switches. So the archive_timeout appears to always force a new WAL file.

I have documented that increasing checkpoint_timeout can avoid WAL
writes on idle systems with archive_timeout.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] archive_timeout behavior for no activity

2010-02-05 Thread Bruce Momjian
Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:
  
  Looking at the archive_timeout documentation and
  CheckArchiveTimeout(), it appears we force a new xlog file and
  archive it even if no activity has been recorded in the xlog file.
  Is this correct?  Should we document this or fix it so only xlog
  files with contents are archived?
  
 Er, you can probably blame me for that.  Tom was going to fix it and
 I pointed out that it would break our monitoring of our warm standby
 processes.  We have a one hour maximum and send alerts if we've gone
 75 minutes or more without receiving a WAL file from one of our
 databases.  Of course, if we had a nicer way to know that we were
 up-to-date with our WAL file copies, we wouldn't need this; but
 right now there aren't a lot of options for monitoring these things.

I am dismayed that we are using a 16MB file for monitoring archive
activity.  Can't you use pg_current_xlog_location() and only check for
an archive file when that location changes?

Anyway, I have updated the documentation with the attached patch to
mention this issue, and added a C comment as well.

Is there a TODO here?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/config.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.249
diff -c -c -r1.249 config.sgml
*** doc/src/sgml/config.sgml	3 Feb 2010 17:25:05 -	1.249
--- doc/src/sgml/config.sgml	5 Feb 2010 23:17:17 -
***
*** 1739,1745 
  server to switch to a new WAL segment file periodically.  When this
  parameter is greater than zero, the server will switch to a new
  segment file whenever this many seconds have elapsed since the last
! segment file switch.  Note that archived files that are closed early
  due to a forced switch are still the same length as completely full
  files.  Therefore, it is unwise to use a very short
  varnamearchive_timeout/ mdash; it will bloat your archive
--- 1739,1749 
  server to switch to a new WAL segment file periodically.  When this
  parameter is greater than zero, the server will switch to a new
  segment file whenever this many seconds have elapsed since the last
! segment file switch, and there has been any database activity,
! including a single checkpoint.  (Increasing
! varnamecheckpoint_timeout/ will reduce unnecessary
! checkpoints on an idle system.)
! Note that archived files that are closed early
  due to a forced switch are still the same length as completely full
  files.  Therefore, it is unwise to use a very short
  varnamearchive_timeout/ mdash; it will bloat your archive
Index: src/backend/postmaster/bgwriter.c
===
RCS file: /cvsroot/pgsql/src/backend/postmaster/bgwriter.c,v
retrieving revision 1.66
diff -c -c -r1.66 bgwriter.c
*** src/backend/postmaster/bgwriter.c	15 Jan 2010 09:19:02 -	1.66
--- src/backend/postmaster/bgwriter.c	5 Feb 2010 23:17:21 -
***
*** 543,549 
  
  /*
   * CheckArchiveTimeout -- check for archive_timeout and switch xlog files
!  *		if needed
   */
  static void
  CheckArchiveTimeout(void)
--- 543,552 
  
  /*
   * CheckArchiveTimeout -- check for archive_timeout and switch xlog files
!  *
!  * This will switch to a new WAL file and force an archive file write
!  * if any activity is recorded in the current WAL file, including just
!  * a single checkpoint record.
   */
  static void
  CheckArchiveTimeout(void)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby documentation

2010-02-05 Thread Bruce Momjian
Joshua Tolley wrote:
-- Start of PGP signed section.
 Having concluded I really need to start playing with hot standby, I started
 looking for documentation on the subject. I found what I was looking for; I
 also found this page[1], which, it seems, ought to mention hot standby.
 Comments?
 
 [1] http://developer.postgresql.org/pgdocs/postgres/high-availability.html

Ah, I now realize it only mentions warm standby, not hot, so I just
updated the documentation to reflect that;  you can see it here:

http://momjian.us/tmp/pgsql/high-availability.html

Warm and Hot Standby Using Point-In-Time Recovery (PITR)

Do we want to call the feature hot standby?  Is a read-only standby a
standby or a slave?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Josh Berkus

 Imho a big problem is that it does way too much itself - i.e. it does not use 
 things like PQExecParams but does escaping/parsing itself...
 Other people may think thats a good idea - I definitely do not think so.

It also has issues with transaction control which cause idle
transactions if the Django front-end times out due to load.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] VAC FULL/CLUSTER on system catalogs is prone to deadlock

2010-02-05 Thread Tom Lane
I discovered $subject while continuing to test my patch.  The problem
occurs because many parts of the backend are in the habit of releasing
lock on system catalogs as soon as they've updated whatever they wanted
to update.  This means that we can encounter INSERT_IN_PROGRESS or
DELETE_IN_PROGRESS tuples, even though we have exclusive lock on the
table.  That's no big problem as far as copy_heap_data goes; it can
just transfer the rows anyway.  However, when we get to the point
of reindexing the catalog, IndexBuildHeapScan will wait for commit of
whichever transaction did the deed.  Since we're still holding exclusive
lock on the catalog, it's entirely likely for that transaction to need
to read or write the catalog again, whereupon deadlock.

There are a couple of things we could try to do about this, other than
just live with it:

1. Try to avoid releasing locks early in system catalog operations.
The trouble with this is that it would inevitably create deadlock
failures of its own, due to adding lock-upgrade situations where there
were none before.  That is, instead of patterns like
Acquire(AccessShareLock), Release(AccessShareLock),
Acquire(RowExclusiveLock), then Release(RowExclusiveLock), we'd have
Acquire(AccessShareLock), Acquire(RowExclusiveLock), then release; which
could deadlock against a parallel acquisition of higher-level locks.
Maybe we could make it work reasonably well if we release
AccessShareLock quickly after a read but always hold RowExclusiveLock
once acquired.  Not sure.

2. In VAC FULL and CLUSTER, tell index rebuild not to wait around for 
INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples.  I believe that the
price of this would be not re-verifying the integrity of unique indexes.
Which is kind of annoying, but then again rechecking data consistency
is not the purpose of these commands.  It would not be too unreasonable
to tell people to use REINDEX if they suspect the validity of an index
--- that takes a significantly weaker lock and wouldn't be nearly as
big a problem to use on system catalogs.

3. Try to ameliorate the problem by making sure it's the VACUUM FULL
that fails, and not the other process, when there's a deadlock.

BTW, the reason we don't see this type of problem with the existing
VAC FULL INPLACE logic is that it essentially silently adopts #2
--- it suppresses uniqueness checks even when it's making new index
insertions.

At the moment #2 is seeming like the most attractive alternative;
both of the others look like research projects.  Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VAC FULL/CLUSTER on system catalogs is prone to deadlock

2010-02-05 Thread Bruce Momjian
Tom Lane wrote:
 2. In VAC FULL and CLUSTER, tell index rebuild not to wait around for 
 INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples.  I believe that the
 price of this would be not re-verifying the integrity of unique indexes.
 Which is kind of annoying, but then again rechecking data consistency
 is not the purpose of these commands.  It would not be too unreasonable
 to tell people to use REINDEX if they suspect the validity of an index
 --- that takes a significantly weaker lock and wouldn't be nearly as
 big a problem to use on system catalogs.
 
 3. Try to ameliorate the problem by making sure it's the VACUUM FULL
 that fails, and not the other process, when there's a deadlock.
 
 BTW, the reason we don't see this type of problem with the existing
 VAC FULL INPLACE logic is that it essentially silently adopts #2
 --- it suppresses uniqueness checks even when it's making new index
 insertions.
 
 At the moment #2 is seeming like the most attractive alternative;
 both of the others look like research projects.  Comments?

Yea, #2.  No one has complained about this behavior in the past, as I
remember.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for system-catalog vacuuming via a relation map

2010-02-05 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Tom Lane wrote:
 I thought the consensus was to remove it if possible.  There may still
 be some marginal use cases, but they don't justify the work that'd
 be needed to make it play safely with HS; let alone fixing the other
 longstanding gotchas with it, like the double-commit risk.

 I think part of the plan was to also provide an online reorg tool that
 works by doing dummy UPDATEs, which means that you can get serialization
 errors in serializable mode, but doesn't need to lock the table.

Yeah.  There's a good deal of interest in incremental/partial vacuuming.
But that wouldn't make use of the existing VFI infrastructure either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Andres Freund
On Friday 05 February 2010 21:34:53 Marko Kreen wrote:
 On 2/5/10, Josh Berkus j...@agliodbs.com wrote:
I think another difference is that the Perl DBI interface is very
rich, whereas the Python DB-API is quite minimal and almost forces
people to write (incompatible) extensions.  The DB-SIG at Python that
ought to drive all this is also quite dead, possibly because everyone
has moved on to SQLAlchemy.
  
  For people who use Python a lot, could I have a list of the deficiencies
  
   in DBAPI?  I've got my horse and lance ready.
 Psycopg was the leader, especially in web-environments,
 but it has non-obvious license and with dead website it does not
 seem that attractive.  Although it is well-maintained still.
Imho a big problem is that it does way too much itself - i.e. it does not use 
things like PQExecParams but does escaping/parsing itself...
Other people may think thats a good idea - I definitely do not think so.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Marko Kreen
On 2/5/10, Greg Smith g...@2ndquadrant.com wrote:
  The issues here have already been identified:  the Perl DBI is an excellent
 spec, while the Python one is so weak everybody ends up needing their own
 extensions to it.  And then portability *even among Python PostgreSQL
 drivers* goes out the window.

Well, no.   You are overplaying on the DBI angle.  If the driver can
offer via the minimal DB-API the same features the DBI driver
can offer, then the DBI API is bloated, not the minimal API weak...

Not that DB-API is perfect, a way to toggle prepare and db cursors
on/off in per-query basis would be good to have, but thats mostly it.
Rest of extra features (COPY) need to be done as extensions anyway.

But all that is off-topic to the current driver suckiness.

The problems with Python drivers are that they fail to implement
even the minimal DB-API 2.0:

- correctly, across all interesting data types.
  PygreSQL (bytea).
  pg8000 (does rollback after query fails - WTF).
  None of the drivers seem to support both array and tuple.

- robustly.  Psycopg2 - their misguided attempt to turn DB-API
  calls optionally async cost their stability big-time as it
  complexified their codebase.  It got stable around 2.0.[678].
  I'm not holding my breath on the other ones.
  Fear the fancy API-s.

- with threadsafety level  1.  (I don't care, but seems in some
  circles it's necessary.)

Note how many ways a driver can suck without stepping above
basic DB-API 2.0.

 If somebody built a BSD/MIT licensed driver
 that replaces every useful feature of all the forks, with no major problems,
 and a couple of major projects switched over to it (think Skype level
 big), maybe this mess could get resolved.  I think it would take someone
 already familiar with the major issues involved a couple of months of
 regular work to make any serious progress on it.

Yeah, but the problem here is that there actually is not a problem.
Psycopg license is not bad, just confusing on first
sight.  And website problem can be fix by advertising
 http://wiki.postgresql.org/wiki/Psycopg as new website...

If your requirement is BSD license, then probably getting PyGreSQL
into shape would be faster than writing from scratch.
Main dev effort would be writing Postgres-specific DB-API 2.0
testsuite anyway.  It's likely that actual improvements needed
would be small.

Also I would suggest to match Psycopg extensions API, that would
make it usable to much broader audience.  (Eg: PygreSql does
not offer COPY extensions via the DB-API 2.0 API.  It would be
good to match Psycopg here, instead inventing new API.)

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hot Standby and DROP DATABASE

2010-02-05 Thread Tatsuo Ishii
Hi,

While testing Hot Standby, I have encountered strange behavior with
DROP DATABASE command.

1) connect to test database at standby via psql
2) issue DROP DATABASE test command to primary
3) session #1 works fine
4) close session #1
5) test database dropped on standby

Fromt the manual:

 Running DROP DATABASE, ALTER DATABASE ... SET TABLESPACE, or ALTER
 DATABASE ... RENAME on primary will generate a log message that will
 cause all users connected to that database on the standby to be
 forcibly disconnected. This action occurs immediately, whatever the
 setting of max_standby_delay.

So it seems at least the behavior is quite different from what the
docs stats. Am I missing something here?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread James William Pye
On Feb 5, 2010, at 1:34 PM, Marko Kreen wrote:
 py-postgresql seems to be more serious, but as it's python3 only
 which makes it irrelevant today.

Furthermore, if it did work on python2, it's *not* something that's going to 
appeal to mainstream users (Python heavy web frameworks) as it *partially* 
suffers from the same problem that pg8000 does. It's mostly pure-Python, but it 
has some C optimizations(notably, PQ message buffer). I have done some 
profiling, and *with a few tweaks* it's about 2x-3x *slower than psycopg2* for 
the retrieval of a single int column. I think it could go faster, but I don't 
think it's worth the work.

ISTM that the target audience are folk who are married to PG, and are generally 
unhappy with DB-API, but do not want to buy into a big abstraction layer like 
SQLAlchemy. Sure, it supports DB-API like other drivers so it *would be* usable 
with frameworks, but why take the 3x *or greater* hit over a properly 
implemented libpq version?

Finally, I just don't see the existing (often PG specific) goals that I have in 
mind for it appealing to the majority of [web framework/abstraction] users.

 Psycopg was the leader, especially in web-environments,
 but it has non-obvious license and with dead website it does not
 seem that attractive.  Although it is well-maintained still.
 
 Best path forward would be to talk with Psycopg guys about
 license clarification/change.

Yep.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread James William Pye
On Feb 5, 2010, at 11:34 AM, Josh Berkus wrote:
 For people who use Python a lot, could I have a list of the deficiencies
 in DBAPI?  I've got my horse and lance ready.
 
 Given that SQLAlchemy isn't for everyone, of course ... it couldn't be,
 or Django would use it, no?


Here are some to start with:

 - paramstyle
 - No direct support of prepared statements
   [they *tried* to compensate for this with cached statements, but it's 
inadequate]
 - Too many *optional* extensions (Cursor Method .__iter__() being rather 
notable)

http://www.python.org/dev/peps/pep-0249/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread James William Pye
On Feb 5, 2010, at 8:00 AM, Peter Eisentraut wrote:
 I think another difference is that the Perl DBI interface is very rich,
 whereas the Python DB-API is quite minimal and almost forces people to
 write (incompatible) extensions.  

Yep.

 The DB-SIG at Python that ought to drive all this is also quite dead,

From reading messages that come into that list(not-so-lately), I've gotten the 
impression that they are content with DB-API 2. Aside from the TPC interfaces, 
the last round of activity that I know of was dbapi3[1 (which was a long while 
ago)].

 possibly because everyone has moved on to SQLAlchemy.

Yeah. AFAICT, that is the general direction encouraged by DB-SIG. Write an 
abstraction on top of DB-API. SQLAlchemy, SQLObject, anysql, and I'm sure there 
are others.


[1] http://wiki.python.org/moin/Aug2001DbApi3Strawman
http://wiki.python.org/moin/DbApi3
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-02-05 Thread Greg Smith

Andres Freund wrote:

On 02/03/10 14:42, Robert Haas wrote:

Well, maybe we should start with a discussion of what kernel calls
you're aware of on different platforms and then we could try to put an
API around it.
In linux there is sync_file_range. On newer Posixish systems one can 
emulate that with mmap() and msync() (in batches obviously).


No idea about windows.


There's a series of parameters you can pass into CreateFile:  
http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx


A lot of these are already mapped inside of src/port/open.c in a pretty 
straightforward way from the POSIX-oriented interface:


O_RDWR,O_WRONLY - GENERIC_WRITE, GENERIC_READ
O_RANDOM - FILE_FLAG_RANDOM_ACCESS
O_SEQUENTIAL - FILE_FLAG_SEQUENTIAL_SCAN
O_SHORT_LIVED - FILE_ATTRIBUTE_TEMPORARY
O_TEMPORARY - FILE_FLAG_DELETE_ON_CLOSE
O_DIRECT - FILE_FLAG_NO_BUFFERING
O_DSYNC - FILE_FLAG_WRITE_THROUGH

You have to read the whole Caching Behavior section to see exactly how 
all of those interact, and even then notes like 
http://support.microsoft.com/kb/99794 are needed to follow the fine 
points of things like FILE_FLAG_NO_BUFFERING vs. FILE_FLAG_WRITE_THROUGH.


So anything that's setting those POSIX open flags better than before is 
getting the benefit of that improvement on Windows, too.  But that's not 
quite the same as the changes using fadvise to provide better targeted 
cache control hints.


I'm getting the impression that doing much better on Windows might fall 
into the same sort of category as Solaris, where the primary interface 
for this sort of thing is to use an AIO implementation instead:  
http://msdn.microsoft.com/en-us/library/aa365683(VS.85).aspx


The effective_io_concurrency feature had proof of concept test programs 
that worked using AIO, but actually following through on that 
implementation would require a major restructuring of how the database 
interacts with the OS in terms of reads and writes of blocks.  It looks 
to me like doing something similar to sync_file_range on Windows would 
be similarly difficult.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reading deleted records - PageHeader v3

2010-02-05 Thread Tom Lane
Jonathan Bond-Caron jbo...@gmail.com writes:
 I think part of my problem is I haven't really understood what 'Then make
 sure you have the right alignment' means. 

 My approach currently is:

 After reading HeapTupleHeaderData (23 bytes), I advance another 4 bytes
 (hoff) and try to read a 32 bit integer (first attribute).

No.  First you start at the tuple beginning plus the number of bytes
indicated by hoff (which should be at least 24).  The first field
will always be right there, because this position is always maximally
aligned.  For subsequent fields you have to advance to a multiple of
the alignment requirement of the datatype.  For example, assume the
table's first column is of type bool (1 byte) and the second column
is of type integer.  The bool will be at offset hoff, but the integer
will be at offset hoff + 4 ... it can't immediately follow the bool,
at offset hoff + 1, because that position isn't correctly aligned.
It has to start at the next offset that's a multiple of 4.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-05 Thread Robert Haas
2010/2/4 Teodor Sigaev teo...@sigaev.ru:
 Oleg's test (http://www.sai.msu.su/~megera/wiki/rbtree_test) are made with
 v0.10 which is differ from 0.11 only by comments around ginInsertRecordBA()

That looks pretty good.  I confess I don't fully understand why it
works.  If we're inserting a bunch of equal-key entries, why does it
matter what order we insert them in?  Is there some code in here
(where?) that breaks ties on the basis of where they are in the input
data?

I think that the code in ginInsertRecordBA() is needlessly complex.
As far as I can see, nNodesOnCurrentLevel is always exactly one more
than nNodesOnPreviousLevel, and I think step is also basically
redundant with both of these although the relationship is a little
more complex.  What I would suggest is something like:

- initialize step to the largest power of 2 s.t. step  nentry
- while step  0
-- for (i = step; true; i += 2 * step)
--- insert entry #i-1
--- if i  nentry - (2 * step)  /* must test before incrementing i, to
guard against overflow */
 break
-- step = step / 2

Typos:

bunary - binary
This insertion order decreases number of rebalancing for tree -
should be number of rebalancings
castomized - customized

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers