Re: [HACKERS] libpq WSACleanup is not needed

2009-01-21 Thread Magnus Hagander
James Mansion wrote:
 Andrew Chernow wrote:
 The only problem is how to detect the first connection.  In a threaded
 environment you'd have to perform locking in connectdb, which is
 probably not going to fly.
 Well, if you do an atomic test for a flag being zero, and if so then
 enter a critsec, do

This is not a problem, we do this in other places in libpq already.

 My understanding is that if you do WSAStartup and WSACleanup scoped to
 each connection
 then:
 - the internal counting means that only the 0 - 1 and  1 - 0
 transitions are expensive
 - libpq will only incur the cost if the application didn't do it already

Yes.


 So it seems that the cost is incurred by an application that:
 - makes no other use of winsock (or also does startup/cleanup often)
 - does not retain a connection (or pool) but creates and closes
  a single connection often

Correct.


 How many applications are there that match this pattern?  Isn't it
 enough just to tell
 the user to do WSAStartup and WSACleanup in main() if they find they
 have a performance problem?  Surely most Windows programs effectively do
 that
 anyway, often as a side effect of using a framework.

Yeah, I think an important point here is: If you are willing to call a
special PQinitWinsock() or whatever, then you can just call WSAStartup()
yourself, and the problem goes away...

I guess adding a connection parameter might help a little bit in that
you don't need an extra API call, but I'm unsure if it's worth it given
that the workaround is so simple.

In which case, we should perhaps just document the workaround using
WSAStartup() yourself, and not bother with either API or connection
parameter...

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about truncate

2009-01-21 Thread Peter Eisentraut

Tom Lane wrote:

Peter Eisentraut pete...@gmx.net writes:
The SQL standard uses a recursive-by-default language.  For example, the 
rules for the DELETE command state:


Actually, I'm not convinced.  Take a look at the SELECT WITH HIERARCHY
OPTION stuff in SQL99 and later, in particular this from SQL99
12.2 grant privilege statement:


Ah, the mysterious HIERARCHY OPTION comes into play.  That appears to be 
the ticket.



 7) Let SWH be the set of privilege descriptors in CPD whose action
is SELECT WITH HIERARCHY OPTION, and let ST be the set of
subtables of O, then for every grantee G in SWH and for every
table T in ST, the following grant statement is effectively
executed without further Access Rule checking:

  GRANT SELECT ON T TO G GRANTED BY A

It's difficult to read that any other way than that privileges are *not*
auto-recursive, and they have chosen to spell * in GRANT as WITH
HIERARCHY OPTION (gackk).


Er, well, I see this piece from SQL:2008 on table reference:


1) Case:
[...]
B) [...], the current privileges shall include SELECT on at least one 
column of T.


2) If TP simply contains only spec and TN identifies a typed table, then
Case:
[...]
B) [...], the current privileges shall include SELECT WITH HIERARCHY 
OPTION on at least one supertable of T.



(The omitted phrases deal with SECURITY INVOKER situations.)

I read that as that privileges are auto-recursive, and that you need the 
hierarchy option to be permitted to use ONLY.  (So the hierarchy option 
is an additional privilege on top of SELECT that allows you to break the 
encapsulation of the inheritance setup.)



On the other hand, it's hard to square that reading with the lack of any
UPDATE or DELETE WITH HIERARCHY OPTION syntax.  What am I missing here?


You need SELECT with or without HIERARCHY, as the case may be, to locate 
the row.  Once you have located it, you can UPDATE or DELETE it 
depending on privilege, but then it doesn't matter anymore how you got it.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about truncate

2009-01-21 Thread Peter Eisentraut

Andrew Dunstan wrote:

It's just occurred to me that if TRUNCATE no longer means TRUNCATE ONLY,
parallel restore will need to detect which server version is being used 
so that for version  8.3 it issues TRUNCATE ONLY.


The pg_dump output was never backward compatible.  (The input is.)  So 
the output of parallel restore need not be backward compatible either. 
(Unless this mandate has changed dramatically while I was not looking?) 
 So always issue TRUNCATE ONLY, if that is what the logic requires. 
The additional benefit is that this will fail safely on older versions.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-21 Thread Magnus Hagander
Alvaro Herrera wrote:
 Magnus Hagander escribió:
 
 For a change like
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/misc/guc.c?r1=1.480r2=1.481

 Will it work to stick _(hintmsg) around it there?
 
 Assuming that there is a gettext_noop() call in the literal that's
 assigned to hintmsg, yes, it should work.

Ok, I've applied a fix for this. Hope I got it right ;)

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4186: set lc_messages does not work

2009-01-21 Thread Magnus Hagander
Hiroshi Inoue wrote:
 Magnus Hagander wrote:
 Hiroshi Inoue wrote:
 Magnus Hagander wrote:
 There still needs to be some error checking added in IsoLocaleName(),
 but this is a start.

 Can someone please test this? :-)
 OK I would check it tonight.

 Thanks.
 
 OK seems to works here.
 The attached is a test case using lc_messages C, fr, de, es and sv.

I have applied this version of the patch with some more error checking
added to the ISO locale function.

Along with the separate commit for the .mo file naming, this should
hopefully take care of this issue?

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Magnus Hagander
Hiroshi Inoue wrote:
 Hiroshi Inoue wrote:
 Magnus Hagander wrote:
 Hiroshi Inoue wrote:
 Hiroshi Inoue wrote:
 Bruce Momjian wrote:
 Hiroshi, is this patch still needed?
 Yes though it should be slightly changed now.

 In what way should it be changed?

 One is already committed by you.
  [COMMITTERS] pgsql: Use the new text domain names

 Another is to bind the codeset EUC-JP for
  PG_EUC_JIS_2004 server encoding.
 
 The attached is an updated patch.

Thanks.

Looking at it, the comment clearly needs updating - I'll do that.

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true? In that case,
this whole thing shouldn't be #ifdef:ed to WIN32 and can be simplified.
Or does the middle part of the comment come into play, in that the
codeset names can be different on different platforms?

Peter, can you comment on that?

If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2009-01-21 Thread Alvaro Herrera
Martin Pihlak escribió:
 I wrote:
  I was thinking that the launcher should only request fresh stats at wakeup,
  the workers could then reuse that file. This could be implemented by calling
  pgstat_clear_snapshot only at launcher wakeup and setting max stats age to
  to autovacuum_naptime for the workers.
 
 Attached is a patch that increases the autovacuum stats age tolerance to
 autovacuum_naptime. This is handled by autovac_refresh_stats() by not clearing
 the stats snapshot unless nap time elapsed or explicitly forced by an error
 or SIGHUP.

You missed putting back the BUG comment that used to be there about
this.

In other words I think this is a bad idea, because there is a very wide
window for a table to be vacuumed twice.  Since naptime can be
arbitrarily large, this is an arbitrarily large bug.  I'm sure there are
other ways to fix this, but please propose those before this patch.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 14:05 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  Latest version of rmgr hooks patch for later review in current
  commitfest.
 
 I'd like to reject this patch.
...
 The external indexam use case doesn't impress me either, and Tom seems to 
 agree 
 (http://archives.postgresql.org/message-id/24006.1221483...@sss.pgh.pa.us).
 
 Plus there's the version incompatibility dangers. Although I think we 
 could put in some safeguards and live with it, it does open new 
 opportunities for confusion, so I'd rather not go there without a very 
 convincing use case.

The original design of Postgres allowed pluggable index access methods,
but that capability has not been brought forward to allow for WAL. This
patch would bridge that gap.

Right now we've got a variety of index types that are *not* flourishing
(hash, bitmap, grouped). If we allow them to develop as separate
projects, then whenever they are ready they can be used with particular
releases. You may doubt the worth of those index types but preventing
other people from building them seems strange. 

Why do we have 12+ pluggable languages, but we're not allowed to write
pluggable indexes? Whatever argument you put against it being too hard
or dangerous or whatever *also* applies to languages. Yet experience
shows pluggability has resulted in a variety of robust and useful
language types, some that might not have been predicted (PL/Proxy, PL/R
etc). They cover a variety of users and situations. Personally, I'd like
to enable people to come up with audio, video, bioinformatics datatypes
and indexes and I definitely don't want to limit the possibilities
there.

There is no danger here for Core, only opportunity. There *is* danger in
forcing new index designers to fit them into Core - look how unusable
hash indexes are. How can we allow that functionality to continue to
exist in Core and yet block the path by which we might reasonably
correct that?

You don't want pluggable indexes, don't use 'em. But that isn't an
argument against allowing the capability for others. That line of
thought would have led us to banning pluggable languages. We should
respect the roots of this project and look for ways to enable the
advancement of database technology, not limit it to only how far we can
currently see ahead through the fog.

 Plus there's the version incompatibility dangers. Although I think we 
 could put in some safeguards and live with it, it does open new 
 opportunities for confusion, so I'd rather not go there without a
 very 
 convincing use case.

There is danger in every plugin, so not a reasonable objection. Any
badly written external module can kill the database or smash data.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Peter Eisentraut

Magnus Hagander wrote:

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true?


Yes, the non-WIN32 code path appears to be the same, still.  But the 
ifdef WIN32 part we don't want, because that presumes something about 
the spelling of encoding names in the local iconv library.



If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.


Yes, that would be much better.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Teodor Sigaev
The external indexam use case doesn't impress me either, and Tom seems to agree 
(http://archives.postgresql.org/message-id/24006.1221483...@sss.pgh.pa.us).

Just for correctness - there is one external index 
http://www.cs.purdue.edu/spgist/


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Magnus Hagander
Peter Eisentraut wrote:
 Magnus Hagander wrote:
 However, one question: The comment currently says it's harmless to do
 this on non-windows platforms. Does this still hold true?
 
 Yes, the non-WIN32 code path appears to be the same, still.  But the
 ifdef WIN32 part we don't want, because that presumes something about
 the spelling of encoding names in the local iconv library.
 
 If we do keep the thing win32 only, I think we should just wrap the
 whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
 Unix - that'll make for cleaner code.
 
 Yes, that would be much better.

Something like this then?

//Magnus

*** a/src/backend/utils/mb/mbutils.c
--- b/src/backend/utils/mb/mbutils.c
***
*** 849,854  cliplen(const char *str, int len, int limit)
--- 849,894 
  	return l;
  }
  
+ #if defined(ENABLE_NLS)  defined(WIN32)
+ static const struct codeset_map {
+ 	int	encoding;
+ 	const char *codeset;
+ } codeset_map_array[] = {
+ 	{PG_UTF8, UTF-8},
+ 	{PG_LATIN1, LATIN1},
+ 	{PG_LATIN2, LATIN2},
+ 	{PG_LATIN3, LATIN3},
+ 	{PG_LATIN4, LATIN4},
+ 	{PG_ISO_8859_5, ISO-8859-5},
+ 	{PG_ISO_8859_6, ISO_8859-6},
+ 	{PG_ISO_8859_7, ISO-8859-7},
+ 	{PG_ISO_8859_8, ISO-8859-8},
+ 	{PG_LATIN5, LATIN5},
+ 	{PG_LATIN6, LATIN6},
+ 	{PG_LATIN7, LATIN7},
+ 	{PG_LATIN8, LATIN8},
+ 	{PG_LATIN9, LATIN-9},
+ 	{PG_LATIN10, LATIN10},
+ 	{PG_KOI8R, KOI8-R},
+ 	{PG_WIN1250, CP1250},
+ 	{PG_WIN1251, CP1251},
+ 	{PG_WIN1252, CP1252},
+ 	{PG_WIN1253, CP1253},
+ 	{PG_WIN1254, CP1254},
+ 	{PG_WIN1255, CP1255},
+ 	{PG_WIN1256, CP1256},
+ 	{PG_WIN1257, CP1257},
+ 	{PG_WIN1258, CP1258},
+ 	{PG_WIN866, CP866},
+ 	{PG_WIN874, CP874},
+ 	{PG_EUC_CN, EUC-CN},
+ 	{PG_EUC_JP, EUC-JP},
+ 	{PG_EUC_KR, EUC-KR},
+ 	{PG_EUC_TW, EUC-TW},
+ 	{PG_EUC_JIS_2004, EUC-JP}
+ };
+ #endif /* WIN32 */
+ 
  void
  SetDatabaseEncoding(int encoding)
  {
***
*** 859,880  SetDatabaseEncoding(int encoding)
  	Assert(DatabaseEncoding-encoding == encoding);
  
  	/*
! 	 * On Windows, we allow UTF-8 database encoding to be used with any
! 	 * locale setting, because UTF-8 requires special handling anyway.
! 	 * But this means that gettext() might be misled about what output
! 	 * encoding it should use, so we have to tell it explicitly.
! 	 *
! 	 * In future we might want to call bind_textdomain_codeset
! 	 * unconditionally, but that requires knowing how to spell the codeset
! 	 * name properly for all encodings on all platforms, which might be
! 	 * problematic.
! 	 *
! 	 * This is presently unnecessary, but harmless, on non-Windows platforms.
  	 */
! #ifdef ENABLE_NLS
! 	if (encoding == PG_UTF8)
! 		if (bind_textdomain_codeset(textdomain(NULL), UTF-8) == NULL)
! 			elog(LOG, bind_textdomain_codeset failed);
  #endif
  }
  
--- 899,921 
  	Assert(DatabaseEncoding-encoding == encoding);
  
  	/*
! 	 * On Windows, we need to explicitly bind gettext to the correct
! 	 * encoding, because gettext() tends to get confused.
  	 */
! #if defined(ENABLE_NLS)  defined(WIN32)
! 	{
! 		int	i;
! 
! 		for (i = 0; i  sizeof(codeset_map_array) / sizeof(codeset_map_array[0]); i++)
! 		{
! 			if (codeset_map_array[i].encoding == encoding)
! 			{
! if (bind_textdomain_codeset(textdomain(NULL), codeset_map_array[i].codeset) == NULL)
! 	elog(LOG, bind_textdomain_codeset failed);
! break;
! 			}
! 		}
! 	}
  #endif
  }
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 14:05 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  Latest version of rmgr hooks patch for later review in current
  commitfest.
 
 I'd like to reject this patch.
...
 I've read through all the related threads again, and I just still don't 
 see a convincing use case for it. I think that tools that let you 
 introspect and modify WAL files should be written as an external 
 toolkit, like pglesslog. 

The only reasonable way to examine the contents of WAL files is with
reference to a copy of the catalog that wrote them, timed *exactly* in
synchronisation with the WAL stream.

If somebody issued
CREATE TABLE x
INSERT INTO x
DROP TABLE
then the only time you can reasonably look at the data from the insert
is while replaying that record. At no other time does the data have
certain meaning. 

So you *must* replay catalog entries and recreate the original catalog
in exact synchronisation with reading WAL files. Recreating the catalog
can only be done by Postgres itself. It simply isn't practical to do
this all with an external tool, or even link in to replay somehow to
keep replay and the reading of the external file synchronised. If it
*was*, somebody would have done it already - some have already tried and
failed.

(I haven't suggested modifying WAL files, BTW, not sure where that came
from).

 Regarding the example plugin included, for debugging purposes you could 
 just compile with WAL_DEBUG, and the plugin to suppress actions for all 
 but one database is clearly not ready for any real work. It only 
 suppresses heapam records, replaying index updates and full-page-images 
 as usual, and it requires that you know the Oid of the database,

They're minor examples, so don't reject the plugin patch because the
example of usage isn't as useful as it could be. I'm hardly likely to
invest lots of time in a plugin while the approach has not been agreed,
am I?

It is viable for us to filter WAL records in this way, and not very
viable any other way. It doesn't require you to know the Oid of the
database, cos you can look that up in the catalog (with hot standby).
The example plugin doesn't do that, but it could.

So two use cases: inspecting WAL and filtering records before applying
them are covered here. Pluggable indexes is another, and there are
others also, as discussed on the original patch.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about truncate

2009-01-21 Thread Andrew Dunstan



Peter Eisentraut wrote:

Andrew Dunstan wrote:

It's just occurred to me that if TRUNCATE no longer means TRUNCATE ONLY,
parallel restore will need to detect which server version is being 
used so that for version  8.3 it issues TRUNCATE ONLY.


The pg_dump output was never backward compatible.  (The input is.)  So 
the output of parallel restore need not be backward compatible either. 
(Unless this mandate has changed dramatically while I was not 
looking?)  So always issue TRUNCATE ONLY, if that is what the logic 
requires. The additional benefit is that this will fail safely on 
older versions.




No it won't fail safely on older versions, because the truncate is part 
of a transaction, and thus the data member(s) will all fail. I'd like to 
be able to use 8.4 pg_restore to run parallel restores on older servers, 
and the fix for this is utterly trivial. I'll be posting a new patch 
with it in today.


(If we can't or don't want to make it work with older servers, I will 
create an out-of-tree patch for 8.3 that does, and put it on pgFoundry. 
But that would be a pity.)


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq WSACleanup is not needed

2009-01-21 Thread Andrew Chernow

Magnus Hagander wrote:


In which case, we should perhaps just document the workaround using
WSAStartup() yourself, and not bother with either API or connection
parameter...




I didn't originally agree with this but now I do.  Any libpq init function for 
wsa, would only be replacing an app calling WSAStartup themselves.  So, why have 
it at all.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Heikki Linnakangas

Simon Riggs wrote:

Latest version of rmgr hooks patch for later review in current
commitfest.


I'd like to reject this patch.

I've read through all the related threads again, and I just still don't 
see a convincing use case for it. I think that tools that let you 
introspect and modify WAL files should be written as an external 
toolkit, like pglesslog. The external indexam use case doesn't impress 
me either, and Tom seems to agree 
(http://archives.postgresql.org/message-id/24006.1221483...@sss.pgh.pa.us).


Plus there's the version incompatibility dangers. Although I think we 
could put in some safeguards and live with it, it does open new 
opportunities for confusion, so I'd rather not go there without a very 
convincing use case.


Regarding the example plugin included, for debugging purposes you could 
just compile with WAL_DEBUG, and the plugin to suppress actions for all 
but one database is clearly not ready for any real work. It only 
suppresses heapam records, replaying index updates and full-page-images 
as usual, and it requires that you know the Oid of the database,


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Column-Level Privileges

2009-01-21 Thread KaiGai Kohei
Stephen Frost wrote:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 On looking closer, though, it's *still* messy and unobvious :-(.
 There is no single place in the parser where we have the complete
 multi-level query tree available in a convenient form for this sort of
 postprocessing.
 
 That's unfortunate. :/
 
 I've thought of a less painful variant of my third option: instead of
 making a permanent addition to RangeTblEntry, we can have a transient
 data structure attached to ParseState that lets us find the JoinExpr
 nodes for already-parsed joins.  I'm going to try that next.
 
 Sounds reasonable.  I'd be happy to help if there's anything useful I
 can do at this point.

I also think it can be a reasonable approach.

However, as an aside, it will not be a help for SE-PostgreSQL, because
it checks Query tree *after* it passed through the rewriter stage, so
ParseState is already released. :-(

 
http://code.google.com/p/sepgsql/source/browse/trunk/sepgsql/src/backend/security/sepgsql/proxy.c#395

 QueryRewrite()
 - pgacePostQueryRewrite()
- sepgsqlPostQueryRewrite()
  - walkQueryHelper()
- walkVarHelper()
  - wholeRefJoinWalker()

Yes, it is an optional facility and we assume performance is not first
priority for SE-PostgreSQL users. However, if its duration of life has
been expanded to the tail of rewriter, I would be also happy.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 16:25 +0300, Teodor Sigaev wrote:
  The external indexam use case doesn't impress me either, and Tom seems to 
  agree 
  (http://archives.postgresql.org/message-id/24006.1221483...@sss.pgh.pa.us).

 Just for correctness - there is one external index 
 http://www.cs.purdue.edu/spgist/

If there is one even when we don't allow them (!), just think how many
there will be if we did allow them...

The docs for the SP-GiST describe PostgreSQL as highly extensible. I'd
like that to extend to allowing recoverable extensions also.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Greg Stark
On Wed, Jan 21, 2009 at 1:25 PM, Simon Riggs si...@2ndquadrant.com wrote:

 The only reasonable way to examine the contents of WAL files is with
 reference to a copy of the catalog that wrote them, timed *exactly* in
 synchronisation with the WAL stream.

This is a good point.


 Regarding the example plugin included, for debugging purposes you could
 just compile with WAL_DEBUG, and the plugin to suppress actions for all
 but one database is clearly not ready for any real work. It only
 suppresses heapam records, replaying index updates and full-page-images
 as usual, and it requires that you know the Oid of the database,

 They're minor examples, so don't reject the plugin patch because the
 example of usage isn't as useful as it could be. I'm hardly likely to
 invest lots of time in a plugin while the approach has not been agreed,
 am I?

Well for these two cases I think the question is would the be better
done from within the core instead of a plugin? And if they are better
done as a plugin are the advantages strong enough to outweigh the
downsides of a plugin. (This actually reinforces the point that doing
these things externally is not very realistic.)

I don't see much of an advantage for plugins instead of core features
for either of these two cases. And given how tightly bound to a
specific version and the WAL record formats of that version a plugin
will have are there any advantages? If a plugin will only work with a
particular version of Postgres and it needs access to internal include
files then what separation does it give? From a code structure point
of view it may as well be integrated, in which case anyone who
modifies the wal structures is more likely to keep the other features
up to date.

Moreover, for things like restoring a single database I think there
are further disadvantages. You would have to ensure that the records
you're skipping don't result in an incoherent database. That means
either doing a cold restore of just a single database. That could be
really cool, you could, for instance allow rolling back a single
database to a hot backup + PITR without even shutting down the rest of
the cluster. However for anything like this to work properly you have
to know what version of the data files were restored and what version
the rest of the database is at, etc. If it's a plugin I think you
don't have enough information or control of the overall state to
handle it.

The only advantage that remains, I think, is the real-world concern
that you can have proprietary plugins that add features to the
database for dealing with emergency situations. It also means people
can experiment with features without maintaining a fork. That's not a
trivial advantage at all. I could see that being quite useful. But on
balance, considering how critical backups and restores are I would
personally avoid experimenting in this area anyways.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about truncate

2009-01-21 Thread Peter Eisentraut

Andrew Dunstan wrote:
The pg_dump output was never backward compatible.  (The input is.)  So 
the output of parallel restore need not be backward compatible either. 
(Unless this mandate has changed dramatically while I was not 
looking?)  So always issue TRUNCATE ONLY, if that is what the logic 
requires. The additional benefit is that this will fail safely on 
older versions.


No it won't fail safely on older versions, because the truncate is part 
of a transaction, and thus the data member(s) will all fail.


I meant safe as in, it won't randomly delete more data than you 
intended.  I didn't mean in as in do-what-I-mean. :-)


I'd like to 
be able to use 8.4 pg_restore to run parallel restores on older servers, 
and the fix for this is utterly trivial. I'll be posting a new patch 
with it in today.


Works for me.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] GIN improvements

2009-01-21 Thread Teodor Sigaev
- after limit is reached, force cleanup of pending list by calling 
gininsertcleanup. Not very good, because users sometimes will see a huge 
execution time of simple insert. Although users who runs a huge update should be 
satisfied.


I have difficulties in a choice of way. Seems to me, the better will be second 
way: if user gets very long time of insertion then (auto)vacuum of his 
installation should tweaked.

I agree that the second solution sounds better to me.



Done. Now GIN counts number of pending tuples and pages and stores they on 
metapage. Index cleanup could start during normal insertion in two cases:

- number of pending tuples is too high to keep guaranteed non-lossy tidbitmap
- pending page's content doesn't fit into work_mem.

BTW, gincostestimate could use that information for cost estimation, but is 
index opening and metapge reading in amcostestimate acceptable?




--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


fast_insert_gin-0.23.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Jonah H. Harris
On Thu, Nov 6, 2008 at 12:03 AM, Jonah H. Harris jonah.har...@gmail.comwrote:

 As I wasn't sure whether anyone agrees with my distaste for
 repurposing tgenabled as mentioned above, I have attached is a patch
 which minimally corrects the function comment for EnableDisableTrigger
 where fires_when is concerned.


Was there a reason that this cleanup patch wasn't applied?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


Re: [HACKERS] FWD: Re: Updated backslash consistency patch

2009-01-21 Thread Martin Pihlak
Bruce Momjian wrote:
 Bruce Momjian wrote:
 I know we don't like the current behavior, but I think we need to make
 them consistent first for easy testing and so when we change it, it will
 remain consistent.

 I will work on a consensus patch soon for the new behavior.
 

The \dXU *.* commands also display objects from information_schema. IMHO
these should also be classified as system objects. It is most annoying to
run '\dfU *.*' and see a list of information_schema internal functions show up.
Whereas the intent was to see the user defined functions in all schemas.

regards,
Martin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes:

 The original design of Postgres allowed pluggable index access methods,
 but that capability has not been brought forward to allow for WAL. This
 patch would bridge that gap.

Well I think what people do is what GIST did early on -- they just don't
support recoverability until they get merged into core.

Nonetheless this *would* be a worthwhile problem to put effort into solving. I
agree that there are lots of exotic index methods out there that it would be
good to be able to develop externally.

But to do that we need an abstract interface that doesn't depend on internal
data structures, not a generic plugin facility that allows the plugin to
hijack the whole system.

We need something more like indexams which provides a set of call points which
do specific functions, only get called when they're needed, and are expected
to only do the one thing they've been asked to do.

This could be a bit tricky since the catalog isn't available to the wal replay
system. We can't just store the info needed in the pg_indexam table. And it
has to span all the databases in the cluster in any case.

Perhaps this should be solved along with the plugins thread. Binary modules
could have some way to register their rmgr id so you could guarantee that
there aren't two plugins with conflicting rmgr ids or version mismatches.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FWD: Re: Updated backslash consistency patch

2009-01-21 Thread Gregory Stark
Martin Pihlak martin.pih...@gmail.com writes:

 Bruce Momjian wrote:
 Bruce Momjian wrote:
 I know we don't like the current behavior, but I think we need to make
 them consistent first for easy testing and so when we change it, it will
 remain consistent.

 I will work on a consensus patch soon for the new behavior.
 

 The \dXU *.* commands also display objects from information_schema. IMHO
 these should also be classified as system objects. It is most annoying to
 run '\dfU *.*' and see a list of information_schema internal functions show 
 up.
 Whereas the intent was to see the user defined functions in all schemas.


You know I think I've come around to agreeing with one of Tom's proposals.

I think we should do the following:

\dX   : list user objects
\dXS  : list system objects
\dX pat : list all matching objects based on search_path
\dX *.*   : list all objects in all schemas

I've basically come to the conclusion that having the output agree with
behaviour at run-time is a big advantage and anything else would actually be
too dangerous.

If you do something like \dt p* or \df a* and are annoyed by the output
you just have to make your pattern something more specific. For tables we
already prefix them all with pg_ so one more letter ought to be enough. 

For functions it would be nice if we could trim the output quite a bit. I
wonder if we could rename all our internal functions which implement operators
and indexam methods without introducing any backwards compatibility issues. We
don't document things like int4gt after all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2009-01-21 Thread Martin Pihlak
I wrote:
 I was thinking that the launcher should only request fresh stats at wakeup,
 the workers could then reuse that file. This could be implemented by calling
 pgstat_clear_snapshot only at launcher wakeup and setting max stats age to
 to autovacuum_naptime for the workers.
 

Attached is a patch that increases the autovacuum stats age tolerance to
autovacuum_naptime. This is handled by autovac_refresh_stats() by not clearing
the stats snapshot unless nap time elapsed or explicitly forced by an error
or SIGHUP.

For the time being, I left the table vacuum recheck in place. Removing the
table_recheck_autovac function requires some further work. I have started on
this, but decided to defer until it is clear whether the whole approach is
acceptable or not.

regards,
Martin

*** a/src/backend/postmaster/autovacuum.c
--- b/src/backend/postmaster/autovacuum.c
***
*** 44,54 
   * Note that there can be more than one worker in a database concurrently.
   * They will store the table they are currently vacuuming in shared memory, so
   * that other workers avoid being blocked waiting for the vacuum lock for that
!  * table.  They will also reload the pgstats data just before vacuuming each
!  * table, to avoid vacuuming a table that was just finished being vacuumed by
!  * another worker and thus is no longer noted in shared memory.  However,
!  * there is a window (caused by pgstat delay) on which a worker may choose a
!  * table that was already vacuumed; this is a bug in the current design.
   *
   * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
--- 44,53 
   * Note that there can be more than one worker in a database concurrently.
   * They will store the table they are currently vacuuming in shared memory, so
   * that other workers avoid being blocked waiting for the vacuum lock for that
!  * table. There is a possibility that a worker might pick up a table that was
!  * already vacuumed by another process. This isn't really a problem, as the
!  * odds of this happening are low and the revacuum is made cheap by the use of
!  * visibility map.
   *
   * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
***
*** 120,128  int			autovacuum_vac_cost_limit;
  
  int			Log_autovacuum_min_duration = -1;
  
- /* how long to keep pgstat data in the launcher, in milliseconds */
- #define STATS_READ_DELAY 1000
- 
  
  /* Flags to tell if we are in an autovacuum process */
  static bool am_autovacuum_launcher = false;
--- 119,124 
***
*** 298,304  static void avl_sighup_handler(SIGNAL_ARGS);
  static void avl_sigusr1_handler(SIGNAL_ARGS);
  static void avl_sigterm_handler(SIGNAL_ARGS);
  static void avl_quickdie(SIGNAL_ARGS);
! static void autovac_refresh_stats(void);
  
  
  
--- 294,300 
  static void avl_sigusr1_handler(SIGNAL_ARGS);
  static void avl_sigterm_handler(SIGNAL_ARGS);
  static void avl_quickdie(SIGNAL_ARGS);
! static void autovac_refresh_stats(bool force);
  
  
  
***
*** 500,509  AutoVacLauncherMain(int argc, char *argv[])
  		DatabaseList = NULL;
  
  		/*
! 		 * Make sure pgstat also considers our stat data as gone.  Note: we
! 		 * mustn't use autovac_refresh_stats here.
  		 */
! 		pgstat_clear_snapshot();
  
  		/* Now we can allow interrupts again */
  		RESUME_INTERRUPTS();
--- 496,504 
  		DatabaseList = NULL;
  
  		/*
! 		 * Make sure pgstat also considers our stat data as gone.
  		 */
! 		autovac_refresh_stats(true);
  
  		/* Now we can allow interrupts again */
  		RESUME_INTERRUPTS();
***
*** 598,603  AutoVacLauncherMain(int argc, char *argv[])
--- 593,601 
  		if (got_SIGTERM)
  			break;
  
+ 		/* Refresh stats. Force it, if reloaded via SIGHUP */
+ 		autovac_refresh_stats(got_SIGHUP);
+ 
  		if (got_SIGHUP)
  		{
  			got_SIGHUP = false;
***
*** 851,859  rebuild_database_list(Oid newdb)
  	int			nelems;
  	HTAB	   *dbhash;
  
- 	/* use fresh stats */
- 	autovac_refresh_stats();
- 
  	newcxt = AllocSetContextCreate(AutovacMemCxt,
     AV dblist,
     ALLOCSET_DEFAULT_MINSIZE,
--- 849,854 
***
*** 1078,1086  do_start_worker(void)
     ALLOCSET_DEFAULT_MAXSIZE);
  	oldcxt = MemoryContextSwitchTo(tmpcxt);
  
- 	/* use fresh stats */
- 	autovac_refresh_stats();
- 
  	/* Get a list of databases */
  	dblist = get_database_list();
  
--- 1073,1078 
***
*** 2145,2158  do_autovacuum(void)
  		}
  
  		/*
! 		 * Check whether pgstat data still says we need to vacuum this table.
! 		 * It could have changed if something else processed the table while
! 		 * we weren't looking.
! 		 *
! 		 * Note: we have a special case in pgstat code to ensure that the stats
! 		 * we read are as up-to-date as possible, to avoid the problem that

Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 14:28 +, Greg Stark wrote:

 The only advantage that remains, I think, is the real-world concern
 that you can have proprietary plugins 

How exactly is this plugin more likely to result in a proprietary plugin
than all of the other plugin types we have? Because I suggest it??

I find it quite amazing that anybody would think I proposed a patch
whose only advantage lay in commercial exploitation, implying that I
intend that. But at least you had the courage to write it, allowing me
to answer, so actually I'll say thank you for raising that point:

** I have no plans for selling software that has been enabled by this
patch. **

The plugin approach was suggested because it brings together so many use
cases in one and adds missing robustness to a case where we already have
extensibility. Extensibility is about doing things for specific
implementations *without* needing to patch Postgres, not just allowing
external projects to exist alongside.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 14:57 +, Gregory Stark wrote:
 But to do that we need an abstract interface that doesn't depend on
 internal data structures, not a generic plugin facility that allows
 the plugin to hijack the whole system.
 
 We need something more like indexams which provides a set of call
 points which do specific functions, only get called when they're
 needed, and are expected to only do the one thing they've been asked
 to do.

Really this is just ridiculous scare-mongering. Hijack the whole system?

The patch takes special care to allow calls to the rmgr functions only
from the startup process. The APIs are exactly like the indexams and
*are* called only in specific ways, at specific times. At your earlier
request I put in filters to prevent WAL inserts for plugins that didn't
exist, ensuring that all WAL writes were crash recoverable.

You can already do all the weird stuff you like with index AMs, like
send emails to the Pope on every row insert. I can already create an
in-memory index for example. How exactly do the rmgr interface give more
power? The structure of the function pointers is identical to the
indexAM code...

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Peter Eisentraut

Magnus Hagander wrote:

Peter Eisentraut wrote:

Magnus Hagander wrote:

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true?

Yes, the non-WIN32 code path appears to be the same, still.  But the
ifdef WIN32 part we don't want, because that presumes something about
the spelling of encoding names in the local iconv library.


If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.

Yes, that would be much better.


Something like this then?


Looks OK to me.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] is 8.4 array_agg() supposed to work with array values?

2009-01-21 Thread Todd A. Cook

Peter Eisentraut wrote:

On Monday 19 January 2009 23:22:21 Todd A. Cook wrote:

The docs at
http://developer.postgresql.org/pgdocs/postgres/functions-aggregate.html
don't prohibit using array values with array_arg(), so I assumed that it
would work.



test= select array_agg(v.a) from (values (array[1,2]), (array[3,4])) as
v(a) ; ERROR:  could not find array type for data type integer[]


Yeah ... This is one of the weirdnesses of the PostgreSQL array 
implementation.  integer[] and integer[][] etc. are actually the same type, 
just using a different number of dimensions internally.  This would work much 
better if integer[][] where array of integer[], in the same way as 
integer[] is array of integer, in the way C deals with arrays.  This is 
also a main reason why composite types and arrays don't mix orthogonally; 
there is no way to represent that in the system catalogs.  To get back to 
your question, as far as array_agg() itself is concerned, it would probably 
work, but the rest of the sytem doesn't deal with it very well.  You will 
probably be able to find a number of other places that break when trying to 
derive the array type from something that is already an array.


Thanks for the clarification.

-- todd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore -1 vs -C and -c

2009-01-21 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
 As for -c, the solution would be to issue DROP IF EXISTS  
 statements. Is there any particular reason why we don't?
 I think we did that to avoid damaging portability and backwards
 compatibility of the dump files.  The backwards compatibility argument
 is pretty weak by now, but the it's not standard SQL argument still
 has force.
 
 IIRC the drop statements are generated by pg_restore and not stored in  
 the archive. So we could do the if exists by default and have a switch  
 to turn it off for a compatible dump, perhaps?
 
 No, the text of the statements is in the archive; though it might not be
 too painful to have pg_restore edit them to insert IF EXISTS.  You
 don't need an extra switch, just do this if -1 is in use (and document
 that that switch reduces the standard-ness of the output...)

Something along the line of this?

(This is for the actual injection, I still haven't implemented
switch/decided when to actually include it, so this is not for
application yet - just for a comment on the general method..)

//Magnus

*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***
*** 123,128  CloseArchive(Archive *AHX)
--- 123,144 
  	 strerror(errno));
  }
  
+ /*
+  * List all objects that can be DROPped that are made up of more
+  * than a single word.
+  */
+ static const char *multiword_drops[] = {
+ 	FOREIGN DATA WRAPPER,
+ 	OPERATOR CLASS,
+ 	OPERATOR FAMILY,
+ 	TEXT SEARCH CONFIGURATION,
+ 	TEXT SEARCH DICTIONARY,
+ 	TEXT SEARCH PARSER,
+ 	TEXT SEARCH TEMPLATE,
+ 	USER MAPPING,
+ 	NULL
+ };
+ 
  /* Public */
  void
  RestoreArchive(Archive *AHX, RestoreOptions *ropt)
***
*** 249,256  RestoreArchive(Archive *AHX, RestoreOptions *ropt)
  /* Select owner and schema as necessary */
  _becomeOwner(AH, te);
  _selectOutputSchema(AH, te-namespace);
! /* Drop it */
! ahprintf(AH, %s, te-dropStmt);
  			}
  		}
  
--- 265,308 
  /* Select owner and schema as necessary */
  _becomeOwner(AH, te);
  _selectOutputSchema(AH, te-namespace);
! /*
!  * Figure out if it's something we can do DROP IF EXISTS on.
!  * Check for DROP  just to be sure.
!  */
! if (strncmp(te-dropStmt, DROP , 5) == 0)
! {
! 	char   *cp = te-dropStmt + 5;
! 	char   *insertpoint = NULL;
! 	char   *newstr = NULL;
! 	int		i;
! 
! 	/*
! 	 * Assume that all objects can be DROP IF EXISTS:ed. However,
! 	 * some have more than one word in them, so we need to figure
! 	 * out exactly where to insert the IF EXISTS.
! 	 */
! 	for (i = 0; multiword_drops[i] != NULL; i++)
! 	{
! 		if (strncmp(cp, multiword_drops[i], strlen(multiword_drops[i])) == 0)
! 		{
! 			insertpoint = cp + strlen(multiword_drops[i]);
! 			break;
! 		}
! 	}
! 	if (insertpoint == NULL)
! 		insertpoint = strchr(cp, ' ');
! 	if (insertpoint == NULL)
! 		die_horribly(AH,modulename,malformatted DROP statement: %s, te-dropStmt);
! 
! 	newstr = calloc(strlen(te-dropStmt) + 11, 1); /* IF EXISTS  + terminator */
! 	strncpy(newstr, te-dropStmt, insertpoint - te-dropStmt);
! 	strcat(newstr,  IF EXISTS);
! 	strcat(newstr, insertpoint);
! 	ahprintf(AH, %s, newstr);
! 	free(newstr);
! }
! else
! 	ahprintf(AH, %s, te-dropStmt);
  			}
  		}
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Gregory Stark

Simon Riggs si...@2ndquadrant.com writes:

 On Wed, 2009-01-21 at 14:28 +, Greg Stark wrote:

 The only advantage that remains, I think, is the real-world concern
 that you can have proprietary plugins 

 ** I have no plans for selling software that has been enabled by this
 patch. **

Hm, I didn't specifically mean this. However I'm not sure why this would be
considered so prejudicial. The Postgres project isn't generally hostile to
commercial use and extensions. If there was something you *did* want to sell
based on this and you needed a clean, generally useful interface to do it then
I think it would be an argument in *favour* of providing it, not against.

But I meant more generally, that the real-world use case for a generic rmgr
plugin function is for providing interfaces for things which cannot -- for
whatever non-code-related reason -- be integrated in core. That is, from a
code point of view they would be best integrated in core. So either they're
not generally useful, not production quality, not license compatible, or
whatever.

 The plugin approach was suggested because it brings together so many use
 cases in one and adds missing robustness to a case where we already have
 extensibility. Extensibility is about doing things for specific
 implementations *without* needing to patch Postgres, not just allowing
 external projects to exist alongside.

I think a generic plugin architecture is *too* many use cases. That is it's
too flexible and doesn't make any promises at all of what its intended to do.
As a result the system can't be sure it's calling the right method, can't
detect conflicts or catch errors. There's a sweet spot of abstraction where
the interface has to be specific enough to be useful but general enough to
cover all the use cases.

I'm not sure though, your comments in the other email make me think there
might be more to the patch that I had the impression was there. Will now go
read the patch and see if I was mistaken.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Automatic view update rules

2009-01-21 Thread Peter Eisentraut

Here is my latest reworked patch that fixes all outstanding issues.


view_update-petere-20090121.patch.bz2
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status Report on Hot Standby

2009-01-21 Thread Mark Kirkwood

Robert Treat wrote:

On Friday 16 January 2009 19:16:42 Simon Riggs wrote:
  

Bruce asked for 2 more weeks to get patches into shape for commit.

Current patch v8e is attached here. Ready for commit? Up to you.

My overall opinion is that it's in very good shape. Worth the community
including it in this release and spending further time on it. I'm happy
to stand by this going forwards.




+1

  

+1

I've been testing several versions of this patch, and overall it looks 
very good.


regards

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Heikki Linnakangas

Gregory Stark wrote:

But to do that we need an abstract interface that doesn't depend on internal
data structures, not a generic plugin facility that allows the plugin to
hijack the whole system.

We need something more like indexams which provides a set of call points which
do specific functions, only get called when they're needed, and are expected
to only do the one thing they've been asked to do.


That's called GiST. ;-)

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2009-01-21 Thread Martin Pihlak
Alvaro Herrera wrote:
 You missed putting back the BUG comment that used to be there about
 this.
 

This was deliberate, I did mention the condition in the comment at
the beginning of the file. This actually makes it a feature :)

Seriously though, do you think that this is still a problem? Given
the rare occurrence of the revacuum and the fact that it is made
cheap by visibility map? In my initial testing, I couldn't reproduce
the revacuum. But I'll keep at it.

 In other words I think this is a bad idea, because there is a very wide
 window for a table to be vacuumed twice.  Since naptime can be
 arbitrarily large, this is an arbitrarily large bug.  I'm sure there are
 other ways to fix this, but please propose those before this patch.
 

I was wondering that maybe the stats subsystem shouldn't be used for
vacuum tracking at all. It maybe convenient to use, but has several
deficiencies (pobig file, lossy, no crash safety, etc). Could we move
vacuum tracking to pg_class instead?

regards,
Martin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Heikki Linnakangas

Simon Riggs wrote:

Right now we've got a variety of index types that are *not* flourishing
(hash, bitmap, grouped).


Hash indexam has been in core for ages, and yet no-one has bothered to 
implement WAL logging. If I've understood correctly, it has been now 
been revamped in 8.4 so that there's a performance use case to use it. I 
wouldn't be surprised if someone (GSoC?) implements WAL logging for it 
for 8.5.


Bitmap indexes required significant changes to the rest of the system, 
the indexam API in particular.


By grouped, I presume you mean my grouped index tuples patch, aka 
clustered indexes. That too required changes to the indexam API, and 
even if it didn't, I can guarantee that I wouldn't spend any more time 
on it than I do now (= 0) if it was on pgfoundry.



If we allow them to develop as separate
projects, then whenever they are ready they can be used with particular
releases.


Developing a new indexam is not something you do over the weekend. It's 
a long way from design to an implementation robust enough that anyone 
cares about crash recovery. Short-circuiting the release cycle with a 
plugin won't get you a production-ready indexam much sooner.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2009-01-21 Thread Alvaro Herrera
Martin Pihlak escribió:
 Alvaro Herrera wrote:
  You missed putting back the BUG comment that used to be there about
  this.
 
 This was deliberate, I did mention the condition in the comment at
 the beginning of the file. This actually makes it a feature :)
 
 Seriously though, do you think that this is still a problem? Given
 the rare occurrence of the revacuum and the fact that it is made
 cheap by visibility map?

Hmm, maybe it's no longer an issue with the visibility map, yes.

 I was wondering that maybe the stats subsystem shouldn't be used for
 vacuum tracking at all. It maybe convenient to use, but has several
 deficiencies (pobig file, lossy, no crash safety, etc). Could we move
 vacuum tracking to pg_class instead?

I agree that pgstats is not ideal (we've said this from the very
beginning), but I doubt that updating pg_class is the answer; you'd be
generating thousands of dead tuples there.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 16:07 +, Gregory Stark wrote:
 
  The plugin approach was suggested because it brings together so many
 use cases in one and adds missing robustness to a case where we
 already have extensibility. Extensibility is about doing things for
 specific implementations *without* needing to patch Postgres, not just
 allowing external projects to exist alongside.
 
 I think a generic plugin architecture is *too* many use cases. That is
 it's too flexible and doesn't make any promises at all of what its
 intended to do. 

I agree. I don't see providing the plugin capability should prevent
provision of further features in this area. Indeed, I see it as a way of
encouraging people to write stuff for Postgres, which we then reel
slowly back into core, if it is robust enough and general purpose
enough. My model is PL/Proxy: the capability we will eventually gain in
Core will be because we gave solution designers a free hand to invent
and a free hand to overcome obstacles in months, not years. Solutions
now, better solutions later. 

 I'm not sure though, your comments in the other email make me think
 there might be more to the patch that I had the impression was there.
 Will now go read the patch and see if I was mistaken.

Thank you.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Heikki Linnakangas

Simon Riggs wrote:

So you *must* replay catalog entries and recreate the original catalog
in exact synchronisation with reading WAL files. Recreating the catalog
can only be done by Postgres itself.


The startup process doesn't have a relcache, so this rmgr patch is 
nowhere near enough to enable that. If I understood correctly, the hot 
standby patch doesn't change that either.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Simon Riggs wrote:
  Right now we've got a variety of index types that are *not* flourishing
  (hash, bitmap, grouped).
 
 Hash indexam has been in core for ages, and yet no-one has bothered to 
 implement WAL logging. If I've understood correctly, it has been now 
 been revamped in 8.4 so that there's a performance use case to use it. I 
 wouldn't be surprised if someone (GSoC?) implements WAL logging for it 
 for 8.5.
 
 Bitmap indexes required significant changes to the rest of the system, 
 the indexam API in particular.
 
 By grouped, I presume you mean my grouped index tuples patch, aka 
 clustered indexes. That too required changes to the indexam API, and 
 even if it didn't, I can guarantee that I wouldn't spend any more time 
 on it than I do now (= 0) if it was on pgfoundry.
 
  If we allow them to develop as separate
  projects, then whenever they are ready they can be used with particular
  releases.
 
 Developing a new indexam is not something you do over the weekend. It's 
 a long way from design to an implementation robust enough that anyone 
 cares about crash recovery. Short-circuiting the release cycle with a 
 plugin won't get you a production-ready indexam much sooner.

Agreed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Heikki Linnakangas

Simon Riggs wrote:

Why do we have 12+ pluggable languages, but we're not allowed to write
pluggable indexes? Whatever argument you put against it being too hard
or dangerous or whatever *also* applies to languages. Yet experience
shows pluggability has resulted in a variety of robust and useful
language types, some that might not have been predicted (PL/Proxy, PL/R
etc). They cover a variety of users and situations.


Languages are quite different. People already know language X, so they 
want to use it for stored procedures too. Or they want to interface 
other libraries or functionality available in language X. There's no 
such argument with indexams. Also, PL handlers are not as tightly 
integrated into the rest of the system, no need for low-level page 
access, for example, which is why it's easier to have a generic 
interface for them. There's also less issues with concurrency and 
version-compatibility.



Personally, I'd like
to enable people to come up with audio, video, bioinformatics datatypes
and indexes and I definitely don't want to limit the possibilities
there.


Yeah, I'd like to see all those datatypes too. But I'd presume that 
audio, video and bioinformatics indexing could all be implemented using 
GiST. You don't want to write an indexam from scratch for every data type.



...  - look how unusable
hash indexes are. How can we allow that functionality to continue to
exist in Core and yet block the path by which we might reasonably
correct that?


I don't see how ripping out hash indexes from core and pushing it into 
an external module where it could use the rmgr plugin mechanism would 
help to add WAL-logging to it. If someone wants to implement WAL-logging 
for hash indexes, just do it, and send a patch.



You don't want pluggable indexes, don't use 'em. But that isn't an
argument against allowing the capability for others. That line of
thought would have led us to banning pluggable languages. We should
respect the roots of this project and look for ways to enable the
advancement of database technology, not limit it to only how far we can
currently see ahead through the fog.


This is an open source project. There's already a lot of people writing 
their thesis and whatnot using PostgreSQL, having no problem modifying 
the code as they see fit to try completely novel things. We're not 
banning or blocking that. On the contrary, that's great! Anyone can 
download the source code, modify it, and publish a patch. Others will 
find the patch interesting and embrace it, or not. *That's* how this 
project moves forward.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Peter Eisentraut
On Friday 16 January 2009 04:09:11 Robert Haas wrote:
 I really wonder what is so terrible about the behavrior as implemented
 in CVS HEAD.  AFAICS, no one except maybe Tom has really specified WHY
 they don't like it, just that they don't like it.  I'm not sure
 whether that's because (1) it's different, and they're used to the old
 way; (2) it involves typing an extra character to get the behavior
 they want; or (3) there's no way to search user and system functions
 simultaneously.

Btw., the reason why I didn't like it is that hiding system tables is OK, 
because you never really interact with them, but hiding system functions is 
not, because you use those all the time.  It has emerged, however, that some 
people appear to think of \df in terms of what's available and some in 
terms of what have I done.  And those call for different behaviors.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Andrew Sullivan
None of this is Any of My Business any more, but

On Wed, Jan 21, 2009 at 03:44:15PM +, Simon Riggs wrote:

 The patch takes special care to allow calls to the rmgr functions only
 from the startup process. The APIs are exactly like the indexams and
 *are* called only in specific ways, at specific times. At your earlier
 request I put in filters to prevent WAL inserts for plugins that didn't
 exist, ensuring that all WAL writes were crash recoverable.

I haven't even started to think about looking at the code, but I buy
Simon's argument here.  The Pg project is at big pains to point out
how the extensible PL support and custom datatypes are such big
deals.  So why is pluggable index support not also a good thing?

I take no position on the merits of the proposed patch, which I do not
pretend to understand.  But it'd be nice to see opponents distinguish
beteween  bad idea in principle and bad idea in this case.  If
you're arguing the former, clarifying why the analogies aren't
relevant would be helpful.

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 18:24 +0200, Heikki Linnakangas wrote:

 If we allow them to develop as separate projects, then whenever they
 are ready they can be used with particular releases.

 Developing a new indexam is not something you do over the weekend.
 It's a long way from design to an implementation robust enough that
 anyone cares about crash recovery. Short-circuiting the release cycle
 with a plugin won't get you a production-ready indexam much sooner.

You're assuming that somebody is starting from scratch and that they
don't have access to index and/or Postgres experts.

There are already research projects in various forms of new index. This
would further encourage that. There are also companies such as CopperEye
that sell indexes for use in other RDBMS, that would be easily able to
adapt their technology to Postgres.

They could also be adapting one of the existing index types for use in a
particular application. Various ideas present themselves.

I'm not trying to persuade you to personally work on indexes. I'm trying
to persuade you to let others work on indexes without your approval.
They already can, though they cannot make them production ready without
this and I see no reason to prevent them. We're not talking about
including their code in Postgres, we're talking about allowing them not
to.

Bruce Lindsay, IBM Fellow and long term DB guru was interviewed in 2005:
Q: If you magically had enough extra time to do one additional thing at
work that you're not doing now, what would it be?

I think I would work on indexing a little harder.

(He mentions XML indexing, multi-dimensional indexing etc)
[Taken from SIGMOD Record, June 2005]

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Bruce Momjian
Peter Eisentraut wrote:
 On Friday 16 January 2009 04:09:11 Robert Haas wrote:
  I really wonder what is so terrible about the behavrior as implemented
  in CVS HEAD. ?AFAICS, no one except maybe Tom has really specified WHY
  they don't like it, just that they don't like it. ?I'm not sure
  whether that's because (1) it's different, and they're used to the old
  way; (2) it involves typing an extra character to get the behavior
  they want; or (3) there's no way to search user and system functions
  simultaneously.
 
 Btw., the reason why I didn't like it is that hiding system tables is OK, 
 because you never really interact with them, but hiding system functions is 
 not, because you use those all the time.  It has emerged, however, that some 
 people appear to think of \df in terms of what's available and some in 
 terms of what have I done.  And those call for different behaviors.

Yep.  I thought about what would be the ideal behavior of each backslash
command that supports 'S'.  Some \d commands are better with just user
objects (\dt), while others are better including system objects (\dT),
and some are unclear (\df).  (You have to love that.)

My feeling was to focus on the most used commands (\dt) and figure out
how they would behave best, and let the others follow.  Arguably the
\dtisv are good in current CVS, while it is unclear if \df is better or
worse than 8.3, and \dT is certainly worse than 8.3.

But frankly, with a very complex backslash API that is already
overloaded, I figured having a consistent 'S' to include system objects
was the best we are going to be able to do.  Once this is out in the
field we might get new ideas.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 18:38 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  So you *must* replay catalog entries and recreate the original catalog
  in exact synchronisation with reading WAL files. Recreating the catalog
  can only be done by Postgres itself.
 
 The startup process doesn't have a relcache, 

Yes

 so this rmgr patch is 
 nowhere near enough to enable 

You are way too smart not to overcome such a minor hurdle...

 that. If I understood correctly, the hot 
 standby patch doesn't change that either.

No it doesn't.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [SQL] array_to_string(anyarray, text) that was working in 8.1 is not working in 8.3

2009-01-21 Thread Bruce Momjian

FYI, I tested your query in 8.3.X CVS and it worked so this fix will in
the next 8.3 minor release.

---

Corey Horton wrote:
 Is there any known workaround to get this the elements of the 
 histogram_bounds anyarray in 8.3.5.  If not, when might I expect a fix?
 
 Just trying to plan our testing/release schedule of rolling out to 8.3 
 around this problem.
 
 Thanks,
 Corey
 
 Tom Lane wrote:
  I wrote:

  While we could probably revert just enough of the changes to
  enforce_generic_type_consistency to allow this case again, I wonder
  just how safe that'd really be.  It would amount to expecting that
  functions that take anyarray but don't take or return anyelement to
  not only work on any array type, but to be always prepared for the
  input element type to change on-the-fly (since that's exactly what
  would happen when scanning pg_statistic).  Quite a lot of the built-in
  anyarray functions are prepared to do that, but I'm not sure they all
  are.
  
 
  I went and looked, and found that none of the thirty or so built-in
  functions that accept ANYARRAY are coded to make unsafe assumptions
  about the input array type remaining the same across calls.  So at least
  as of CVS HEAD, it seems safe to relax this back to the way it was
  pre-8.3.
 
  I'm still worried about the possibility of extension functions or future
  core functions failing to follow this coding rule; but as long as people
  are lazy and copy-and-paste from the existing models, it should be okay.
 
  regards, tom lane
 
 


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 19:13 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  Why do we have 12+ pluggable languages, but we're not allowed to write
  pluggable indexes? Whatever argument you put against it being too hard
  or dangerous or whatever *also* applies to languages. Yet experience
  shows pluggability has resulted in a variety of robust and useful
  language types, some that might not have been predicted (PL/Proxy, PL/R
  etc). They cover a variety of users and situations.
 
 Languages are quite different. People already know language X, so they 
 want to use it for stored procedures too. Or they want to interface 
 other libraries or functionality available in language X. There's no 
 such argument with indexams. Also, PL handlers are not as tightly 
 integrated into the rest of the system, no need for low-level page 
 access, for example, which is why it's easier to have a generic 
 interface for them. There's also less issues with concurrency and 
 version-compatibility.

Yes, they allow people's external experience to be brought to Postgres.
Which includes index experience.

You're assuming that indexes must have concurrency and are therefore
difficult to design. Concurrency isn't a requirement in many cases. You
just need to store tids and feed them back. Indexes don't have to use
database pages even. Robustness is a much more certain requirement,
since rebuilding indexes from scratch may not even be practical in some
cases.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Robert Haas
On Wed, Jan 21, 2009 at 6:17 AM, Jonah H. Harris jonah.har...@gmail.com wrote:
 On Thu, Nov 6, 2008 at 12:03 AM, Jonah H. Harris jonah.har...@gmail.com
 wrote:

 As I wasn't sure whether anyone agrees with my distaste for
 repurposing tgenabled as mentioned above, I have attached is a patch
 which minimally corrects the function comment for EnableDisableTrigger
 where fires_when is concerned.

 Was there a reason that this cleanup patch wasn't applied?

1. It was submitted after the deadline for CommitFest:November.

2. It sounded like you had given up:

 Oh well, it was just a thought.

3. Tom Lane objected to it.

http://archives.postgresql.org/message-id/20096.1225984...@sss.pgh.pa.us

If you want it to be considered further, you might add it here:

http://wiki.postgresql.org/wiki/CommitFest_2009-First

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] GIN improvements

2009-01-21 Thread Jeff Davis
On Wed, 2009-01-21 at 15:06 +0300, Teodor Sigaev wrote:
 Done. Now GIN counts number of pending tuples and pages and stores they on 
 metapage. Index cleanup could start during normal insertion in two cases:
 - number of pending tuples is too high to keep guaranteed non-lossy tidbitmap
 - pending page's content doesn't fit into work_mem.

Great, thanks. I will take a look at this version tonight.

Because time is short, I will mark it as Ready for committer review
now. I think all of the major issues have been addressed, and I'll just
be looking at the code and testing it.

 BTW, gincostestimate could use that information for cost estimation, but is 
 index opening and metapge reading in amcostestimate acceptable?

That sounds reasonable to me. I think that's what the index-specific
cost estimators are for. Do you expect a performance impact?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Alvaro Herrera
Bruce Momjian escribió:

 But frankly, with a very complex backslash API that is already
 overloaded, I figured having a consistent 'S' to include system objects
 was the best we are going to be able to do.  Once this is out in the
 field we might get new ideas.

I don't buy this argument.  If we're going to break backwards
compatibility we should only do so to get a better UI.  Not because we
might get new ideas.

After all this discussion, I'm not sure I understand why is it so
important that all \ commands behave consistently.  Since psql is
primarily a user-convenience tool, it seems that it needs to be usable
first, consistent second.  In most cases, usable means consistent (think
having the OK button at the same side of the Cancel button in all dialog
boxes), but this is one of the other cases, because the requirements for
some situations are clearly at odds in other situations (or as Peter
puts it: it is seldom useful to display pg_catalog tables, but it is
very often useful to display pg_catalog types).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2009-01-21 Thread Heikki Linnakangas

Alvaro Herrera wrote:

Martin Pihlak escribió:

Alvaro Herrera wrote:

You missed putting back the BUG comment that used to be there about
this.

This was deliberate, I did mention the condition in the comment at
the beginning of the file. This actually makes it a feature :)

Seriously though, do you think that this is still a problem? Given
the rare occurrence of the revacuum and the fact that it is made
cheap by visibility map?


Hmm, maybe it's no longer an issue with the visibility map, yes.


You still have to scan all indexes, so it's still not free by any means.

(I haven't been paying attention to what kind of a risk we're talking 
about..)


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Heikki Linnakangas

Robert Haas wrote:

On Wed, Jan 21, 2009 at 6:17 AM, Jonah H. Harris jonah.har...@gmail.com wrote:

On Thu, Nov 6, 2008 at 12:03 AM, Jonah H. Harris jonah.har...@gmail.com
wrote:

As I wasn't sure whether anyone agrees with my distaste for
repurposing tgenabled as mentioned above, I have attached is a patch
which minimally corrects the function comment for EnableDisableTrigger
where fires_when is concerned.

Was there a reason that this cleanup patch wasn't applied?


1. It was submitted after the deadline for CommitFest:November.


Well, it's just comment changes...


2. It sounded like you had given up:


That's the impression I had, until I just went and read the thread in 
detail.



Oh well, it was just a thought.


3. Tom Lane objected to it.

http://archives.postgresql.org/message-id/20096.1225984...@sss.pgh.pa.us


If I understood the discussion correctly, Tom objected to the more 
drastic change of renaming the catalog column. But the patch Jonah 
posted didn't do that, it only changed the comments, precisely because 
he felt that others might not want the more drastic change,


(I haven't checked whether the comment changes are a good idea. But they 
probably are..)


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Alvaro Herrera
Heikki Linnakangas escribió:

 (I haven't checked whether the comment changes are a good idea. But they  
 probably are..)

The original comments are broken, the new ones seem good.  I think this
patch should just be applied.  The only possible gripe I have is that
the grammar in the second hunk seems strange or broken, but maybe it's
just that I don't know the language enough.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Alvaro Herrera
Alvaro Herrera escribió:

 The only possible gripe I have is that the grammar in the second hunk
 seems strange or broken, but maybe it's just that I don't know the
 language enough.

Oh, it makes sense if you consider states as a noun rather than a verb.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian escribi?:
 
  But frankly, with a very complex backslash API that is already
  overloaded, I figured having a consistent 'S' to include system objects
  was the best we are going to be able to do.  Once this is out in the
  field we might get new ideas.
 
 I don't buy this argument.  If we're going to break backwards
 compatibility we should only do so to get a better UI.  Not because we
 might get new ideas.
 
 After all this discussion, I'm not sure I understand why is it so
 important that all \ commands behave consistently.  Since psql is
 primarily a user-convenience tool, it seems that it needs to be usable
 first, consistent second.  In most cases, usable means consistent (think
 having the OK button at the same side of the Cancel button in all dialog
 boxes), but this is one of the other cases, because the requirements for
 some situations are clearly at odds in other situations (or as Peter
 puts it: it is seldom useful to display pg_catalog tables, but it is
 very often useful to display pg_catalog types).

Well, to do this you are going to need 'U' and 'S' modifiers, and then
we have to decide how \df is supposed to behave.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Alvaro Herrera
Bruce Momjian escribió:

 Well, to do this you are going to need 'U' and 'S' modifiers, and then
 we have to decide how \df is supposed to behave.

I think we should have first decided how it was supposed to behave, and
later applied any patches.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian escribi?:
 
  Well, to do this you are going to need 'U' and 'S' modifiers, and then
  we have to decide how \df is supposed to behave.
 
 I think we should have first decided how it was supposed to behave, and
 later applied any patches.

Well, there was a lot of discussion in the Spring that the backslash
commands should be consistent, Greg Sabino Mullane went away to work on
the patch, but didn't finish until my prodding in October, so I then
applied it.  \df seemed to be the hot item, and the rest just fell into
place.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Robert Haas
 Was there a reason that this cleanup patch wasn't applied?

 1. It was submitted after the deadline for CommitFest:November.

 Well, it's just comment changes...

Oh, didn't realize that.  That's what I get for replying without
reading the patch...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Alvaro Herrera
Bruce Momjian escribió:
 Alvaro Herrera wrote:
  Bruce Momjian escribi?:
  
   Well, to do this you are going to need 'U' and 'S' modifiers, and then
   we have to decide how \df is supposed to behave.
  
  I think we should have first decided how it was supposed to behave, and
  later applied any patches.
 
 Well, there was a lot of discussion in the Spring that the backslash
 commands should be consistent,

Yeah, apparently the idea that consistency is not necessarily the best
guiding principle did not emerge until after the patch was applied :-(

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian escribi?:
  Alvaro Herrera wrote:
   Bruce Momjian escribi?:
   
Well, to do this you are going to need 'U' and 'S' modifiers, and then
we have to decide how \df is supposed to behave.
   
   I think we should have first decided how it was supposed to behave, and
   later applied any patches.
  
  Well, there was a lot of discussion in the Spring that the backslash
  commands should be consistent,
 
 Yeah, apparently the idea that consistency is not necessarily the best
 guiding principle did not emerge until after the patch was applied :-(

You are seeing that too?  ;-)

We certainly needed to get this into CVS so folks could test it, and
\d,\dt was so mangled in their behavior that only fixing it in CVS
allowed people to see a clear picture.

Here is the list we care about, from CVS HEAD:

Informational
  Modifiers: S = show system objects  + = Additional detail
  \l[+] list all databases
  \d[S+]list tables, views, and sequences
  \d[S+]   NAME describe table, view, sequence, or index
  \da[S]   [PATTERN]list aggregate functions
  \db[+]   [PATTERN]list tablespaces
  \dc[S]   [PATTERN]list conversions
  \dC  [PATTERN]list casts
  \dd[S]   [PATTERN]show comments on objects
  \dD[S]   [PATTERN]list domains
  \des[+]  [PATTERN]list foreign servers
  \deu[+]  [PATTERN]list user mappings
  \dew[+]  [PATTERN]list foreign-data wrappers
  \df[S+]  [PATTERN]list functions
  \dF[+]   [PATTERN]list text search configurations
  \dFd[+]  [PATTERN]list text search dictionaries
  \dFp[+]  [PATTERN]list text search parsers
  \dFt[+]  [PATTERN]list text search templates
  \dg  [PATTERN]list roles (groups)
  \di[S+]  [PATTERN]list indexes
  \dl   list large objects, same as \lo_list
  \dn[+]   [PATTERN]list schemas
  \do[S]   [PATTERN]list operators
  \dp  [PATTERN]list table, view, and sequence access privileges
   \z  [PATTERN]same as \dp
  \ds[S+]  [PATTERN]list sequences
  \dt[S+]  [PATTERN]list tables
  \dT[S+]  [PATTERN]list data types
  \du  [PATTERN]list roles (users)
  \dv[S+]  [PATTERN]list views

Here are the items I think are best to default to user-only:

  \d[S+]list tables, views, and sequences
  \d[S+]   NAME describe table, view, sequence, or index
  \df[S+]  [PATTERN]list functions
  \di[S+]  [PATTERN]list indexes
  \ds[S+]  [PATTERN]list sequences
  \dt[S+]  [PATTERN]list tables
  \dv[S+]  [PATTERN]list views
  \dD[S]   [PATTERN]list domains

Here are the ones that should include system objects by default:

  \da[S]   [PATTERN]list aggregate functions
  \dc[S]   [PATTERN]list conversions
  \dd[S]   [PATTERN]show comments on objects
  \do[S]   [PATTERN]list operators
  \dT[S+]  [PATTERN]list data types

The lists are pretty close to being the same size, especially since \d
is listed twice.

You will notice some commands, like \dF, are not listed at all because
they don't support 'S'.  One new idea would be to remove 'S' support
from the include system group and have them default to showing system
objects.

We could add a 'U' flag but that introduces confusion over whether the
command without 'S' or 'U' shows system objects, and would 'S' show only
system objects, or include system objects?   And what about 'U'?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Ron Mayer
Gregory Stark wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 
 The original design of Postgres allowed pluggable index access methods,
 but that capability has not been brought forward to allow for WAL. This
 patch would bridge that gap.
 
 Well I think what people do is what GIST did early on -- they just don't
 support recoverability until they get merged into core.

What other constraints are there on such non-in-core indexex?  Early (2005)
GIST indexes were very painful in production environments because vacuuming
them held locks for a *long* time (IIRC, an hour or so on my database) on
the indexes locking out queries.  Was that just a shortcoming of the
implementation, or was it a side-effect of them not supporting recoverability.
If the latter, I think that's a good reason to try to avoid developing new
index types the same way the GIST guys did.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2009-01-21 Thread Martin Pihlak
Alvaro Herrera wrote:
 I agree that pgstats is not ideal (we've said this from the very
 beginning), but I doubt that updating pg_class is the answer; you'd be
 generating thousands of dead tuples there.
 

But we already do update pg_class after vacuum -- in vac_update_relstats().
Hmm, that performs a heap_inplace_update() ... I assume that this is cheap,
but have no idea as if it is suitable for the purpouse.

regards,
Martin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Heikki Linnakangas

Ron Mayer wrote:

Early (2005)
GIST indexes were very painful in production environments because vacuuming
them held locks for a *long* time (IIRC, an hour or so on my database) on
the indexes locking out queries.  Was that just a shortcoming of the
implementation, or was it a side-effect of them not supporting recoverability.


The former.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Robert Haas
 Here are the items I think are best to default to user-only:
[...]
 Here are the ones that should include system objects by default:

Well, at a minimum, I think it's important for any type of object to
have an easy way to exclude system objects, because show me all of
the stuff that didn't come with the database is a valid use case for
any type of item.

It's certainly true that the more obscure the item is, the more likely
you are to be looking for the system object rather than a user object.
 But it's really subjective where to put the line: some people might
put it between table and function (Tom), others might put it between
function and aggregate (Bruce), and still others might say there's no
object type so exotic that I don't want system objects excluded by
default (me).  And there isn't any right or wrong answer here: it's
just opinions.

So maybe we should provide U, S, and A modifiers for every type of
object (user, system, all).  That doesn't solve the problem of which
should be the default for each object type, but at least it guarantees
that you can type an extra character to get the behavior you want in
any particular case.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2009-01-21 Thread Alvaro Herrera
Martin Pihlak escribió:
 Alvaro Herrera wrote:
  I agree that pgstats is not ideal (we've said this from the very
  beginning), but I doubt that updating pg_class is the answer; you'd be
  generating thousands of dead tuples there.
 
 But we already do update pg_class after vacuum -- in vac_update_relstats().
 Hmm, that performs a heap_inplace_update() ... I assume that this is cheap,
 but have no idea as if it is suitable for the purpouse.

Oh, sorry, I thought you were suggesting to use pg_class to store number
of tuples dead/alive/etc.

I had a patch to introduce a new type of table, which would only be used
for non-transactional updates.  That would allow what you're proposing.
I think we discussed something similar to what you propose and rejected
it for some reason I can't recall offhand.  Search the archives for
pg_class_nt and pg_ntclass, that might give you some ideas.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cancelling idle in transaction state

2009-01-21 Thread Bruce Momjian

Added to TODO:

Allow administrators to cancel multi-statement idle
transactions

This allows locks to be released, but it is complex to report the
cancellation back to the client.

* 
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php 

---

Simon Riggs wrote:
 Currently SIGINT is ignored during IDLE in transaction, but we have
 recently agreed to allow this to cancel the transaction. We said we
 would do this in all cases, so this is a separate feature/patch (though
 Hot Standby requires it).
 
 A simple change allows the transaction to be cancelled, but there are
 some loose ends that I wish to discuss.
 
 If we are running a statement and a cancel is received, then we return
 the ERROR to the client, who is expecting it. If we cancel a transaction
 while the connection is idle, we have no way of signalling to the client
 program this has occurred. So the client finds out about this much
 later, not in fact until the next message is sent.
 
 Is there a mechanism for communicating the state back to the client?
 Will this be handled correctly with existing code? psql appears to be
 confused by a cancelled backend.
 
 I'm not familiar with these aspects of the code, so some clear
 suggestions are needed to allow me to work this out. I'm worried that
 this will delay things further otherwise.
 
 -- 
  Simon Riggs   www.2ndQuadrant.com
  PostgreSQL Training, Services and Support
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 19:13 +0200, Heikki Linnakangas wrote:

  You don't want pluggable indexes, don't use 'em. But that isn't an
  argument against allowing the capability for others. That line of
  thought would have led us to banning pluggable languages. We should
  respect the roots of this project and look for ways to enable the
  advancement of database technology, not limit it to only how far we
 can  currently see ahead through the fog.
 
 This is an open source project. 

That's a whole different discussion.

Extensibility is what gives options in production. Yes, the academics
can do whatever they like. We know the reality is people don't fiddle
with core code for a range of reasons but are happy to use extensions.
I'm in favour of allowing people that use Postgres to get access to
advanced technology without asking my permission or paying me a licence
fee for a modified version.

We support extensible everything, but not indexes. Why? 

PostgreSQL is supposed to be The World's Most Advanced Open Source
Database. There is no good technical reason to hold back this patch.

The arguments against this patch seem to revolve around fears of
commercial exploitation or subverting the release process. Or telling
people that we know better than them and they can't possibly write an
index worthy of actual use. They might not be able to, its true, but I
see no reason to prevent them either.

 *That's* how this project moves forward.

We've got one committer working almost exclusively on new indexes.
Preventing work on new indexes by non-committers has meant that Bitmap
indexes, which first came out in 2005 have not been usable with
Postgres. That forced people *away* from Postgres towards Bizgres. Lack
of Bitmap indexes is a huge issue for many people. It's 2009 now and it
seems probable that without this patch it will be 2010 at least before
they see BMIs, and later still before they see other index types.

Many people can see the blockage there. I agree it is right to have
prevented BMIs from being committed to core, but they have been usable
and beneficial for many years now for read only workloads. In the
current way of thinking early GIST would never have been allowed in and
there would be no PostGIS.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cancelling idle in transaction state

2009-01-21 Thread Bruce Momjian
Simon Riggs wrote:
 
 On Wed, 2009-01-21 at 15:22 -0500, Bruce Momjian wrote:
  Added to TODO:
  
  Allow administrators to cancel multi-statement idle
  transactions
  
  This allows locks to be released, but it is complex to report the
  cancellation back to the client.
  
  * 
  http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php 
 
 This is part of Hot Standby.
 
 The bug is on the TODO list.

Well, if it gets done for 8.4 then we can mark it completed;  it not it
will be there for 8.5.  The behavior is useful independent of hot
standby.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Jonah H. Harris
On Wed, Jan 21, 2009 at 2:02 PM, Robert Haas robertmh...@gmail.com wrote:

  Was there a reason that this cleanup patch wasn't applied?
 
  1. It was submitted after the deadline for CommitFest:November.
 
  Well, it's just comment changes...

 Oh, didn't realize that.  That's what I get for replying without
 reading the patch...


Yes :)


-- 
Jonah H. Harris, Senior DBA
myYearbook.com


[HACKERS] Help with Join Performance Testing

2009-01-21 Thread Lawrence, Ramon
A hash join modification patch is under review for 8.4 that needs
performance testing.   We would appreciate help with this testing. 

 

A testing version of the patch is attached in addition to testing
instructions and where to retrieve a sample data set.   The basic idea
of the patch is that it reduces disk operations for large multi-batch
hash joins where there is skew in the probe relation.  The patch
collects statistics on performance benefits when using the optimization.

 

--

Ramon Lawrence and Bryce Cutt

Overview


This document provides an overview of how to test the histojoin patch.  The 
patch performs skew optimization for large, multi-batch hash joins.

Installation

The patch should compile cleanly against CVS head.


Execution
-
The skew optimization can be turned on by:

set enable_hashjoin_usestatmcvs = on;

and off by:

set enable_hashjoin_usestatmcvs = off;

If a hash join has detectable skew in the larger probe relation, then the skew 
optimization will output the amount of skew it sees and the number of tuples it 
will buffer in memory to exploit that skew.  When the hash join completes, it 
will output statistics on the number of tuples actually matched by the 
in-memory (IM) skew partition and the number of tuples in partition 0. The 
improvements in join I/Os is also given.

Sample (from LI-P TPCH 10G 1Z):

Values: 100 Skew: 0.27  Est. tuples: 59986052.00 Batches: 512  Est. Save: 
16114709.99
Total Inner Tuples: 200
IM Inner Tuples: 83
Batch Zero Inner Tuples: 3941
Batch Zero Potential Inner Tuples: 3941
Total Outer Tuples: 59986052
IM Outer Tuples: 16074146
Batch Zero Outer Tuples: 98778
Batch Zero Potential Outer Tuples: 98778 
Total Output Tuples: 59986052 
IM Output Tuples: 16074146
Batch Zero Output Tuples: 98778
Batch Zero Potential Output Tuples: 98778
Percentage less tuple IOs than HHJ: 25.98


Data Set

A sample test data set is TPC-H scale factor 1 GB.  A pg_dump can be downloaded 
from:

http://people.ok.ubc.ca/rlawrenc/tpch1g1z.zip

The larger 10 GB data sets are available on request.  You can also download the 
generator itself (works only on Windows) at:

http://people.ok.ubc.ca/rlawrenc/TPCHSkew.zip

The only joins with significant skew in the database are Part-LineItem and 
Supplier-LineItem.


Result Notes


1) The percentage benefit increases with the amount of skew.  Relations with no 
skew are not affected.  Relations with minimal skew show no noticeable 
improvement or negative impact.

2) Since disk I/Os in the join is only one part of the query execution time, 
overall execution times do not improve the same amount as the reduction in disk 
I/Os.  For CPU-bound queries, the disk I/O improvement may not have a 
significant effect on the overall time.

3) The relations are quite large.  Thus, queries with SELECT * that join 
several relations are very costly and the generation of the tuples dominates 
the execution time (especially if executing the query through a client such as 
pgAdmin).


Previous Results


The join with LineItem-Part on TPCH 1G 1Z shows about a 26% improvement in I/Os 
performed during the join and about 5-10% improvement in overall time.  The 
join with LineItem-Supplier is similar.  Data sets with higher skew show even 
better performance.  For example, Lineitem-Part on TPCH 10G 2Z has 90% of probe 
relation tuples matching 100 most common values.  The improvement in I/Os is 
about 90% and time about 50%.

Some sample test queries:

Query #1a:
SELECT * FROM Part, Lineitem WHERE p_partkey = l_partkey;

Query #1b:
SELECT count(*) FROM Part, Lineitem WHERE p_partkey = l_partkey;

Query #2a:
SELECT * FROM Supplier, Lineitem WHERE s_suppkey = l_suppkey;

Query #2b:
SELECT count(*) FROM Supplier, Lineitem WHERE s_suppkey = l_suppkey;

Query #3a:
SELECT * FROM Part, Lineitem, Supplier WHERE p_partkey = l_partkey and 
s_suppkey = l_suppkey;

Query #3b:
SELECT count(*) FROM Part, Lineitem, Supplier WHERE p_partkey = l_partkey and 
s_suppkey = l_suppkey;



histojoin_testing.patch
Description: histojoin_testing.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cancelling idle in transaction state

2009-01-21 Thread Joshua D. Drake
On Wed, 2009-01-21 at 15:46 -0500, Bruce Momjian wrote:
 Simon Riggs wrote:
  
  On Wed, 2009-01-21 at 15:22 -0500, Bruce Momjian wrote:
   Added to TODO:
 
 Allow administrators to cancel multi-statement idle
 transactions
 
 This allows locks to be released, but it is complex to report the
 cancellation back to the client.
 
 * 
   http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php 
  
  This is part of Hot Standby.
  
  The bug is on the TODO list.
 
 Well, if it gets done for 8.4 then we can mark it completed;  it not it
 will be there for 8.5.  The behavior is useful independent of hot
 standby.

At one time there was also a positive discussion on having something
like:

idle_in_transaction_timeout

Does this play along with that?

Joshua D.D rake


 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com
 
   + If your life is a hard drive, Christ can be your backup. +
 
-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] Transactions and temp tables]

2009-01-21 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Emmanuel Cecchet wrote:
  I just saw that this new patch was not considered because the previous 
  version ended being rejected.
  Note that this version of the patch aims at supporting ONLY temp tables 
  that are created AND dropped in the same transaction. We need to be able 
  to use temp tables in transactions that are doing 2PC, but the temp 
  table lifespan does not need to cross transaction boundaries.
  
  Please let me know if this patch could be integrated in 8.4.
 
 IMHO, this is just getting too kludgey. We came up with pretty good 
 ideas on how to handle temp tables properly, by treating the same as 
 non-temp tables. That should eliminate all the problems the latest patch 
 did, and also the issues with sequences, and allow all access to temp 
 tables, not just a limited subset. I don't think it's worthwhile to 
 apply the kludge as a stopgap measure, let's do it properly in 8.5.
 
 As a workaround, you can use a regular table instead of a temporary one. 
 If you create and drop the regular table in the same transaction (that's 
 the same limitation that latest patch has), you won't end up with a 
 bogus table in your database if the connection is dropped unexpectedly. 
 If your application uses multiple connections simultaenously, you'll 
 need a little bit of code in the application so that you don't try to 
 create a table with the same name in all backends. You could also create 
 a different schema for each connection, and do set 
 search_path='semitempschemaX, public', so that you can use the same 
 table name and still have separate tables for each connections.

Can someone tell me how this should be worded as a TODO item?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cancelling idle in transaction state

2009-01-21 Thread Bruce Momjian
Joshua D. Drake wrote:
 On Wed, 2009-01-21 at 15:46 -0500, Bruce Momjian wrote:
  Simon Riggs wrote:
   
   On Wed, 2009-01-21 at 15:22 -0500, Bruce Momjian wrote:
Added to TODO:

Allow administrators to cancel multi-statement idle
transactions

This allows locks to be released, but it is complex to 
report the
cancellation back to the client.

* 
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php 
   
   This is part of Hot Standby.
   
   The bug is on the TODO list.
  
  Well, if it gets done for 8.4 then we can mark it completed;  it not it
  will be there for 8.5.  The behavior is useful independent of hot
  standby.
 
 At one time there was also a positive discussion on having something
 like:
 
 idle_in_transaction_timeout

Yep, and already a TODO:

Add idle_in_transaction_timeout GUC so locks are not held for
long periods of time

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Oleg Bartunov

On Wed, 21 Jan 2009, Simon Riggs wrote:



On Wed, 2009-01-21 at 21:45 +0200, Heikki Linnakangas wrote:

Ron Mayer wrote:

Early (2005)
GIST indexes were very painful in production environments because vacuuming
them held locks for a *long* time (IIRC, an hour or so on my database) on
the indexes locking out queries.  Was that just a shortcoming of the
implementation, or was it a side-effect of them not supporting recoverability.


The former.


In the current way of thinking early-GIST would never have been
committed and as a result we would not have PostGIS. Yes, early index
implementations can be bad and they scare the hell out of me. That's
exactly why I want to keep them out of core, so they don't need to be
perfect, they can come with all sorts of health warnings.


I'm rather keen on Pg extendability, which allowed me and Teodor to 
work on many extensions. Yes, first GiST we inherited from early 
academic research and was more like a toy. We still have several TODO 
items about GiST interface (incorporate SP-GiST). 
I'm not sure about specific patch Simon advocate, but as soon as it
doesnot introduces any threat to the whole  database cluster health 
(for example, WAL spamming) I think we can apply it. 
Other question, why don't improve GiST to allow support of  more indexes ?

bitmap indexes could be implemented usin g GiST.






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lock conflict behavior?

2009-01-21 Thread Bruce Momjian
Jeff Davis wrote:
 On Tue, 2008-12-23 at 08:48 -0500, Tom Lane wrote:
  I've always thought that it was extremely shaky for LOCK to try to work
  that way.  With no lock, you have no confidence that the table isn't
  changing or disappearing under you.  In the worst case, the permissions
  check might fail outright (likely with a cache lookup failed message
  about a catalog row that disappeared as we attempted to fetch it); or it
  might give an answer that's obsolete by the time we do acquire the lock.
 
 It looks like it would be easy enough to throw a better error message
 than that, e.g. with a try/catch. The information could be obsolete, but
 if it succeeds, it would at least mean they had permissions at some time
 in the past.
 
 Or, we could just remove the ACL checks from LOCK TABLE, so that it's at
 least consistent. Mostly it's the inconsistency that bothers me.

Is this a TODO?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Simon Riggs

On Thu, 2009-01-22 at 00:29 +0300, Oleg Bartunov wrote:

 I'm rather keen on Pg extendability, which allowed me and Teodor to 
 work on many extensions. Yes, first GiST we inherited from early 
 academic research and was more like a toy. We still have several TODO 
 items about GiST interface (incorporate SP-GiST).

Sounds good.

 I'm not sure about specific patch Simon advocate, but as soon as it
 doesnot introduces any threat to the whole  database cluster health 
 (for example, WAL spamming) I think we can apply it. 

Currently you can write any crap you want to WAL from any plugin, as
long as it looks a lot like an existing WAL message type. If you crash
then we'll read that crap and (probably) crash again. That is already a
risk.

The rmgr plugin provides a way to handle user-defined WAL messages. The
patch is recovery-side only and is designed to complement the indexAM
APIs, which are normal-running-side only. Best way to think of it is as
another 5 functions on index access method interface that allow you to
implement recoverable index plugins. (Remembering that dynamic index
plugins are already allowed by Postgres).

So the patch does not provide any additional way of *writing* WAL, it
just provides a way of reading it and then taking action.

Rmgr plugins would allow you to simply ignore certain kinds of WAL,
apply data in a user defined manner or filter it etc.. So if you come
across a buggy index, you can turn off the WAL for that index type and
then recover the database without those indexes. Or dynamically patch
the code for that index type and recover. You'll get Postgres back up
faster with this patch than without it, in many cases.

 Other question, why don't improve GiST to allow support of  more indexes ?
 bitmap indexes could be implemented usin g GiST.

I'm not advocating any particular type of index here, just the ability
to make index plugins robust. There is no other way of doing this, i.e.
it can't be done by an external module etc..

I'll avoid discussing index design with you :-)

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Josh Berkus

All,

I really don't see why we would object to making *anything* pluggable if 
someone was willing to write the code to do so.  For example, making 
storage pluggable would allow PostgreSQL to achieve great new things on 
new types of hardware. (yes, I have some idea how difficult this would be)


For that matter, our pluggable languages, operators, aggregates, and 
UDFs are the mainsteam of PostgreSQL adoption -- and as hardware and 
technology changes in the future, I believe that our database's 
programmability will become the *entire* use case for PostgreSQL.


So I really can't see any plausible reason to be opposed to pluggable 
indexes *in principle*.  We should be promoting pluggability whereever 
we can reasonably add it.


Now, like always, that says nothing about the quality of this particular 
patch or whether it *really* moves us closer to pluggable indexes.


--Josh Berkus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Bruce Momjian
Josh Berkus wrote:
 All,
 
 I really don't see why we would object to making *anything* pluggable if 
 someone was willing to write the code to do so.  For example, making 
 storage pluggable would allow PostgreSQL to achieve great new things on 
 new types of hardware. (yes, I have some idea how difficult this would be)
 
 For that matter, our pluggable languages, operators, aggregates, and 
 UDFs are the mainsteam of PostgreSQL adoption -- and as hardware and 
 technology changes in the future, I believe that our database's 
 programmability will become the *entire* use case for PostgreSQL.
 
 So I really can't see any plausible reason to be opposed to pluggable 
 indexes *in principle*.  We should be promoting pluggability whereever 
 we can reasonably add it.
 
 Now, like always, that says nothing about the quality of this particular 
 patch or whether it *really* moves us closer to pluggable indexes.

Plugability adds complexity. Heikki's comment is that adding this patch
make the job of creating pluggable indexes 5% easier, while no one is
actually working on plugable indexes, and it hard to say that making it
5% easier really advances anything, especially since many of our
existing index types aren't WAL-logged.  Plugability is not a zero-cost
feature.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lock conflict behavior?

2009-01-21 Thread Jeff Davis
On Wed, 2009-01-21 at 17:39 -0500, Bruce Momjian wrote:
  It looks like it would be easy enough to throw a better error message
  than that, e.g. with a try/catch. The information could be obsolete, but
  if it succeeds, it would at least mean they had permissions at some time
  in the past.
  
  Or, we could just remove the ACL checks from LOCK TABLE, so that it's at
  least consistent. Mostly it's the inconsistency that bothers me.
 
 Is this a TODO?

I don't feel too strongly about it. I would feel better if we were
consistent about the permissions checks, because there's less of a
chance for confusion or a false sense of security.

If we keep the permission check in LockTableCommand(), I can make a
patch that produces a more useful error message when the table is
removed right before the pg_class_aclcheck().

Right now it does:
ERROR:  relation with OID 16542 does not exist

which is undesirable.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Josh Berkus

Bruce,


Plugability adds complexity. Heikki's comment is that adding this patch
make the job of creating pluggable indexes 5% easier, while no one is
actually working on plugable indexes, and it hard to say that making it
5% easier really advances anything, especially since many of our
existing index types aren't WAL-logged.  Plugability is not a zero-cost
feature.


Right.  And I'm saying that pluggability is PostgreSQL's main reason for 
existence, if you look at our place in the future of databases.  So it's 
worth paying *some* cost, provided that the cost/benefit ratio works for 
the particular patch.


To rephrase: I can't judge the rmgr patch one way or the other.  I'm 
only objecting to the idea expressed by Heikki and others that pluggable 
indexes are stupid and unnecessary.


--Josh


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Bruce Momjian
Josh Berkus wrote:
 Bruce,
 
  Plugability adds complexity. Heikki's comment is that adding this patch
  make the job of creating pluggable indexes 5% easier, while no one is
  actually working on plugable indexes, and it hard to say that making it
  5% easier really advances anything, especially since many of our
  existing index types aren't WAL-logged.  Plugability is not a zero-cost
  feature.
 
 Right.  And I'm saying that pluggability is PostgreSQL's main reason for 
 existence, if you look at our place in the future of databases.  So it's 
 worth paying *some* cost, provided that the cost/benefit ratio works for 
 the particular patch.
 
 To rephrase: I can't judge the rmgr patch one way or the other.  I'm 
 only objecting to the idea expressed by Heikki and others that pluggable 
 indexes are stupid and unnecessary.

It is cost vs. benefit.  No one is saying plugabiity is bad, only that
in this case it is more costly than beneficial;  of course, that might
change some day.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Kevin Grittner
 Bruce Momjian br...@momjian.us wrote:
 It is cost vs. benefit.  No one is saying plugabiity is bad, only
that
 in this case it is more costly than beneficial
 
Just curious -- are we talking execution time costs or programming
costs because of increased code complexity?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Bruce Momjian
Kevin Grittner wrote:
  Bruce Momjian br...@momjian.us wrote:
  It is cost vs. benefit.  No one is saying plugabiity is bad, only
 that
  in this case it is more costly than beneficial
  
 Just curious -- are we talking execution time costs or programming
 costs because of increased code complexity?

Programming, I assume, and the chance of bugs.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Frames vs partitions: is SQL2008 completely insane?

2009-01-21 Thread Bruce Momjian
Tom Lane wrote:
 According to SQL2008 section 7.11 window clause, general rule 5, the
 default definition of window framing in a window that has an ordering
 clause but no framing (RANGE/ROWS) clause is that the window frame for
 a given row R runs from the first row of its partition through the last
 peer of R.
 
 Section 6.10's general rules define the results of LEAD, LAG,
 FIRST_VALUE, LAST_VALUE, NTH_VALUE in terms of the rows available in the
 window frame of the current window, not its partition.
 
 Meanwhile, section 6.10 window function syntax rule 6 says that
 LEAD/LAG must use a window that has an ordering clause and no
 framing clause.
 
 This means that without an explicit framing clause, none of these
 functions can look beyond the last peer of the current row; and
 what's worse, LEAD/LAG seem to be explicitly forbidden from looking
 further than that even if we had an implementation of framing clauses.
 
 This seems to be less than sane.  I would certainly expect that LEAD(x)
 gives you the next value of x regardless of peer-row status, since
 LAG(x) gives you the prior value of x regardless of peer row status.
 It is also simply bizarre for FIRST_VALUE to give you the partition's
 first row when LAST_VALUE doesn't give you the partition's last row.
 
 Are there any errata for SQL2008 yet?  Can anyone check the actual
 behavior of DB2 or other DBMS's that claim to implement these functions?
 
 I notice that the current patch code seems to implement
 first/last/nth_value using the frame, but lead/lag using the partition,
 which doesn't conform to spec AFAICS ... but lead/lag on the frame
 doesn't actually appear to be a useful definition so I'd rather go
 with that than with what the letter of the spec seems to say.
 
 Lastly, for a simple aggregate used with an OVER clause, the current
 patch seems to define the aggregate as being taken over the frame
 rather than the partition, but I cannot find anything in SQL2008 that
 lends any support to *either* definition.
 
 Comments?  This all seems rather badly broken.

Was this dealt with?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql: numeric assignment to an integer variable errors out

2009-01-21 Thread Bruce Momjian
Nikhil Sontakke wrote:
   PFA, patch which uses find_coercion_pathway to find a direct
   COERCION_PATH_FUNC function and uses that if it is available. Or is there
  a
   better approach? Seems to handle the above issue with this patch.
 
  +1
 
  I thing, so some values should by cached, current patch could by slow.
 
 
 Agreed, it can slow things down a bit especially since we are only
 interested in the COERCION_PATH_FUNC case. What we need is a much simpler
 pathway function which searches in the SysCache and returns back with the
 valid/invalid castfunc immediately.
 
 PFA, version 2.0 of this patch with these changes in place. I could have
 added a generic function in parse_coerce.c, but thought the use case was
 restricted to plpgsql and hence I have kept it within pl_exec.c for now.

Where are we on this?  8.5?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Gregory Stark
Josh Berkus j...@agliodbs.com writes:

 Right.  And I'm saying that pluggability is PostgreSQL's main reason for
 existence, if you look at our place in the future of databases.  So it's worth
 paying *some* cost, provided that the cost/benefit ratio works for the
 particular patch.

I agree that pluggability is a huge deal for Postgres. But note that the
interface is critical. If we provided a plugin architecture for functions and
operators which was simply a hook where you replaced part of the
infrastructure of the parser and executor it would be pointless. 

Instead we provide an interface where your function has to know as little as
possible about the rest of the system. And the parser and executor get enough
information about your function that they can do most of the work. That you
can create a new operator in Postgres *without* knowing how operators actually
are implemented and without worrying about what other operators exist is what
makes the feature so useful.

This is made a lot harder with WAL because a) it spans the entire cluster, not
just a database so any meta-information has to be stored somewhere global and
b) the consequences for getting something wrong are so much more dire. The
entire cluster is dead and can't even be restored from backup.

 To rephrase: I can't judge the rmgr patch one way or the other.  I'm only
 objecting to the idea expressed by Heikki and others that pluggable indexes 
 are
 stupid and unnecessary.

Well we support pluggable indexes -- they just can't be recoverable right now.
Presumably if they're merged into the core database they would have
recoverability added like how GIST progressed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO items for window functions

2009-01-21 Thread Bruce Momjian
Robert Haas wrote:
  I am not thrilled about inventing a new column for this, but how about
  a display like so:
 
  regression=# \df nth_value
 List of functions
Schema   |   Name| Result data type |   Argument data types
  +---+--+-
   pg_catalog | nth_value | anyelement   | anyelement, integer OVER window
 
  or some other addition that only shows up when needed.
 
 I think this whole idea is a bad one.  In the current release, you can do
 
 DROP FUNCTION Name ( Argument data types )
 
 ...and it will work.  Maybe you will say that no one is doing this via
 a script (which I wouldn't bet on, but it's possible) but I'm sure
 people are doing it via cut and paste, because I have done exactly
 this thing.  Any of the various proposals for hacking up Argument data
 types will make this no longer true, and somebody will get confused.
 I think you should bite the bullet and add a type column (f for
 regular function and w for window?  could there be others in the
 future?).

I assume this is still an open issue.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 18:06 -0500, Bruce Momjian wrote:

 Plugability adds complexity. Heikki's comment is that adding this
 patch make the job of creating pluggable indexes 5% easier, while no
 one is actually working on plugable indexes, and it hard to say that
 making it 5% easier really advances anything, especially since many of
 our existing index types aren't WAL-logged.  Plugability is not a
 zero-cost feature.

Sorry Bruce, but that misses the key point.

Without the patch it is completely *impossible* to write an index plugin
that is *recoverable*. Yes, we have pluggable indexes now, but unless
they are recoverable we certainly can't ever use them in production.

With the patch, you still have to write the index code. I agree it is
hard code to write, but not impossible. I would go so far as to say that
the patch helps you 0% with the task of actually writing the plugin. But
the patch enables you to start and that is all its intended as: an
enabler.

So its not a slightly easier thing, its a can/cannot thing.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Bruce Momjian
Simon Riggs wrote:
 
 On Wed, 2009-01-21 at 18:06 -0500, Bruce Momjian wrote:
 
  Plugability adds complexity. Heikki's comment is that adding this
  patch make the job of creating pluggable indexes 5% easier, while no
  one is actually working on plugable indexes, and it hard to say that
  making it 5% easier really advances anything, especially since many of
  our existing index types aren't WAL-logged.  Plugability is not a
  zero-cost feature.
 
 Sorry Bruce, but that misses the key point.

I understood the key point.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-21 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Fri, 2009-01-09 at 16:14 +0200, Peter Eisentraut wrote:
  Simon Riggs wrote:
  I notice that we allow commands such as
 
  SET TRANSACTION read only read write read only;
 
  BEGIN TRANSACTION read only read only read only;
 
  Unsurprisingly, these violate the SQL Standard:
  * p.977 section 19.1 syntax (1)
  * p.957 section 17.3 syntax (2)
  Well, we allow a lot of things.  Violations of the SQL standard happen 
  when a command that appears in the standard doesn't do what the standard 
  says.  Allowing commands that are not in the standard is not a violation.
  
  Except when the standard explicitly forbids it, as with the above.
 
 No, it just means that the statement SET TRANSACTION read only read 
 write read only; doesn't conform to the standard, and it's therefore 
 implementation-dependent what it does. See the meaning of shall in 
 Syntax Rules, section 6.3.3.2 Terms denoting rule requirements.
 
 I agree with Tom that the 2nd form is harmless, but we should throw an 
 error for the first.

Added to TODO:

Prevent the specification of conflicting transaction read/write
options

* http://archives.postgresql.org/pgsql-hackers/2009-01/msg00684.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Simon Riggs

On Thu, 2009-01-22 at 00:00 +, Gregory Stark wrote:

 But note that the interface is critical.

Yes, it is.

The existing rmgr code provides for 5 separate calls that a module needs
to implement to make an access method recoverable. btree, hash, gist and
gin already implement that API.

I haven't invented a new interface at all. All the patch does is expose
the existing API for plugins, allowing them to act in exactly the same
ways that the existing index types do.

If you have patch review comments about additional requirements for that
API, that is fine. But saying the API is wrong is not a reason to reject
the patch. Its a reason to change the patch.

 the consequences for getting something wrong are so much more dire.
 The entire cluster is dead and can't even be restored from backup.

Not true. If you decide to use a pluggable index and the plugin breaks,
you can turn off that index type and continue recovering the database.
If GIN breaks for example, you can simply bypass it and continue. So the
rmgr patch provides you a mechanism for recovering an existing system in
a way that is not currently possible - no data loss, just loss of
damaged indexes. And it provides an escape hatch if you use a pluggable
index and it breaks.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 17:46 -0600, Kevin Grittner wrote:
  Bruce Momjian br...@momjian.us wrote:
  It is cost vs. benefit.  No one is saying plugabiity is bad, only
 that
  in this case it is more costly than beneficial
  
 Just curious -- are we talking execution time costs or programming
 costs because of increased code complexity?

The execution time of a pluggable index would be identical to a
non-pluggable index. There is zero overhead in having the capability,
since we already use a function pointer mechanism in the existing code. 

There is not really any overhead in having 10 or 50 plugins; the
recovery processing time is determined by the efficiency of the plugin
and how many WAL message need to be processed. For example, if you have
more TypeX indexes then recovery will spend more time recovering TypeX
indexes. If you have no TypeX indexes, that module would only be asked
to startup() and cleanup(), but nothing else.

The code complexity is exactly the same whether you write it as a plugin
or a patch against core. The API is identical. The key difference is
that users get to choose whether they use a plugin, or not, whereas
without the plugin you are limited to index types that have been
included with core Postgres.

Just as with some PL languages, some pluggable indexes may gain a
reputation as buggy and fall into disuse. Others may become popular and
be invited to join core, where they will gain further trust.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-updated fields

2009-01-21 Thread Bruce Momjian
Alvaro Herrera wrote:
 Robert Treat wrote:
  On Thursday 08 May 2008 00:27:10 Tino Wildenhain wrote:
   David Fetter wrote:
 
 Ref: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00198.php
 
1.  Create a generic (possibly overloaded) trigger function, bundled
with PostgreSQL, which sets a field to some value.  For example, a
timestamptz version might set the field to now().
 
   Having the pre defined triggers at hand could be useful, especially
   for people not writing triggers so often to get used to it but I'm
   really not happy with the idea of magic preprocessing.
 
  I have a generic version of this in pagila. 
 
 Now that we have a specific file in core for generic triggers (right now with 
 a
 single one), how about adding this one to it?

Any progress on this?  TODO?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stats queries versus per-database encodings

2009-01-21 Thread Bruce Momjian
Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  Tom Lane wrote:
  We could attack this by including source database's encoding in the
  shared-memory entries, and performing a conversion on the fly when
  reading out the data.  However, what happens if the conversion fails?
 
  The most useful behavior would be to replace the untranslatable 
  characters with ?. I'm not sure how invasive the changes to the 
  conversion functions would be to support that.
 
 I agree, but it looks like fairly massive changes would be needed,
 starting with redefining the API for conversion functions to add
 an error/noerror boolean.  Not something that I care to tackle
 right now.  Maybe we shall just have to live with it for another
 release.

Added to TODO:

Have pg_stat_activity display query strings in the correct client
encoding

* http://archives.postgresql.org/pgsql-hackers/2009-01/msg00131.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
  Here are the items I think are best to default to user-only:
 [...]
  Here are the ones that should include system objects by default:
[...]
 So maybe we should provide U, S, and A modifiers for every type of
 object (user, system, all).  That doesn't solve the problem of which
 should be the default for each object type, but at least it guarantees
 that you can type an extra character to get the behavior you want in
 any particular case.

You know, there's an easy answer to what to set the defaults to, though
it might be a bit cheesy- how about the current behavior?  Adding
functionality (+U or +S or whatever) without breaking backwards
compatibility (much).

I'm not a huge fan of having \df still list system functions, but it's
less of an issue if I can just do \dfU when I want.  Of course, for a
year or two I'll probably be cursing this thread whenver I pick the
wrong one, but that's life. ;)

In response to a comment earlier- I definitely like the idea of
pre-pending system calls that aren't really 'published' with a 'pg_'.
That probably raises some backwords compatability problems, but I still
think it's a good idea to try and do at some point.

Stephen


signature.asc
Description: Digital signature


Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-21 Thread Stephen Frost
Bruce, et al,

* Bruce Momjian (br...@momjian.us) wrote:
 \dg  [PATTERN]list roles (groups)
 \du  [PATTERN]list roles (users)

Seeing this list reminded me of a pet-peeve..  \du and \dg actually show
the same info, that's fine, but neither of them show the rolcanlogin
value.  As someone who has a bad habit of doing 'create role blah;' for
new users this can get very annoying.  I also have people complain to me
that they can't figure out why a new user can't log in after they did
the same thing.  Could we get that changed?  Or maybe have 'connections'
say something different when you don't have rolcanlogin besides 'no
limit'?

 \dp  [PATTERN]list table, view, and sequence access privileges

erp, I don't think I changed this in my column-level privleges patch..
Should we explicitly mention column in this list?

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] parallel restore

2009-01-21 Thread Andrew Dunstan


Latest patch is attached. Changed as discussed to issue TRUNCATE ... 
ONLY when talking to servers = 8.4 instead of plain TRUNCATE.


cheers

andrew


parallel_restore_15.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] incoherent view of serializable transactions

2009-01-21 Thread Kevin Grittner
Here's a shot at a more radical revision, to try to address concerns
raised over my failure in the previous (very minimal) suggested patch
to address PostgreSQL behavior close to where the spec's behavior is
described, and my dragging in of language directly from the spec in a
confusing context.  I'd appreciate any corrections or suggestions
before I massage it into sgml.
 
Also, I don't know if I should leave it with the one example or
whether there should be more.  I could leave in the old example,
although the popular example of reversing updates (one transaction
updates all card rows to 'face-up' where they are 'face-down' and vice
versa) seems easier to understand.  One or both of these?  Other
suggestions?
 
Also, I tried using SELECT FOR SHARE and SELECT FOR HOLD as the
complete solution or instead of one of the table locks, but was able
to generate anomalies in all such cases.  If someone has a less
extreme technique for blocking the anomalies in the receipt example
(even when the SELECT of the deposit date for a receipt is in a
separate statement earlier in the transaction), please let me know, so
that I can include it.
 
If time permits I might take a stab at expanding the section on data
consistency checks at the application level; however, that seems less
urgent than correcting the obsolescent discussions of the SQL standard
and describing some of the anomalies not covered by a discussion of
consistency checks.
 
Thanks,
 
-Kevin
 
 
13.2. Transaction Isolation
 
The SQL standard defines four levels of transaction isolation in terms
of three phenomena that must be prevented between concurrent
transactions, with additional constraints on Serializable
transactions. These undesirable phenomena are: 
 
dirty read 
 
A transaction reads data written by a concurrent uncommitted
transaction. 
 
nonrepeatable read 
 
A transaction re-reads data it has previously read and finds that data
has been modified by another transaction (that committed since the
initial read). 
 
phantom read 
 
A transaction re-executes a query returning a set of rows that satisfy
a search condition and finds that the set of rows satisfying the
condition has changed due to another recently-committed transaction.
 
The four transaction isolation levels and the corresponding behaviors
are described in Table 13-1.
 
Table 13-1. SQL Transaction Isolation Levels
 
table here
 
The standard also requires that serializable transactions behave as
though they were run one at a time, even though their execution may
actually overlap. Since the phenomena described above relate to the
visibility of the effects of concurrent transactions, and each
serializable transaction must behave as though it were run in its
entirety either before or after every other transaction, none of the
above phenomena can occur within a serializable transaction.
 
In practice there is another popular transaction isolation level, not
mentioned in the standard, generally known as Snapshot isolation
level.  A transaction executed at this transaction isolation level
sees a consistent view of the data; changes made by other transactions
are not visible to it.  Because of this, none of the phenomena
described above are possible.  Additionally, when concurrent
transactions running at this level attempt to modify the same data,
the update conflict causes causes transaction rollback to prevent many
forms of update anomalies.  Still, data may be viewed or stored in a
state which is not consistent with any serial execution of
transactions run at this level, so although it is more strict than
required for Repeatable Read, it does not meet the standard's
definition of the Serializable transaction isolation level.
 
In PostgreSQL you can request any of the four standard transaction
isolation levels, but internally there are only two distinct isolation
levels, which correspond to the levels Read Committed and Snapshot.
When you select the level Read Uncommitted you really get Read
Committed, and when you select Repeatable Read or Serializable you
really get Snapshot.  Since the standard does not provide for the
Snapshot isolation level, PostgreSQL reports it as Serializable.  The
behavior of the available isolation levels is detailed in the
following subsections. 
 
To set the transaction isolation level of a transaction, use the
command SET TRANSACTION. or specify the desired transaction isolation
level on a BEGIN TRANSACTION or START TRANSACTION statement.
 
 
13.2.1. Read Committed Isolation Level
 
unchanged
 
 
13.2.2. Snapshot Isolation Level
 
The Snapshot level (reported as Serializable) provides the strictest
transaction isolation available in PostgreSQL. This level approximates
serial transaction execution, as if transactions had been executed one
after another, serially, rather than concurrently. However,
applications using this level must be prepared to retry transactions
due to serialization failures.
 
When a transaction is on the this level, a SELECT query sees 

Re: Pluggable Indexes (was Re: [HACKERS] rmgr hooks (v2))

2009-01-21 Thread Joshua D. Drake
On Wed, 2009-01-21 at 17:49 +, Simon Riggs wrote:
 On Wed, 2009-01-21 at 18:24 +0200, Heikki Linnakangas wrote:

 Bruce Lindsay, IBM Fellow and long term DB guru was interviewed in 2005:
 Q: If you magically had enough extra time to do one additional thing at
 work that you're not doing now, what would it be?
 
 I think I would work on indexing a little harder.
 
 (He mentions XML indexing, multi-dimensional indexing etc)
 [Taken from SIGMOD Record, June 2005]

I am curious. I read this whole current thread. What is wrong with the
patch? As I understand it it does not increase complexity. It appears to
only expose (or perhaps abstract?) existing functionality into a usable
API that is not dependent on something being in core.

Imagine if at some point to develop new index types or perhaps single
purpose modified index types all you needed was knowhow, pgxs and too
much time.

Unless there is something wrong with this patch I say we need to stop
arguing semantics and apply it.

Sincerely,

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >