Re: [HACKERS] estimating # of distinct values

2011-01-10 Thread tv
 On Fri, 2011-01-07 at 12:32 +0100, t...@fuzzy.cz wrote:
 the problem is you will eventually need to drop the results and rebuild
 it, as the algorithms do not handle deletes (ok, Florian mentioned an
 algorithm L_0 described in one of the papers, but I'm not sure we can
 use
 it).

 Yes, but even then you can start with much better cards if you already
 have an estimate of what it looks like, based on the fact that you did
 continuous updating of it. For example you'll have a pretty good
 estimate of the bounds of the number of distinct values, while if you
 really start from scratch you have nothing to start with but assume that
 you must cope with the complete range between all values are distinct -
 there's only a few of them.

Sure, using the previous estimate is a good idea. I just wanted to point
out there is no reasonable way to handle deletes, so that you have to drop
the stats are rebuild it from scratch.

The biggest problem is not choosing a reasonable parameters (some of the
parameters can handle a few billions ndistinct values with something like
128kB of memory and less than 5% error). The really serious concern is I/O
generated by rebuilding the stats.

 Another thing I'm not sure about is where to store those intermediate
 stats (used to get the current estimate, updated incrementally).

 The forks implementation proposed in other responses is probably the
 best idea if usable. It will also solve you the problem of memory
 limitations, at the expense of more resources used if the structure
 grows too big, but it will still be probably fast enough if it stays
 small and in cache.

Hm, the forks seem to be an interesting option. It's probably much better
than storing that directly in the memory (not a good idea if there is a
lot of data). And you don't really need the data to get the estimate, it's
needed just when updating the stats.

So the last thing I'm not sure is how to store the changed rows, so that
the update process can get a list of new values. Someone already offered
LISTEN/NOTIFY, but I guess we could just write the ctids into a file
(maybe another fork)?

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming base backups

2011-01-10 Thread Magnus Hagander
On Sun, Jan 9, 2011 at 23:33, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 2011/1/7 Magnus Hagander mag...@hagander.net:
 On Fri, Jan 7, 2011 at 01:47, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 2011/1/5 Magnus Hagander mag...@hagander.net:
 On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 Magnus Hagander mag...@hagander.net writes:
 * Stefan mentiond it might be useful to put some
 posix_fadvise(POSIX_FADV_DONTNEED)
   in the process that streams all the files out. Seems useful, as long 
 as that
   doesn't kick them out of the cache *completely*, for other backends as 
 well.
   Do we know if that is the case?

 Maybe have a look at pgfincore to only tag DONTNEED for blocks that are
 not already in SHM?

 I think that's way more complex than we want to go here.


 DONTNEED will remove the block from OS buffer everytime.

 Then we definitely don't want to use it - because some other backend
 might well want the file. Better leave it up to the standard logic in
 the kernel.

 Looking at the patch, it is (very) easy to add the support for that in
 basebackup.c
 That supposed allowing mincore(), so mmap(), and so probably switch
 the fopen() to an open() (or add an open() just for mmap
 requirement...)

 Let's go ?

Per above, I still don't think we *should* do this. We don't want to
kick things out of the cache underneath other backends, and since we
can't control that. Either way, it shouldn't happen in the beginning,
and if it does, should be backed with proper benchmarks.

I've committed the backend side of this, without that. Still working
on the client, and on cleaning up Heikki's patch for grammar/parser
support.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Magnus Hagander
On Sun, Jan 9, 2011 at 15:53, Simon Riggs si...@2ndquadrant.com wrote:
 On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:

 One thing I noticed is that it gives an interesting property to my
 patch for streaming base backups - they now show up in
 pg_stat_replication, with a streaming location of 0/0.

 If the view is named pg_stat_replication, we probably want to filter
 that out somehow. But then, do we want a separate view listing the
 walsenders that are busy sending base backups?

 For that matter, do we want an indication that separates a walsender
 not sending data from one sending that happens to be at location 0/0?
 Most will leave 0/0 really quickly, but a walsender can be idle (not
 received a command yet), or it can be running IDENTIFY_SYSTEM for
 example.

 I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
 phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 In going back through old emails to see what issues might have
 been raised but not yet addressed for the SSI patch, I found the
 subject issue described in a review by Jeff Davis here:
  
 http://archives.postgresql.org/pgsql-hackers/2010-10/msg01159.php
 
After reviewing the docs and testing things, I'm convinced that more
work is needed.  Because the transaction's writes aren't visible
until COMMIT PREPARED is run, and write-write conflicts are still
causing serialization failures after PREPARE TRANSACTION, some of
the work being done for SSI on PREPARE TRANSACTION needs to be moved
to COMMIT PREPARED.
 
It seems likely that shops who use prepared transactions are more
likely than most to care about truly serializable transactions, so I
don't think I should write this off as a limitation for the 9.1
implementation.  Unless someone sees some dire problem with the
patch which I've missed, this seems like my top priority to fix
before cutting a patch.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Simon Riggs
On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:
 On Sun, Jan 9, 2011 at 15:53, Simon Riggs si...@2ndquadrant.com wrote:
  On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:
 
  One thing I noticed is that it gives an interesting property to my
  patch for streaming base backups - they now show up in
  pg_stat_replication, with a streaming location of 0/0.
 
  If the view is named pg_stat_replication, we probably want to filter
  that out somehow. But then, do we want a separate view listing the
  walsenders that are busy sending base backups?
 
  For that matter, do we want an indication that separates a walsender
  not sending data from one sending that happens to be at location 0/0?
  Most will leave 0/0 really quickly, but a walsender can be idle (not
  received a command yet), or it can be running IDENTIFY_SYSTEM for
  example.
 
  I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
  phases of replication.
 
 That seems reasonable. But if we keep BACKUP in there, should we
 really have it called pg_stat_replication? (yeah, I know, I'm not
 giving up :P)
 
 (You'd need a 4th mode for WAITING or so, to indicate it's waiting for
 a command)

That's something different.

The 3 phases are more concrete.

BACKUP -- CATCHUP --- STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread David Fetter
On Mon, Jan 10, 2011 at 08:49:12AM -0600, Kevin Grittner wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  
  In going back through old emails to see what issues might have
  been raised but not yet addressed for the SSI patch, I found the
  subject issue described in a review by Jeff Davis here:
   
  http://archives.postgresql.org/pgsql-hackers/2010-10/msg01159.php
  
 After reviewing the docs and testing things, I'm convinced that more
 work is needed.  Because the transaction's writes aren't visible
 until COMMIT PREPARED is run, and write-write conflicts are still
 causing serialization failures after PREPARE TRANSACTION, some of
 the work being done for SSI on PREPARE TRANSACTION needs to be moved
 to COMMIT PREPARED.
  
 It seems likely that shops who use prepared transactions are more
 likely than most to care about truly serializable transactions, so I
 don't think I should write this off as a limitation for the 9.1
 implementation.  Unless someone sees some dire problem with the
 patch which I've missed, this seems like my top priority to fix
 before cutting a patch.

Could people fix it after the patch?  ISTM that a great way to test it
is to make very sure it's available ASAP to a wide range of people via
the next alpha (or beta, if that's where we're going next).

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread Kevin Grittner
David Fetter da...@fetter.org wrote:
 
 Could people fix it after the patch?  ISTM that a great way to
 test it is to make very sure it's available ASAP to a wide range
 of people via the next alpha (or beta, if that's where we're going
 next).
 
People can always pull from the git repo:
 
git://git.postgresql.org/git/users/kgrittn/postgres.git
 
Also, I can post a patch against HEAD at any time.  Should I post
one now, and then again after this is solved?
 
Full disclosure requires that I mention that while Dan has completed
code to fix the page split/combine issues Heikki raised, I don't
think he's done testing it.  (It's hard to test because you don't
hit the problem unless you have a page split or combine right at the
point where the hash table for predicate lock becomes full.)  So,
anyway, there could possibly be some wet paint there.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread David Fetter
On Mon, Jan 10, 2011 at 08:59:45AM -0600, Kevin Grittner wrote:
 David Fetter da...@fetter.org wrote:
  Could people fix it after the patch?  ISTM that a great way to
  test it is to make very sure it's available ASAP to a wide range
  of people via the next alpha (or beta, if that's where we're going
  next).
  
 People can always pull from the git repo:
  
 git://git.postgresql.org/git/users/kgrittn/postgres.git
  
 Also, I can post a patch against HEAD at any time.  Should I post
 one now, and then again after this is solved?
  
 Full disclosure requires that I mention that while Dan has completed
 code to fix the page split/combine issues Heikki raised, I don't
 think he's done testing it.  (It's hard to test because you don't
 hit the problem unless you have a page split or combine right at the
 point where the hash table for predicate lock becomes full.)  So,
 anyway, there could possibly be some wet paint there.

Short of a test suite that can inject faults at the exact kinds of
places where this occurs and a way to enumerate all those faults,
there's only so much testing that's possible to do /in vitro/.  Oh,
and such enumerations tend to be combinatorial explosions anyhow. :P

At some point, and that point is rapidly approaching if it's not
already here, you've done what you can to shake out bugs and
infelicities, and the next steps are up to people testing alphas,
betas, and to be completely frank, 9.1.0 and possibly later versions.

This is way, way too big a feature to expect you can get a perfect
handle on it by theory alone.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN indexscans versus equality selectivity estimation

2011-01-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Jan 9, 2011 at 6:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 or we could hack eqsel() to bound the no-stats estimate to a bit less
 than 1.

 This seems like a pretty sensible thing to do.  I can't immediately
 imagine a situation in which 1.0 is a sensible selectivity estimate in
 the no-stats case and 0.90 (say) is a major regression.

After sleeping on it, that seems like my least favorite option.  It's
basically a kluge, as is obvious because there's no principled way to
choose what the bound is (or the minimum result from
get_variable_numdistinct, if we were to hack it there).  I'm currently
leaning to the idea of tweaking the logic in indxpath.c; in particular,
why wouldn't it be a good idea to force consideration of the bitmap path
if the index type hasn't got amgettuple?  If we don't, then we've
completely wasted the effort spent up to that point inside
find_usable_indexes.

Or we could just ignore the issue; as Josh says, that's not an
unreasonable option.  The particular case I ran into is certainly not
too compelling.  I'm a bit worried though that there might be other
cases where the estimator comes up with 1.0 selectivity but it'd still
be worth considering a bitmap scan.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Heikki Linnakangas

On 10.01.2011 16:49, Simon Riggs wrote:

On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:

On Sun, Jan 9, 2011 at 15:53, Simon Riggssi...@2ndquadrant.com  wrote:

On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:


One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.


I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.


That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)


That's something different.

The 3 phases are more concrete.

BACKUP --  CATCHUP---  STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.


So how does a walsender that's waiting for a command from the client 
show up? Surely it's not in catchup mode yet?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Simon Riggs
On Mon, 2011-01-10 at 17:05 +0200, Heikki Linnakangas wrote:
 On 10.01.2011 16:49, Simon Riggs wrote:
  On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:
  On Sun, Jan 9, 2011 at 15:53, Simon Riggssi...@2ndquadrant.com  wrote:
  On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:
 
  One thing I noticed is that it gives an interesting property to my
  patch for streaming base backups - they now show up in
  pg_stat_replication, with a streaming location of 0/0.
 
  If the view is named pg_stat_replication, we probably want to filter
  that out somehow. But then, do we want a separate view listing the
  walsenders that are busy sending base backups?
 
  For that matter, do we want an indication that separates a walsender
  not sending data from one sending that happens to be at location 0/0?
  Most will leave 0/0 really quickly, but a walsender can be idle (not
  received a command yet), or it can be running IDENTIFY_SYSTEM for
  example.
 
  I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
  phases of replication.
 
  That seems reasonable. But if we keep BACKUP in there, should we
  really have it called pg_stat_replication? (yeah, I know, I'm not
  giving up :P)
 
  (You'd need a 4th mode for WAITING or so, to indicate it's waiting for
  a command)
 
  That's something different.
 
  The 3 phases are more concrete.
 
  BACKUP --  CATCHUP---  STREAM
 
  When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
  you never issue a BACKUP. Once we have caught up we move to STREAM. That
  has nothing to do with idle/active.
 
 So how does a walsender that's waiting for a command from the client 
 show up? Surely it's not in catchup mode yet?

There is a trivial state between connect and first command. If you think
that is worth publishing, feel free. STARTING?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Euler Taveira de Oliveira

Em 10-01-2011 12:05, Heikki Linnakangas escreveu:

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in catchup mode yet?

It is kind of initializing catchup. I think it is not worth representing 
those short lifespan states (in normal scenarios).



--
  Euler Taveira de Oliveira
  http://www.timbira.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Magnus Hagander
On Mon, Jan 10, 2011 at 16:41, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, 2011-01-10 at 17:05 +0200, Heikki Linnakangas wrote:
 On 10.01.2011 16:49, Simon Riggs wrote:
  On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:
  On Sun, Jan 9, 2011 at 15:53, Simon Riggssi...@2ndquadrant.com  wrote:
  On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:
 
  One thing I noticed is that it gives an interesting property to my
  patch for streaming base backups - they now show up in
  pg_stat_replication, with a streaming location of 0/0.
 
  If the view is named pg_stat_replication, we probably want to filter
  that out somehow. But then, do we want a separate view listing the
  walsenders that are busy sending base backups?
 
  For that matter, do we want an indication that separates a walsender
  not sending data from one sending that happens to be at location 0/0?
  Most will leave 0/0 really quickly, but a walsender can be idle (not
  received a command yet), or it can be running IDENTIFY_SYSTEM for
  example.
 
  I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
  phases of replication.
 
  That seems reasonable. But if we keep BACKUP in there, should we
  really have it called pg_stat_replication? (yeah, I know, I'm not
  giving up :P)
 
  (You'd need a 4th mode for WAITING or so, to indicate it's waiting for
  a command)
 
  That's something different.
 
  The 3 phases are more concrete.
 
  BACKUP --  CATCHUP---  STREAM
 
  When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
  you never issue a BACKUP. Once we have caught up we move to STREAM. That
  has nothing to do with idle/active.

 So how does a walsender that's waiting for a command from the client
 show up? Surely it's not in catchup mode yet?

 There is a trivial state between connect and first command. If you think
 that is worth publishing, feel free. STARTING?

If we don't publish it, it'll implicitly be in one of the others..
Unless we say NULL, of course, but I definitely prefer STARTING then.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/python custom exceptions for SPI

2011-01-10 Thread Jan Urbański
On 23/12/10 15:40, Jan Urbański wrote:
 Here's a patch implementing custom Python exceptions for SPI errors
 mentioned in
 http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's
 an incremental patch on top of the explicit-subxacts patch sent eariler.

I changed that patch to use Perl instead of sed to generate the
exceptions, which should be a more portable. It's still not nice, and I
think the way forward is to have a common format for SQLSTATE
conditions, as proposed in
http://archives.postgresql.org/message-id/4d19c93c.5000...@wulczer.org.

I failed to follow on with that patch because I couldn't figure out how
to persuade the buildsystem to generate errcodes.h early enough for the
rest of the system to compile, not to mention doing it for the MSVC
build system.

Cheers,
Jan
diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile
index 33dddc6..0d7ddee 100644
*** a/src/pl/plpython/Makefile
--- b/src/pl/plpython/Makefile
*** rpathdir = $(python_libdir)
*** 38,44 
  
  NAME = plpython$(python_majorversion)
  OBJS = plpython.o
! 
  
  # Python on win32 ships with import libraries only for Microsoft Visual C++,
  # which are not compatible with mingw gcc. Therefore we need to build a
--- 38,44 
  
  NAME = plpython$(python_majorversion)
  OBJS = plpython.o
! SPIEXCEPTIONS = spiexceptions.h
  
  # Python on win32 ships with import libraries only for Microsoft Visual C++,
  # which are not compatible with mingw gcc. Therefore we need to build a
*** PSQLDIR = $(bindir)
*** 86,93 
  
  include $(top_srcdir)/src/Makefile.shlib
  
  
! all: all-lib
  
  install: all installdirs install-lib
  ifeq ($(python_majorversion),2)
--- 86,102 
  
  include $(top_srcdir)/src/Makefile.shlib
  
+ .PHONY: gen-spiexceptions
  
! # Generate spiexceptions.h from utils/errcodes.h
! spiexceptions.h: $(top_srcdir)/src/include/utils/errcodes.h
! 	$(PERL) $(srcdir)/generate-spiexceptions.pl $^  $(SPIEXCEPTIONS)
! 
! gen-spiexceptions: $(SPIEXCEPTIONS)
! 
! all: gen-spiexceptions all-lib
! 
! distprep: gen-spiexceptions
  
  install: all installdirs install-lib
  ifeq ($(python_majorversion),2)
*** clean distclean maintainer-clean: clean-
*** 138,143 
--- 147,153 
  	rm -f $(OBJS)
  	rm -rf results
  	rm -f regression.diffs regression.out
+ 	rm -f $(SPIEXCEPTIONS)
  ifeq ($(PORTNAME), win32)
  	rm -f python${pytverstr}.def
  endif
diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out
index 7fc8337..718ebce 100644
*** a/src/pl/plpython/expected/plpython_error.out
--- b/src/pl/plpython/expected/plpython_error.out
*** CREATE FUNCTION sql_syntax_error() RETUR
*** 8,14 
  'plpy.execute(syntax error)'
  LANGUAGE plpythonu;
  SELECT sql_syntax_error();
! ERROR:  plpy.SPIError: syntax error at or near syntax
  CONTEXT:  PL/Python function sql_syntax_error
  /* check the handling of uncaught python exceptions
   */
--- 8,14 
  'plpy.execute(syntax error)'
  LANGUAGE plpythonu;
  SELECT sql_syntax_error();
! ERROR:  spiexceptions.SyntaxError: syntax error at or near syntax
  CONTEXT:  PL/Python function sql_syntax_error
  /* check the handling of uncaught python exceptions
   */
*** CREATE FUNCTION exception_index_invalid_
*** 27,33 
  return rv[0]'
  	LANGUAGE plpythonu;
  SELECT exception_index_invalid_nested();
! ERROR:  plpy.SPIError: function test5(unknown) does not exist
  CONTEXT:  PL/Python function exception_index_invalid_nested
  /* a typo
   */
--- 27,33 
  return rv[0]'
  	LANGUAGE plpythonu;
  SELECT exception_index_invalid_nested();
! ERROR:  spiexceptions.UndefinedFunction: function test5(unknown) does not exist
  CONTEXT:  PL/Python function exception_index_invalid_nested
  /* a typo
   */
*** return None
*** 43,49 
  '
  	LANGUAGE plpythonu;
  SELECT invalid_type_uncaught('rick');
! ERROR:  plpy.SPIError: type test does not exist
  CONTEXT:  PL/Python function invalid_type_uncaught
  /* for what it's worth catch the exception generated by
   * the typo, and return None
--- 43,49 
  '
  	LANGUAGE plpythonu;
  SELECT invalid_type_uncaught('rick');
! ERROR:  spiexceptions.UndefinedObject: type test does not exist
  CONTEXT:  PL/Python function invalid_type_uncaught
  /* for what it's worth catch the exception generated by
   * the typo, and return None
*** SELECT valid_type('rick');
*** 109,111 
--- 109,149 
   
  (1 row)
  
+ /* Check catching specific types of exceptions
+  */
+ CREATE TABLE specific (
+ i integer PRIMARY KEY
+ );
+ NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index specific_pkey for table specific
+ CREATE FUNCTION specific_exception(i integer) RETURNS void AS
+ $$
+ from plpy import spiexceptions
+ try:
+ plpy.execute(insert into specific values (%s) % (i or NULL));
+ except spiexceptions.NotNullViolation, e:
+ plpy.notice(Violated the NOT NULL constraint, 

[HACKERS] READ ONLY fixes

2011-01-10 Thread Kevin Grittner
Attached is a rebased roll-up of the 3 and 3a patches from last month.
 
-Kevin

--- a/src/backend/commands/variable.c
+++ b/src/backend/commands/variable.c
@@ -544,29 +544,72 @@ show_log_timezone(void)
 
 
 /*
+ * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE
+ *
+ * These should be transaction properties which can be set in exactly the
+ * same points in time that transaction isolation may be set.
+ */
+bool
+assign_transaction_read_only(bool newval, bool doit, GucSource source)
+{
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
+   if (source != PGC_S_OVERRIDE)
+   {
+   /* Can't go to r/w mode inside a r/o transaction */
+   if (newval == false  XactReadOnly  IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(cannot set transaction 
read-write mode inside a read-only transaction)));
+   return false;
+   }
+   /* Top level transaction can't change this after first 
snapshot. */
+   if (FirstSnapshotSet  !IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(read-only property must be set 
before any query)));
+   return false;
+   }
+   /* Can't go to r/w mode while recovery is still active */
+   if (newval == false  XactReadOnly  RecoveryInProgress())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(cannot set transaction 
read-write mode during recovery)));
+   return false;
+   }
+   }
+
+   return true;
+}
+
+/*
  * SET TRANSACTION ISOLATION LEVEL
  */
+extern char *XactIsoLevel_string;  /* in guc.c */
 
 const char *
 assign_XactIsoLevel(const char *value, bool doit, GucSource source)
 {
-   if (FirstSnapshotSet)
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
+   if (source != PGC_S_OVERRIDE)
{
-   ereport(GUC_complaint_elevel(source),
-   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
-errmsg(SET TRANSACTION ISOLATION LEVEL must 
be called before any query)));
-   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
-   if (source != PGC_S_OVERRIDE)
+   if (FirstSnapshotSet)
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(SET TRANSACTION ISOLATION 
LEVEL must be called before any query)));
return NULL;
-   }
-   else if (IsSubTransaction())
-   {
-   ereport(GUC_complaint_elevel(source),
-   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
-errmsg(SET TRANSACTION ISOLATION LEVEL must 
not be called in a subtransaction)));
-   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
-   if (source != PGC_S_OVERRIDE)
+   }
+   /* We ignore a subtransaction setting it to the existing value. 
*/
+   if (IsSubTransaction()  strcmp(value, XactIsoLevel_string) != 
0)
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(SET TRANSACTION ISOLATION 
LEVEL must not be called in a subtransaction)));
return NULL;
+   }
}
 
if (strcmp(value, serializable) == 0)
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -168,7 +168,6 @@ static bool assign_bonjour(bool newval, bool doit, 
GucSource source);
 static bool assign_ssl(bool newval, bool doit, GucSource source);
 static bool assign_stage_log_stats(bool newval, bool doit, GucSource source);
 static bool assign_log_stats(bool newval, bool doit, GucSource source);
-static bool assign_transaction_read_only(bool newval, bool doit, GucSource 
source);
 static const char *assign_canonical_path(const char *newval, bool doit, 
GucSource source);
 static const char *assign_timezone_abbreviations(const char *newval, bool 
doit, GucSource source);
 static const char *show_archive_command(void);
@@ -425,7 +424,6 @@ static int  server_version_num;
 static char *timezone_string;
 static char *log_timezone_string;
 

[HACKERS] Remove toast relid tracking from pg_upgrade

2011-01-10 Thread Bruce Momjian
The attached, applied patch removes toast relid from the relation array
as it is no longer needed.  Also other remaming was done.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
commit 0a5f11993195d74f23b63cc5c2d7024c6d27d7e2
Author: Bruce Momjian br...@momjian.us
Date:   Mon Jan 10 11:45:22 2011 -0500

A toast relid field are no longer needed in pg_upgrade's rel arrays, so
remove them.  Also other renaming.

diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
index 74449df..fc82be4 100644
*** /tmp/6A0Xec_check.c Mon Jan 10 11:47:29 2011
--- /tmp/ahdAoa_check.c Mon Jan 10 11:47:29 2011
*** check_new_db_is_empty(void)
*** 362,368 
}
}
  
!   dbarr_free(new_cluster.dbarr);
  
if (found)
pg_log(PG_FATAL, New cluster is not empty; exiting\n);
--- 362,368 
}
}
  
!   free_db_and_rel_infos(new_cluster.dbarr);
  
if (found)
pg_log(PG_FATAL, New cluster is not empty; exiting\n);
diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c
index c805a04..c041231 100644
*** /tmp/E8qJIb_info.c  Mon Jan 10 11:47:29 2011
--- /tmp/yt5zIb_info.c  Mon Jan 10 11:47:29 2011
***
*** 12,26 
  #include access/transam.h
  
  
- static void get_db_infos(ClusterInfo *cluster);
- static void print_db_arr(ClusterInfo *cluster);
- static void print_rel_arr(RelInfoArr *arr);
- static void get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo);
- static void free_rel_arr(RelInfoArr *rel_arr);
  static void create_rel_filename_map(const char *old_data, const char 
*new_data,
  const DbInfo *old_db, const DbInfo *new_db,
  const RelInfo *old_rel, const RelInfo *new_rel,
  FileNameMap *map);
  
  
  /*
--- 12,26 
  #include access/transam.h
  
  
  static void create_rel_filename_map(const char *old_data, const char 
*new_data,
  const DbInfo *old_db, const DbInfo *new_db,
  const RelInfo *old_rel, const RelInfo *new_rel,
  FileNameMap *map);
+ static void get_db_infos(ClusterInfo *cluster);
+ static void get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo);
+ static void free_rel_infos(RelInfoArr *rel_arr);
+ static void print_db_infos(DbInfoArr *dbinfo);
+ static void print_rel_infos(RelInfoArr *arr);
  
  
  /*
*** create_rel_filename_map(const char *old_
*** 111,125 
  
  
  void
! print_maps(FileNameMap *maps, int n, const char *dbName)
  {
if (log_opts.debug)
{
int mapnum;
  
!   pg_log(PG_DEBUG, mappings for db %s:\n, dbName);
  
!   for (mapnum = 0; mapnum  n; mapnum++)
pg_log(PG_DEBUG, %s.%s: %u to %u\n,
   maps[mapnum].nspname, maps[mapnum].relname,
   maps[mapnum].old_relfilenode,
--- 111,125 
  
  
  void
! print_maps(FileNameMap *maps, int n_maps, const char *db_name)
  {
if (log_opts.debug)
{
int mapnum;
  
!   pg_log(PG_DEBUG, mappings for db %s:\n, db_name);
  
!   for (mapnum = 0; mapnum  n_maps; mapnum++)
pg_log(PG_DEBUG, %s.%s: %u to %u\n,
   maps[mapnum].nspname, maps[mapnum].relname,
   maps[mapnum].old_relfilenode,
*** print_maps(FileNameMap *maps, int n, con
*** 131,136 
--- 131,160 
  
  
  /*
+  * get_db_and_rel_infos()
+  *
+  * higher level routine to generate dbinfos for the database running
+  * on the given port. Assumes that server is already running.
+  */
+ void
+ get_db_and_rel_infos(ClusterInfo *cluster)
+ {
+   int dbnum;
+ 
+   get_db_infos(cluster);
+ 
+   for (dbnum = 0; dbnum  cluster-dbarr.ndbs; dbnum++)
+   get_rel_infos(cluster, cluster-dbarr.dbs[dbnum]);
+ 
+   if (log_opts.debug)
+   {
+   pg_log(PG_DEBUG, %s databases\n, CLUSTER_NAME(cluster));
+   print_db_infos(cluster-dbarr);
+   }
+ }
+ 
+ 
+ /*
   * get_db_infos()
   *
   * Scans pg_database system catalog and populates all user
*** get_db_infos(ClusterInfo *cluster)
*** 144,152 
int ntups;
int tupnum;
DbInfo *dbinfos;
!   int i_datname;
!   int i_oid;
!   int i_spclocation;
  
res = executeQueryOrDie(conn,
SELECT d.oid, 
d.datname, t.spclocation 
--- 168,174 
int ntups;
int tupnum;
 

[HACKERS] Using mingw

2011-01-10 Thread pasman pasmański
Hi. I try to compile postgres with mingw32. When configure runs, it
tells that found perl 5.6 which is too old. I install perl 5.10 from
activestate but configure cant find it. How to set up path to newer
perl?

-- 
Sent from my mobile device


pasman

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Feature Request: Groups in SSPI for the pg_ident.conf file mapping

2011-01-10 Thread Hotchkiss, Christopher A
To All,
I am attempting to setup a server to use SSPI for mapping operating system 
users/groups to various postgres roles. In process I found that everything is 
driven off of the username in the mapping with no group but the mapping file 
supports regular expressions to do some mapping.

As detailed in:
http://serverfault.com/questions/219596/is-it-possbile-to-restrict-who-can-connect-to-postgres-using-active-directory-gro
http://www.postgresql.org/docs/9.0/interactive/auth-methods.html#SSPI-AUTH
http://www.postgresql.org/docs/9.0/interactive/auth-username-maps.html

Would it be possible to include either the primary group or a list of groups in 
the username string for mapping in the pg_ident.conf file?

For example:
User Tom is a member of the sales primary group in the DEV domain with a 
secondary group of users. When he attempts to login, postgres builds the 
following username for matching purposes: Tom:sa...@dev or 
Tom:sales,us...@dev. At that point we could map the user to a specific 
postgres based on the group(s) instead of using username prefixes or hard 
coding each name.


Christopher A Hotchkiss
JPMorgan Chase  Co.
Email christopher.a.hotchk...@jpmchase.com


This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase  Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using mingw

2011-01-10 Thread Andrew Dunstan



On 01/10/2011 11:51 AM, pasman pasmański wrote:

Hi. I try to compile postgres with mingw32. When configure runs, it
tells that found perl 5.6 which is too old. I install perl 5.10 from
activestate but configure cant find it. How to set up path to newer
perl?



Assuming you installed into the usual place:

   PATH=/c/perl/bin:$PATH ./configure 



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI patch version 8

2011-01-10 Thread David Fetter
On Mon, Jan 10, 2011 at 10:03:18AM -0600, Kevin Grittner wrote:
 Due to popular request (Hey, David's popular, right?),

Well, I'm a person, and popular originally refers to something like
that ;)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote: 
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  
 In going back through old emails to see what issues might have
 been raised but not yet addressed for the SSI patch, I found the
 subject issue described in a review by Jeff Davis here:
  
 http://archives.postgresql.org/pgsql-hackers/2010-10/msg01159.php
  
 After reviewing the docs and testing things, I'm convinced that
 more work is needed.  Because the transaction's writes aren't
 visible until COMMIT PREPARED is run, and write-write conflicts
 are still causing serialization failures after PREPARE
 TRANSACTION, some of the work being done for SSI on PREPARE
 TRANSACTION needs to be moved to COMMIT PREPARED.
 
I'm now also convinced that Jeff is right in his assessment that
when a transaction is prepared, information about predicate locks
and conflicts with other prepared transactions must be persisted
somewhere.  (Jeff referred to a 2PC state file.)
 
I'm trying not to panic here, but I haven't looked at 2PC before
yesterday and am just dipping into the code to support it, and time
is short.  Can anyone give me a pointer to anything I should read
before I dig through the 2PC code, which might accelerate this?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Magnus Hagander
On Mon, Jan 10, 2011 at 16:48, Euler Taveira de Oliveira
eu...@timbira.com wrote:
 Em 10-01-2011 12:05, Heikki Linnakangas escreveu:

 So how does a walsender that's waiting for a command from the client
 show up? Surely it's not in catchup mode yet?

 It is kind of initializing catchup. I think it is not worth representing
 those short lifespan states (in normal scenarios).

True, but it's quite important to detect and diagnose the abnormal ones...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN indexscans versus equality selectivity estimation

2011-01-10 Thread Josh Berkus
On 1/10/11 7:25 AM, Tom Lane wrote:
 I'm a bit worried though that there might be other
 cases where the estimator comes up with 1.0 selectivity but it'd still
 be worth considering a bitmap scan.

Well, I think the answer is to apply the other fixes, and test.  If
there are other cases of selectivity=1.0, they'll show up.  People are
pretty fast to complain if indexes aren't used, and we have a good
production test case available once you implement the other operators.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Josh Berkus
On 1/9/11 5:27 PM, Robert Haas wrote:
 I agree.  I think we should assume that existing code which asks for
 serializable behavior wants serializable behavior, not broken
 serializable behavior.  There certainly could be cases where the
 opposite is true (the code wants, specifically, our traditional
 definition of serializability rather than actual serializability) but
 I bet there's not a whole lot of them, and changing such code to ask
 for REPEATABLE READ probably isn't extremely difficult.

I'm going to disagree here. For a large, sprawling, legacy application
changing SERIALIZABLE to REPEATABLE READ in every place in the code
which might call it can be prohibitively difficult.  Further, many such
applications would be written with workarounds for broken serializable
behavior, workarounds which would behave unpredictably after an upgrade.

As such, I'd tend to say that like other major behavior changes, we
ought to have a LEGACY_SERIALIZABLE GUC for a couple of versions,
defaulting to FALSE.  Otherwise SSI becomes an anti-feature for some
users and prevents them from upgrading.

On the other hand, I'm not sure how many users ever use SERIALIZABLE
mode.  That would be the main counter-argument.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 
 many such applications would be written with workarounds for
 broken serializable behavior, workarounds which would behave
 unpredictably after an upgrade.
 
Can you elaborate?
 
The techniques we use in our shop wouldn't interact badly with SSI,
and I'm having trouble picturing what would.  Sure, some of these
techniques would no longer be needed, and would only add overhead if
SSI was there.  They would generally tend to prevent code from
getting to the point where a serialization failure from SSI would
occur.  In spite of that there would probably be at least some
additional serialization failures.  What other interactions or
problems do you see?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Josh Berkus
On 1/10/11 10:28 AM, Kevin Grittner wrote:
 The techniques we use in our shop wouldn't interact badly with SSI,
 and I'm having trouble picturing what would.  Sure, some of these
 techniques would no longer be needed, and would only add overhead if
 SSI was there.

Yeah?  Well, you have more experience than I do in this; my clients have
tended to use SELECT FOR UPDATE instead of SERIALIZABLE.  I'll defer to
you if you feel reasonably confident that breakage won't result.

And as I said, I'm unsure of how many people are using SERIALIZABLE in
any mission-critical context right now.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 
 my clients have tended to use SELECT FOR UPDATE instead of
 SERIALIZABLE.
 
If they're not using SERIALIZABLE, this patch will have no impact on
them at all.  If they are using SELECT FOR UPDATE *with*
SERIALIZABLE, everything will function exactly as it is except that
there may be some serialization failures which they weren't getting
before, either from the inevitable (but hopefully minimal) false
positives inherent in the technique or because they missed covering
something.
 
Since SSI doesn't introduce any blocking, and causes no behavior
changes beyond triggering serialization failures when it seems that
an anomaly may otherwise result, there's really nothing else to go
wrong.
 
Well, if there are no bugs we've missed in these few thousand lines
of code, that is.  Given the size and complexity of the patch, it'd
be surprising if we've squashed them all just yet.  We've tried
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming base backups

2011-01-10 Thread Cédric Villemain
2011/1/10 Magnus Hagander mag...@hagander.net:
 On Sun, Jan 9, 2011 at 23:33, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 2011/1/7 Magnus Hagander mag...@hagander.net:
 On Fri, Jan 7, 2011 at 01:47, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 2011/1/5 Magnus Hagander mag...@hagander.net:
 On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 Magnus Hagander mag...@hagander.net writes:
 * Stefan mentiond it might be useful to put some
 posix_fadvise(POSIX_FADV_DONTNEED)
   in the process that streams all the files out. Seems useful, as long 
 as that
   doesn't kick them out of the cache *completely*, for other backends 
 as well.
   Do we know if that is the case?

 Maybe have a look at pgfincore to only tag DONTNEED for blocks that are
 not already in SHM?

 I think that's way more complex than we want to go here.


 DONTNEED will remove the block from OS buffer everytime.

 Then we definitely don't want to use it - because some other backend
 might well want the file. Better leave it up to the standard logic in
 the kernel.

 Looking at the patch, it is (very) easy to add the support for that in
 basebackup.c
 That supposed allowing mincore(), so mmap(), and so probably switch
 the fopen() to an open() (or add an open() just for mmap
 requirement...)

 Let's go ?

 Per above, I still don't think we *should* do this. We don't want to
 kick things out of the cache underneath other backends, and since we

we are dropping stuff underneath other backends  anyway but I
understand your point.

 can't control that. Either way, it shouldn't happen in the beginning,
 and if it does, should be backed with proper benchmarks.

I agree.


 I've committed the backend side of this, without that. Still working
 on the client, and on cleaning up Heikki's patch for grammar/parser
 support.

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Josh Berkus
On 1/10/11 10:47 AM, Kevin Grittner wrote:
 If they're not using SERIALIZABLE, this patch will have no impact on
 them at all.  If they are using SELECT FOR UPDATE *with*
 SERIALIZABLE, everything will function exactly as it is except that
 there may be some serialization failures which they weren't getting
 before, either from the inevitable (but hopefully minimal) false
 positives inherent in the technique or because they missed covering
 something.

Right, that's what I'm worried about.  That's the sort of thing which is
very hard for a user to hunt down and troubleshoot, and could become a
blocker to upgrading.  Especially if they user has a vendor application
where they *can't* fix the code.  The only reason I'm ambivalent about
this is I'm unsure that there are more than a handful of people using
SERIALIZABLE in production applications, precisely because it's been so
unintuitive in the past.

Lemme start a survey on whether people use SERIALIZABLE.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bug in pg_describe_object (was: Re: [HACKERS] obj_unique_identifier(oid))

2011-01-10 Thread Andreas Karlsson
Here is the bug-fix patch again with a description of the context so I
can add it to the commit fest.

Joel Jacobson discovered a bug in the function pg_describe_object where
it does not produce unique identifiers for some entries in pg_amproc.

This patch fixes the bug where when two entries in pg_amproc only differ
in amproclefttype or amprocrighttype the same description will be
produced by pg_describe_object, by simply adding the two fields
(amproclefttype, amprocrighttype) to the description.

== Before patch

SELECT pg_describe_object('pg_amproc'::regclass,oid,0)
FROM pg_amproc WHERE oid IN (10608,10612);
 pg_describe_object 


 function 1 bttextcmp(text,text) of operator family array_ops for access method 
gin
 function 1 bttextcmp(text,text) of operator family array_ops for access method 
gin
(2 rows)

== After patch

SELECT pg_describe_object('pg_amproc'::regclass,oid,0)
FROM pg_amproc WHERE oid IN (10608,10612);
pg_describe_object  
  
--
 function 1 bttextcmp(text,text) of operator family array_ops for access method 
gin for (text[],text[])
 function 1 bttextcmp(text,text) of operator family array_ops for access method 
gin for (character varying[],character varying[])
(2 rows)

Regards,
Andreas
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index ec8eb74..795051e 100644
*** a/src/backend/catalog/dependency.c
--- b/src/backend/catalog/dependency.c
*** getObjectDescription(const ObjectAddress
*** 2389,2398 
   * textual form of the function with arguments, and the second
   * %s is the description of the operator family.
   */
! appendStringInfo(buffer, _(function %d %s of %s),
   amprocForm-amprocnum,
   format_procedure(amprocForm-amproc),
!  opfam.data);
  pfree(opfam.data);
  
  systable_endscan(amscan);
--- 2389,2400 
   * textual form of the function with arguments, and the second
   * %s is the description of the operator family.
   */
! appendStringInfo(buffer, _(function %d %s of %s for (%s,%s)),
   amprocForm-amprocnum,
   format_procedure(amprocForm-amproc),
!  opfam.data,
!  format_type_be(amprocForm-amproclefttype),
!  format_type_be(amprocForm-amprocrighttype));
  pfree(opfam.data);
  
  systable_endscan(amscan);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] walsender parser patch

2011-01-10 Thread Magnus Hagander
Attached is an updated version of Heikki's patch to use a parser for
the walsender commands, instead of parsing things manually. It also
does some minor refactoring in walsender.c to break out
IdentifySystem() and StartReplication() to their own functions to make
it more readable.

While having an actual parser here isn't *necessary* at this point, it
makes things easier. And it will become increasingly useful as we add
new features (for example, the include all wal files option for
streaming base backup, and I'm sure that sync rep will require some
additional commands or changes to commands).

Any objections to doing this?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
*** a/doc/src/sgml/protocol.sgml
--- b/doc/src/sgml/protocol.sgml
***
*** 1460,1475  The commands accepted in walsender mode are:
/varlistentry
  
varlistentry
! termBASE_BACKUP replaceableoptions/literal;/replaceablelabel//term
  listitem
   para
Instructs the server to start streaming a base backup.
!   The system will automatically be put in backup mode with the label
!   specified in replaceablelabel/ before the backup is started, and
!   taken out of it when the backup is complete. The following options
!   are accepted:
variablelist
 varlistentry
  termliteralPROGRESS//term
  listitem
   para
--- 1460,1486 
/varlistentry
  
varlistentry
! termBASE_BACKUP [literalLABEL/literal replaceable'label'/replaceable] [literalPROGRESS/literal]/term
  listitem
   para
Instructs the server to start streaming a base backup.
!   The system will automatically be put in backup mode before the backup
!   is started, and taken out of it when the backup is complete. The
!   following options are accepted:
variablelist
 varlistentry
+ termliteralLABEL/literal replaceable'label'/replaceable/term
+ listitem
+  para
+   Sets the label of the backup. If none is specified, a backup label
+   of literalbase backup/literal will be used. The quoting rules
+   for the label are the same as a standard SQL string with
+   xref linkend=guc-standard-conforming-strings turned on.
+  /para
+ /listitem
+/varlistentry
+ 
+varlistentry
  termliteralPROGRESS//term
  listitem
   para
*** a/src/backend/replication/Makefile
--- b/src/backend/replication/Makefile
***
*** 12,17  subdir = src/backend/replication
  top_builddir = ../../..
  include $(top_builddir)/src/Makefile.global
  
! OBJS = walsender.o walreceiverfuncs.o walreceiver.o basebackup.o
  
  include $(top_srcdir)/src/backend/common.mk
--- 12,40 
  top_builddir = ../../..
  include $(top_builddir)/src/Makefile.global
  
! OBJS = walsender.o walreceiverfuncs.o walreceiver.o basebackup.o \
! 	repl_gram.o
  
  include $(top_srcdir)/src/backend/common.mk
+ 
+ # repl_scanner is compiled as part of repl_gram
+ repl_gram.o: repl_scanner.c
+ 
+ # See notes in src/backend/parser/Makefile about the following two rules
+ 
+ repl_gram.c: repl_gram.y
+ ifdef BISON
+ 	$(BISON) -d $(BISONFLAGS) -o $@ $
+ else
+ 	@$(missing) bison $ $@
+ endif
+ 
+ repl_scanner.c: repl_scanner.l
+ ifdef FLEX
+ 	$(FLEX) $(FLEXFLAGS) -o'$@' $
+ else
+ 	@$(missing) flex $ $@
+ endif
+ 
+ # repl_gram.c and repl_scanner.c are in the distribution tarball, so
+ # they are not cleaned here.
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***
*** 64,75  base_backup_cleanup(int code, Datum arg)
   * pg_stop_backup() for the user.
   */
  void
! SendBaseBackup(const char *options)
  {
  	DIR		   *dir;
  	struct dirent *de;
- 	char	   *backup_label = strchr(options, ';');
- 	bool		progress = false;
  	List	   *tablespaces = NIL;
  	tablespaceinfo *ti;
  	MemoryContext backup_context;
--- 64,73 
   * pg_stop_backup() for the user.
   */
  void
! SendBaseBackup(const char *backup_label, bool progress)
  {
  	DIR		   *dir;
  	struct dirent *de;
  	List	   *tablespaces = NIL;
  	tablespaceinfo *ti;
  	MemoryContext backup_context;
***
*** 83,100  SendBaseBackup(const char *options)
  	old_context = MemoryContextSwitchTo(backup_context);
  
  	if (backup_label == NULL)
! 		ereport(FATAL,
! (errcode(ERRCODE_PROTOCOL_VIOLATION),
!  errmsg(invalid base backup options: %s, options)));
! 	backup_label++;/* Walk past the semicolon */
! 
! 	/* Currently the only option string supported is PROGRESS */
! 	if (strncmp(options, PROGRESS, 8) == 0)
! 		progress = true;
! 	else if (options[0] != ';')
! 		ereport(FATAL,
! (errcode(ERRCODE_PROTOCOL_VIOLATION),
!  errmsg(invalid base backup options: %s, options)));
  
  	/* Make sure we can open the directory with tablespaces in it */
  	dir = AllocateDir(pg_tblspc);
--- 81,87 
  	old_context = 

Re: [HACKERS] Bug in pg_describe_object (was: Re: [HACKERS] obj_unique_identifier(oid))

2011-01-10 Thread Joel Jacobson
2011/1/10 Andreas Karlsson andr...@proxel.se:
 Here is the bug-fix patch again with a description of the context so I
 can add it to the commit fest.

Many thanks for fixing the bug!

I also implemented the pg_describe_object in pure SQL, for those of us
who have not yet switched to PostgreSQL 9 in the production. Very
helpful function indeed!

https://github.com/gluefinance/pov/blob/master/sql/schema/pov/functions/pg_describe_object.sql

-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender parser patch

2011-01-10 Thread Dimitri Fontaine
Magnus Hagander mag...@hagander.net writes:
 Attached is an updated version of Heikki's patch to use a parser for
 the walsender commands, instead of parsing things manually. It also
 does some minor refactoring in walsender.c to break out
 IdentifySystem() and StartReplication() to their own functions to make
 it more readable.

Nice work.

 While having an actual parser here isn't *necessary* at this point, it
 makes things easier. And it will become increasingly useful as we add
 new features (for example, the include all wal files option for
 streaming base backup, and I'm sure that sync rep will require some
 additional commands or changes to commands).

Is that option on the roadmap for 9.1? That's huge! Go Magnus!

 Any objections to doing this?

None here :)
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming base backups

2011-01-10 Thread Stefan Kaltenbrunner

On 01/10/2011 08:13 PM, Cédric Villemain wrote:

2011/1/10 Magnus Hagandermag...@hagander.net:

On Sun, Jan 9, 2011 at 23:33, Cédric Villemain
cedric.villemain.deb...@gmail.com  wrote:

2011/1/7 Magnus Hagandermag...@hagander.net:

On Fri, Jan 7, 2011 at 01:47, Cédric Villemain
cedric.villemain.deb...@gmail.com  wrote:

2011/1/5 Magnus Hagandermag...@hagander.net:

On Wed, Jan 5, 2011 at 22:58, Dimitri Fontainedimi...@2ndquadrant.fr  wrote:

Magnus Hagandermag...@hagander.net  writes:

* Stefan mentiond it might be useful to put some
posix_fadvise(POSIX_FADV_DONTNEED)
   in the process that streams all the files out. Seems useful, as long as that
   doesn't kick them out of the cache *completely*, for other backends as well.
   Do we know if that is the case?


Maybe have a look at pgfincore to only tag DONTNEED for blocks that are
not already in SHM?


I think that's way more complex than we want to go here.



DONTNEED will remove the block from OS buffer everytime.


Then we definitely don't want to use it - because some other backend
might well want the file. Better leave it up to the standard logic in
the kernel.


Looking at the patch, it is (very) easy to add the support for that in
basebackup.c
That supposed allowing mincore(), so mmap(), and so probably switch
the fopen() to an open() (or add an open() just for mmap
requirement...)

Let's go ?


Per above, I still don't think we *should* do this. We don't want to
kick things out of the cache underneath other backends, and since we


we are dropping stuff underneath other backends  anyway but I
understand your point.


can't control that. Either way, it shouldn't happen in the beginning,
and if it does, should be backed with proper benchmarks.


I agree.


well I want to point out that the link I provided upthread actually 
provides a (linux centric) way to do get the property of interest for this:


* if the datablocks are in the OS buffercache just leave them alone, if 
the are NOT tell the OS that this current user is not interested in 
having it there


I would like to see something like that implemented in the backend 
sometime and maybe even as a guc of some sort, that way we actually 
could use that for say a pg_dump run as well, I have seen the 
responsetimes of big boxes tank not because of the CPU and lock-load 
pg_dump imposes but because of the way that it can cause the 
OS-buffercache to get spoiled with not-really-important data.




anyway I agree that the (positive and/or negative) effect of something 
like that needs to be measured but this effect is not too easy to see in 
very simple setups...



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I think we've learned over the years that GUCs that significantly
 change semantics can be foot-guns.  I'm not sure exactly how
 dangerous this one would be
 
I didn't respond to this at first because the idea seemed DOA, but
with Josh's concerns I guess I should answer this question.
 
With the patch, SERIALIZABLE transactions run exactly as they did
before, and as REPEATABLE READ continue to run, except that they are
monitored for read-write conflict patterns which can cause
serialization anomalies.  This monitoring doesn't introduce any new
blocking.  The only behavior change is that there are additional
serialization failures when the monitoring detects dangerous
structures in the rw-conflicts among transactions.  The proposed GUC
would suppress the monitoring in SERIALIZABLE mode and avoid the new
serialization failures, thereby providing legacy behavior --
anomalies and all.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-10 Thread Csaba Nagy
On Fri, 2011-01-07 at 12:32 +0100, t...@fuzzy.cz wrote:
 the problem is you will eventually need to drop the results and rebuild
 it, as the algorithms do not handle deletes (ok, Florian mentioned an
 algorithm L_0 described in one of the papers, but I'm not sure we can use
 it).

Yes, but even then you can start with much better cards if you already
have an estimate of what it looks like, based on the fact that you did
continuous updating of it. For example you'll have a pretty good
estimate of the bounds of the number of distinct values, while if you
really start from scratch you have nothing to start with but assume that
you must cope with the complete range between all values are distinct -
there's only a few of them.

 I'm not sure a constantly running background process is a good idea. I'd
 prefer storing an info about the modified tuples somewhere, and starting
 analyze only when a given threshold is reached. I'm not sure how to do
 that, though.
 
 Another thing I'm not sure about is where to store those intermediate
 stats (used to get the current estimate, updated incrementally).

The forks implementation proposed in other responses is probably the
best idea if usable. It will also solve you the problem of memory
limitations, at the expense of more resources used if the structure
grows too big, but it will still be probably fast enough if it stays
small and in cache.

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_describe_object (was: Re: [HACKERS] obj_unique_identifier(oid))

2011-01-10 Thread Tom Lane
Andreas Karlsson andr...@proxel.se writes:
 Here is the bug-fix patch again with a description of the context so I
 can add it to the commit fest.

 Joel Jacobson discovered a bug in the function pg_describe_object where
 it does not produce unique identifiers for some entries in pg_amproc.

There was never any intention that that code produce a guaranteed-unique
identifier; it's only meant to be a humanly useful identifer, and this
patch seems to me to mostly add noise.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pl/python quoting functions

2011-01-10 Thread Jan Urbański
Here's a patch that adds a few PL/Python functions for quoting strings.
It's an incremental patch on top of the plpython-refactor patch sent in
http://archives.postgresql.org/message-id/4d135170.3080...@wulczer.org.

Git branch for this patch:
https://github.com/wulczer/postgres/tree/functions

The new functions are plpy.quote_literal, plpy.quote_nullable and
plpy.quote_ident, and work just like their sql or plperl equivalents.

Cheers,
Jan
diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile
index 16d78ae..292e360 100644
*** a/src/pl/plpython/Makefile
--- b/src/pl/plpython/Makefile
*** REGRESS = \
*** 79,84 
--- 79,85 
  	plpython_types \
  	plpython_error \
  	plpython_unicode \
+ 	plpython_quote \
  	plpython_drop
  # where to find psql for running the tests
  PSQLDIR = $(bindir)
diff --git a/src/pl/plpython/expected/plpython_quote.out b/src/pl/plpython/expected/plpython_quote.out
index ...b33ee3f .
*** a/src/pl/plpython/expected/plpython_quote.out
--- b/src/pl/plpython/expected/plpython_quote.out
***
*** 0 
--- 1,87 
+ -- test quoting functions
+ CREATE FUNCTION quote(t text, how text) RETURNS text AS $$
+ if how == literal:
+ return plpy.quote_literal(t)
+ elif how == nullable:
+ return plpy.quote_nullable(t)
+ elif how == ident:
+ return plpy.quote_ident(t)
+ else:
+ raise plpy.Error(unrecognized quote type %s % how)
+ $$ LANGUAGE plpythonu;
+ SELECT quote(t, 'literal') FROM (VALUES
+('abc'),
+('a''bc'),
+('''abc'''),
+(''),
+(),
+('xyzv')) AS v(t);
+quote   
+ ---
+  'abc'
+  'a''bc'
+  '''abc'''
+  ''
+  
+  'xyzv'
+ (6 rows)
+ 
+ SELECT quote(t, 'nullable') FROM (VALUES
+('abc'),
+('a''bc'),
+('''abc'''),
+(''),
+(),
+(NULL)) AS v(t);
+quote   
+ ---
+  'abc'
+  'a''bc'
+  '''abc'''
+  ''
+  
+  NULL
+ (6 rows)
+ 
+ SELECT quote(t, 'ident') FROM (VALUES
+('abc'),
+('a b c'),
+('a  ''abc''')) AS v(t);
+ quote 
+ --
+  abc
+  a b c
+  a  'abc'
+ (3 rows)
+ 
+ -- test errors
+ SELECT quote(NULL::text, 'literal');
+ ERROR:  TypeError: argument 1 must be string, not None
+ CONTEXT:  PL/Python function quote
+ SELECT quote(NULL::text, 'ident');
+ ERROR:  TypeError: argument 1 must be string, not None
+ CONTEXT:  PL/Python function quote
+ SELECT quote('abc', 'random');
+ ERROR:  plpy.Error: unrecognized quote type random
+ CONTEXT:  PL/Python function quote
+ DO $$ plpy.quote_literal(3) $$ LANGUAGE plpythonu;
+ ERROR:  TypeError: argument 1 must be string, not int
+ CONTEXT:  PL/Python anonymous code block
+ DO $$ plpy.quote_literal(None) $$ LANGUAGE plpythonu;
+ ERROR:  TypeError: argument 1 must be string, not None
+ CONTEXT:  PL/Python anonymous code block
+ DO $$ plpy.quote_literal({'a': 'b'}) $$ LANGUAGE plpythonu;
+ ERROR:  TypeError: argument 1 must be string, not dict
+ CONTEXT:  PL/Python anonymous code block
+ DO $$ plpy.quote_literal() $$ LANGUAGE plpythonu;
+ ERROR:  TypeError: function takes exactly 1 argument (0 given)
+ CONTEXT:  PL/Python anonymous code block
+ DO $$ plpy.quote_literal(1, 2) $$ LANGUAGE plpythonu;
+ ERROR:  TypeError: function takes exactly 1 argument (2 given)
+ CONTEXT:  PL/Python anonymous code block
+ DO $$ plpy.quote_nullable([1, 2]) $$ LANGUAGE plpythonu;
+ ERROR:  TypeError: argument 1 must be string or None, not list
+ CONTEXT:  PL/Python anonymous code block
+ DO $$ plpy.quote_ident(1.5) $$ LANGUAGE plpythonu;
+ ERROR:  TypeError: argument 1 must be string, not float
+ CONTEXT:  PL/Python anonymous code block
diff --git a/src/pl/plpython/expected/plpython_test.out b/src/pl/plpython/expected/plpython_test.out
index d92c987..961f6c0 100644
*** a/src/pl/plpython/expected/plpython_test.out
--- b/src/pl/plpython/expected/plpython_test.out
*** contents.sort()
*** 43,51 
  return , .join(contents)
  $$ LANGUAGE plpythonu;
  select module_contents();
!   module_contents  
! ---
!  Error, Fatal, SPIError, debug, error, execute, fatal, info, log, notice, prepare, warning
  (1 row)
  
  CREATE FUNCTION elog_test() RETURNS void
--- 43,51 
  return , .join(contents)
  $$ LANGUAGE plpythonu;
  select module_contents();
! module_contents
! ---
!  Error, Fatal, SPIError, debug, error, execute, fatal, info, log, notice, prepare, quote_ident, quote_literal, quote_nullable, warning
  (1 row)
  
  CREATE FUNCTION elog_test() RETURNS void
diff --git a/src/pl/plpython/plpython.c 

Re: [HACKERS] Bug in pg_describe_object (was: Re: [HACKERS] obj_unique_identifier(oid))

2011-01-10 Thread Joel Jacobson
2011/1/10 Tom Lane t...@sss.pgh.pa.us:
 There was never any intention that that code produce a guaranteed-unique
 identifier; it's only meant to be a humanly useful identifer, and this
 patch seems to me to mostly add noise.

For all objects, except for these pg_amproc regclass, the function
does already generate unique strings. They are guaranteed to be unique
thanks to every component of the unique constraints in alll pg_*
tables are included in the unique text identifier.

It makes a lot more sense to fix the function to return a unique
string also for pg_amproc, than to introduce a entirely new function
which returns a unique string identifier. It would hardly break
anything and I think you exaggerate the noise factor.

I can think of numerous reasons why it is absolutely necessary to
provide a function generating unique identifiers for objects:

a) To allow comparing all objects in two different databases, by
comparing objects with the same identifier. This cannot be done using
the oids, since they naturally differ between databases.

b) To draw nice human readable digraphs in the .dot format , instead
of drawing relations digraphs of classid.objid.subobjid.

c) OIDs are probably misused in a lot of applications, due to
misunderstandings of what they are and not are, I for one didn't know
they are not necessarily unique, but only within their regclass. It
would be better to encourage users to use a text string if they need
to refer to a unique objects in their application, than to force them
to use OIDs (or in combination with the regclass, almost as bad), in
lack of something better.
While you could build your own query to generate a unique string,
based on all the columns defining the unique constraint for each
class, doing so is very cumbersome and requires a lot of
postgres-guru-knowledge.

I think it would be a big improvement and increase the number of
possible use cases of the existing pg_describe_object function if the
documentation would say the returned text is guaranteed to be unique
for each object.

-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Kevin Grittner
I wrote:
 
 The proposed GUC would suppress the monitoring in SERIALIZABLE
 mode and avoid the new serialization failures, thereby providing
 legacy behavior -- anomalies and all.
 
After posting that I realized that there's no technical reason that
such a GUC couldn't be set within each session as desired, as long
as we disallowed changes after the first snapshot of a transaction
was acquired.  The IsolationIsSerializable() macro could be modified
to use that along with XactIsoLevel.
 
Really, the biggest risk of such a GUC is the confusion factor when
supporting people.  If we're told that the transactions involved in
some scenario were all run at the SERIALIZABLE isolation level, we
would need to wonder how many *really* were, and how many were (as
David put it) at the NOTREALLYSERIALIZABLEBUTLABELEDASSERIALIZABLE
isolation level?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add function dependencies

2011-01-10 Thread Joel Jacobson
When a function is created, the system validates the syntax and
complains if any function the created function attempts to call is
missing.
I think this is really good, since it traps typos and warns you if you
have forgotten to install any functions your function depends on.

It would be equally useful if it warned you when trying to drop a
function other functions might depend on.

Currently, I do something like SELECT * FROM pg_proc WHERE prosrc LIKE
'%myfunc%' to verify nothing is using the function I'm about to drop.

Just like you can disable the creation check by setting
check_function_bodies to false,
I would suggest a similar option to disable the check upon dropping
functions, to disable the suggested dependency check.

Additionally, if pg_depend would reveal function dependencies, it
would be trivial to automatically generate function call digraphs in
.dot format, showing a nice call tree of your entire system.

-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2011-01-10 Thread Shigeru HANADA
On Fri, 7 Jan 2011 10:57:17 +0900
Itagaki Takahiro itagaki.takah...@gmail.com wrote:
 I updated the COPY FROM API patch.
 - GetCopyExecutorState() is removed because FDWs will use their own context.
 
 The patch just rearranges codes for COPY FROM to export those functions.
 It also modifies some of COPY TO codes internally for code readability.
 - BeginCopyFrom(rel, filename, attnamelist, options)
 - EndCopyFrom(cstate)
 - NextCopyFrom(cstate, OUT values, OUT nulls, OUT tupleOid)
 - CopyFromErrorCallback(arg)

For the purpose of file_fdw, additional ResetCopyFrom() would be
necessary. I'm planning to include such changes in file_fdw patch. 
Please find attached partial patch for ResetCopyFrom(). Is there
anything else which should be done at reset?

 Some items to be considered:
 - BeginCopyFrom() could receive filename as an option instead of a separated
 argument. If do so, file_fdw would be more simple, but it's a change only for
 file_fdw. COPY commands in the core won't be improved at all.

 - NextCopyFrom() returns values/nulls arrays rather than a HeapTuple. I expect
 the caller store the result into tupletableslot with ExecStoreVirtualTuple().
 It is designed for performance, but if the caller always needs an materialized
 HeapTuple, HeapTuple is better for the result type.

IIUC, materizlizing is for tableoid system column. If we could add
tts_tableoid into TupleTableSlot, virtual tuple would be enough.  In
this design, caller can receive results with tts_values/tts_isnull
arrays.

Regards,
--
Shigeru Hanada


20110110-ResetCopyFrom.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/python custom exceptions for SPI

2011-01-10 Thread Hannu Krosing

On 10.1.2011 17:20, Jan Urbański wrote:

On 23/12/10 15:40, Jan Urbański wrote:

Here's a patch implementing custom Python exceptions for SPI errors
mentioned in
http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's
an incremental patch on top of the explicit-subxacts patch sent eariler.

I changed that patch to use Perl instead of sed to generate the
exceptions, which should be a more portable.

Why not python ?

  It's still not nice, and I
think the way forward is to have a common format for SQLSTATE
conditions, as proposed in
http://archives.postgresql.org/message-id/4d19c93c.5000...@wulczer.org.

I failed to follow on with that patch because I couldn't figure out how
to persuade the buildsystem to generate errcodes.h early enough for the
rest of the system to compile, not to mention doing it for the MSVC
build system.

Cheers,
Jan






--

Hannu Krosing
Senior Consultant,
Infinite Scalability  Performance
http://www.2ndQuadrant.com/books/



Re: [HACKERS] Streaming base backups

2011-01-10 Thread Cédric Villemain
2011/1/10 Stefan Kaltenbrunner ste...@kaltenbrunner.cc:
 On 01/10/2011 08:13 PM, Cédric Villemain wrote:

 2011/1/10 Magnus Hagandermag...@hagander.net:

 On Sun, Jan 9, 2011 at 23:33, Cédric Villemain
 cedric.villemain.deb...@gmail.com  wrote:

 2011/1/7 Magnus Hagandermag...@hagander.net:

 On Fri, Jan 7, 2011 at 01:47, Cédric Villemain
 cedric.villemain.deb...@gmail.com  wrote:

 2011/1/5 Magnus Hagandermag...@hagander.net:

 On Wed, Jan 5, 2011 at 22:58, Dimitri
 Fontainedimi...@2ndquadrant.fr  wrote:

 Magnus Hagandermag...@hagander.net  writes:

 * Stefan mentiond it might be useful to put some
 posix_fadvise(POSIX_FADV_DONTNEED)
   in the process that streams all the files out. Seems useful, as
 long as that
   doesn't kick them out of the cache *completely*, for other
 backends as well.
   Do we know if that is the case?

 Maybe have a look at pgfincore to only tag DONTNEED for blocks that
 are
 not already in SHM?

 I think that's way more complex than we want to go here.


 DONTNEED will remove the block from OS buffer everytime.

 Then we definitely don't want to use it - because some other backend
 might well want the file. Better leave it up to the standard logic in
 the kernel.

 Looking at the patch, it is (very) easy to add the support for that in
 basebackup.c
 That supposed allowing mincore(), so mmap(), and so probably switch
 the fopen() to an open() (or add an open() just for mmap
 requirement...)

 Let's go ?

 Per above, I still don't think we *should* do this. We don't want to
 kick things out of the cache underneath other backends, and since we

 we are dropping stuff underneath other backends  anyway but I
 understand your point.

 can't control that. Either way, it shouldn't happen in the beginning,
 and if it does, should be backed with proper benchmarks.

 I agree.

 well I want to point out that the link I provided upthread actually provides
 a (linux centric) way to do get the property of interest for this:

yes, it is exactly what we are talking about here.
mincore and posix_fadvise.

freeBSD should allow that later, at least it is in the todo list
Windows may allow that too with different API.


 * if the datablocks are in the OS buffercache just leave them alone, if the
 are NOT tell the OS that this current user is not interested in having it
 there

my experience is that posix_fadvise on a specific block behave more
brutaly than flaging a whole file. In the later case it may not do
what you want if it estimates it is not welcome (because of other IO
request)

What Magnus point out is that other backends execute queries and
request blocks (and load them in shared buffers of postgresql) and it
is *hard* to be sure we don't remove blocks just loaded by another
backend ( the worst case beeing flushing prefeteched blocks not yet in
shared buffers, cf effective_io_concurrency )


 I would like to see something like that implemented in the backend sometime
 and maybe even as a guc of some sort, that way we actually could use that
 for say a pg_dump run as well, I have seen the responsetimes of big boxes
 tank not because of the CPU and lock-load pg_dump imposes but because of the
 way that it can cause the OS-buffercache to get spoiled with
 not-really-important data.

Glad to here that, pgfincore is also a POC about those topics.
The best solution is to mmap in postgres, but it is not posible, so we
have to do snapshot of objects and restore them afterwards (again *it
is* what tobias do with is rsync). Side note : because of readahead,
inspect block by block while you read the file provide bad results (or
you need to fadvise POSIX_FADV_RANDOM to remove readahead behavior,
which is not good at all).


 anyway I agree that the (positive and/or negative) effect of something like
 that needs to be measured but this effect is not too easy to see in very
 simple setups...

yes. and with pgbase_backup, copying 1GB over the network is longer
than  2 seconds, we will probably need to have a specific strategy.


-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_describe_object

2011-01-10 Thread Josh Berkus

 There was never any intention that that code produce a guaranteed-unique
 identifier; it's only meant to be a humanly useful identifer, and this
 patch seems to me to mostly add noise.

Would making the identifier unique do any *harm*?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2011-01-10 Thread Tom Lane
Shigeru HANADA han...@metrosystems.co.jp writes:
 For the purpose of file_fdw, additional ResetCopyFrom() would be
 necessary. I'm planning to include such changes in file_fdw patch. 
 Please find attached partial patch for ResetCopyFrom(). Is there
 anything else which should be done at reset?

Seems like it would be smarter to close and re-open the copy operation.
Adding a reset function is just creating an additional maintenance
burden and point of failure, for what seems likely to be a negligible
performance benefit.

If you think it's not negligible, please show some proof of that before
asking us to support such code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/python custom exceptions for SPI

2011-01-10 Thread Tom Lane
Hannu Krosing ha...@2ndquadrant.com writes:
 On 10.1.2011 17:20, Jan Urbański wrote:
 I changed that patch to use Perl instead of sed to generate the
 exceptions, which should be a more portable.

 Why not python ?

Because we're not adding even more different tool requirements to the
build process.  Perl is what we've chosen to depend on, and there is no
reason to use a different tool here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming base backups

2011-01-10 Thread Cédric Villemain
2011/1/10 Magnus Hagander mag...@hagander.net:
 On Sun, Jan 9, 2011 at 23:33, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 2011/1/7 Magnus Hagander mag...@hagander.net:
 On Fri, Jan 7, 2011 at 01:47, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 2011/1/5 Magnus Hagander mag...@hagander.net:
 On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 Magnus Hagander mag...@hagander.net writes:
 * Stefan mentiond it might be useful to put some
 posix_fadvise(POSIX_FADV_DONTNEED)
   in the process that streams all the files out. Seems useful, as long 
 as that
   doesn't kick them out of the cache *completely*, for other backends 
 as well.
   Do we know if that is the case?

 Maybe have a look at pgfincore to only tag DONTNEED for blocks that are
 not already in SHM?

 I think that's way more complex than we want to go here.


 DONTNEED will remove the block from OS buffer everytime.

 Then we definitely don't want to use it - because some other backend
 might well want the file. Better leave it up to the standard logic in
 the kernel.

 Looking at the patch, it is (very) easy to add the support for that in
 basebackup.c
 That supposed allowing mincore(), so mmap(), and so probably switch
 the fopen() to an open() (or add an open() just for mmap
 requirement...)

 Let's go ?

 Per above, I still don't think we *should* do this. We don't want to
 kick things out of the cache underneath other backends, and since we
 can't control that. Either way, it shouldn't happen in the beginning,
 and if it does, should be backed with proper benchmarks.

 I've committed the backend side of this, without that. Still working
 on the client, and on cleaning up Heikki's patch for grammar/parser
 support.

attached is a small patch fixing -d basedir when its called with an
absolute path.
maybe we can use pg_mkdir_p() instead of mkdir ?


 --
  Magnus Hagander
  Me: http://www.hagander.net/
  Work: http://www.redpill-linpro.com/




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 098f330..149a2ff 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -257,11 +257,6 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
 	 */
 	verify_dir_is_empty_or_create(current_path);
 
-	if (current_path[0] == '/')
-	{
-		current_path[0] = '_';
-	}
-
 	/*
 	 * Get the COPY data
 	 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Jeff Davis
On Mon, 2011-01-10 at 11:29 -0800, Josh Berkus wrote:
 On 1/10/11 10:47 AM, Kevin Grittner wrote:
  If they're not using SERIALIZABLE, this patch will have no impact on
  them at all.  If they are using SELECT FOR UPDATE *with*
  SERIALIZABLE, everything will function exactly as it is except that
  there may be some serialization failures which they weren't getting
  before, either from the inevitable (but hopefully minimal) false
  positives inherent in the technique or because they missed covering
  something.
 
 Right, that's what I'm worried about.

If we must have a GUC, perhaps we could publish a sunset one release in
the future.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Josh Berkus

 If we must have a GUC, perhaps we could publish a sunset one release in
 the future.

I was thinking default to false/off in 9.1, and disappear in 9.3.

 Really, the biggest risk of such a GUC is the confusion factor when
 supporting people.  If we're told that the transactions involved in
 some scenario were all run at the SERIALIZABLE isolation level, we
 would need to wonder how many *really* were, and how many were (as
 David put it) at the NOTREALLYSERIALIZABLEBUTLABELEDASSERIALIZABLE
 isolation level?

How is this different from our other backwards-compatibility GUCs?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_describe_object

2011-01-10 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 There was never any intention that that code produce a guaranteed-unique
 identifier; it's only meant to be a humanly useful identifer, and this
 patch seems to me to mostly add noise.

 Would making the identifier unique do any *harm*?

It would make dependency error messages significantly longer and less
readable.  Quite aside from the point at hand here, we elide schema
names in many cases (and it looks like there are some code paths where
getObjectDescription never bothers to print them at all).  Another issue
that might make it interesting to try to use the output for purposes
other than human-readable descriptions is that we localize all the
phrases involved.

My point is that this isn't a bug fix, it's more like moving the
goalposts on what getObjectDescription is supposed to do.  And I'm not
even very sure where they're being moved to.  I haven't seen a
specification for an intended use of pg_describe_object for which its
existing behavior would be unsatisfactory.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Robert Haas
On Mon, Jan 10, 2011 at 1:17 PM, Josh Berkus j...@agliodbs.com wrote:
 I'm going to disagree here. For a large, sprawling, legacy application
 changing SERIALIZABLE to REPEATABLE READ in every place in the code
 which might call it can be prohibitively difficult.

What makes you think that would be necessary?  That'd require someone
(a) using serializable, and (b) wanting it to be broken?  I think the
most common reaction would be thank goodness, this thing actually
works now.

 Further, many such
 applications would be written with workarounds for broken serializable
 behavior, workarounds which would behave unpredictably after an upgrade.

Uh...  you want to support that with an example?  Because my first
reaction is that's FUD.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 How is this different from our other backwards-compatibility GUCs?

Mainly, that it's not clear we need it.  Nobody's pointed to a concrete
failure mechanism that makes it necessary for an existing app to run
under fake-SERIALIZABLE mode.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN indexscans versus equality selectivity estimation

2011-01-10 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 On 1/10/11 7:25 AM, Tom Lane wrote:
 I'm a bit worried though that there might be other
 cases where the estimator comes up with 1.0 selectivity but it'd still
 be worth considering a bitmap scan.

 Well, I think the answer is to apply the other fixes, and test.  If
 there are other cases of selectivity=1.0, they'll show up.  People are
 pretty fast to complain if indexes aren't used, and we have a good
 production test case available once you implement the other operators.

Implement the other operators?  I don't think we're on the same page
here.  What I'm talking about is a one-line change in indxpath.c to not
short-circuit consideration of a bitmap indexscan.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Kevin Grittner
Josh Berkus  wrote:
 
 Really, the biggest risk of such a GUC is the confusion factor
 when supporting people.
 
 How is this different from our other backwards-compatibility GUCs?
 
I thought Tom might be concerned about such a GUC destabilizing
things in other ways.  I just wanted to make clear how unlikely that
was in this case.  I agree that the risk of confusion in support is
always there with a backwards-compatibility GUC.
 
I'm still not taking a position either way on this, since I can see
the merit of both arguments and it has little impact on me,
personally.  I'm just trying to be up-front about things so people
can make an informed decision.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN indexscans versus equality selectivity estimation

2011-01-10 Thread Robert Haas
On Mon, Jan 10, 2011 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Jan 9, 2011 at 6:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 or we could hack eqsel() to bound the no-stats estimate to a bit less
 than 1.

 This seems like a pretty sensible thing to do.  I can't immediately
 imagine a situation in which 1.0 is a sensible selectivity estimate in
 the no-stats case and 0.90 (say) is a major regression.

 After sleeping on it, that seems like my least favorite option.  It's
 basically a kluge, as is obvious because there's no principled way to
 choose what the bound is (or the minimum result from
 get_variable_numdistinct, if we were to hack it there).

Well, the general problem is that we have no reasonable way of
handling planning uncertainty.  We have no way of throwing our hands
up in the air and saying I really have no clue how many rows are
going to come out of that node; as far as the rest of the planning
process is concerned, a selectivity estimate of 0.005 based on
column = some MCV with a frequency of 0.005 is exactly identical
to one that results from a completely inscrutable equality condition.
So while I agree with you that there's no particular principled way to
choose the exact value, that doesn't strike me as a compelling
argument against fixing some value.  ISTM that selectivity estimates
of exactly 0 and exactly 1 ought to be viewed with a healthy dose of
suspicion.

 I'm currently
 leaning to the idea of tweaking the logic in indxpath.c; in particular,
 why wouldn't it be a good idea to force consideration of the bitmap path
 if the index type hasn't got amgettuple?  If we don't, then we've
 completely wasted the effort spent up to that point inside
 find_usable_indexes.

I guess the obvious question is: why wouldn't it be a good idea to
force consideration of the bitmap path even if the index type DOES
have amgettuple?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2011-01-10 Thread Robert Haas
On Mon, Jan 10, 2011 at 10:41 AM, Simon Riggs si...@2ndquadrant.com wrote:
  I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
  phases of replication.
 
  That seems reasonable. But if we keep BACKUP in there, should we
  really have it called pg_stat_replication? (yeah, I know, I'm not
  giving up :P)
 
  (You'd need a 4th mode for WAITING or so, to indicate it's waiting for
  a command)
 
  That's something different.
 
  The 3 phases are more concrete.
 
  BACKUP --  CATCHUP---  STREAM
 
  When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
  you never issue a BACKUP. Once we have caught up we move to STREAM. That
  has nothing to do with idle/active.

 So how does a walsender that's waiting for a command from the client
 show up? Surely it's not in catchup mode yet?

 There is a trivial state between connect and first command. If you think
 that is worth publishing, feel free. STARTING?

I think it's worth publishing.  STARTING would be OK, or maybe STARTUP
to parallel the other two -UP states.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN indexscans versus equality selectivity estimation

2011-01-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 10, 2011 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm currently
 leaning to the idea of tweaking the logic in indxpath.c; in particular,
 why wouldn't it be a good idea to force consideration of the bitmap path
 if the index type hasn't got amgettuple?  If we don't, then we've
 completely wasted the effort spent up to that point inside
 find_usable_indexes.

 I guess the obvious question is: why wouldn't it be a good idea to
 force consideration of the bitmap path even if the index type DOES
 have amgettuple?

Well, the motivation is what the code comment said: not to waste time
uselessly considering the bitmap form of an indexscan whose only reason
to live was to produce output sorted in a particular way.  That's
irrelevant for GIN of course, but it's entirely relevant for btree.

It might be just useless over-optimization, but I don't think so --
choose_bitmap_and is O(N^2) in the number of paths submitted to it,
so adding a lot of uninteresting paths doesn't seem smart.

A small variant of the approach would be to only reject paths that have
non-empty pathkeys.  That's not a *sufficient* condition, because a path
could have both pathkeys and good selectivity --- but it could be added
onto the selectivity test.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Josh Berkus

 Mainly, that it's not clear we need it.  Nobody's pointed to a concrete
 failure mechanism that makes it necessary for an existing app to run
 under fake-SERIALIZABLE mode.

I think it's quite possible that you're right, and nobody depends on
current SERIALIZABLE behavior because it's undependable.  However, we
don't *know* that -- most of our users aren't on the mailing lists,
especially those who use packaged vendor software.

That being said, the case for a backwards-compatiblity GUC is weak, and
I'd be ok with not having one barring someone complaining during beta,
or survey data showing that there's more SERIALIZABLE users than we think.

Oh, survey:
http://www.postgresql.org/community/

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI patch(es)

2011-01-10 Thread Dan Ports
On Sat, Jan 08, 2011 at 10:20:22PM -0600, Kevin Grittner wrote:
 One thing that would help a lot besides code review is performance
 testing.  I did some months ago and I know Dan booked time on MIT
 benchmarking systems and got good numbers, but with the refactoring
 it would be good to redo that, and benchmarking properly can be very
 time consuming.  Existing benchmark software might need to be tweaked
 to retry transactions which fail with SQLSTATE 40001, or at least
 continue on with out counting those in TPS figures, since
 applications using this feature will generally have frameworks which
 automatically do retries for that SQLSTATE.

I can certainly try to get a more complete set of DBT-2 results -- and
possibly even do that in a timely manner :-) -- but I doubt I'll have
time in the near future to do anything more comprehensive.

It would be great to have some more results beyond DBT-2/TPC-C.
Although it's certainly an interesting benchmark, it's known not to
exhibit any serialization anomalies under snapshot isolation. (And, of
course, it's seek-bound, so results may not be representative of
workloads that aren't.)

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_describe_object

2011-01-10 Thread Robert Haas
On Mon, Jan 10, 2011 at 7:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It would make dependency error messages significantly longer and less
 readable.  Quite aside from the point at hand here, we elide schema
 names in many cases (and it looks like there are some code paths where
 getObjectDescription never bothers to print them at all).  Another issue
 that might make it interesting to try to use the output for purposes
 other than human-readable descriptions is that we localize all the
 phrases involved.

 My point is that this isn't a bug fix, it's more like moving the
 goalposts on what getObjectDescription is supposed to do.  And I'm not
 even very sure where they're being moved to.  I haven't seen a
 specification for an intended use of pg_describe_object for which its
 existing behavior would be unsatisfactory.

I think that adding the types to the description string is a pretty
sensible thing to do.  Yeah, it makes the error messages longer, but
it also tells you which objects you're actually operating on, a
non-negligible advantage.  It's fairly confusing that pg_amproc has a
four part key, two members of which reference objects which in turn
have compound names.  But leaving out two out of the four parts in the
key is not an improvement.  People aren't going to hit dependencies on
pg_amproc entries every day, but when they do they presumably want to
uniquely identify the objects in question.

Now, I agree that this is probably not quite adequate to the purpose
to which the OP proposed to put it, but that's really another
question.

One gripe I do have is that we should put the operator types in the
same place ALTER OPERATOR FAMILY puts them - immediately after the
support number, and without the word for - rather than all the way
at the end.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fwd: [TESTERS] [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in release notes.

2011-01-10 Thread Itagaki Takahiro
It was reported from a tester that we don't have casts of money from/to integer
types even though we have from/to numeric type.

http://archives.postgresql.org/pgsql-testers/2011-01/msg0.php

Did we have any discussions about the behavior?
I think we should have them for consistency.

-- Forwarded message --
From: Itagaki Takahiro itagaki.takah...@gmail.com
Date: Fri, Jan 7, 2011 at 16:34
Subject: Re: [TESTERS] [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in
release notes.
To: Ramanujam innomot...@gmail.com
Cc: pgsql-test...@postgresql.org

On Fri, Jan 7, 2011 at 15:54, Ramanujam innomot...@gmail.com wrote:
 [Release]: 9.1Alpha3. Binaries compiled with mingw-32 (gcc 4.4.0) on
 i686 without zlib support.

 [Test]: a) Check feature E.1.4.7.2 in 9.1Alpha3 release notes
 (Monetary data type). b) Documentation mistake(?)

 [Results]: Documentation states that integer literals are allowed
 values for input. I am getting the following error:

The docs is:
http://developer.postgresql.org/pgdocs/postgres/datatype-money.html
| Input is accepted in a variety of formats,
| including integer and floating-point literals

The reported issue doesn't depend on lc_monetary.
It comes from missing cast support from integer to money.

Should we have cast to/from integer to numeric?  It is inconsistent
that 1::numeric::money is accepted but 1::money is not.


postgres=# SHOW lc_monetary;
 lc_monetary
-
 C
(1 row)

postgres=# SELECT 1::numeric::money;
 money
---
 $1.00
(1 row)

postgres=# SELECT 1::integer::money;
ERROR:  cannot cast type integer to money
LINE 1: SELECT 1::integer::money;
                        ^
postgres=# SELECT castsource::regtype, casttarget::regtype,
castfunc::regproc, castcontext FROM pg_cast WHERE casttarget =
'money'::regtype;
 castsource | casttarget | castfunc | castcontext
++--+-
 numeric    | money      | money    | a
(1 row)

postgres=# \df money
                         List of functions
  Schema   | Name  | Result data type | Argument data types |  Type
+---+--+-+
 pg_catalog | money | money            | numeric             | normal
(1 row)

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_describe_object

2011-01-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 10, 2011 at 7:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 My point is that this isn't a bug fix, it's more like moving the
 goalposts on what getObjectDescription is supposed to do.

 I think that adding the types to the description string is a pretty
 sensible thing to do.

Not really.  AFAIR, there are two cases that exist in practice,
depending on which AM you're talking about:

1. The recorded types match the input types of the operator/function
   (btree  hash).
2. The recorded types are always the same as the opclass's input type
   (gist  gin).

In neither case does printing those types really add much information.
That's why it's not there now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [TESTERS] [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in release notes.

2011-01-10 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@gmail.com writes:
 It was reported from a tester that we don't have casts of money from/to 
 integer
 types even though we have from/to numeric type.

In most locales, the idea isn't sensible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [TESTERS] [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in release notes.

2011-01-10 Thread Itagaki Takahiro
On Tue, Jan 11, 2011 at 11:10, Tom Lane t...@sss.pgh.pa.us wrote:
 Itagaki Takahiro itagaki.takah...@gmail.com writes:
 It was reported from a tester that we don't have casts of money from/to 
 integer
 types even though we have from/to numeric type.

 In most locales, the idea isn't sensible.

The documentation says:
| Input is accepted in a variety of formats,
| including integer and floating-point literals

If we won't to add accept integers for money, we should fix the docs.
| integer and floating-point string literals
|~~~
Will it get better?

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI patch version 9

2011-01-10 Thread Kevin Grittner
I wrote:
 
 Last time we did stress tests, it uncovered some race conditions.
 Those were fixed at the time, and hopefully we haven't introduced
 any new ones; but it's a Very Good Thing that Dan is able to run
 some more DBT-2 tests, even if that test isn't ideal for
 highlighting SERIALIZABLE issues.
 
Dan's DBT-2 run triggered an Assert that looks like it would be
caused by a race condition in the new code that uses SLRU for
graceful degradation.  I will probably have another patch some time
tomorrow morning.
 
Sorry about this; I guess maybe I should have waited that extra few
days before posting the patch
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compatibility GUC for serializable

2011-01-10 Thread Pavel Stehule
2011/1/11 Robert Haas robertmh...@gmail.com:
 On Mon, Jan 10, 2011 at 1:17 PM, Josh Berkus j...@agliodbs.com wrote:
 I'm going to disagree here. For a large, sprawling, legacy application
 changing SERIALIZABLE to REPEATABLE READ in every place in the code
 which might call it can be prohibitively difficult.

 What makes you think that would be necessary?  That'd require someone
 (a) using serializable, and (b) wanting it to be broken?  I think the
 most common reaction would be thank goodness, this thing actually
 works now.

it works, but not works perfect. Some important toolkit like
performance benchmarks doesn't work with PostgreSQL without failures.
It's one reason why PostgreSQL has less score in some enterprise
rating than MySQL. It working for current user, but it not works well
for users who should do decision for migration to PostgreSQL. I don't
see a problem in GUC, but it isn't a problem - more significant
problem is current PostgreSQL's serializable implementation in general
(that should work on more SQL servers) applications. It's a break for
one class of customers.

Regards

Pavel Stehule


 Further, many such
 applications would be written with workarounds for broken serializable
 behavior, workarounds which would behave unpredictably after an upgrade.

 Uh...  you want to support that with an example?  Because my first
 reaction is that's FUD.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error code for terminating connection due to conflict with recovery

2011-01-10 Thread Tatsuo Ishii
 On Sat, Jan 8, 2011 at 9:52 AM, Tatsuo Ishii is...@postgresql.org wrote:
 While looking at the backend code, I realized that error code for
 terminating connection due to conflict with recovery is
 ERRCODE_ADMIN_SHUTDOWN.

 I thought the error code is for somewhat a human interruption, such as
 shutdown command issued by pg_ctl. Is the usage of the error code
 appropreate?
 
 That doesn't sound right to me.  I'd have thought something in class 40.

What about:

40004 CONFLICT WITH RECOVERY conflict_with_recovery
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] casts: max double precision text double precision fails with out or range error

2011-01-10 Thread Maciej Sakrejda
Tried asking this in pgsql-general but I got no response, so I thought
I'd give hackers a shot:

postgres=# select (((1.7976931348623157081e+308)::double
precision)::text)::double precision;
ERROR:  1.79769313486232e+308 is out of range for type double precision

I'm working on a pg driver and in my float data decoder functional
tests, I ran into some errors that I eventually traced back to this
behavior. Essentially, postgres seems to cast the max normal double
(i.e., the bits of ~(1ULL52 | 1ULL63)) to text in such a manner
that it's rounded up, and the reverse cast, text-to-double-precision,
does not recognize it as being in range. Curiously, pg_dump seems to
print doubles with more precision (in both COPY and INSERT modes),
avoiding this issue. Of course I'm not expecting perfect precision in
round-tripping doubles like this (this is always dicey with IEEE
floating point anyway), but failing outright is a little ugly. Any
thoughts? Version is PostgreSQL 8.4.6 on i486-pc-linux-gnu, compiled
by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 32-bit.

Also, although the simplest way to illustrate this problem is with
this round-trip set of casts, that's obviously a contrived use case.
However, given that the same behavior is seen in the TEXT mode output
for doubles of the FEBE protocol, I think it's a little more
noteworthy.

Thanks,
Maciek Sakrejda

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_describe_object

2011-01-10 Thread Joel Jacobson
2011/1/11 Tom Lane t...@sss.pgh.pa.us:
 It would make dependency error messages significantly longer and less
 readable.  Quite aside from the point at hand here, we elide schema
 names in many cases (and it looks like there are some code paths where
 getObjectDescription never bothers to print them at all).  Another issue
 that might make it interesting to try to use the output for purposes
 other than human-readable descriptions is that we localize all the
 phrases involved.

 My point is that this isn't a bug fix, it's more like moving the
 goalposts on what getObjectDescription is supposed to do.  And I'm not
 even very sure where they're being moved to.  I haven't seen a
 specification for an intended use of pg_describe_object for which its
 existing behavior would be unsatisfactory.

Thanks for some good arguments. I now agree with you it would be a bit
counter productive to change the existing pg_describe_object.
Due to the localization of the phrases and the lack of mandatory
namespace inclusion, you lose the comparison ability anyway.

I instead propose we introduce a new function named
pg_get_object_unique_identifier( classid oid, objid oid, objsubid
integer ) returns text.

The name would make sense since we already have a
pg_get_function_identity_arguments( func_oid ), for a similar purpose
but solely for functions.

-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers