Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names

2008-01-28 Thread Magnus Hagander
On Sun, Jan 27, 2008 at 09:51:48PM -0500, Tom Lane wrote:
 Stephen Frost [EMAIL PROTECTED] writes:
  * Tom Lane ([EMAIL PROTECTED]) wrote:
  Whilst trying to reproduce bug #3902 I noticed that the code doesn't
  work with an abbreviated host name:
 
  Testing w/ 8.3RC2, everything seems to be working fine here:
 
 Okay, that probably means there's something wacko about my Kerberos
 setup.  It's quite likely got something to do with the fact that I
 set up the KDC on the same machine where I'm doing the PG testing,
 which is surely a case that would never be sane in practice.
 
 [ thinks for a bit... ]  In this context there's some ambiguity as to
 whether 'rh2' should resolve as 127.0.0.1 or the machine's real IP
 address, and no doubt something is making the wrong choice someplace.
 That's probably how the localdomain lookups got into it.

Sounds likely. FWIW, DNS issues is by far the most common problem with
Kerberos installations - at least it is on Windows.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] SSL connections don't cope with server crash very well at all

2008-01-28 Thread Magnus Hagander
On Sun, Jan 27, 2008 at 08:09:10PM -0500, Tom Lane wrote:
 If you do a manual kill -9 (for testing purposes) on its connected
 server process, psql normally recovers nicely:
 
 regression=# select 1;
  ?column? 
 --
 1
 (1 row)
 
 -- issue kill here in another window
 regression=# select 1;
 server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
 The connection to the server was lost. Attempting reset: Succeeded.
 regression=# 
 
 But try it with an SSL-enabled connection, and psql just dies rudely.
 Investigation shows that it's being killed by SIGPIPE while attempting
 to clean up the failed connection:
 
 Program received signal SIGPIPE, Broken pipe.
 0x0030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
 (gdb) bt
 #0  0x0030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
 #1  0x003102497a27 in rl_filename_completion_function ()
from /lib64/libcrypto.so.6
 #2  0x003102495e5e in BIO_write () from /lib64/libcrypto.so.6
 #3  0x003877a1f449 in ssl3_write_pending () from /lib64/libssl.so.6
 #4  0x003877a1f8b6 in ssl3_dispatch_alert () from /lib64/libssl.so.6
 #5  0x003877a1d602 in ssl3_shutdown () from /lib64/libssl.so.6
 #6  0x2aac2675 in close_SSL (conn=0x642d60) at fe-secure.c:1095
 #7  0x2aabb483 in pqReadData (conn=0x642d60) at fe-misc.c:719
 #8  0x2aaba9b8 in PQgetResult (conn=0x642d60) at fe-exec.c:1223
 #9  0x2aabaa8e in PQexecFinish (conn=0x642d60) at fe-exec.c:1452
 #10 0x004075b7 in SendQuery (query=value optimized out)
 at common.c:853
 #11 0x00409cf3 in MainLoop (source=0x30f8151680) at mainloop.c:225
 #12 0x0040c3dc in main (argc=value optimized out, argv=0x100)
 at startup.c:352
 
 Apparently we need to do the SIGPIPE disable/enable dance around
 SSL_shutdown() as well as SSL_write().  I wonder whether we don't need
 it around SSL_read() as well --- I seem to recall that OpenSSL might
 either read or write the socket within SSL_read(), due to various corner
 cases in the SSL protocol.
 
 Comments?

Yes, AFAIK it can end up doing writes - for example if it has to
re-negotiate the encryption key.

So yes, I think we need it around both.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Gregory Stark
Neil Conway [EMAIL PROTECTED] writes:

 AFAIK the conclusion reached by the previous thread was that to be type
 safe, you'd need one distinct pseudotype per aggregate function, along
 with some way to let the planner distinguish this class of pseudotypes
 from other types (in order to apply the heuristic that functions like
 these are likely to consume more memory). You could identify this class
 by an additional column in pg_type, but I think we'd need a lot of
 machinery to do this properly (e.g. to allow these types to be created
 via SQL). I wonder if this isn't over-engineering: the simple approach
 originally followed by Stephen Frost was to declare the transition value
 as, say, int8, and just disallow the transition and final functions from
 being called outside an aggregate context. AFAIK this would be safe,
 although of course it is ugly.

The alternative is to use the regular array type and have the implementation
of it have some magic behind the scenes.

I was already thinking we might need some magic like this for read-only cases
like:

select * where i in array[1,3,5,...]

or 

for i in 1..n
 var = arrayvar[i]
 ...
end

Both of these are O(n^2) (assuming the size of the array and the number of
loop iterations are both n). Each array IN scan or index lookup is O(n).

These cases might be easier to deal with, my idea was to memoize the array
contents in a hash data structure referenced by the parse tree fnextra
pointer. The array functions would check their function call site's fnextra
pointer to see if the array has previously been cached in the more efficient
form and is the same array and then use either hash probes for the IN case or
a C datum array for the latter case.

Could the same be done by the aggregate call site where the aggregate's type
is a plain anyarray like normal, but the array_accum call would look at the
call site and stash the actual contents there in a linked list or tuplesort?
The actual anyarray data type would just have a flag saying the data's over
there.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] plperl: Documentation on BYTEA decoding is wrong

2008-01-28 Thread Florian Weimer
* Robert Treat:

 Note we've been using Theo's plperl bytea patch on one of our
 production servers for some time; if anyone wants access to that
 lmk.

I'm interested.  Could you post a pointer to this code, please?

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Zeugswetter Andreas ADI SD

  I liked the synchronized_sequential_scans idea myself.
 
  I think that's a bit too long. How about synchronized_scans, or
  synchronized_seqscans?
 
 We have enable_seqscan already, so that last choice seems to fit in.

Yes looks good, how about synchronized_seqscan without plural ?

Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Proposal: Integrity check

2008-01-28 Thread Zdenek Kotala

Simon Riggs wrote:

On Fri, 2008-01-25 at 17:56 +0100, Zdenek Kotala wrote:
Regarding to Robert Mach's work during Google SOC on data integrity 
check. I would like to improve storage module and implement some 
Robert's code into the core.


I would like to make following modification:

1) Add ReadBuffer_noerror (recommend me better name) function which will 
accept damaged page without Error. This page will be marked as corrupted 
and when ReadBuffer will touch this page then it will be handled in 
standard way.


This is important for check and repair functions to process all table 
without interruption.


We shouldn't let duff data into shared buffers at all.


As Tom mentioned before. I agree, it could cause a lot of problems.


I think you could mix the two methods of reading buffers

- start a subtransaction
- read blocks into shared buffers
- if error, then re-read block into private memory and examine
- carry on thru table in a new subtransaction


It seems like good idea.


OK with other points, except I don't want a new command. Let's do it as
a function that can accept block ranges to check, not just whole tables.
e.g. pg_check_blocks(17, 43) would check blocks 17 - 43


It makes sense. I think following function should cover all cases:

pg_check_blocks() - all db
pg_check_blocks(relno) - all relation
pg_check_blocks(relno, start, stop) - selected interval
pg_check_blocks(relno, array of blocks) - selected blocks


 Zdenek




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Kevin Grittner
 On Sun, Jan 27, 2008 at  9:02 AM, in message
[EMAIL PROTECTED], Gregory Stark [EMAIL PROTECTED]
wrote: 
 
 Perhaps we should have some form of escape hatch for pg_dump to request real
 physical order when dumping clustered tables.
 
It would seem reasonable to me for pg_dump to use ORDER BY to select
data from clustered tables.
 
I don't see a general case for worrying about the order of rows
returned by queries which lack an ORDER BY clause.
 
-Kevin
 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Florian G. Pflug

Guillaume Smet wrote:

On Jan 27, 2008 9:07 PM, Markus Bertheau
[EMAIL PROTECTED] wrote:

2008/1/28, Tom Lane [EMAIL PROTECTED]:

Do we have nominations for a name?  The first idea that comes to
mind is synchronized_scanning (defaulting to ON).
synchronized_sequential_scans is a bit long, but contains the 
keyword sequential scans, which will ring a bell with many, more

so than synchronized_scanning.


synchronize_seqscans?


How about enable_syncscan, or enable_seqscan_sync? It's not strictly
something the influences the planner, but maybe it's similar enough to
justify a similar naming?

regards, Florian Pflug

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names

2008-01-28 Thread Magnus Hagander
On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote:
 
  While I'm complaining: that's got to be one of the least useful error
  messages I've ever seen, and it's for a case that's surely going to be
  fairly common in practice.  Can't we persuade GSSAPI to produce
  something more user-friendly?  At least convert 7 to Server not
  found in Kerberos database?
 
 I agree, and have found it to be very frustrating while working w/
 Kerberos in general.  I *think* there's a library which can convert
 those error-codes (libcomm-err?), but I've not really looked into it
 yet.

AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
gss_display_status function to get the error messages. I think the problem
here is in the Kerberos library?

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Including Snapshot Info with Indexes

2008-01-28 Thread Jonah H. Harris
On Jan 28, 2008 8:21 AM, Gokulakannan Somasundaram [EMAIL PROTECTED] wrote:
 I am not seeing my mail getting listed in the archives. So i am just
 resending it, in case the above one has got missed.

It was sent.  Archive processing is delayed.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
Edison, NJ 08837| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Kevin Grittner
 On Mon, Jan 28, 2008 at  9:00 AM, in message [EMAIL PROTECTED],
Andrew Dunstan [EMAIL PROTECTED] wrote: 
 Kevin Grittner wrote:
 On Sun, Jan 27, 2008 at  9:02 AM, in message
 [EMAIL PROTECTED], Gregory Stark [EMAIL PROTECTED]
 wrote: 
  
 Perhaps we should have some form of escape hatch for pg_dump to request real
 physical order when dumping clustered tables.
  
 It would seem reasonable to me for pg_dump to use ORDER BY to select
 data from clustered tables.
 
 What will be the performance hit from doing that?
 
If the rows actually are in order of the clustered index, it
shouldn't add much more than the time needed to sequentially pass
the clustered index, should it?  Even so, perhaps there should be a
command-line option on pg_dump to control whether it does this.
 
-Kevin
 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Simon Riggs
On Sun, 2008-01-27 at 21:04 -0500, Tom Lane wrote:
 [ redirecting thread to -hackers ]
 
 Neil Conway [EMAIL PROTECTED] writes:
  On Sun, 2008-01-27 at 21:54 +, Gregory Stark wrote:
  I liked the synchronized_sequential_scans idea myself.
 
  I think that's a bit too long. How about synchronized_scans, or
  synchronized_seqscans?
 
 We have enable_seqscan already, so that last choice seems to fit in.

If we're going to have a GUC, we may as well make it as useful as
possible.

Currently we set synch scan on when the table is larger than 25% of
shared_buffers. So increasing shared_buffers can actually turn this
feature off.

Rather than having a boolean GUC, we should have a number and make the
parameter synchronised_scan_threshold. This would then be the size of
a table above which we would perform synch scans. If its set to -1, then
this would be the same as off in all cases. The default value would be
25% of shared_buffers. (Think we can only do that at initdb time
currently).

If we do that, its clearly different from the enable_* parameters, so
the name is easier to decide ;-)

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] system catalog constraints question

2008-01-28 Thread Gevik Babakhani
Hi,

Is there a way to query the column constraints between the tables of system
catalog.
For example pg_attribute.atttypidpg_type.oid. This is described in the 
docs of course , but does the system use something like pg_constaint 
or the system catalog constraints are enforced only in the code?

Regards,
Gevik.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Kevin Grittner
 On Mon, Jan 28, 2008 at 10:36 AM, in message [EMAIL PROTECTED],
Tom Lane [EMAIL PROTECTED] wrote: 
 in general pg_dump's charter is to reproduce
 the state of the database as best it can, not to improve it.
 
Seems that I've often seen it recommended as a way to eliminate bloat.
 
It seems like there are some practical use cases where it would be
a pain to have to do a CLUSTER right on the heels of having used
pg_dump to psql.
 
This does seem like the right way to do it where a user really wants
to maintain the physical sequence; my biggest concern is that
CLUSTER is sometimes used to eliminate bloat, and there is no real
interest in maintaining that sequence later.  I'd bet that people
generally do not alter the table to remove the clustered index
choice, so this option could be rather painful somewhere
downstream, when the sequence has become pretty random.
 
Maybe it would make sense if it was not the default, and the issues
were properly documented under the description of the option?
 
-Kevin
 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Guillaume Smet
Hi Florian,

Glad to see you back!

On Jan 28, 2008 3:25 PM, Florian G. Pflug [EMAIL PROTECTED] wrote:
 How about enable_syncscan, or enable_seqscan_sync? It's not strictly
 something the influences the planner, but maybe it's similar enough to
 justify a similar naming?

It was my first idea but I didn't propose it as it's really a
different thing IMHO. enable_* variables don't change the way
PostgreSQL really does the job as synchronize_scans (or whatever the
name will be) does.
And it's not very consistent with the other GUC variables (most of
them could have enable in their name) but we limited the usage of
enable_* to planner variables. I don't know if it's on purpose though.

--
Guillaume

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] system catalog constraints question

2008-01-28 Thread Gevik Babakhani
Thank you :) 

 -Original Message-
 From: Heikki Linnakangas [mailto:[EMAIL PROTECTED] 
 Sent: Monday, January 28, 2008 5:35 PM
 To: Gevik Babakhani
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] system catalog constraints question
 
 Gevik Babakhani wrote:
  Is there a way to query the column constraints between the 
 tables of 
  system catalog.
  For example pg_attribute.atttypidpg_type.oid. This is 
 described 
  in the docs of course , but does the system use something like 
  pg_constaint or the system catalog constraints are enforced 
 only in the code?
 
 There is pg_depend, which is kind of like constraints, but 
 enforced in code.
 
 -- 
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com
 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Kevin Grittner wrote:
 It would seem reasonable to me for pg_dump to use ORDER BY to select
 data from clustered tables.

 What will be the performance hit from doing that?

That worries me too.  Also, in general pg_dump's charter is to reproduce
the state of the database as best it can, not to improve it.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 It was my first idea but I didn't propose it as it's really a
 different thing IMHO. enable_* variables don't change the way
 PostgreSQL really does the job as synchronize_scans (or whatever the
 name will be) does.
 And it's not very consistent with the other GUC variables (most of
 them could have enable in their name) but we limited the usage of
 enable_* to planner variables. I don't know if it's on purpose though.

Yeah, it is a more or less deliberate policy to use enable_ only for
planner control variables, which this one certainly isn't.  I seem
to recall an argument also that prefixing enable_ is just noise; it
doesn't add anything to your understanding of what the variable does.

So far I think synchronize_seqscans is the best proposal.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Andrew Dunstan



Kevin Grittner wrote:

On Sun, Jan 27, 2008 at  9:02 AM, in message


[EMAIL PROTECTED], Gregory Stark [EMAIL PROTECTED]
wrote: 
 
  

Perhaps we should have some form of escape hatch for pg_dump to request real
physical order when dumping clustered tables.

 
It would seem reasonable to me for pg_dump to use ORDER BY to select

data from clustered tables.
 
I don't see a general case for worrying about the order of rows

returned by queries which lack an ORDER BY clause.
 

  


What will be the performance hit from doing that?

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-01-28 Thread Bruce Momjian

I am not thrilled about moving _some_ of pgcrypto into the backend ---
pgcrypto right now seems well designed and if we pull part of it out it
seems it will be less clear than what we have now.  Perhaps we just need
to document that md5() isn't for general use and some function in
pgcrypto should be used instead?

---

Marko Kreen wrote:
 On 1/21/08, Tom Lane [EMAIL PROTECTED] wrote:
   MD5 is broken in the sense that you can create two or more meaningful
   documents with the same hash.
 
  Note that this isn't actually very interesting for the purpose for
  which the md5() function was put into core: namely, hashing passwords
  before they are stored in pg_authid.
 
 Note: this was bad idea.  The function that should have been
 added to core would be pg_password_hash(username, password).
 
 Adding md5() lessens incentive to install pgcrypto or push/accept
 digest() into core and gives impression there will be sha1(), etc
 in the future.
 
 Now users who want to store passwords in database (the most
 popular usage) will probably go with md5() without bothering
 with pgcrypto.  They probably see Postgres itself uses MD5 too,
 without realizing their situation is totally different from
 pg_authid one.
 
 It's like we have solution that is ACID-compliant 99% of the time in core,
 so why bother with 100% one.
 
 -- 
 marko
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] autonomous transactions

2008-01-28 Thread Decibel!
On Wed, Jan 23, 2008 at 05:50:02PM -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  From looking at how Oracle does them, autonomous transactions are
  completely independent of the transaction that originates them -- they
  take a new database snapshot. This means that uncommitted changes in the
  originating transaction are not visible to the autonomous transaction.
 
  Oh! Recursion depth would need to be tested for as well. Nasty.
 
 Seems like the cloning-a-session idea would be a possible implementation
 path for these too.

Oracle has a feature where you can effectively save a session and return
to it. For example, if filling out a multi-page web form, you could save
state in the database between those calls. I'm assuming that they use
that capability for their autonomous transactions; save the current
session to the stack, clone it, run the autonomous transaction, then
restore the saved one.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpKD3eTOJmEA.pgp
Description: PGP signature


Re: [HACKERS] [PATCHES] Friendly help for psql

2008-01-28 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 Uh, imagine:

   test= SELECT * from pg_class
   test- help

 Technically 'help' is now an alias for 'pg_class'.  Are you suggesting
 supporting 'help' in this usage?  People were saying they forget
 semicolons, so this 'help' usage is quite possible.  We don't want to
 hear Why doesn't 'help' work sometimes?

I don't think we have to worry about supporting that case. We only
support it when the command begins with help. No ambiguity with
SQL to worry about. Don't forget that ctrl-c resets a query as well, so
the above situation is not as dire as it may appear to the uninitiated
user.

 Supporting 'help' in psql seems like a very slippery slope.  We are very
 tight in defining when an entry is psql and when it is SQL and this
 weakens that.

Does not seem slippery to me, seems a simple, one-time fix that replaces
a syntax error with a helpful message. As the instigator of this discussion,
I can assure you I have no plans for anything other than help. It's simple,
standard, and points you to anything else you may need or want to do.

 What would be interesting would be if the _server_ could send back some
 message about Use the help facility of your client application but it
 would have to have a trailing semicolon;  unlikely.  :-(

Now THAT would be a bad mangling of SQL and non-SQL. We'll pretend you
didn't suggest that one. :)

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200801281451
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFHnjJgvJuQZxSWSsgRA8mMAJ4jzlrwYDVguUNt2yAwMTXMvV1S3QCg+SaF
n8ybxb/KbGBxW9aouZGpzaY=
=igJU
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Strange locking choices in pg_shdepend.c

2008-01-28 Thread Decibel!
On Mon, Jan 21, 2008 at 04:54:06PM -0500, Tom Lane wrote:
 It's probably not a good idea to have shdepReassignOwned() take only
 AccessShareLock on pg_shdepend.  Even though the function itself
 merely reads the table, it is going to call functions that will take
 RowExclusiveLock, meaning that we're setting ourselves up for potential
 deadlock failures due to lock-upgrade.  It'd be safer (and faster too)
 to just hold RowExclusiveLock through the whole operation.

Just a thought...

Would it be worthwhile to allow for logging when a lock gets upgraded?
That would make it easier to protect against deadlocks...
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpEq5RUInVoR.pgp
Description: PGP signature


Re: [HACKERS] Spoofing as the postmaster

2008-01-28 Thread Decibel!
On Sat, Dec 22, 2007 at 09:25:05AM -0500, Bruce Momjian wrote:
 So, what solutions exist?  We could require the use of port numbers less
 than 1024 which typically require root and then become a non-root user,
 but that requires root to start the server.  We could put the unix

I don't know about *requiring* this, but it would certainly be a nice
option to have. Right now there's absolutely no way that you could get
Postgres to use a port  1024.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgp5kYYo0x2d8.pgp
Description: PGP signature


Re: [HACKERS] There's random access and then there's random access

2008-01-28 Thread Decibel!
On Wed, Dec 05, 2007 at 01:49:20AM +, Gregory Stark wrote:
 Regardless of what mechanism is used and who is responsible for doing it
 someone is going to have to figure out which blocks are specifically
 interesting to prefetch. Bitmap index scans happen to be the easiest since
 we've already built up a list of blocks we plan to read. Somehow that
 information has to be pushed to the storage manager to be acted upon. 
 
 Normal index scans are an even more interesting case but I'm not sure how hard
 it would be to get that information. It may only be convenient to get the
 blocks from the last leaf page we looked at, for example.

I guess it depends on how you're looking at things... I'm thinking more
in terms of telling the OS to fetch stuff we're pretty sure we're going
to need while we get on with other work. There's a lot of cases where
you know that besides just a bitmap scan (though perhaps code-wise
bitmap scan is easier to implement...)

For a seqscan, we'd want to be reading some number of blocks ahead of
where we're at right now.

Ditto for index pages on an index scan. In addition, when we're scanning
the index, we'd definitely want to issue heap page requests
asynchronously, since that gives the filesystem, etc a better shot at
re-ordering the reads to improve performance.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpSjafYi0G3G.pgp
Description: PGP signature


Re: [HACKERS] Truncate Triggers

2008-01-28 Thread Decibel!
On Fri, Jan 25, 2008 at 11:40:19AM +, Simon Riggs wrote:
 (for 8.4 ...)
 I'd like to introduce triggers that fire when we issue a truncate:

Rather than focusing exclusively on TRUNCATE, how about triggers that
fire whenever any kind of DDL operation is performed? (Ok, truncate is
more DML than DDL, but still).

The reason I put triggers in quotes is because I'm not suggesting that
we actually put triggers on the catalog tables, since we all know that's
hard/impossible. Instead this would have to tie into command processing,
similar to what you're proposing for truncate.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpGSGEk92QWF.pgp
Description: PGP signature


Re: [HACKERS] [PATCH] Add size/acl information when listing databases

2008-01-28 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Andrew Gilligan wrote:
 
 On 20 Jan 2008, at 04:34, Tom Lane wrote:
  Andrew Gilligan [EMAIL PROTECTED] writes:
  Showing the size on \l+ probably makes more sense, but I imagine that
  would require a very different approach due to the permissions  
  changes?
 
  I haven't experimented, but perhaps something along the lines of
 
  case when has_database_privilege(current_user, db.oid, 'connect')
  then pg_database_size...
  else null
  end
 
  would fix it.
 
 
 Yep, that seems to do the trick.
 
 I've attached an updated patch (based on 8.3RC2) that adds the ACL
 information to \l and the size to \l+ if available.
 
 Best regards,
 -Andy
 
 

[ Attachment, skipping... ]

 
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-28 Thread Decibel!
On Wed, Dec 05, 2007 at 06:49:00PM +0100, Guillaume Smet wrote:
 On Dec 5, 2007 3:26 PM, Greg Sabino Mullane [EMAIL PROTECTED] wrote:
  Agreed, this would be a nice 8.4 thing. But what about 8.3 and 8.2? Is
  there a reason not to make this change? I know I've been lazy and not run
  any absolute figures, but rough tests show that raising it (from 10 to
  100) results in a very minor increase in analyze time, even for large
  databases. I think the burden of a slightly slower analyze time, which
  can be easily adjusted, both in postgresql.conf and right before running
  an analyze, is very small compared to the pain of some queries - which 
  worked
  before - suddenly running much, much slower for no apparent reason at all.
 
 As Tom stated it earlier, the ANALYZE slow down is far from being the
 only consequence. The planner will also have more work to do and
 that's the hard point IMHO.

How much more? Doesn't it now use a binary search? If so, ISTM that
going from 10 to 100 would at worst double the time spent finding the
bucket we need. Considering that we're talking something that takes
microseconds, and that there's a huge penalty to be paid if you have bad
stats estimates, that doesn't seem that big a deal. And on modern
machines it's not like the additional space in the catalogs is going to
kill us.

FWIW, I've never seen anything but a performance increase or no change
when going from 10 to 100. In most cases there's a noticeable
improvement since it's common to have over 100k rows in a table, and
there's just no way to capture any kind of a real picture of that with
only 10 buckets.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpTmTmwysHI5.pgp
Description: PGP signature


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Rather than having a boolean GUC, we should have a number and make the
 parameter synchronised_scan_threshold.

This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not.  As the note in heapam.c says,

 * If the table is large relative to NBuffers, use a bulk-read access
 * strategy and enable synchronized scanning (see syncscan.c).  Although
 * the thresholds for these features could be different, we make them the
 * same so that there are only two behaviors to tune rather than four.

It's a bit late in the cycle to be revisiting that choice.  Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave.  The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Strange locking choices in pg_shdepend.c

2008-01-28 Thread Tom Lane
Decibel! [EMAIL PROTECTED] writes:
 Would it be worthwhile to allow for logging when a lock gets upgraded?
 That would make it easier to protect against deadlocks...

There is some debug code for that in the backend, but my experience
is that it's too noisy to have on by default.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-01-28 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 I am not thrilled about moving _some_ of pgcrypto into the backend ---
 pgcrypto right now seems well designed and if we pull part of it out it
 seems it will be less clear than what we have now.  Perhaps we just need
 to document that md5() isn't for general use and some function in
 pgcrypto should be used instead?

I think looking at this as putting some of pg_crypto into core is looking
at this the wrong way. We are never going to put the whole thing into
core given the current state of cryptography laws, as obviously the
current status of giving users md5() and nothing else is not ideal. What
we're looking for is a middle ground. It seems to me we've narrowed
it down to two questions:

1) Does sha1(), or other hashing algorithms risk running afoul of
cryptography regulations?

I'm 100% sure that sha1() itself is not a problem (it's even a PHP builtin,
and good luck finding a box these days wihout that monstrosity installed).
I highly doubt any of the rest (SHA*, HMAC, etc.) are a problem either:
we're doing a one-way hash, not encrypting data. But common sense and
cryptography have seldom been seen together since the start of the cold war,
so I'll hold my final judgement.

2) Which ones do we include?

Putting sha1() seems a no-brainer, but as Joe points out, why not add all
the rest in at the same time?

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200801281506
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFHnjeJvJuQZxSWSsgRA2fWAKCljvbj5BVaFQ5mEDvckNGhVz6rDgCg0DRc
zaIu/rT1vdDrL61JTsXdIZ8=
=7DKm
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names

2008-01-28 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 While I'm complaining: that's got to be one of the least useful error
 messages I've ever seen, and it's for a case that's surely going to be
 fairly common in practice.

 AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
 gss_display_status function to get the error messages. I think the problem
 here is in the Kerberos library?

Yeah, I had verified by tracing through it that the text was just what
gss_display_status gave us.  It could be that it's just plain broken,
but I was sort of hoping that we were using it incorrectly or that
there's some magic flag to set to get better messages out of it.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] system catalog constraints question

2008-01-28 Thread Heikki Linnakangas

Gevik Babakhani wrote:

Is there a way to query the column constraints between the tables of system
catalog.
For example pg_attribute.atttypidpg_type.oid. This is described in the 
docs of course , but does the system use something like pg_constaint 
or the system catalog constraints are enforced only in the code?


There is pg_depend, which is kind of like constraints, but enforced in 
code.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Friendly help for psql

2008-01-28 Thread Bruce Momjian
Alvaro Herrera wrote:
 To avoid the usage of unadorned help (which I don't think is going to
 ever cause conflicts with a SQL command but perhaps it's better to be
 prepared), one idea would be to respond with please execute \help
 instead, and then \help would emit the verbose output.  Perhaps
 eventually we could adorn it with \help category, etc.

Uh, imagine:

test= SELECT * from pg_class
test- help

Technically 'help' is now an alias for 'pg_class'.  Are you suggesting
supporting 'help' in this usage?  People were saying they forget
semicolons, so this 'help' usage is quite possible.  We don't want to
hear Why doesn't 'help' work sometimes?

I think the fundamental problem is that most programs, like ftp, have a
predefined set of single-line commands, while we have an SQL 'language'
that can be multi-line and has no special markings in psql.  In fact the
special marking is for help and psql-commands using backslash.

Supporting 'help' in psql seems like a very slippery slope.  We are very
tight in defining when an entry is psql and when it is SQL and this
weakens that.

What would be interesting would be if the _server_ could send back some
message about Use the help facility of your client application but it
would have to have a trailing semicolon;  unlikely.  :-(

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Hans-Juergen Schoenig


On Jan 28, 2008, at 6:14 PM, Simon Riggs wrote:


On Sun, 2008-01-27 at 21:04 -0500, Tom Lane wrote:

[ redirecting thread to -hackers ]

Neil Conway [EMAIL PROTECTED] writes:

On Sun, 2008-01-27 at 21:54 +, Gregory Stark wrote:

I liked the synchronized_sequential_scans idea myself.



I think that's a bit too long. How about synchronized_scans, or
synchronized_seqscans?


We have enable_seqscan already, so that last choice seems to fit in.


If we're going to have a GUC, we may as well make it as useful as
possible.

Currently we set synch scan on when the table is larger than 25% of
shared_buffers. So increasing shared_buffers can actually turn this
feature off.

Rather than having a boolean GUC, we should have a number and make the
parameter synchronised_scan_threshold. This would then be the  
size of
a table above which we would perform synch scans. If its set to -1,  
then
this would be the same as off in all cases. The default value  
would be

25% of shared_buffers. (Think we can only do that at initdb time
currently).

If we do that, its clearly different from the enable_* parameters, so
the name is easier to decide ;-)



+1
This is in fact a lot more flexible and transparent.
It gives us a lot more control over the process and it is easy to  
explain / understand.


best regards,

hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] [PATCHES] Friendly help for psql

2008-01-28 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Alvaro Herrera wrote:
 To avoid the usage of unadorned help (which I don't think is going to
 ever cause conflicts with a SQL command but perhaps it's better to be
 prepared), one idea would be to respond with please execute \help
 instead, and then \help would emit the verbose output.  Perhaps
 eventually we could adorn it with \help category, etc.

 Uh, imagine:

   test= SELECT * from pg_class
   test- help

You didn't read the patch: it would recognize help only when the input
buffer is empty.  The only thing it's assuming is that no SQL command
will ever *begin* with the word help.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] autonomous transactions

2008-01-28 Thread Roberts, Jon
 On Wed, Jan 23, 2008 at 05:50:02PM -0500, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   From looking at how Oracle does them, autonomous transactions are
   completely independent of the transaction that originates them --
 they
   take a new database snapshot. This means that uncommitted changes
in
 the
   originating transaction are not visible to the autonomous
 transaction.
 
   Oh! Recursion depth would need to be tested for as well. Nasty.
 
  Seems like the cloning-a-session idea would be a possible
implementation
  path for these too.
 
 Oracle has a feature where you can effectively save a session and
return
 to it. For example, if filling out a multi-page web form, you could
save
 state in the database between those calls. I'm assuming that they use
 that capability for their autonomous transactions; save the current
 session to the stack, clone it, run the autonomous transaction, then
 restore the saved one.
 --

You are describing an uncommitted transaction and not an autonomous
transaction.  Transactions in Oracle are not automatically committed
like they are in PostgreSQL.

Here is a basic example of an autonomous transaction: 

create or replace procedure pr_log_error (p_error_message
errorlog.message%type) is 
  pragma autonomous_transaction;
begin
  insert
into errorlog
 (log_user,
  log_time,
  error_message)
  values (user,
  sysdate(),
  p_error_message);
  commit;
exception
  when others then
rollback;
raise;
end;
  

And then you can call it from a procedure like this:

create or replace procedure pr_example is
begin
  null;--do some work
  commit;  --commit the work
exception
  when others
pr_log_error(p_error_message = sqlerrm);
rollback;
raise;
end;

The autonomous transaction allows me to insert and commit a record in
different transaction than the calling procedure so the calling
procedure can rollback or commit.

You can also remove the commit/rollback from pr_example and instead do
it from the anonymous block that calls it.  I just added it to make it
clear that it is a different transaction than the error logging
transaction.



Jon

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-28 Thread Christopher Browne
On Dec 6, 2007 6:28 PM, Decibel! [EMAIL PROTECTED] wrote:
 FWIW, I've never seen anything but a performance increase or no change
 when going from 10 to 100. In most cases there's a noticeable
 improvement since it's common to have over 100k rows in a table, and
 there's just no way to capture any kind of a real picture of that with
 only 10 buckets.

I'd be more inclined to try to do something that was at least somewhat
data aware.

The interesting theory that I'd like to verify if I had a chance
would be to run through a by-column tuning using a set of heuristics.
My first order approximation would be:

- If a column defines a unique key, then we know there will be no
clustering of values, so no need to increase the count...

- If a column contains a datestamp, then the distribution of values is
likely to be temporal, so no need to increase the count...

- If a column has a highly constricted set of values (e.g. - boolean),
then we might *decrease* the count.

- We might run a query that runs across the table, looking at
frequencies of values, and if it finds a lot of repeated values, we'd
increase the count.

That's a bit hand-wavy, but that could lead to both increases and
decreases in the histogram sizes.  Given that, we can expect the
overall stat sizes to not forcibly need to grow *enormously*, because
we can hope for there to be cases of shrinkage.


-- 
http://linuxfinances.info/info/linuxdistributions.html
The definition of insanity is doing the same thing over and over and
expecting different results.  -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Steve Atkins


On Jan 28, 2008, at 8:36 AM, Tom Lane wrote:


Andrew Dunstan [EMAIL PROTECTED] writes:

Kevin Grittner wrote:

It would seem reasonable to me for pg_dump to use ORDER BY to select
data from clustered tables.



What will be the performance hit from doing that?


That worries me too.  Also, in general pg_dump's charter is to  
reproduce

the state of the database as best it can, not to improve it.


One common use of cluster around here is to act as a faster version
of vacuum full when there's a lot of dead rows in a table. There's no
intent to keep the table clustered on that index, and the cluster flag
isn't removed with alter table (why bother, the only thing it affects is
the cluster command).

I'm guessing that's not unusual, and it'd lead to sorting tables as part
of pg_dump.

Cheers,
  Steve


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Heikki Linnakangas

Simon Riggs wrote:

On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

Rather than having a boolean GUC, we should have a number and make the
parameter synchronised_scan_threshold.

This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not.  As the note in heapam.c says,

 * If the table is large relative to NBuffers, use a bulk-read access
 * strategy and enable synchronized scanning (see syncscan.c).  Although
 * the thresholds for these features could be different, we make them the
 * same so that there are only two behaviors to tune rather than four.

It's a bit late in the cycle to be revisiting that choice.  Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave.  The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.


OK, good points. 


I'm still concerned that the thresholds gets higher as we increase
shared_buffers. We may be removing performance features as fast as we
gain performance when we set shared_buffers higher.

Might we agree that the threshold should be fixed at 8MB, rather than
varying upwards as we try to tune? 


Synchronized scans, and the bulk-read strategy, don't help if the table 
fits in cache. If it fits in shared buffers, you're better off keeping 
it there, than swap pages between the OS cache and shared buffers, or 
spend any effort synchronizing scans. That's why we agreed back then 
that the threshold should be X% of shared_buffers.


It's a good point that we don't want pg_dump to screw up the cluster 
order, but that's the only use case I've seen this far for disabling 
sync scans. Even that wouldn't matter much if our estimate for 
clusteredness didn't get screwed up by a table that looks like this: 
5 6 7 8 9 1 2 3 4


Now, maybe there's more use cases where you'd want to tune the 
threshold, but I'd like to see some before we add more knobs.


To benefit from a lower threshold, you'd need to have a table large 
enough that its cache footprint matters, but is still smaller than 25% 
of shared_buffers, and have seq scans on it. In that scenario, you might 
benefit from a lower threshold, because that would leave some 
shared_buffers free for other use. Even that is quite hand-wavey; the 
buffer cache LRU algorithm handles that kind of scenarios reasonably 
well already, and whether or not


To benefit from a larger threshold, you'd need to have a table larger 
than 25% of shared_buffers, but still smaller than shared_buffers, and 
seq scan it often enough that you want to keep it in shared buffers. If 
you're frequently seq scanning a table of that size, you're most likely 
suffering from a bad plan. Even then, the performance difference 
shouldn't be that great, the table surely fits in OS cache anyway, with 
typical shared_buffers settings.


Tables that are seq scanned are typically very small, like a summary 
table with just a few rows, or huge tables in a data warehousing 
system. Between the extremes, I don't think the threshold actually has a 
very big impact.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Jeff Davis
On Sun, 2008-01-27 at 13:37 -0500, Tom Lane wrote:
 Also, does anyone object to making pg_dump just disable it
 unconditionally?  Greg's original gripe only mentioned the case of
 clustered tables, but it'd be kind of a pain to make pg_dump turn it
 on and off again for different tables.  And I could see people
 complaining about pg_dump failing to preserve row order even in
 unclustered tables.
 

If you are running pg_dump, that increases the likelihood that multiple
sequential scans will be reading the same large table at the same time.
Sync scans prevent that additional scan from bringing your active
database to a halt during your dump (due to horrible seeking and poor
cache efficiency).

I think that pg_dump is a good use case for synchronized scans. Assuming
it doesn't hold up 8.3, I think it's worthwhile to consider only
disabling it for tables that have been clustered.

That being said, this isn't a strong objection. Having a painless 8.3
release is the top priority, of course.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Jeff Davis
On Sun, 2008-01-27 at 15:02 +, Gregory Stark wrote:
 It occurred to me the other day that synchronized scans could play havoc with
 clustered tables. When you dump and reload a table even if it was recently
 clustered if any other sequential scans are happening in the system at the
 time you dump it the dump could shuffle the records out of order. 
 
 Now the records would still be effectively ordered for most purposes but our
 statistics can't detect that. Since the correlation would be poor the restored
 database would have markedly different statistics showing virtually no
 correlation on the clustered column.
 
 Perhaps we should have some form of escape hatch for pg_dump to request real
 physical order when dumping clustered tables.
 

Thank you for bringing this up, it's an interesting point.

Keep in mind that this only matters if you are actually running pg_dump
concurrently with another scan, because a scan will reset the starting
point after completing.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Jeff Davis
On Sun, 2008-01-27 at 12:45 -0500, Tom Lane wrote:
 Maybe a GUC variable to enable/disable syncscan?

The first iterations of the patch included a GUC. 

I don't have any objection to re-introducing a GUC to enable/disable it.
However, I would suggest that it defaults to on, because:

1. There aren't many cases where you'd want it to be off, and this
particular case with pg_dump is the best one that I've heard of (thanks
Greg). We want people who install 8.3 to see a boost without lots of
tuning, if possible.
2. It only turns on for tables over 25% of shared buffers anyway.

Introducing GUCs reintroduces the same questions that were discussed
before. 

1. Should the 25% figure be tunable as well? 
2. Remember that the 25% figure is also tied to Simon and Heikki's
buffer recycling patch (buffer ring patch). Should they be configurable
independently? Should they be tied together, but configurable?

The simplest solution, in my opinion, is something like:

large_scan_threshold = 0.25 # set to -1 to disable

Where a scan of any table larger than (large_scan_threshold *
shared_buffers) employs both synchronized scans and buffer recycling. We
may implement other large scan strategies in the future.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Simon Riggs
On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Rather than having a boolean GUC, we should have a number and make the
  parameter synchronised_scan_threshold.
 
 This would open up a can of worms I'd prefer not to touch, having to do
 with whether the buffer-access-strategy behavior should track that or
 not.  As the note in heapam.c says,
 
  * If the table is large relative to NBuffers, use a bulk-read access
  * strategy and enable synchronized scanning (see syncscan.c).  Although
  * the thresholds for these features could be different, we make them the
  * same so that there are only two behaviors to tune rather than four.
 
 It's a bit late in the cycle to be revisiting that choice.  Now we do
 already have three behaviors to worry about (BAS on and syncscan off)
 but throwing in a randomly settable knob will take it back to four,
 and we have no idea how that fourth case will behave.  The other tack we
 could take (having the one GUC variable control both thresholds) is
 not good since it will result in pg_dump trashing the buffer cache.

OK, good points. 

I'm still concerned that the thresholds gets higher as we increase
shared_buffers. We may be removing performance features as fast as we
gain performance when we set shared_buffers higher.

Might we agree that the threshold should be fixed at 8MB, rather than
varying upwards as we try to tune? 

The objective of having a tuning hook would have been simply to
normalise the effects of increasing shared_buffers anyway, so fixing it
would solve the problem I see.

(8MB is the default, based upon 25% of 32MB).

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Jeff Davis
On Sun, 2008-01-27 at 22:11 -0800, Neil Conway wrote:
 p. 564 discusses the required behavior. The result of array_agg() is an
 array with one element per input value, sorted according to the optional
 ORDER BY clause. NULL input values are included in the array, and the
 result for an empty group is NULL, not an empty array. Note that per
 page 66, I'd expect array values in the input to array_agg() not to be
 flattened.

Should there be an inverse operator (a SRF, in this case) that returns a
set from an array?

Regards,
Jeff Davis


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Florian G. Pflug

Steve Atkins wrote:

On Jan 28, 2008, at 8:36 AM, Tom Lane wrote:


Andrew Dunstan [EMAIL PROTECTED] writes:

Kevin Grittner wrote:

It would seem reasonable to me for pg_dump to use ORDER BY to select
data from clustered tables.



What will be the performance hit from doing that?


That worries me too.  Also, in general pg_dump's charter is to reproduce
the state of the database as best it can, not to improve it.


One common use of cluster around here is to act as a faster version
of vacuum full when there's a lot of dead rows in a table. There's no
intent to keep the table clustered on that index, and the cluster flag
isn't removed with alter table (why bother, the only thing it affects is
the cluster command).

I'm guessing that's not unusual, and it'd lead to sorting tables as part
of pg_dump.


I've done that too - and every time I typed that CLUSTER ...  I 
thought why, oh why isn't there something like REWRITE TABLE table, 
which would work just like CLUSTER, but without the sorting ;-) Maybe 
something to put on the TODO list...


We might even call it VACCUM REWRITE ;-)

regards, Florian Pflug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Mail issue

2008-01-28 Thread Decibel!
I had a mail issue on my end which resulted in a number of outbound
emails getting stuck in a queue. They all just went out; sorry for the
flood.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgp2tr8ezOZOZ.pgp
Description: PGP signature


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Simon Riggs
On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
 Tables that are seq scanned are typically very small, like a summary 
 table with just a few rows, or huge tables in a data warehousing 
 system. Between the extremes, I don't think the threshold actually has
 a very big impact.

And if you have a partitioned table with partitions inconveniently
sized? You'd need to *reduce* shared_buffers specifically to get synch
scans and BAS to kick in. Or increase partition size. Both of which
reduce the impact of the benefits we've added.

I don't think the argument that a table is smaller than shared buffers
therefore it is already in shared buffers holds true in all cases. I/O
does matter.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Jeff Davis
On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
 It's a good point that we don't want pg_dump to screw up the cluster 
 order, but that's the only use case I've seen this far for disabling 
 sync scans. Even that wouldn't matter much if our estimate for 
 clusteredness didn't get screwed up by a table that looks like this: 
 5 6 7 8 9 1 2 3 4

It doesn't seem like there is any reason for the estimate to get
confused, but it apparently does. I loaded a test table with a similar
distribution to your example, and it shows a correlation of about -0.5,
but it should be as good as something near -1 or +1.

I am not a statistics expert, but it seems like a better measurement
would be: what is the chance that, if the tuples are close together in
index order, the corresponding heap tuples are close together?.

The answer to that question in your example is very likely, so there
would be no problem.

Is there a reason we don't do this?

Regards,
Jeff Davis


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Strange locking choices in pg_shdepend.c

2008-01-28 Thread Alvaro Herrera
Tom Lane wrote:
 Decibel! [EMAIL PROTECTED] writes:
  Would it be worthwhile to allow for logging when a lock gets upgraded?
  That would make it easier to protect against deadlocks...
 
 There is some debug code for that in the backend, but my experience
 is that it's too noisy to have on by default.

Seems a good idea to separate the upgrade bits from the other stuff and
enable it on --enable-cassert or something similar.  TODO for 8.4?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Truncate Triggers

2008-01-28 Thread Alvaro Herrera
Decibel! wrote:
 On Fri, Jan 25, 2008 at 11:40:19AM +, Simon Riggs wrote:
  (for 8.4 ...)
  I'd like to introduce triggers that fire when we issue a truncate:
 
 Rather than focusing exclusively on TRUNCATE, how about triggers that
 fire whenever any kind of DDL operation is performed? (Ok, truncate is
 more DML than DDL, but still).

I don't think it makes sense in general.  For example, would we fire
triggers on CLUSTER?  Or on ALTER TABLE / SET STATISTICS?

TRUNCATE seems a special case that needs it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Ron Mayer
Jeff Davis wrote:
 On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
   
 clusteredness didn't get screwed up by a table that looks like this: 
 5 6 7 8 9 1 2 3 4
 
 ...test table with a similar
 distribution to your example, and it shows a correlation of about -0.5,
 but it should be as good as something near -1 or +1.

 I am not a statistics expert, but it seems like a better measurement
 would be: what is the chance that, if the tuples are close together in
 index order, the corresponding heap tuples are close together?.
   
Same applies for data clustered by zip-code.

All rows for any State or City or County or SchoolZone
are close together on the same pages; yet postgres's
stats think they're totally unclustered.
 The answer to that question in your example is very likely, so there
 would be no problem.
 Is there a reason we don't do this?
   
I've been tempted to do things like

   update pg_statistic set stanumbers3='{1.0}' where starelid=2617 and
staattnum=7;

after every analyze when I have data like this from tables clustered
by zip.  Seems it'd help more plans than it hurts, but haven't been
brave enough to try in production.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] find_typedef alternative that works on mainstream systems

2008-01-28 Thread Bruce Momjian
Alvaro Herrera wrote:
 Alvaro Herrera wrote:
 
  objdump  -W $object_file | \
  awk '/DW_TAG_/ { grab=0 } /DW_TAG_typedef/ { grab=1 } /DW_AT_name/ { if 
  (grab) { print $0 } }' | \
  sed -e 's/^.*: \([^ ]*\)/\1/' | \
  sort | \
  uniq
 
 I oversimplified the awk line, causing some garbage to appear at the end
 :-(  The full awk line I am using is 
 
 awk '
 /^Contents of / { if (read) exit }
 /^The section / { if (read) exit }
 /^The section .debug_info contains:/ { read=1 }
 /DW_TAG_/ { grab=0 }
 /DW_TAG_typedef/ { grab=1 }
 /DW_AT_name/ { if (read  grab) { print $0 } }
 '

objdump errors on the -W option here (returns -1) so put some
conditional logic in tools/find_typedef and we should be fine.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Joe Conway

Jeff Davis wrote:

On Sun, 2008-01-27 at 22:11 -0800, Neil Conway wrote:

p. 564 discusses the required behavior. The result of array_agg() is an
array with one element per input value, sorted according to the optional
ORDER BY clause. NULL input values are included in the array, and the
result for an empty group is NULL, not an empty array. Note that per
page 66, I'd expect array values in the input to array_agg() not to be
flattened.


Should there be an inverse operator (a SRF, in this case) that returns a
set from an array?


Yes -- see UNNEST

Joe

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Bogus cleanup code in GSSAPI/SSPI patch

2008-01-28 Thread Tom Lane
Hi Magnus,
  Would you take a look at the patch I just committed in fe-connect.c?

I found out today that PQreset() wasn't working on a GSSAPI connection,
because closePGconn hadn't been patched to clear out the GSSAPI state
(resulting in duplicate GSS authentication request failure).  I think
I fixed that but it wouldn't be bad for someone who knows that code
better to double-check.  Also, I can't test the ENABLE_SSPI case,
so that needs to be looked at.

The test case I've been using is to select 1; in psql, then kill -9
the backend from another window, then select 1; again.  psql should
recover and reconnect successfully.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Hitoshi Harada
yet another inverse function I wrote before, though it applies for only 1D
array.

typedef struct _enuminfo{
ArrayType   *data;
char*ptr;
int16   typlen;
booltypbyval;
chartypalign;
} EnumInfo;

Datum array_enum(PG_FUNCTION_ARGS){
FuncCallContext *funcctx;
MemoryContext oldcontext;
ArrayType   *input;
EnumInfo*info;
Datum   result;

if(SRF_IS_FIRSTCALL()){
funcctx = SRF_FIRSTCALL_INIT();
oldcontext =
MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);

input = PG_GETARG_ARRAYTYPE_P(0);
if(ARR_NDIM(input) != 1){
elog(ERROR, array_enum() accepts only one dimension
array.);
}
funcctx-max_calls = ArrayGetNItems(ARR_NDIM(input),
ARR_DIMS(input));

info = (EnumInfo*)palloc0(sizeof(EnumInfo));
info-data = (ArrayType*)PG_DETOAST_DATUM_COPY(input);
info-ptr = ARR_DATA_PTR(info-data);
get_typlenbyvalalign(
info-data-elemtype, 
(info-typlen), 
(info-typbyval), 
(info-typalign)
);

funcctx-user_fctx = info;

MemoryContextSwitchTo(oldcontext);
}

funcctx = SRF_PERCALL_SETUP();
info = funcctx-user_fctx;

if(funcctx-call_cntr  funcctx-max_calls){
/* Get source element */
result = fetch_att(info-ptr, info-typbyval, info-typlen);

info-ptr = att_addlength(info-ptr, info-typlen,
PointerGetDatum(info-ptr));
info-ptr = (char *) att_align(info-ptr, info-typalign);
SRF_RETURN_NEXT(funcctx, result);
}else{
SRF_RETURN_DONE(funcctx);
}
}

Hitoshi Harada

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Joe Conway
 Sent: Tuesday, January 29, 2008 11:00 AM
 To: Jeff Davis
 Cc: Neil Conway; pgsql-hackers
 Subject: Re: [HACKERS] RFC: array_agg() per SQL:200n
 
 Jeff Davis wrote:
  On Sun, 2008-01-27 at 22:11 -0800, Neil Conway wrote:
  p. 564 discusses the required behavior. The result of array_agg() is an
  array with one element per input value, sorted according to the optional
  ORDER BY clause. NULL input values are included in the array, and the
  result for an empty group is NULL, not an empty array. Note that per
  page 66, I'd expect array values in the input to array_agg() not to be
  flattened.
 
  Should there be an inverse operator (a SRF, in this case) that returns a
  set from an array?
 
 Yes -- see UNNEST
 
 Joe
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Transition functions for SUM(::int2), SUM(::int4, SUM(::int8])

2008-01-28 Thread Caleb Welton
Is there any reason that int2_sum, int4_sum, and int8_sum are not marked as
being strict?  All the other transition functions for sum, and every other
built in aggregation function is marked as strict, as demonstrated with:

select x.proname, t.proname, t.proisstrict
from ((pg_aggregate a left join
   pg_proc x on (a.aggfnoid = x.oid)) left join
   pg_proc t on (a.aggtransfn = t.oid))
where not t.proisstrict;

 proname | proname  | proisstrict
-+--+-
 sum | int2_sum | f
 sum | int4_sum | f
 sum | int8_sum | f


select x.proname, t.proname, t.proisstrict
from ((pg_aggregate a left join
   pg_proc x on (a.aggfnoid = x.oid)) left join
   pg_proc t on (a.aggtransfn = t.oid))
where x.proname = 'sum';

 proname |   proname   | proisstrict
-+-+-
 sum | int8_sum| f
 sum | int4_sum| f
 sum | int2_sum| f
 sum | float4pl| t
 sum | float8pl| t
 sum | cash_pl | t
 sum | interval_pl | t
 sum | numeric_add | t
(8 rows)

Thanks,
  Caleb


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Transition functions for SUM(::int2), SUM(::int4, SUM(::int8])

2008-01-28 Thread Tom Lane
Caleb Welton [EMAIL PROTECTED] writes:
 Is there any reason that int2_sum, int4_sum, and int8_sum are not marked as
 being strict?

They wouldn't work otherwise, because the transition datatypes aren't
the same as the inputs.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] autonomous transactions

2008-01-28 Thread Hans-Juergen Schoenig


On Jan 25, 2008, at 7:27 AM, Decibel! wrote:


On Wed, Jan 23, 2008 at 05:50:02PM -0500, Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

From looking at how Oracle does them, autonomous transactions are
completely independent of the transaction that originates them  
-- they
take a new database snapshot. This means that uncommitted  
changes in the
originating transaction are not visible to the autonomous  
transaction.



Oh! Recursion depth would need to be tested for as well. Nasty.


Seems like the cloning-a-session idea would be a possible  
implementation

path for these too.


Oracle has a feature where you can effectively save a session and  
return
to it. For example, if filling out a multi-page web form, you could  
save

state in the database between those calls. I'm assuming that they use
that capability for their autonomous transactions; save the current
session to the stack, clone it, run the autonomous transaction, then
restore the saved one.



If you want to use it for webforms you cannot just put it on the  
stack - you had to put it in shared memory because you don't know if  
you will ever get the same database connection back from the pool.
personally i like marko's idea. if a snapshot was identified by a key  
it would be perfect. we could present the snapshots saved as a nice  
nice superuser-readable system view (similar to what we do for 2PC)


the only thing i would do is to give those snapshots some sort of  
timeout (configurable). otherwise we will get countless VACUUM  
related reports.

this sounds like a very cool feature - definitely useful.

many thanks,

hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at