Re: [HACKERS] Assertion failure on hot standby

2010-11-27 Thread Simon Riggs
On Fri, 2010-11-26 at 18:35 -0500, Tom Lane wrote:
 Speaking of which, is there any code in there to ensure that a
 deadlock
 in the standby is resolved by killing HS queries and not the replay
 process?  Because deadlocks are certainly going to be possible no
 matter
 what. 

Locks cause query conflicts, so any attempt to take a lock that is
blocked will be resolved before a deadlock takes place. So that
situation does not arise in the startup process. 

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch BUG #5103: pg_ctl -w (re)start fails with custom unix_socket_directory

2010-11-27 Thread Quan Zongliang
Hi, all

I created a pg_ctl patch to fix:
* BUG #5103: pg_ctl -w (re)start fails with custom unix_socket_directory 
Allow pg_ctl to work properly with configuration files located outside the 
PGDATA directory

I tested it under Windows XP sp3.
All of configuration files(postgresql.conf pg_hba.conf pg_ident.conf) are in 
c:\data,
and data_dir is in C:\Program Files\PostgreSQL\9.0\data

Check the attchment, please.

Another question, after clone source with git I can not compile them:
Bad format filename 'src\bin\scripts\submake-libpq'
Former makefile like createdb: createdb.o ... keywords.o
Now it is createdb: createdb.o ... keywords.o | submake-libpq
How to do this?

--
SEARCHING JOB. I can work on C/C++.

Quan Zongliang quanzongli...@gmail.com
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
old mode 100644
new mode 100755
index 14d36b5..c708ba8
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -89,6 +89,8 @@ static char *register_username = NULL;
 static char *register_password = NULL;
 static char *argv0 = NULL;
 static bool allow_core_files = false;
+static char *pgconf_portstr = NULL;
+static char *pgconf_datadir = NULL;
 
 static void
 write_stderr(const char *fmt,...)
@@ -455,41 +457,7 @@ test_postmaster_connection(bool do_checkpoint)
 * for valid port settings.
 */
if (!*portstr)
-   {
-   char  **optlines;
-
-   optlines = readfile(conf_file);
-   if (optlines != NULL)
-   {
-   for (; *optlines != NULL; optlines++)
-   {
-   p = *optlines;
-
-   while (isspace((unsigned char) *p))
-   p++;
-   if (strncmp(p, port, 4) != 0)
-   continue;
-   p += 4;
-   while (isspace((unsigned char) *p))
-   p++;
-   if (*p != '=')
-   continue;
-   p++;
-   /* advance past any whitespace/quoting */
-   while (isspace((unsigned char) *p) || *p == 
'\'' || *p == '')
-   p++;
-   /* find end of value (not including any ending 
quote/comment!) */
-   q = p;
-   while (*q 
-  !(isspace((unsigned char) *q) ||
-*q == '\'' || *q == '' || *q 
== '#'))
-   q++;
-   /* and save the argument value */
-   strlcpy(portstr, p, Min((q - p) + 1, 
sizeof(portstr)));
-   /* keep looking, maybe there is another */
-   }
-   }
-   }
+   strlcpy(portstr, pgconf_portstr, Min(sizeof(pgconf_portstr)+1, 
sizeof(portstr)));
 
/* Check environment */
if (!*portstr  getenv(PGPORT) != NULL)
@@ -547,6 +515,75 @@ test_postmaster_connection(bool do_checkpoint)
 }
 
 
+static void
+read_conf_file(void)
+{
+   char**optlines;
+   char *p, *q;
+   bool isportnum, isdatadir, isquoted;
+
+   optlines = readfile(conf_file);
+   if (optlines == NULL)
+   return;
+
+   for (; *optlines != NULL; optlines++)
+   {
+   p = *optlines;
+
+   while (isspace((unsigned char) *p))
+   p++;
+
+   if (strncmp(p, port, 4) == 0)
+   {
+   isportnum = true;
+   p += 4;
+   }
+   else if (strncmp(p, data_directory, 14) == 0)
+   {
+   isdatadir = true;
+   p += 14;
+   }
+   else
+   continue;
+
+   while (isspace((unsigned char) *p))
+   p++;
+   if (*p != '=')
+   continue;
+   p++;
+   isquoted = false;
+   /* advance past any whitespace/quoting */
+   while (isspace((unsigned char) *p) || *p == '\'' || *p == '')
+   {
+   if (*p == '\'' || *p == '')
+   isquoted = true;
+   p++;
+   }
+   /* find end of value (not including any ending quote/comment!) 
*/
+   q = p;
+   while (*q 
+  !((!isquoted  isspace((unsigned char) *q)) ||
+*q == '\'' || *q == '' || *q == '#'))
+   q++;
+   /* and save the argument value 

Re: [HACKERS] Assertion failure on hot standby

2010-11-27 Thread Robert Haas
On Fri, Nov 26, 2010 at 7:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 In fact, now that I think about it, what I'm proposing would actually
 substantially REDUCE the risk of deadlock on the standby, because the
 master would only ever need to lock a backing file long enough to drop
 or truncate it, whereas under the present system the startup process
 might need to hold many locks at once.

 Now you're the one spouting nonsense.  Consider a master transaction
 that does

        begin;
        lock table foo;
        alter table foo --- some rewriting table operation;
        alter table foo --- some rewriting table operation;
        alter table foo --- some rewriting table operation;
        alter table foo --- some rewriting table operation;
        alter table foo --- some rewriting table operation;
        alter table foo --- some rewriting table operation;
        alter table foo --- some rewriting table operation;
        commit;

 On the master, no other transaction can see the intermediate states.
 We don't want that property to disappear on the slave.

And why would it?  Which states are visible is something that gets
controlled by MVCC, not by the order in which we remove backing files.

It's possible that to make what I'm describing work correctly, you
need to acquire the locks and then drop all the backing files, rather
than interleaving those operations.  I'm not sure about that.  But
acquiring them at the same point in WAL that master did seems quite
pointless - AFAICS, you lose nothing by postponing until commit time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2010-11-27 Thread Robert Haas
2010/11/25 pasman pasmański pasma...@gmail.com:
I've seen this as well be a performance issue, in particular with partitioned 
tables.
Out of habit I now write functions that always cache the value of the 
function in
a variable and use the variable in the actual query to avoid this particular 
gotcha.

 subquery may be used to cache constants:

 SELECT a_date
  FROM test
  WHERE a_date=(SELECT current_date);


 Seq Scan on test1  (cost=0.01..10.76 rows=5 width=4)
   Filter: (a_date = $0)
   InitPlan 1 (returns $0)
     -  Result  (cost=0.00..0.01 rows=1 width=0)

Interesting.  So we pull a subquery out and treat it as an initplan,
but not a stable function?  Hmm...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-27 Thread Robert Haas
On Thu, Nov 25, 2010 at 1:18 AM, KaiGai Kohei kai...@ak.jp.nec.com wrote:
 The attached patch is revised version.

 - Logging part within auth_delay was removed. This module now focuses on
  injection of a few seconds delay on authentication failed.
 - Documentation parts were added like any other contrib modules.

Committed, with a few adjustments.  Per Fujii Masao's suggestion, I
changed sleep() to pg_usleep(); I also changed the GUC to be reckoned
in milliseconds rather than seconds.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Favorable i--)

2010-11-27 Thread Nikhil Sontakke
Hi, a very good business company, I suggest you go to see:
mobiles3gs.com, and now all of products enjoy great discount, saving
time and money.

 h--)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Favorable i--)

2010-11-27 Thread Manuel Sugawara
On Saturday, November 27, 2010, Nikhil Sontakke nikkh...@gmail.com wrote:
 Hi, a very good business company, I suggest you go to see:
 mobiles3gs.com, and now all of products enjoy great discount, saving
 time and money.

  h--)

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-27 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  PQping is supposed to be smarter about classifying errors
  than this.
 
  I was not aware this was discussed last week because I am behind on
  email.  I was fixing a report from a month ago.  I did explain how I was
  doing the tests.
 
 Um, you did respond in that thread, several times even:
 http://archives.postgresql.org/pgsql-hackers/2010-11/msg01102.php
 so I kind of assumed that the patch you presented this week did
 what was agreed to last week.

Yes, I do remember that, but I remember this:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg01095.php

What we want here is to check the result of postmaster.c's
canAcceptConnections(),

and this:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg01106.php

You do have to distinguish connection failures (ie connection refused)
from errors that came back from the postmaster, and the easiest place to
be doing that is inside libpq.

which I thought meant it had to be done in libpq and we didn't have
access to the postmaster return codes in libpq.

Your changes look very good, and not something I would have been able to
code.

 I have committed a patch to make PQping do what was agreed to.

Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-27 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Tom Lane's message of lun nov 15 02:41:40 -0300 2010:
 
  I believe also that there are probably race conditions in several of
  the steps you listed; in particular there is certainly a risk involved
  in changing the database-we-advertise-being-connected-to versus a
  concurrent DROP DATABASE.  Maybe that's easily soluble, but I'm not sure
  about it.
 
 Perhaps it would help to first disconnect from the original database and
 not reconnect to any one in particular, i.e. stay in a state like the
 autovac launcher is in.  Then cleanup as per Robert's list, then switch
 to a valid database ID.

One question is whether we keep our caches in the hope we will continue
with the same database, or clear our caches so we don't have to incur
the overhead of clearing them when we connect a new session.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Horizontal Write Scaling

2010-11-27 Thread Eliot Gable
Thanks, everyone, for all the feedback! I am nowhere near a database expert
yet, but you guys have been very helpful in clearing up some of my
confusion. I have checked out Postgres-XC and it looks like the version 1.0
that will be released soon probably covers everything I have been looking
for in terms of Postgres capabilities. The big ones are write scaling, read
scaling, consistent view of data between each server, and HA capabilities.
Last time I looked at Postgres-XC was probably a year ago, and it was
nowhere close to what I was looking for at the time, and I forgot all about
it. Now, it looks like a real contender.

I was aware of Postgres-R and was actually thinking I might be able to get
away with using that, but the project I am working on does a substantial
amount of writing, as well as being CPU intensive. Each query executes a
stored procedure which is about 2,500 lines long and pulls data from about
80 tables to compute a final result set. That final result set is returned
to the requester, and is also written into 3 tables (while still inside the
original transaction). One of those tables gets one row while the other two
get 6 - 15 rows per query. I execute hundreds of these queries per second.
So, I need to be able to spread the load across multiple boxes due to CPU
usage, but still have a consistent view of the written data. Using
conventional drives, I would saturate the disk I/O pretty quickly on
commodity hardware. With normal multi-master replication, the cost of making
sure I have enough disk I/O on each server is way more than I have the
budget for. With a write scaling solution, it suddenly looks affordable. I
was looking at maybe getting a single shared RAID array with some
enterprise-class SSDs that could guarantee writes even during a power
failure. I was hoping I could find something that would let multiple
Postgres instances share that disk array as it would be more cost effective
to get both the CPU power and Disk I/O I needed than sticking such a RAID
array in each and every server I was going to spread load across.
Postgres-XC actually makes it look even more affordable, as I now probably
no longer need to consider SSDs, or at least I don't need to consider a RAID
10 array of 4 or more SSDs per box. I can probably do RAID 1 with 2 drives
per box and have plenty of Disk I/O available for the amount of CPU power I
would have in the boxes.

So, thanks again for the feedback.


Re: [HACKERS] Amazon now supporting GPU focused EC2 instances

2010-11-27 Thread Bruce Momjian
Jeroen Vermeulen wrote:
 On 2010-11-15 18:49, Greg Stark wrote:
 
  I've seen papers on doing relational joins using GPUs and I'm sure
  there are other algorithms we wonderful stuff we could do. But if it
  comes at the cost of being able to handle arbitrary join clauses it'll
  be a tough sacrifice to make.
 
 Perhaps the coolest use of all is as an intermediate filtering stage for 
 spatial joins, using collision detection.  Draw your data and your 
 search region (slightly enlarged) as objects and ask the graphics card 
 if the search region collides with anything.  Much like it might ask is 
 this player character bumping into any walls?

Netezza has FPGAs (Field-Programmable Gate Arrays) for each disk drive
that performs a similar function:

http://www.dbms2.com/2009/08/08/netezza-fpga/

The longer answer is:

* Projections
* Restrictions/selections
* Visibility, which for now seems to mean recognizing which rows are
and arent valid under Netezzas form of MVCC (MultiVersion Concurrency
Control).
* Compression and/or decompression (Im a little confused as to
which, but I imagine its both)
* Netezzas form of UDFs (User-Defined Functions)

More details:


http://www.theregister.co.uk/2006/10/03/netezza_annual_conference_roundup/

Another area in which Netezza has been hiding its light under a bushel
is in the matter of FPGAs (field programmable gate arrays). FPGAs are
used to process data as it is streamed off disk. Note that this is
important to understand. Most data warehouse appliances (and, indeed,
conventional products) use a caching architecture whereby data is read
from disk and then held in cache for processing. Netezza, on the other
hand, uses an approach that queries the data as it comes off disk before
passing the results on to memory. In other words it uses a streaming
architecture in which the data is streamed through the queries (whose
programs have been loaded into the FPGA) rather than being stored (even
if in memory) and then queried.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum maintenance_work_mem

2010-11-27 Thread Bruce Momjian
Josh Berkus wrote:
 
  I think the difficulty is figuring out what to get the existing
  workers to give us some memory when a new one comes along.  You want
  the first worker to potentially use ALL the memory... until worker #2
  arrives.
 
 Yeah, doing this would mean that you couldn't give worker #1 all the
 memory, because on most OSes it can't release the memory even if it
 wants to.

FYI, what normally happens in this case is that the memory is pushed to
swap by the kernel and never paged in from swap.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-27 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Committed, after various changes and corrections.  One noteworthy one
 is that I removed the _oid variants, since those would be dead code at
 the moment.

Thanks!

The _oid variants will have to re-appear in the alter extension set
schema patch, which is the last of the series. Meanwhile, I will have
to merge head with the current extension patch (already overdue for a
new version, waiting on purpose) so that it no longer includes the
cfparser and execute from file patches too (which have changed a lot
underneath it already).

I'm not sure there's lots of precedent for dealing with in-commitfest
patches dependencies, so here's a summary of what I think would ideally
happen next (ordering counts):

 1. cfparser
https://commitfest.postgresql.org/action/patch_view?id=413
ready for commiter

 2. pg_execute_from_file
https://commitfest.postgresql.org/action/patch_view?id=414
needs another review and maybe some more documentation

 3. extensions
https://commitfest.postgresql.org/action/patch_view?id=404
needs review and minor updating
will need another version after merging the two previous patches

 4. alter extension set schema
https://commitfest.postgresql.org/action/patch_view?id=416
needs review and a reviewer
will need bitrot fix (and adding in the _oid variants)
would be better to adjust once the 3 previous are in

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Report: Linux huge pages with Postgres

2010-11-27 Thread Tom Lane
We've gotten a few inquiries about whether Postgres can use huge pages
under Linux.  In principle that should be more efficient for large shmem
regions, since fewer TLB entries are needed to support the address
space.  I spent a bit of time today looking into what that would take.
My testing was done with current Fedora 13, kernel version
2.6.34.7-61.fc13.x86_64 --- it's possible some of these details vary
across other kernel versions.

You can test this with fairly minimal code changes, as illustrated in
the attached not-production-grade patch.  To select huge pages we have
to include SHM_HUGETLB in the flags for shmget(), and we have to be
prepared for failure (due to permissions or lack of allocated
hugepages).  I made the code just fall back to a normal shmget on
failure.  A bigger problem is that the shmem request size must be a
multiple of the system's hugepage size, which is *not* a constant
even though the test patch just uses 2MB as the assumed value.  For a
production-grade patch we'd have to scrounge the active value out of
someplace in the /proc filesystem (ick).

In addition to the code changes there are a couple of sysadmin
requirements to make huge pages available to Postgres:

1. You have to configure the Postgres user as a member of the group
that's permitted to allocate hugepage shared memory.  I did this:
sudo sh -c id -g postgres /proc/sys/vm/hugetlb_shm_group
For production use you'd need to put this in the PG initscript,
probably, to ensure it gets re-set after every reboot and before PG
is started.

2. You have to manually allocate some huge pages --- there doesn't
seem to be any setting that says just give them out on demand.
I did this:
sudo sh -c echo 600 /proc/sys/vm/nr_hugepages
which gave me a bit over 1GB of space reserved as huge pages.
Again, this'd have to be done over again at each system boot.

For testing purposes, I figured that what I wanted to stress was
postgres process swapping and shmem access.  I built current git HEAD
with --enable-debug and no other options, and tested with these
non-default settings:
 shared_buffers 1GB
 checkpoint_segments50
 fsync  off
(fsync intentionally off since I'm not trying to measure disk speed).
The test machine has two dual-core Nehalem CPUs.  Test case is pgbench
at -s 25; I ran several iterations of pgbench -c 10 -T 60 bench
in each configuration.

And the bottom line is: if there's any performance benefit at all,
it's on the order of 1%.  The best result I got was about 3200 TPS
with hugepages, and about 3160 without.  The noise in these numbers
is more than 1% though.

This is discouraging; it certainly doesn't make me want to expend the
effort to develop a production patch.  However, perhaps someone else
can try to show a greater benefit under some other test conditions.

regards, tom lane

*** src/backend/port/sysv_shmem.c.orig	Wed Sep 22 18:57:31 2010
--- src/backend/port/sysv_shmem.c	Sat Nov 27 13:39:46 2010
***
*** 33,38 
--- 33,39 
  #include miscadmin.h
  #include storage/ipc.h
  #include storage/pg_shmem.h
+ #include storage/shmem.h
  
  
  typedef key_t IpcMemoryKey;		/* shared memory key passed to shmget(2) */
***
*** 75,80 
--- 76,92 
  	IpcMemoryId shmid;
  	void	   *memAddress;
  
+ #ifdef SHM_HUGETLB
+ 	/* request must be multiple of page size, else shmat() will fail */
+ #define HUGE_PAGE_SIZE (2 * 1024 * 1024)
+ 	size = add_size(size, HUGE_PAGE_SIZE - (size % HUGE_PAGE_SIZE));
+ 
+ 	shmid = shmget(memKey, size,
+    SHM_HUGETLB | IPC_CREAT | IPC_EXCL | IPCProtection);
+ 	if (shmid = 0)
+ 		elog(LOG, shmget with SHM_HUGETLB succeeded);
+ 	else
+ #endif
  	shmid = shmget(memKey, size, IPC_CREAT | IPC_EXCL | IPCProtection);
  
  	if (shmid  0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST insert algorithm rewrite

2010-11-27 Thread Bruce Momjian
Heikki Linnakangas wrote:
 There's no on-disk format changes, except for the additional flag in the 
 page headers, so this does not affect pg_upgrade. However, if there's 
 any invalid keys in the old index because of an incomplete insertion, 
 the new code will not understand that. So you should run vacuum to 
 ensure that there's no such invalid keys in the index before upgrading. 
 Vacuum will print a message in the log if it finds any, and you will 
 have to reindex. But that's what it suggests you to do anyway.

OK, pg_upgrade has code to report invalid gin and hash indexes because
of changes between PG 8.3 and 8.4.  Is this something we would do for
9.0 to 9.1?

You are saying it would have to be run before the upgrade.  Can it not
be run after?

I can output a script to VACUUM all such indexes, and tell users to
manually REINDEX any index that generates a warning messasge.  I don't
have any way to automate an optional REINDEX step.


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-27 Thread Jeff Janes
On Thu, Nov 4, 2010 at 6:35 AM, Stephen Frost sfr...@snowman.net wrote:
 * Jan Urbański (wulc...@wulczer.org) wrote:
 On 04/11/10 14:09, Robert Haas wrote:
  Hmm, I wonder how useful this is given that restriction.

 As KaiGai mentined, it's more to make bruteforcing difficult (read: tmie
 consuming), right?

 Which it would still do, since the attacker would be bumping up against
 max_connections.  max_connections would be a DOS point, but that's no
 different from today.

I haven' t thought of a way to test this, so I guess I'll just ask.
If the attacking client just waits a few milliseconds for a response
and then drops the socket, opening a new one, will the server-side
walking-dead process continue to be charged against max_connections
until it's sleep expires?

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove outdated comments from the regression test files.

2010-11-27 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Robert Haas wrote:
 Remove outdated comments from the regression test files.
 
 Since 2004, int2 and int4 operators do detect overflow; this was fixed by
 commit 4171bb869f234281a13bb862d3b1e577bf336242.
 
 Extracted from a larger patch by Andres Freund.

 I noticed with this commit that we are referencing pre-git-conversion
 git branches, basically adding a dependency on git to our commit
 messages.  I don't see a problem with this, but did we ever reference
 CVS details in CVS commits?  I don't remember any.

I've usually preferred to use a date, eg, my patch of 2009-10-07,
when referring to previous patches in commit messages.  I think people
have occasionally mentioned CVS revision IDs, but the folly of that
should now be obvious.  I agree that reference to a git hash is way
way way too fragile and git-centric.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Remove outdated comments from the regression test files.

2010-11-27 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Robert Haas wrote:
  Remove outdated comments from the regression test files.
  
  Since 2004, int2 and int4 operators do detect overflow; this was fixed by
  commit 4171bb869f234281a13bb862d3b1e577bf336242.
  
  Extracted from a larger patch by Andres Freund.
 
  I noticed with this commit that we are referencing pre-git-conversion
  git branches, basically adding a dependency on git to our commit
  messages.  I don't see a problem with this, but did we ever reference
  CVS details in CVS commits?  I don't remember any.
 
 I've usually preferred to use a date, eg, my patch of 2009-10-07,
 when referring to previous patches in commit messages.  I think people
 have occasionally mentioned CVS revision IDs, but the folly of that
 should now be obvious.  I agree that reference to a git hash is way
 way way too fragile and git-centric.

Who's going to be the first to say that being git-centric can't ever be
a bad thing?  ;-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove outdated comments from the regression test files.

2010-11-27 Thread Robert Haas
On Nov 27, 2010, at 2:49 PM, Bruce Momjian br...@momjian.us wrote:
 Who's going to be the first to say that being git-centric can't ever be
 a bad thing?  ;-)

At least for me, referring to it that way makes finding the original patch an 
order of magnitude faster (git show hash).  YMMV.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove outdated comments from the regression test files.

2010-11-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Nov 27, 2010, at 2:49 PM, Bruce Momjian br...@momjian.us wrote:
 Who's going to be the first to say that being git-centric can't ever be
 a bad thing?  ;-)

 At least for me, referring to it that way makes finding the original patch an 
 order of magnitude faster (git show hash).  YMMV.

[ shrug... ]  You need to take the long view here.  We're not working on
the assumption that git is the last SCM this project will ever use.
Even granting that it is, I don't think git hashes are adequately stable;
loading the code into a different repository would likely result in new
hashes.  And AFAIK there is no mechanism that would fix hash references
embedded in commit log messages (or the code).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Small documentation enhancement for default_tablespace

2010-11-27 Thread Andreas 'ads' Scherbaum


Hello,

last week someone raised in a german webforum the question why 
default_tablespace is not used for CREATE DATABASE. After a brief 
discussion on irc (thanks RhodiumToad) I added a note to the 
documentation mentioning this point.


Patch attached.


Bye

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 4357,4362  COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
--- 4357,4368 
 /para
  
 para
+ This variable is also not used for creating databases. Tablespace
+ information are copied from the template database (usually
+ literaltemplate1/literal) instead.
+/para
+ 
+para
  For more information on tablespaces,
  see xref linkend=manage-ag-tablespaces.
 /para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small documentation enhancement for default_tablespace

2010-11-27 Thread Tom Lane
Andreas 'ads' Scherbaum adsm...@wars-nicht.de writes:
 last week someone raised in a german webforum the question why 
 default_tablespace is not used for CREATE DATABASE. After a brief 
 discussion on irc (thanks RhodiumToad) I added a note to the 
 documentation mentioning this point.

 Patch attached.

Personally I'd have said the description of default_tablespace was
clear enough about this already, but I added the following somewhat
revised text.

   para
This variable is also not used when creating databases.
By default, a new database inherits its tablespace setting from
the template database it is copied from.
   /para


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Performance tests with parallel gmake

2010-11-27 Thread Bruce Momjian
I ran some tests on my dual-Xeon machine to see what speedup using
parallel gmake.  After two builds to warm up the cache, I saw when
alternating gmake and gmake -j2:

gmake   0m57.207s
gmake -j2   0m44.305s
gmake   0m58.691s
gmake -j2   0m33.979s

As you can see, gmake -j2 is ~35% faster than gmake without -j. I will
be using -j from now on.  -j3 shaves another second off.

FYI, I also use ccache for compiles.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What do these terms mean in the SOURCE CODE?

2010-11-27 Thread Bruce Momjian
Vaibhav Kaushal wrote:
 I am going through the Executor code and come across the following terms
 quite often. Can someone tell me what do they mean (in a few (may be a
 couple of) sentences)?
 
 1. Scan State
 2. Plan State
 3. Tuple Projection
 4. EState
 5. Qual
 6. Expression
 
 They sound quite ambiguous in the source code, specially when some of them
 already have terms which have multiple meanings.

Have you read src/backend/executor/README?  It is hard to explain what
these mean without understanding the executor.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PLy_malloc and plperl mallocs

2010-11-27 Thread Jan Urbański
I noticed that PL/Python uses a simple wrapper around malloc that does
ereport(FATAL) if malloc returns NULL. I find it a bit harsh, don't we
normally do ERROR if we run out of memory?

And while looking at how PL/Perl does these things I find that one
failed malloc (in compile_plperl_function) throws an ERROR, and the rest
(in plperl_spi_prepare) are simply unguarded...

I guess PL/Python should stop throwing FATAL errors and PL/Perl should
get its own malloc_or_ERROR helper and start using that.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-27 Thread Bruce Momjian
Josh Berkus wrote:
 All,
 
 Whatever we pick, someone will be confused by it and about equal numbers
 regardless.  Let's just stick with the current patch.
 
 Or we could call it extraint conclusions.  ;-)

I vote for extraint confusions.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Rethinking representation of sort/hash semantics in queries and plans

2010-11-27 Thread Tom Lane
In recent discussions of the plan-tree representation for KNNGIST index
scans, there seemed to be agreement that it'd be a good idea to explicitly
represent the expected sort ordering of the output.  While thinking about
how best to do that, it struck me that there's some pretty horrendous
impedance mismatches in the way we do things now.  Different parts of the
system use two different representations of sort ordering:

* A sort operator (which can have  or  semantics) plus nulls-first flag

* A btree opfamily plus direction and nulls-first flags

Getting from one of these representations to the other is not exactly
cheap, as it requires one or more syscache lookups.  But consider what
happens when we process a simple SELECT ... ORDER BY query to produce
a Sort plan:

1. The parser generates a SortGroupClause, which contains the
sort-operator representation.  This involves looking up the default btree
opclass for the ORDER BY column's datatype, then finding the  or  member
of the opclass.  (These lookups are buffered by the typcache, but we'll
still have to do them at least once per session.)  If you use ORDER BY
USING then you might think it's cheaper ... but we still do a lookup to
verify that the operator is in some btree family.

2. The planner generates a PathKey, which is based on the opfamily
representation, so we have to do get_ordering_op_properties to go back the
other way.

3. If a sort plan is chosen, createplan.c uses get_opfamily_member to go
from the PathKey representation back to sort-operator representation,
because the Sort plan node type stores sort operators.

4. At runtime, tuplesort_begin_heap needs the comparison function for the
sort operator, so it calls get_ordering_op_properties *again* to re-derive
the opfamily, from which it can extract the right support function using
get_opfamily_proc.

Things are only marginally less comical if an indexscan plan is chosen,
and that's only because the IndexScan plan node doesn't contain any
explicit representation of the output sort order.  If we were to solve the
KNNGIST issue by installing sort operator lists in IndexScan, it'd be
about as ugly as this.  (For some extra amusement, trace through where
build_index_pathkeys' data comes from...)

I have not traced through the behavior for hash-based plans as carefully,
but I believe that there are a similar number of conversions between
operator OID and hash opfamily OID representations.

We got into this mess by revising the planner to use opfamily OIDs to
define sort/hash semantics without changing the structures that are its
input and output APIs.  I think it's probably time to fix this.  I don't
have any evidence at the moment about what fraction of SearchSysCache load
is coming from these repeated conversions, but it can't be trivial.  And
quite aside from any performance benefits, it would be conceptually
cleaner to have only one representation of sort semantics not two.

If you look closely at what we're doing with sort operators
(get_ordering_op_properties pretty much encapsulates this), it turns out
that a sort operator is shorthand for three pieces of information:

1. btree opfamily OID
2. specific input datatype for the opfamily
3. ascending or descending direction

So to fix these problems we'd need to replace sort operator OIDs in
SortGroupClause and plan nodes with those three items.  Obviously, this
would be slightly bulkier, but the extra cost added to copying parse and
plan trees should be tiny compared to the avoided syscache lookups.

A possible compromise is to avoid storing the specific input datatype.
In most cases it's the same as the column datatype or expression datatype
that we're feeding to the sort operation, which I believe we can always
get from somewhere else in the plan.  However, it can be different in
binary-compatibility cases (such as feeding a specific array type to
anyarray_ops, or varchar to text_ops).  If we don't store it then we'd
need to add extra complexity in get_opfamily_proc and similar places
to search for a binary-compatible member function or operator.  This
extra cost would be paid only when actually using binary-compatible
cases, but it still seems like it'd be better to pay a little extra
storage to avoid it.

In a similar fashion, I think that the eqop member of SortGroupClause
should be replaced by a hash opfamily OID and specific input datatype
(no direction needed here, of course), and likewise for hash operator
OIDs in hash-based plan node types.

A possible objection to this idea is that instead of having stored rules
and views depending on specific operators, they'd now be shown as
depending on specific opfamilies.  If you were to drop the relevant
operators then you'd get failures at runtime because no suitable member
operator or function could be found.  However, with the current scheme
it's possible to drop the opfamily that makes use of a particular operator
valid in a given query context; so you can still get a failure, though 

Re: [HACKERS] Report: Linux huge pages with Postgres

2010-11-27 Thread Robert Haas
On Sat, Nov 27, 2010 at 2:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 For testing purposes, I figured that what I wanted to stress was
 postgres process swapping and shmem access.  I built current git HEAD
 with --enable-debug and no other options, and tested with these
 non-default settings:
  shared_buffers         1GB
  checkpoint_segments    50
  fsync                  off
 (fsync intentionally off since I'm not trying to measure disk speed).
 The test machine has two dual-core Nehalem CPUs.  Test case is pgbench
 at -s 25; I ran several iterations of pgbench -c 10 -T 60 bench
 in each configuration.

 And the bottom line is: if there's any performance benefit at all,
 it's on the order of 1%.  The best result I got was about 3200 TPS
 with hugepages, and about 3160 without.  The noise in these numbers
 is more than 1% though.

 This is discouraging; it certainly doesn't make me want to expend the
 effort to develop a production patch.  However, perhaps someone else
 can try to show a greater benefit under some other test conditions.

Hmm.  Presumably in order to see a large benefit, you would need to
have shared_buffers set large enough to thrash the TLB.  I have no
idea how big TLBs on modern systems are, but it'd be interesting to
test this on a big machine with 8GB of shared buffers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PLy_malloc and plperl mallocs

2010-11-27 Thread Tom Lane
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 I noticed that PL/Python uses a simple wrapper around malloc that does
 ereport(FATAL) if malloc returns NULL. I find it a bit harsh, don't we
 normally do ERROR if we run out of memory?

 And while looking at how PL/Perl does these things I find that one
 failed malloc (in compile_plperl_function) throws an ERROR, and the rest
 (in plperl_spi_prepare) are simply unguarded...

 I guess PL/Python should stop throwing FATAL errors and PL/Perl should
 get its own malloc_or_ERROR helper and start using that.

The real question is why they're not using palloc instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-27 Thread Bruce Momjian
Robert Haas wrote:
  In fact, it wouldn't be that hard to relax the known at compile time
  constraint either. ?We could just declare:
 
  char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED];
 
  ...and then peel off chunks.
  Won't this just cause loads of additional pagefaults after fork() when those
  pages are used the first time and then a second time when first written to 
  (to
  copy it)?
 
 Aren't we incurring those page faults anyway, for whatever memory
 palloc is handing out?  The heap is no different from bss; we just
 move the pointer with sbrk().

Here is perhaps more detail than you wanted, but ...

Basically in a forked process, the text/program is fixed, and the
initialized data and stack are copy on write (COW).  Allocating a big
block of zero memory in data is unitialized data, and the behavior there
differs depending on whether the parent process faulted in those pages. 
If it did, then they are COW, but if it did not, odds are the OS just
gives them to you clean and not shared.  The pages have to be empty
because if it gave you anything else it could be giving you data from
another process.  For details, see
http://docs.hp.com/en/5965-4641/ch01s11.html, Faulting In a Page of
Stack or Uninitialized Data.

As far as sbrk(), those pages are zero-filled also, again for security
reasons.  You have to clear malloc()'ed memory (or call calloc()) not
because the OS gave you dirty pages but because you might be using
memory that you previously freed.  If you have never freed memory (and
the postmaster/parent has not either), I bet all malloc'ed memory would
be zero-filled.

Not sure that information moves us forward.  If the postmaster cleared
the memory, we would have COW in the child and probably be even slower.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PLy_malloc and plperl mallocs

2010-11-27 Thread Andrew Dunstan



On 11/27/2010 10:28 PM, Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org  writes:

I noticed that PL/Python uses a simple wrapper around malloc that does
ereport(FATAL) if malloc returns NULL. I find it a bit harsh, don't we
normally do ERROR if we run out of memory?
And while looking at how PL/Perl does these things I find that one
failed malloc (in compile_plperl_function) throws an ERROR, and the rest
(in plperl_spi_prepare) are simply unguarded...
I guess PL/Python should stop throwing FATAL errors and PL/Perl should
get its own malloc_or_ERROR helper and start using that.

The real question is why they're not using palloc instead.




Well, the stuff in plperl_spi_prepare needs to be allocated in a 
long-lived context. We could use palloc in TopMemoryContext instead, I 
guess.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Remove outdated comments from the regression test files.

2010-11-27 Thread Robert Haas
On Sat, Nov 27, 2010 at 3:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Nov 27, 2010, at 2:49 PM, Bruce Momjian br...@momjian.us wrote:
 Who's going to be the first to say that being git-centric can't ever be
 a bad thing?  ;-)

 At least for me, referring to it that way makes finding the original patch 
 an order of magnitude faster (git show hash).  YMMV.

 [ shrug... ]  You need to take the long view here.  We're not working on
 the assumption that git is the last SCM this project will ever use.
 Even granting that it is, I don't think git hashes are adequately stable;
 loading the code into a different repository would likely result in new
 hashes.  And AFAIK there is no mechanism that would fix hash references
 embedded in commit log messages (or the code).

Well, if we ever did want to rewrite the entire development history
(why?) I suppose we could rewrite SHA hashes in the commit messages at
the same time.  But I think one big advantage of git (or svn, or
probably any other post-CVS VCS) is that it has unique IDs for
commits.  Referring to them as the commit by so-and-so on
such-and-such a date just on the off chance that we might someday
decide to replace those unique IDs with another set of unique IDs
doesn't make much sense to me.  It makes things more difficult now in
the hope that, ten years from now when we switch systems again, it'll
be easier to use unstructured text to construct a search command to
root through the development history than it will be to map a git
commit id onto a commit id in the new system.  That's possible, but
it's far from obvious.  We are database professionals; we ought to
believe in the value of unique keys.

In fact, I'd go a bit further and say that moving in the direction of
MORE unstructured text is the last thing that our commit messages
need.  Right now, I follow your practice of writing the author of a
patch separated from the last paragraph of the commit message by a
blank line, additionally noting the reviews and others who have
contributed (or who reported the problem).  The syntax falls short of
machine-parseable, though.  Other committers do different things,
listing the author at the end of the paragraph of commit text, or
preceding it with Author:, or burying it somewhere in the middle, or
even writting Commit so-and-so's patch to do something-or-other.  It
is impossible to construct a meaningful history of code contributions
without grepping the logs for each person's name individually; that's
a lot less helpful than it could be, especially since there's no
comprehensive list of name to grep for.   Perhaps it's too late to fix
up the history (though we could annotate it with git notes if someone
were willing to do the legwork), but we could certainly do better
going forward if we were so inclined.  We ought to be looking for ways
to include MORE structured information, not less.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-27 Thread Robert Haas
On Sat, Nov 27, 2010 at 11:18 PM, Bruce Momjian br...@momjian.us wrote:
 Not sure that information moves us forward.  If the postmaster cleared
 the memory, we would have COW in the child and probably be even slower.

Well, we can determine the answers to these questions empirically.  I
think some more scrutiny of the code with the points you and Andres
and Tom have raised is probably in order, and probably some more
benchmarking, too.  I haven't had a chance to do that yet, however.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers