[HACKERS] Raise a WARNING if a REVOKE affects nothing?

2012-08-21 Thread Craig Ringer

Hi all

I'm seeing lots of confusion from people about why:

REVOKE CONNECT ON DATABASE foo FROM someuser;

doesn't stop them connecting. Users seem to struggle to understand that:

- There's a default GRANT to public; and
- REVOKE removes existing permissions, it doesn't add deny rules

It'd really help if REVOKE consistently raised warnings when it didn't 
actually revoke anything.


Even better, a special case for REVOKEs on objects that only have owner 
and public permissions could say:


WARNING: REVOKE didn't remove any permissions for user blah. This 
table/db/whatever
has default permissions, so there were no GRANTs for user blah to 
revoke. See the documentation

for REVOKE for more information.


Opinions?


--
Craig Ringer


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?

2012-08-21 Thread Darren Duncan

That sounds like a good change to me. -- Darren Duncan

Craig Ringer wrote:

Hi all

I'm seeing lots of confusion from people about why:

REVOKE CONNECT ON DATABASE foo FROM someuser;

doesn't stop them connecting. Users seem to struggle to understand that:

- There's a default GRANT to public; and
- REVOKE removes existing permissions, it doesn't add deny rules

It'd really help if REVOKE consistently raised warnings when it didn't 
actually revoke anything.


Even better, a special case for REVOKEs on objects that only have owner 
and public permissions could say:


WARNING: REVOKE didn't remove any permissions for user blah. This 
table/db/whatever
has default permissions, so there were no GRANTs for user blah to 
revoke. See the documentation

for REVOKE for more information.


Opinions?


--
Craig Ringer






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1

2012-08-21 Thread Kaare Rasmussen

On 2012-08-20 18:36, Tom Lane wrote:

Alvaro Herreraalvhe...@2ndquadrant.com  writes:

Excerpts from Alex Hunsaker's message of lun ago 20 12:03:11 -0400 2012:

Hrm seems to work for me. What version of perl is this?
$ perl -V
Summary of my perl5 (revision 5 version 16 subversion 0) configuration:

I can reproduce the failure with 5.14.2

Smells like a Perl bug to me.  Has anybody attempted to reproduce this
just in Perl itself, not PL/Perl?

regards, tom lane


I can't reproduce it in perl, but it's in PL/perl from 9.2 beta built 
with perl 5.14.2. Currently I don't have another perl with libperl 
installed.


This produces the error:

CREATE OR REPLACE FUNCTION test1() RETURNS SETOF NUMERIC AS $BODY$
use strict;
use warnings;
for (0..9) {
my $rand = rand();
$rand =~ m/(.*)/;
return_next($1);
}
return;
$BODY$ LANGUAGE plperl;

Adding 'elog(NOTICE, rand:$rand, :$1);' after the capture maked the 
error go away. Do does changining the return_next statemtnt to 'my 
$retvalue=$1;return_next($retvalue);'




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] temporal support patch

2012-08-21 Thread Vlad Arkhipov

On 08/21/2012 01:52 PM, Jeff Davis wrote:

On Mon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote:

Personally, I would prefer a tool which just made it simpler to build my
own triggers, and made it automatic for the history table to track
changes in the live table.  I think anything we build which controls
what goes into the history table, etc., will only narrow the user base.

That sounds like a good way to start. Actually, even before the tool,
how about just some really good examples of triggers for specific kinds
of audit logs, and some ways to run queries on them? I think that might
settle a lot of these details.


Here is the example of triggers we use in our applications. This is the 
test implementation, the production one uses similar triggers written in C.

http://softus.org/?page_id=63

1. There are 3 tables: test contains only current data, test_history 
contains only historical data and test_audit contains all data.
2. There must be a field in an audited table system_time for a period of 
validity of the row.
3. Optional fields are: txid_modified, user_modified for txid/user that 
inserts or updated the row, txid_deleted, user_deleted for txid/user 
that deleted the row. There may be other information in the audit table 
that was omitted in the example (client IP, host name, etc.)
3. We do not use txid_current() as transaction ID because backup/restore 
resets it.

4. User is set by the application (audit.current_user() is just a dummy).
5. There is no exclusion constraint on (primary key, system_time) in 
history table, integrity is maintained by triggers (however the user can 
damage the historical data by modifying test_history table).
6. It's important to understand that when audit triggers are enabled 
some modifications can fail because the same row may be concurrently 
modified by another transaction CURRENT_TIMESTAMP of which is lower or 
the same as the current one.


[HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
Hi,

I am thinking about to implement multi-master option for pgbench.
Supose we have multiple PostgreSQL running on host1 and host2.
Something like pgbench -c 10 -h host1,host2... will create 5
connections to host1 and host2 and send queries to host1 and host2.
The point of this functionality is to test some cluster software which
have a capability to create multi-master configuration.

Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] temporal support patch

2012-08-21 Thread Anssi Kääriäinen
I have written one approach to audit tables, available from 
https://github.com/akaariai/pgsql_shadow_tables


The approach is that every table is backed by a similar audit table + 
some meta information. The tables and triggers to update the audit 
tables are managed by plpgsql procedures.


While the approach isn't likely that interesting itself there is one 
interesting aspects. Views similar to the original tables are created 
automatically in the shadow schema. The views use a session variable for 
wanted snapshot time. The reason is that one can use this to query the 
database at wanted time:


set search_path = 'shadow_public, public';
set test_session_variable.view_time = 'wanted view timestamp'; -- for 
example '2012-05-06 22:08:00'


And now you can use exactly the same queries you use normally to 
retrieve data from wanted view timestamp. This is very useful if you 
happen to use an ORM.


In addition the known limitations mentioned in the README are likely 
something the temporal support patch needs to tackle.


 - Anssi


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Michael Paquier
On Tue, Aug 21, 2012 at 6:04 PM, Tatsuo Ishii is...@postgresql.org wrote:

 Hi,

 I am thinking about to implement multi-master option for pgbench.
 Supose we have multiple PostgreSQL running on host1 and host2.
 Something like pgbench -c 10 -h host1,host2... will create 5
 connections to host1 and host2 and send queries to host1 and host2.
 The point of this functionality is to test some cluster software which
 have a capability to create multi-master configuration.

Perhaps the read option has a good interest for PostgreSQL to check a
simultaneous load on a multiple cluster of Postgres with read operations.
But I do not see any immediate use of write operations only. Have you
thought about the possibility to define a different set of transaction
depending on the node targetted? For example you could target a master with
write-read and slaves with read-only.

Btw, this could have some use not only for Postgres, but also for other
projects based on it with which you could really do some multi-master
benchmark in writing.
Do you have some thoughts about the possible option specifications?
Configuration files would be too heavy for the only purpose of pgbench. So,
specifiying all the info in a single command? It is of course possible, but
command will become easily unreadable, and it might be the cause of many
mistakes.

However, here are some ideas you might use:
1) pgbench -h host1:port1,host2:port2 ...
2) pgbench -h host1,host2 -p port1:port2

Regards,
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-21 Thread Heikki Linnakangas

On 20.08.2012 00:31, Alexander Korotkov wrote:

On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:


On 15.08.2012 11:34, Alexander Korotkov wrote:


Ok, we've to decide if we need standard histogram. In some cases it can
be used for more accurate estimation of   and   operators.
But I think it is not so important. So, we can replace standard
histogram
with histograms of lower and upper bounds?


Yeah, I think that makes more sense. The lower bound histogram is still
useful for  and  operators, just not as accurate if there are lots of
values with the same lower bound but different upper bound.


New version of patch.
* Collect new stakind STATISTIC_KIND_BOUNDS_HISTOGRAM, which is lower and
upper bounds histograms combined into single ranges array, instead
of STATISTIC_KIND_HISTOGRAM.


Ah, that's an interesting approach. So essentially, the histogram looks 
just like a normal STATISTIC_KIND_HISTOGRAM histogram, but the values 
stored in it are not picked the usual way. The usual way would be to 
pick N evenly-spaced values from the column, and store those. Instead, 
you pick N evenly-spaced lower bounds, and N evenly-spaced upper bounds, 
and construct N range values from those. Looking at a single value in 
the histogram, its lower bound comes from a different row than its upper 
bound.


That's pretty clever - the histogram has a shape and order that's 
compatible with a histogram you'd get with the standard scalar 
typanalyze function. In fact, I think you could just let the standard 
scalar estimators for  and  to use that histogram as is. Perhaps we 
should use STATISTIC_KIND_HISTOGRAM for this after all...


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u

2012-08-21 Thread Sachin Srivastava
Hi all,

On my Mac 10.6.8 using ActiveState Python 3.2.2, I am getting a crash when
I try to execute CREATE EXTENSION plpython3u
This is the backtrace:

Program received signal SIGABRT, Aborted.
0x7fff899a40b6 in __kill ()
(gdb) bt
#0  0x7fff899a40b6 in __kill ()
#1  0x7fff89a449f6 in abort ()
#2  0x00010a5da84c in Py_InitializeEx ()
#3  0x000100bd7467 in _PG_init ()
#4  0x0001003d8892 in internal_load_library ()
#5  0x0001003d835e in load_external_function ()
#6  0x0001000c0509 in fmgr_c_validator ()
#7  0x0001003dc808 in OidFunctionCall1Coll ()
#8  0x0001000c01e0 in ProcedureCreate ()
#9  0x0001001563ca in CreateProceduralLanguage ()
#10 0x0001002c60c8 in standard_ProcessUtility ()
#11 0x0001002c49a9 in ProcessUtility ()
#12 0x00010014005d in execute_sql_string ()
#13 0x0001001404f8 in execute_extension_script ()
#14 0x000100141459 in CreateExtension ()
#15 0x0001002c51b0 in standard_ProcessUtility ()
#16 0x0001002c49a9 in ProcessUtility ()
#17 0x0001002c38f6 in PortalRunUtility ()
#18 0x0001002c3a9b in PortalRunMulti ()
#19 0x0001002c3085 in PortalRun ()
#20 0x0001002bcb16 in exec_simple_query ()
#21 0x0001002c1416 in PostgresMain ()
#22 0x00010026d1d9 in BackendRun ()
#23 0x00010026c898 in BackendStartup ()
#24 0x000100269684 in ServerLoop ()
#25 0x000100268fa3 in PostmasterMain ()
#26 0x0001001dcd4d in main ()
(gdb)

CREATE EXTENSION plpython2u works when I compile the same code using
ActiveState Python 2.7.

So is it an issue with the ActiveState Python 3.2??



-- 
Regards,
Sachin Srivastava
EnterpriseDB, India


Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2012-08-21 Thread Robert Haas
On Mon, Aug 20, 2012 at 6:44 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Aug 20, 2012 at 4:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Surely we could just prevent creation of the FSM until the table has
 reached at least, say, 10 blocks.

 Any threshold beyond one block would mean potential space wastage,
 but it's hard to get excited about that until you're into the dozens
 of pages.

 I dunno, I think one-row tables are pretty common.

 Sure, and for that you don't need an FSM, because any row allocation
 attempt will default to trying the last existing block before it extends
 (see RelationGetBufferForTuple).  It's only once you've got more than
 one block in the table that it becomes interesting.

 If we had a convention that FSM is only created for rels of more than
 N blocks, perhaps it'd be worthwhile to teach RelationGetBufferForTuple
 to try all existing blocks when relation size = N.  Or equivalently,
 hack the FSM code to return all N pages when it has no info.

Now that's an idea I could get behind.  I'd pick a smaller value of N
than what you suggested (10), perhaps 5.  But I like it otherwise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes:
 I am thinking about to implement multi-master option for pgbench.
 Supose we have multiple PostgreSQL running on host1 and host2.
 Something like pgbench -c 10 -h host1,host2... will create 5
 connections to host1 and host2 and send queries to host1 and host2.
 The point of this functionality is to test some cluster software which
 have a capability to create multi-master configuration.

Why wouldn't you just fire up several copies of pgbench, one per host?

The main reason I'm dubious about this is that it's demonstrable that
pgbench itself is the bottleneck in many test scenarios.  That problem
gets worse the more backends you try to have it control.  You can of
course solve this with multiple threads in pgbench, but as soon as you
do that there's no functional benefit over just running several copies.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] temporal support patch

2012-08-21 Thread Kevin Grittner
Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2012-08-20 at 19:32 -0500, Kevin Grittner wrote:
 Josh Berkus j...@agliodbs.com wrote:
  
 This is sounding like a completely runaway spec on what should
 be a simple feature.
  
 I hate to contribute to scope creep (or in this case scope
 screaming down the tracks at full steam), but I've been watching
 this with a queasy feeling about interaction with Serializable
 Snapshot Isolation (SSI).
 
 There are all kinds of challenges here, and I'm glad you're
 thinking about them. I alluded to some problems here:
 

http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis
 
 But those might be a subset of the problems you're talking about.
 
 It sounds like, at a high level, there are two problems:
 
 1. capturing the apparent order of execution in the audit log
 2. assigning meaningful times to the changes that are consistent
 with the apparent order of execution
 
As far as I can see, transactions which execute DML at any
transaction isolation level other than serializable can be
considered to have occurred in commit order.  Transactions which
don't write to the database don't need to be considered as part of
the history, at least in terms of viewing prior state.  Same with
transactions which roll back.  (Now, failed transactions and reads
might be of interest for some audit reports, but that seems to me
like a different issue than a temporal database.)
 
The funny bit is for a serializable transaction (TN) which commits
after writing to the database -- you can't know the apparent order
of execution as long as there are any serializable transactions
active which can't see the work of TN (i.e., the transactions
overlap).  If such a transaction (TX) executes a read which
conflicts with a TN write, TX appears to have executed first, since
it doesn't see the work of TN, so I think the sequence number or
timestamp for TN has to follow that for TX even though TN committed
first.  On the other hand, TX might write something that conflicts
with a TN read, in which case TN will appear to have executed first
and must get a sequence number or timestamp before TX.
If there is a cycle, SSI will cancel one of the transactions
involved, so that can't occur anywhere in the time line.
 
So, if you want to allow serializable temporal queries, the timing
of a read-write serializable transaction can't be locked down until
all overlapping read-write serializable transactions complete; and
the apparent order of execution must be based on read-write
conflicts, which are tracked within SSI.  I think that if we can
generate a list of committed transactions in order based on this
logic, it could feed into replication system -- hot standby as well
as trigger-based systems.  I think we could generate snapshots which
exclude the transactions for which the order of execution has not
yet been determined, and avoid the delays involved in other possible
solutions.
 
There's a lot of detail missing here in terms of what the API would
be, and how we handle the summarization that can occur within SSI so
that it can continue to function within bounded memory even in
pessimal circumstances, but that's the general outline of my
concerns and suggested solution.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-08-21 Thread Boszormenyi Zoltan

Hi,

new version with a lot more cleanup is attached.

2012-07-22 22:03 keltezéssel, Boszormenyi Zoltan írta:

Attached is the revised (and a lot leaner, more generic) lock timeout patch,
which introduces new functionality for the timeout registration framework.
The new functionality is called extra timeouts, better naming is welcome.
Instead of only the previously defined (deadlock and statement) timeouts,
the extra timeouts can also be activated from within ProcSleep() in a linked
way.


This mini-framework is now called lock manager timeouts and
both deadlock timeout and the new lock timeout belong to it.
The little piece of standalone code managing these are in
storage/lmgr/lmgrtimeout.c.

There is no PGSemaphoreTimedLock() any more. Instead,
PGSemaphoreLock() gained a new function argument for
checking timeouts. This has three advantages:
- There is only one PGSemaphoreLock() implementation and bug fixes
  like ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec don't need to
  touch several places.
- There is no layering violation between pg_sema.c and proc.c.
- The extra function can check other type of conditions from different
  callers, should the need arise.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -durpN postgresql/src/backend/port/ipc_test.c postgresql.1/src/backend/port/ipc_test.c
--- postgresql/src/backend/port/ipc_test.c	2012-04-16 19:57:22.437915477 +0200
+++ postgresql.1/src/backend/port/ipc_test.c	2012-08-21 15:53:50.059329927 +0200
@@ -240,7 +240,7 @@ main(int argc, char **argv)
 	printf(Testing Lock ... );
 	fflush(stdout);
 
-	PGSemaphoreLock(storage-sem, false);
+	PGSemaphoreLock(storage-sem, false, NULL);
 
 	printf(OK\n);
 
@@ -262,8 +262,8 @@ main(int argc, char **argv)
 	PGSemaphoreUnlock(storage-sem);
 	PGSemaphoreUnlock(storage-sem);
 
-	PGSemaphoreLock(storage-sem, false);
-	PGSemaphoreLock(storage-sem, false);
+	PGSemaphoreLock(storage-sem, false, NULL);
+	PGSemaphoreLock(storage-sem, false, NULL);
 
 	printf(OK\n);
 
@@ -311,7 +311,7 @@ main(int argc, char **argv)
 	printf(Waiting for child (should wait 3 sec here) ... );
 	fflush(stdout);
 
-	PGSemaphoreLock(storage-sem, false);
+	PGSemaphoreLock(storage-sem, false, NULL);
 
 	printf(OK\n);
 
diff -durpN postgresql/src/backend/port/posix_sema.c postgresql.1/src/backend/port/posix_sema.c
--- postgresql/src/backend/port/posix_sema.c	2012-04-16 19:57:22.438915489 +0200
+++ postgresql.1/src/backend/port/posix_sema.c	2012-08-21 15:49:26.215579665 +0200
@@ -236,9 +236,11 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be  0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema, bool interruptOK,
+	PGSemaphoreCondition condition_checker)
 {
 	int			errStatus;
+	bool			condition = false;
 
 	/*
 	 * See notes in sysv_sema.c's implementation of PGSemaphoreLock. Just as
@@ -252,8 +254,12 @@ PGSemaphoreLock(PGSemaphore sema, bool i
 		CHECK_FOR_INTERRUPTS();
 		errStatus = sem_wait(PG_SEM_REF(sema));
 		ImmediateInterruptOK = false;
-	} while (errStatus  0  errno == EINTR);
+		if (condition_checker)
+			condition = condition_checker();
+	} while (errStatus  0  errno == EINTR  !condition);
 
+	if (condition)
+		return;
 	if (errStatus  0)
 		elog(FATAL, sem_wait failed: %m);
 }
diff -durpN postgresql/src/backend/port/sysv_sema.c postgresql.1/src/backend/port/sysv_sema.c
--- postgresql/src/backend/port/sysv_sema.c	2012-05-14 08:20:56.284830580 +0200
+++ postgresql.1/src/backend/port/sysv_sema.c	2012-08-21 15:49:26.991584804 +0200
@@ -358,9 +358,11 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be  0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema, bool interruptOK,
+	PGSemaphoreCondition condition_checker)
 {
 	int			errStatus;
+	bool			condition = false;
 	struct sembuf sops;
 
 	sops.sem_op = -1;			/* decrement */
@@ -414,8 +416,12 @@ PGSemaphoreLock(PGSemaphore sema, bool i
 		CHECK_FOR_INTERRUPTS();
 		errStatus = semop(sema-semId, sops, 1);
 		ImmediateInterruptOK = false;
-	} while (errStatus  0  errno == EINTR);
+		if (condition_checker)
+			condition = condition_checker();
+	} while (errStatus  0  errno == EINTR  !condition);
 
+	if (condition)
+		return;
 	if (errStatus  0)
 		elog(FATAL, semop(id=%d) failed: %m, sema-semId);
 }
diff -durpN postgresql/src/backend/port/win32_sema.c postgresql.1/src/backend/port/win32_sema.c
--- postgresql/src/backend/port/win32_sema.c	2012-06-11 06:22:48.137921483 +0200
+++ postgresql.1/src/backend/port/win32_sema.c	2012-08-21 15:49:24.921571070 +0200
@@ -116,10 +116,12 @@ PGSemaphoreReset(PGSemaphore sema)
  * Serve the interrupt if interruptOK is true.
  */
 void
-PGSemaphoreLock(PGSemaphore 

Re: [HACKERS] NOT NULL constraints in foreign tables

2012-08-21 Thread Robert Haas
On Mon, Aug 20, 2012 at 5:14 PM, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2012-08-20 at 16:50 -0400, Robert Haas wrote:
 #3 for foreign tables.

 I'm skeptical of that approach for two reasons:

 (1) It will be hard to inform users which constraints are enforced and
 which aren't.

The thing to keep in mind here is that EVERY property of a foreign
table is subject to change at any arbitrary point in time, without our
knowledge.  The very existence of the table, the names of its columns,
the types of those columns, and any additional constraints on that
columns are all things that can vary between the declaration and
what's actually present, and can be subsequently changed on the remote
side at any time.  Why should CHECK constraints be any different than,
say, column types?

 (2) It will be hard for users to understand the planner benefits or the
 consequences when the constraint is not enforced.

Why should that be any worse with foreign tables than anything else?
I mean, lots of people, as things stand today, manage to set up
partitioned tables using CHECK constraints.  There are undoubtedly
people who don't understand the planner benefit of having an
appropriate CHECK constraint on each partition, but it's not exactly a
common cause of confusion.

 That being said, I can imagine good use cases (like when the foreign
 table is in postgres, and already has that constraint declared), so I'm
 not outright opposed to it.

 #1 is not a reasonable alternative for foreign
 tables because we lack enforcement power in that case,

 Right.

  and #2 is also
 not reasonable, because the only point of allowing declarative
 constraints is to get better performance, and if we go with #2 then
 we've pretty much thrown that out the window.

 Declared constraints can improve the plans, while runtime-enforced
 constraints slow down execution of a given plan. I'm not really sure
 whether runtime enforcement is a good trade-off, but it doesn't seem
 like an obviously bad one.

It seems to me that the term runtime enforcement is a misnomer; you
can't really enforce anything about a foreign table.  You can throw
an error if the data doesn't meet expectations, but by that time the
cat is already out of the bag.  The purpose of a CHECK constraint on a
plain table is to prevent bad data from going into a table, not to
throw an error when retrieving previously-inserted bad data.  If we
were to propose changing the semantics from the former to the latter,
we'd be laughed out of town, and rightly so.

 Also, what did you mean by the only point of allowing declarative
 constraints is to get better performance? Maybe the user wants to get
 an error if some important assumption about the remote data source is
 not as true as when they declared the constraint.

I think that need is going to be best served by issuing a query to
validate whatever constraint you're interested in, i.e. SELECT * FROM
foreign_table WHERE NOT (whatever_i_am_assuming).  I mean, let's
suppose that we were to allow unique constraints on foreign tables.
This is clearly useful, because, for example, it would allow join
removal in a case like local_table LEFT JOIN foreign_table ON
local_table.id = foreign_table.id; and it would also allow a
selectivity estimate of -1 for that column.  But are you going to
validate that the column in question is still unique every time you
access the table?  Presumably not; you'd have to read and sort the
entire table to do that.  Now, if the query is something like SELECT
* FROM foreign_table WHERE id = 1, you could fairly cheaply validate
that there is only one row with id = 1, but that's not the same thing
as validating that the assumption (namely, that foreign_table (id) is
unique) is still true.  And if the query is SELECT
max(some_other_column) FROM foreign_table, you can't really validate
anything at all, or at least not without a lot of extra overhead.

Now, it's more feasible to think that you could validate a CHECK
constraint on each row that the query actually touches, but that's
still going to be pretty expensive, and it's still not the same thing
as validating that the assumption holds true for all rows in the
table.  I think if we go down this road of trying to validate
remote-side CHECK constraints, we're going to end up with a mishmash
of cases where constraints are checked and other cases where
constraints are not checked, and then that really is going to be
confusing.  We're definitely not going to be able to check everything
all the time, so checking nothing ever seems like the principled
alternative.

Also, for tightly-coupled databases under common control, there's
really no need to be constantly checking and rechecking that the other
guy hasn't pulled the rug out from under you.  We certainly need to
have that as an option.  For loosely-coupled databases under different
administrative control there might be some point, but it seems sort of
hopeless: if the other DBA doesn't care about breaking your stuff, 

Re: [HACKERS] [PATCH]Tablesample Submission

2012-08-21 Thread Robert Haas
On Mon, Aug 20, 2012 at 9:52 PM, Qi Huang huangq...@outlook.com wrote:
 Hi, hackers
  I made the final version tablesample patch. It is implementing SYSTEM
 and BERNOULLI sample method, which is basically feature-complete. The
 regression test is also included in this patch.
 There is an wiki documentation on
 https://wiki.postgresql.org/wiki/TABLESAMPLE_Implementation. The detail
 about this patch and this project is all included in this documentation.

Please add your patch here:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Tom Lane
... or at least, that's what the schedule says.  I don't think we can
honestly produce a release candidate when there are still open issues
listed as blockers at
http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Open_Items
We need to either get something done about those, conclude that they're
not blockers, or postpone RC1.

The items currently listed as blockers are:

* GiST indexes vs fuzzy comparisons used by geometric types
** Alexander proposed a patch that would support the current behavior, but 
should we change the behavior instead?

I put this in the blocker list because I was hoping to get some
conversation going about the whole issue of fuzzy comparisons in the
geometric stuff.  However, the time for making any basic semantic
revisions in 9.2 is long past.  We could perhaps look at applying
Alexander's more restricted patch, but maybe even that is too
destabilizing at this point.  I'm inclined to move the whole thing onto
the long term issues list.  Comments?

* Should we fix tuple limit handling, or redefine 9.x behavior as correct?
** The consensus seems to be to change the documentation to match the current 
behavior.

At this point this is just a pre-existing documentation bug.  Somebody
ought to do something about it at some point, but it hardly seems like
a release blocker.

* keepalives

I don't know who put this item in, or what it refers to, since it has
no supporting link.  Unless somebody steps forward with an explanation
of what the blocker issue is here, this entry is going to disappear.

* pg_ctl crashes on Win32 when neither PGDATA nor -D specified

I'm not sure that this qualifies as a release blocker either --- isn't
it a plain-vanilla pre-existing bug?  And what does the proposed patch
have to do with the stated problem?  (Even if you define the problem
as make sure we're restricted rather than the stated symptom, the
patch looks rather fragile and Rube Goldbergian ... isn't there a way
to actually test if we're in a restricted process?)

* Checkpointer process split broke fsync'ing
** bug is fixed, but now we had better recheck earlier performance claims

Is anyone actually going to do any performance testing on this?

* View options are problematic for pg_dump

I had hoped those who created this problem were going to fix it, but
given the lack of response I guess I'll have to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]Tablesample Submission

2012-08-21 Thread Qi Huang
 Please add your patch here:
 
 https://commitfest.postgresql.org/action/commitfest_view/open
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

Hi, Robert
I added it under Miscellaneous.
https://commitfest.postgresql.org/action/patch_view?id=918

Best RegardsHuang Qi VictorComputer Science of National University of Singapore 
  

Re: [HACKERS] temporal support patch

2012-08-21 Thread David Fetter
On Mon, Aug 20, 2012 at 09:33:45PM -0700, Jeff Davis wrote:
 On Mon, 2012-08-20 at 19:17 -0400, David Johnston wrote:
  Ideally the decision of whether to do so could be a client
  decision.  Not storing intra-transaction changes is easier than
  storing all changes.  At worse you could stage up all changed then
  simply fail to store all intermediate results within a given
  relation.  It that case you gain nothing in execution performance
  but safe both storage and interpretative resources.  So the
  question becomes is it worth doing without the ability to store
  intermediate results?  If you were to ponder both which setup
  would the default be?  If the default is the harder one (all
  statements) to implement then to avoid upgrade issues the syntax
  should specify that it is logging transactions only.
 
 I think the biggest question here is what guarantees can be offered?
 What if the transaction aborts after having written some data, does
 the audit log still get updated?

There are definitely use cases for this, but until we have autonomous
transactions, a totally separate project, I don't think we should
attempt them in the first version.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Docs: Make notes on sequences and rollback more obvious

2012-08-21 Thread Robert Haas
On Mon, Aug 20, 2012 at 4:45 AM, Craig Ringer ring...@ringerc.id.au wrote:
 Trying again with the attachments; the archiver only seemed to see the first
 patch despite all three being attached. Including patches inline; if you
 want 'em prettier, see:

   https://github.com/ringerc/postgres/tree/sequence_documentation_fixes


 Subject: [PATCH 1/3] Make sure you can't read through mvcc.sgml without
  realising that not everything is MVCC.


The first of these three patches looks good to me, so I committed it.
I am not convinced that the others are ready to go in.  AFAICS, there
hasn't been any discussion of whether a list of non-transactional
features would be a useful thing to have, or if so where it should be
located in the docs and what should go into it.  I'm not necessarily
opposed to adding something, but I think it needs some actual
discussion before we commit anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mar ago 21 10:47:41 -0400 2012:

 * pg_ctl crashes on Win32 when neither PGDATA nor -D specified
 
 I'm not sure that this qualifies as a release blocker either --- isn't
 it a plain-vanilla pre-existing bug?  And what does the proposed patch
 have to do with the stated problem?  (Even if you define the problem
 as make sure we're restricted rather than the stated symptom, the
 patch looks rather fragile and Rube Goldbergian ... isn't there a way
 to actually test if we're in a restricted process?)

You mean, test if we're in a restricted process, and then refuse to run
unless that is so?  That would be a simple way out of the problem, but
I'm not really sure that it fixes the issue because Win32 people
normally expects stuff to run by dropping privs internally.

Maybe that's something we should leave for later, though, and fix 9.2 by
doing what you propose (which is presumably going to be a much simpler
patch).  Clearly having pg_ctl just crash is not a good situation.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
 I am thinking about to implement multi-master option for pgbench.
 Supose we have multiple PostgreSQL running on host1 and host2.
 Something like pgbench -c 10 -h host1,host2... will create 5
 connections to host1 and host2 and send queries to host1 and host2.
 The point of this functionality is to test some cluster software which
 have a capability to create multi-master configuration.

 Perhaps the read option has a good interest for PostgreSQL to check a
 simultaneous load on a multiple cluster of Postgres with read operations.
 But I do not see any immediate use of write operations only. Have you
 thought about the possibility to define a different set of transaction
 depending on the node targetted? For example you could target a master with
 write-read and slaves with read-only.

I think that kind of intelligence is beyond scope of pgbench. I
would prefer to leave such a work to another tool.

 Btw, this could have some use not only for Postgres, but also for other
 projects based on it with which you could really do some multi-master
 benchmark in writing.

Right. If pgbench could have such a functionarlity, we could compare
those projects by using pgbench. Currently those projects use
different benchmarking tools. That means, the comparison is something
like apple-to-orange. With enhanced pgbench we could do apple-to-apple
comparison.

 Do you have some thoughts about the possible option specifications?
 Configuration files would be too heavy for the only purpose of pgbench. So,
 specifiying all the info in a single command? It is of course possible, but
 command will become easily unreadable, and it might be the cause of many
 mistakes.

Agreed.

 However, here are some ideas you might use:
 1) pgbench -h host1:port1,host2:port2 ...
 2) pgbench -h host1,host2 -p port1:port2

Looks good.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Robert Haas
On Tue, Aug 21, 2012 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 ... or at least, that's what the schedule says.  I don't think we can
 honestly produce a release candidate when there are still open issues
 listed as blockers at
 http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Open_Items
 We need to either get something done about those, conclude that they're
 not blockers, or postpone RC1.

 The items currently listed as blockers are:

 * GiST indexes vs fuzzy comparisons used by geometric types
 ** Alexander proposed a patch that would support the current behavior, but 
 should we change the behavior instead?

 I put this in the blocker list because I was hoping to get some
 conversation going about the whole issue of fuzzy comparisons in the
 geometric stuff.  However, the time for making any basic semantic
 revisions in 9.2 is long past.  We could perhaps look at applying
 Alexander's more restricted patch, but maybe even that is too
 destabilizing at this point.  I'm inclined to move the whole thing onto
 the long term issues list.  Comments?

Agree.

 * Should we fix tuple limit handling, or redefine 9.x behavior as correct?
 ** The consensus seems to be to change the documentation to match the current 
 behavior.

 At this point this is just a pre-existing documentation bug.  Somebody
 ought to do something about it at some point, but it hardly seems like
 a release blocker.

Agree.

 * keepalives

 I don't know who put this item in, or what it refers to, since it has
 no supporting link.  Unless somebody steps forward with an explanation
 of what the blocker issue is here, this entry is going to disappear.

I don't know who added this either, but Simon addressed it, so it can
be moved to resolved.  It referred to some changes to the
walsender/walreceiver protocol that were made for 9.2 but still a bit
half-baked.

 * pg_ctl crashes on Win32 when neither PGDATA nor -D specified

 I'm not sure that this qualifies as a release blocker either --- isn't
 it a plain-vanilla pre-existing bug?  And what does the proposed patch
 have to do with the stated problem?  (Even if you define the problem
 as make sure we're restricted rather than the stated symptom, the
 patch looks rather fragile and Rube Goldbergian ... isn't there a way
 to actually test if we're in a restricted process?)

If this isn't a regression, it's not a release blocker.

 * Checkpointer process split broke fsync'ing
 ** bug is fixed, but now we had better recheck earlier performance claims

 Is anyone actually going to do any performance testing on this?

I am unlikely to have time between now and release.

 * View options are problematic for pg_dump

 I had hoped those who created this problem were going to fix it, but
 given the lack of response I guess I'll have to.

This is my fault, but my hackers inbox got flooded and this got lost
in the shuffle.  Sorry.  I can probably devote some time to it today
if you don't want to be bothered with it.  Do you have a sense of what
the right fix is?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Robert Haas
On Thu, Aug 16, 2012 at 10:37 PM, Stephen Frost sfr...@snowman.net wrote:
 Greetings,

   When doing tab-completion under 9.1, pg_table_is_visible(oid) is slow
   and is ending up as the first thing tested against all the rows
   in pg_class.  Increasing the cost of pg_table_is_visible() up to
   10 causes it to move to the end of the tests, which improves things
   greatly- I thought there was a plan to make that the default..?

   This is with 9.1.4.

Is this a regression versus earlier releases, or just a bad thing in general?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] huge tlb support

2012-08-21 Thread Robert Haas
On Thu, Aug 16, 2012 at 10:53 PM, David Gould da...@sonic.net wrote:
 A warning, on RHEL 6.1 (2.6.32-131.4.1.el6.x86_64 #1 SMP) we have had
 horrible problems caused by transparent_hugepages running postgres on
 largish systems (128GB to 512GB memory, 32 cores). The system sometimes
 goes 99% system time and is very slow and unresponsive to the point of
 not successfully completing new tcp connections. Turning off
 transparent_hugepages fixes it.

Yikes!  Any idea WHY that happens?

I'm inclined to think this torpedos any idea we might have of enabling
hugepages automatically whenever possible.  I think we should just add
a GUC for this and call it good.  If the state of the world improves
sufficiently in the future, we can adjust, but I think for right now
we should just do this in the simplest way possible and move on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Amit Kapila
From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
 * pg_ctl crashes on Win32 when neither PGDATA nor -D specified

 I'm not sure that this qualifies as a release blocker either --- isn't
 it a plain-vanilla pre-existing bug?  And what does the proposed patch
 have to do with the stated problem?  (Even if you define the problem
 as make sure we're restricted rather than the stated symptom, the
 patch looks rather fragile and Rube Goldbergian ... 


This is to handle one part of the overall problem. Below is text from
previous mail discussion due to which new handling is introduced:

 I note that postgres -C data_directory will refuse to run on the
 command line because I've got admin privileges in Windows, and that
 pg_ctl normally starts postgres.exe using CreateRestrictedProcess.
 But it does not do so for the popen call in adjust_data_dir.

-- By you
if that actually is a third bug, as seems likely, somebody with access
to a windows environment will need to deal with it.


I have tried to define the handling similar to InitDB where for
administrative users,
it re-forks itself in a restricted mode as it has to start postgres.

 isn't there a way to actually test if we're in a restricted process?

Do you mean to say that it should check if pg_ctl runs as an administrative
user
then do the re-fork in restricted mode. 
If something else, then could you please give more detail about what is
exact expectation to handle the above issue.


With Regards,
Amit Kapila.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] huge tlb support

2012-08-21 Thread Andres Freund
On Tuesday, August 21, 2012 05:30:28 PM Robert Haas wrote:
 On Thu, Aug 16, 2012 at 10:53 PM, David Gould da...@sonic.net wrote:
  A warning, on RHEL 6.1 (2.6.32-131.4.1.el6.x86_64 #1 SMP) we have had
  horrible problems caused by transparent_hugepages running postgres on
  largish systems (128GB to 512GB memory, 32 cores). The system sometimes
  goes 99% system time and is very slow and unresponsive to the point of
  not successfully completing new tcp connections. Turning off
  transparent_hugepages fixes it.
 
 Yikes!  Any idea WHY that happens?
 
 I'm inclined to think this torpedos any idea we might have of enabling
 hugepages automatically whenever possible.  I think we should just add
 a GUC for this and call it good.  If the state of the world improves
 sufficiently in the future, we can adjust, but I think for right now
 we should just do this in the simplest way possible and move on.
He is talking about transparent hugepages not hugepages afaics.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-08-21 Thread Robert Haas
On Sat, Aug 11, 2012 at 6:11 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 However, I do think we will want to add a way to query for the time of
 the last reset, as other monitoring features are going that way.

That should be easy to add.

 Is it OK that the count is reset upon a server restart?

I think it's OK.  The reason why many of our stats are kept in the
stats file is because we have a limited amount of shared memory and
therefore can't guarantee (for example) that there's enough to keep
stats about EVERY table, since the number of tables is unlimited.
However, in cases where the data to be stored is fixed-size, and
especially when it's fixed-size and small, there's a lot of sense to
keeping the data in shared memory rather than sending stats collector
messages.  It's a lot less overhead, for one thing.  Maybe at some
point someone will want to devise a way to hibernate such stats to
disk at shutdown (or periodically) and reload them on startup, but it
doesn't seem like a must-have to me.

Other opinions may vary, of course.

 I'll marked it as waiting on author, for the documentation and reset
 time.

Yeah, we definitely need some documentation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
 On Thu, Aug 16, 2012 at 10:37 PM, Stephen Frost sfr...@snowman.net wrote:
When doing tab-completion under 9.1, pg_table_is_visible(oid) is slow
and is ending up as the first thing tested against all the rows
in pg_class.  Increasing the cost of pg_table_is_visible() up to
10 causes it to move to the end of the tests, which improves things
greatly- I thought there was a plan to make that the default..?
 
This is with 9.1.4.
 
 Is this a regression versus earlier releases, or just a bad thing in general?

It's really a regression- in prior releases, we had the
pg_table_is_visible() test later in the WHERE clause, so that call
wasn't invoked as often.  This all happened when Tom reworked the psql
SQL calls to be defined in an array instead of in-line'd (which was a
good change, but moved pg_table_is_visible() up to the front of the
WHERE clause, slowing things down).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] GetSnapshotData() comments

2012-08-21 Thread Robert Haas
On Tue, Aug 14, 2012 at 5:41 PM, Bruce Momjian br...@momjian.us wrote:
 Did these comment updates ever get addressed?

Partially.

I just made a commit to clean up the rest of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Stephen Frost
* Stephen Frost (sfr...@snowman.net) wrote:
  Is this a regression versus earlier releases, or just a bad thing in 
  general?
 
 It's really a regression- in prior releases

Sorry, to clarify (after reading through my -hackers inbox a bit more
and realizing you were probably asking about 9.2)- it's a regression,
but it was also in earlier releases..  I'd have to go back to the git
blame that I ran earlier to find it, but I think the change was made
in 8.4 or 9.0, so I don't think it's a regression as far as
9.0 - 9.1 or 9.1 - 9.2 is concerned.

I'm slightly ashamed to admit that we discovered it during our migration
from 8.2 - 9.1...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] B-tree parent pointer and checkpoints

2012-08-21 Thread Robert Haas
On Wed, Aug 15, 2012 at 6:23 PM, Bruce Momjian br...@momjian.us wrote:
 Has this been addressed?  A TODO?

I don't think anything's been done about it.  According to your email
of October 11, 2011, you already did add a TODO for this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] huge tlb support

2012-08-21 Thread Robert Haas
On Tue, Aug 21, 2012 at 11:31 AM, Andres Freund and...@2ndquadrant.com wrote:
 On Tuesday, August 21, 2012 05:30:28 PM Robert Haas wrote:
 On Thu, Aug 16, 2012 at 10:53 PM, David Gould da...@sonic.net wrote:
  A warning, on RHEL 6.1 (2.6.32-131.4.1.el6.x86_64 #1 SMP) we have had
  horrible problems caused by transparent_hugepages running postgres on
  largish systems (128GB to 512GB memory, 32 cores). The system sometimes
  goes 99% system time and is very slow and unresponsive to the point of
  not successfully completing new tcp connections. Turning off
  transparent_hugepages fixes it.

 Yikes!  Any idea WHY that happens?

 I'm inclined to think this torpedos any idea we might have of enabling
 hugepages automatically whenever possible.  I think we should just add
 a GUC for this and call it good.  If the state of the world improves
 sufficiently in the future, we can adjust, but I think for right now
 we should just do this in the simplest way possible and move on.
 He is talking about transparent hugepages not hugepages afaics.

Hmm.  I guess you're right.  But why would it be different?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A caveat of partitioning tables in the document

2012-08-21 Thread Kasahara Tatsuhito
Hi.

The latest document (doc/src/sgml/ddl.sgml) says
===
2974itemizedlist
2975 listitem
2976  para
2977   Constraint exclusion only works when the query's literalWHERE/
2978   clause contains constants.  A parameterized query will not be
2979   optimized, since the planner cannot know which partitions the
2980   parameter value might select at run time.  For the same reason,
2981   quotestable/ functions such as functionCURRENT_DATE/function
2982   must be avoided.
2983  /para
2984 /listitem
===
but in my understanding, this problem will be solved on 9.2 (with
parameterized plans).

Or some issues still remain ?

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog file naming

2012-08-21 Thread Robert Haas
On Wed, Aug 15, 2012 at 8:43 PM, Bruce Momjian br...@momjian.us wrote:
 Are there any TODO items here?

It's possible there's something we want to change here, but it's far
from obvious what that thing is.  Our WAL file handling is
ridiculously hard to understand, but the problem with changing it is
that there will then be two things people have to understand, and a
lot of tools that have to be revamped.  It isn't clear that it's worth
going through that kind of pain for a minor improvement in clarity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] huge tlb support

2012-08-21 Thread Andres Freund
On Tuesday, August 21, 2012 05:56:58 PM Robert Haas wrote:
 On Tue, Aug 21, 2012 at 11:31 AM, Andres Freund and...@2ndquadrant.com 
wrote:
  On Tuesday, August 21, 2012 05:30:28 PM Robert Haas wrote:
  On Thu, Aug 16, 2012 at 10:53 PM, David Gould da...@sonic.net wrote:
   A warning, on RHEL 6.1 (2.6.32-131.4.1.el6.x86_64 #1 SMP) we have had
   horrible problems caused by transparent_hugepages running postgres on
   largish systems (128GB to 512GB memory, 32 cores). The system
   sometimes goes 99% system time and is very slow and unresponsive to
   the point of not successfully completing new tcp connections. Turning
   off
   transparent_hugepages fixes it.
  
  Yikes!  Any idea WHY that happens?
Afair there were several bugs that could cause that in earlier version of the 
hugepage feature. The prominent was something around never really stopping to 
search for mergeable pages even though the probability was small or such.

I am not a rhel person, so I cannot directly interpret that kernel version, is 
that the latest kernel?

  I'm inclined to think this torpedos any idea we might have of enabling
  hugepages automatically whenever possible.  I think we should just add
  a GUC for this and call it good.  If the state of the world improves
  sufficiently in the future, we can adjust, but I think for right now
  we should just do this in the simplest way possible and move on.
  
  He is talking about transparent hugepages not hugepages afaics.
 
 Hmm.  I guess you're right.  But why would it be different?
Because in this case explicit hugepage usage reduces the pain instead of 
increasing it. And we cannot do much against transparent hugepages being 
enabled by default.
Unless I misremember how things work the problem is/was independent of 
anonymous mmap or sysv shmem.


Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread David Fetter
On Tue, Aug 21, 2012 at 06:04:42PM +0900, Tatsuo Ishii wrote:
 Hi,
 
 I am thinking about to implement multi-master option for pgbench.
 Supose we have multiple PostgreSQL running on host1 and host2.
 Something like pgbench -c 10 -h host1,host2... will create 5
 connections to host1 and host2 and send queries to host1 and host2.
 The point of this functionality is to test some cluster software which
 have a capability to create multi-master configuration.
 
 Comments?

To distinguish it from simply running separate pgbench tests for each
host, would this somehow test propagation of the writes?  Such a thing
would be quite useful, but it seems at first glance like a large
project.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 21, 2012 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 * View options are problematic for pg_dump
 
 I had hoped those who created this problem were going to fix it, but
 given the lack of response I guess I'll have to.

 This is my fault, but my hackers inbox got flooded and this got lost
 in the shuffle.  Sorry.  I can probably devote some time to it today
 if you don't want to be bothered with it.  Do you have a sense of what
 the right fix is?

I can work on it if you're still swamped.  I think it is probably
fixable by treating the view options as attached to the _RETURN rule
instead of the base table in pg_dump's objects.  (There is an ALTER VIEW
command to set the security option, no?)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] reviewing the Reduce sinval synchronization overhead patch / b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4

2012-08-21 Thread Nils Goroll

Hi,

I am reviewing this one year old change again before backporting it to 9.1.3 for 
production use.


ATM, I believe the code is correct, but I don't want to miss the change to spot 
possible errors, so please let me dump my brain on some points:


- IIUC, SIGetDataEntries() can return 0 when in fact there _are_ messages
  because  stateP-hasMessages could come from a stale cache (iow there is no
  read-membar used and because we return before acquiring SInvalReadLock (which
  the patch is all about in the first place), we don't get an implicit
  read-membar from a lock op any more).

  What I can't judge on: Would this cause any harm? What are the consequences
  of SIGetDataEntries returning 0 after another process has posted a message
  (looking at global temporal ordering)?

  I don't quite understand the significance of the respective comment in the
  code that the incoherence should be acceptable because the cached read can't
  migrate to before a previous lock acquisition (which itself is clear).

  AcceptInvalidationMessages has a comment that it should be the first thing
  to do in a transaction, and I am not sure if all the consumers have a
  read-membar equivalent operation in place.

  How bad would a missed cache invalidation be? Should we have a read-membar
  in SIGetDataEntries just to be safe?

Other notes on points which appear correct to me (really more a note to myself):

- stateP-hasMessages = false in SIGetDataEntries is membar'ed by
  SpinLockAcquire(vsegP-msgnumLock), so it shouldn't happen that
  clearing hasMessages moves behind reading msgnumLock

  (in which case we could loose the hasMessages flag)

- but it can happen that hasMessages gets set when in fact there is
  nothing to read (which is fine because we then check maxMsgNum)

Nils


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Thu, Aug 16, 2012 at 10:37 PM, Stephen Frost sfr...@snowman.net wrote:
When doing tab-completion under 9.1, pg_table_is_visible(oid) is slow
and is ending up as the first thing tested against all the rows
in pg_class.  Increasing the cost of pg_table_is_visible() up to
10 causes it to move to the end of the tests, which improves things
greatly- I thought there was a plan to make that the default..?
 
This is with 9.1.4.
 
 Is this a regression versus earlier releases, or just a bad thing in general?

Alright, so, yea, the commit I was referring to is this one:

e84487f67a0d216f3db87b2558f1edd322a09e48

Which was apparently in the 8.3 dev cycle, so this would be an
8.2 - 8.3 regression (as I mentioned in my last email, on this
particular database, we're going from 8.2 - 9.1).

My vote is that everyone else needs to have databases with more tables,
or they need to care about tab-completion speed more. :)  There are a
few complaints in the archives though, so I'm not alone..

Would changing the cost of pg_table_is_visible() require a catversion
bump..?  Or maybe just do it w/o the bump for 9.2 and tell beta testers
that they might want to make the change by hand?  Or bump it as part of
early 9.3?

I do also want to change tab-complete around a bit to make it so that we
can actually index the query based on the name, which would clearly be a
9.3 thing.  I was expecting some push-back on this idea, but havn't
heard any yet.  Would people accept adding an index on pg_class.relname
to support fast tab-completion?  Or is this going to expand into
figuring out how to support index-based partial lookups for the 'name'
type, so we could use the existing index (if that's even possible to
do...)?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Excerpts from Tom Lane's message of mar ago 21 10:47:41 -0400 2012:
 * pg_ctl crashes on Win32 when neither PGDATA nor -D specified
 
 I'm not sure that this qualifies as a release blocker either --- isn't
 it a plain-vanilla pre-existing bug?  And what does the proposed patch
 have to do with the stated problem?  (Even if you define the problem
 as make sure we're restricted rather than the stated symptom, the
 patch looks rather fragile and Rube Goldbergian ... isn't there a way
 to actually test if we're in a restricted process?)

 You mean, test if we're in a restricted process, and then refuse to run
 unless that is so?  That would be a simple way out of the problem, but
 I'm not really sure that it fixes the issue because Win32 people
 normally expects stuff to run by dropping privs internally.

Well, what the proposed patch does is fix the permissions problem by
re-executing pg_ctl in a restricted process.  What I was unhappy about
was the mechanism for deciding it needs to do that: I think it should
be something less easily breakable than looking for an environment
variable.

And I still don't see what that has to do with failing if the data
directory isn't specified.  Surely that should just lead to

pg_ctl: no database directory specified and environment variable PGDATA 
unset
Try pg_ctl --help for more information.

If that doesn't work on Windows, isn't there something else wrong
altogether?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Tom Lane
Amit Kapila amit.kap...@huawei.com writes:
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
 * pg_ctl crashes on Win32 when neither PGDATA nor -D specified

 I'm not sure that this qualifies as a release blocker either --- isn't
 it a plain-vanilla pre-existing bug?

 This is to handle one part of the overall problem. Below is text from
 previous mail discussion due to which new handling is introduced:
 
 I note that postgres -C data_directory will refuse to run on the
 command line because I've got admin privileges in Windows, and that
 pg_ctl normally starts postgres.exe using CreateRestrictedProcess.
 But it does not do so for the popen call in adjust_data_dir.

Ah, okay, so that is a new bug in 9.2.  I've adjusted the description
on the open-items page to reflect what still needs to be fixed.

 isn't there a way to actually test if we're in a restricted process?

 Do you mean to say that it should check if pg_ctl runs as an administrative
 user then do the re-fork in restricted mode. 

Something like that.  The proposed patch depends on there not being a
conflicting environment variable, which seems rather fragile to me.
Can't we test the same condition that postgres.exe itself would test?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Kevin Grittner
Stephen Frost sfr...@snowman.net wrote:
 
 Would people accept adding an index on pg_class.relname to support
 fast tab-completion? Or is this going to expand into figuring out
 how to support index-based partial lookups for the 'name' type, so
 we could use the existing index (if that's even possible 
 to do...)?
 
That already seems to work for me.
 
explain analyze
select relname from pg_class where relname like 'Party%';
 
 Index Scan using pg_class_relname_nsp_index on pg_class
 (cost=0.00..0.47 rows=1 width=64)
 (actual time=0.027..0.060 rows=19 loops=1)
   Index Cond: ((relname = 'Party'::name)
   AND (relname  'Partz'::name))
   Filter: (relname ~~ 'Party%'::text)
 Total runtime: 0.084 ms
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u

2012-08-21 Thread Josh Berkus
On 8/21/12 6:34 AM, Sachin Srivastava wrote:
 Hi all,
 
 On my Mac 10.6.8 using ActiveState Python 3.2.2, I am getting a crash when
 I try to execute CREATE EXTENSION plpython3u
 This is the backtrace:

Does it work in Postgres 9.1?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Stephen Frost (sfr...@snowman.net) wrote:
 Is this a regression versus earlier releases, or just a bad thing in 
 general?

 It's really a regression- in prior releases

 Sorry, to clarify (after reading through my -hackers inbox a bit more
 and realizing you were probably asking about 9.2)- it's a regression,
 but it was also in earlier releases..  I'd have to go back to the git
 blame that I ran earlier to find it, but I think the change was made
 in 8.4 or 9.0, so I don't think it's a regression as far as
 9.0 - 9.1 or 9.1 - 9.2 is concerned.

 I'm slightly ashamed to admit that we discovered it during our migration
 from 8.2 - 9.1...

Um ... I don't see any difference in the clause ordering from 8.2
forward.  SELECT * FROM bazTAB produces a query like this in 8.2:

LOG:  statement: SELECT pg_catalog.quote_ident(c.relname) FROM 
pg_catalog.pg_class c WHERE c.relkind IN ('r', 'S', 'v') AND 
substring(pg_catalog.quote_ident(c.relname),1,3)='baz' AND 
pg_catalog.pg_table_is_visible(c.oid) AND c.relnamespace  (SELECT oid FROM 
pg_catalog.pg_namespace WHERE nspname = 'pg_catalog')
UNION
SELECT pg_catalog.quote_ident(n.nspname) || '.' FROM 
pg_catalog.pg_namespace n WHERE substring(pg_catalog.quote_ident(n.nspname) || 
'.',1,3)='baz' AND (SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace 
WHERE substring(pg_catalog.quote_ident(nspname) || '.',1,3) = 
substring('baz',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1))  1
UNION
SELECT pg_catalog.quote_ident(n.nspname) || '.' || 
pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class c, 
pg_catalog.pg_namespace n WHERE c.relnamespace = n.oid AND c.relkind IN ('r', 
'S', 'v') AND substring(pg_catalog.quote_ident(n.nspname) || '.' || 
pg_catalog.quote_ident(c.relname),1,3)='baz' AND 
substring(pg_catalog.quote_ident(n.nspname) || '.',1,3) = 
substring('baz',1,pg_catalog.length(pg_catalog.quote_ident(n.nspname))+1) AND 
(SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace WHERE 
substring(pg_catalog.quote_ident(nspname) || '.',1,3) = 
substring('baz',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)) = 1
LIMIT 1000

and the only difference in HEAD is another relkind in the IN clauses.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Stephen Frost
* Kevin Grittner (kevin.gritt...@wicourts.gov) wrote:
 That already seems to work for me.

Ah, yes, that does..  I was using the query from tab-complete.c, which
wraps it in quote_ident(c.relanme), which ends up preventing us from
using the index.

There's a couple of other interesting corner cases, such as:

select * from spatab

Will return nothing, but:

select * from spatab

Works just fine.

Similar, if you have a table called TEST, then:

select * from TE

works, but:

select * from TE
(or any other combo w/o a quote)

doesn't work.

Clearly, I'd be happier if we could use the index.  To be honest, I
think it'd also be nice if we could auto-quote tables (I've got quite a
few people who do things like create tables through ODBC, or migrate
from another DB where everything is all-uppercase, or mixed-case, and it
sure would be nice if those just worked).  I'm not sure that there's
much use-case for asking people to prefix their table with a - seems
more like a surprise that it's required than expected.

That said, I'm sure there are points to be made about how there'd be
problems with not requiring the , or with things more complex than just
mixed-case situations.  Perhaps we need to go through and try to
enumerate what we want to happen in each situation and try to reach a
concensus (and a plan for how to implement it..).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Um ... I don't see any difference in the clause ordering from 8.2
 forward.  SELECT * FROM bazTAB produces a query like this in 8.2:

Odd..  I could have sworn I saw a difference in the query generated,
but perhaps I just assumed it was reordered, since it ends up ordered
differently in the actual plans:

8.2: http://explain.depesz.com/s/449
9.1: http://explain.depesz.com/s/03O

Increasing the cost of pg_table_is_visible() caused it to move that test
to the end again (under 9.1), which improved the performance (though I'm
sure having a usable index would help a great deal more..).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Robert Haas (robertmh...@gmail.com) wrote:
 Is this a regression versus earlier releases, or just a bad thing in general?

 Alright, so, yea, the commit I was referring to is this one:
 e84487f67a0d216f3db87b2558f1edd322a09e48
 Which was apparently in the 8.3 dev cycle, so this would be an
 8.2 - 8.3 regression (as I mentioned in my last email, on this
 particular database, we're going from 8.2 - 9.1).

Uh, no, that was in the 7.4 cycle.

 My vote is that everyone else needs to have databases with more tables,
 or they need to care about tab-completion speed more. :)

Agreed, but we're not doing anything about that in 9.2.

 I do also want to change tab-complete around a bit to make it so that we
 can actually index the query based on the name, which would clearly be a
 9.3 thing.  I was expecting some push-back on this idea, but havn't
 heard any yet.  Would people accept adding an index on pg_class.relname
 to support fast tab-completion?  Or is this going to expand into
 figuring out how to support index-based partial lookups for the 'name'
 type, so we could use the existing index (if that's even possible to
 do...)?

The LIKE idea is interesting.  What you'd still need is to suppress the
quote_ident function call so that it becomes just relname LIKE 'foo%'.
Which seems do-able if possibly rather ugly.  That would leave us with
SELECT ... FROM fooTAB being fast but SELECT ... FROM fooTAB being
slow; not sure if that's good enough.  I do *not* want to add an index
on quote_ident(relname) to pg_class though.  Quite aside from speed
issues, I'm pretty sure that functional indexes on core system catalogs
would be a huge can of worms.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 The LIKE idea is interesting.  What you'd still need is to suppress the
 quote_ident function call so that it becomes just relname LIKE 'foo%'.
 Which seems do-able if possibly rather ugly.  That would leave us with
 SELECT ... FROM fooTAB being fast but SELECT ... FROM fooTAB being
 slow; not sure if that's good enough.  I do *not* want to add an index
 on quote_ident(relname) to pg_class though.  Quite aside from speed
 issues, I'm pretty sure that functional indexes on core system catalogs
 would be a huge can of worms.

That's the kind of concern that I was expecting, to be honest. :)  As
Kevin's pointed out, it's not likely to be needed anyway..  There's a
bit of an open question still regarding case-insensitive searching, but
perhaps we let that be slow and only done if we don't get any answers
back from a case-sensetive search?

For example, given these tables:

TEST
test

Doing:

select * from TEtab  - TEST
select * from TEtab  - TEST
select * from tetab  - test
select * from tetab  - test
select * from Tetab  - test (but slow)
select * from Tetab  - test (but slow)

We'd essentially do: LIKE 'xx%', and then run quote_ident() on the
result (I assume we can replace the whole word, right?).  I'd also
strip off any , for the purposes of searching with tab-completion.  I'm
not sure how easy it'd be to have a fall-back setup.  I do wonder if we
should do what I often recommend my dev do though, which is to have a
SQL or pl/pgsql function defined on the database-side, rather than
sending large/complex queries to the database from the application..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Amit kapila
From: Tom Lane [t...@sss.pgh.pa.us]
Sent: Tuesday, August 21, 2012 10:31 PM
Amit Kapila amit.kap...@huawei.com writes:
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
 * pg_ctl crashes on Win32 when neither PGDATA nor -D specified

 I'm not sure that this qualifies as a release blocker either --- isn't
 it a plain-vanilla pre-existing bug?

 This is to handle one part of the overall problem. Below is text from
 previous mail discussion due to which new handling is introduced:
 
 I note that postgres -C data_directory will refuse to run on the
 command line because I've got admin privileges in Windows, and that
 pg_ctl normally starts postgres.exe using CreateRestrictedProcess.
 But it does not do so for the popen call in adjust_data_dir.

Ah, okay, so that is a new bug in 9.2.  I've adjusted the description
on the open-items page to reflect what still needs to be fixed.

 isn't there a way to actually test if we're in a restricted process?

 Do you mean to say that it should check if pg_ctl runs as an administrative
 user then do the re-fork in restricted mode.

Something like that.  The proposed patch depends on there not being a
conflicting environment variable, which seems rather fragile to me.

Can't we test the same condition that postgres.exe itself would test?
  Yes, it should be possible. I will update the patch tommorow and will post it 
here.
  And if there will be any problem in having the similar check as postgres.exe 
itself does, I shall find an  alternative and discuss the same.
  
With Regards,
Amit Kapila.

   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 There's a couple of other interesting corner cases, such as:

Yeah.  I had been thinking of this as purely a performance issue, but
if we want to consider adjusting the visible behavior as well, that
makes it a completely different thing.

 select * from spatab
 Will return nothing, but:
 select * from spatab
 Works just fine.

Of course this is because quote_ident() only quotes if necessary.

 Similar, if you have a table called TEST, then:
 select * from TE
 works, but:
 select * from TE
 (or any other combo w/o a quote)
 doesn't work.

And that's because we're comparing to quote_ident() output, which
will have quotes in this case.

I wonder whether it would work to do something like this: if the word to
be tab-completed starts with a quote, compare to quote_ident output;
otherwise, compare to relname (or other name column) directly.

We'd still emit quote_ident output, which means that if you did

select * from TETAB

it would change that to

select * from TEST

(assuming you had say TEST1 and TEST2 so it couldn't complete further).
That seems like it would be a step forward in usability.  I'm not sure
that this covers all the corner cases usability-wise though.

From a performance point of view, this would be fast for the unquoted
case (if combined with Stephen's idea to use LIKE).  It'd still be slow
for quoted input though.

But then take another step: if the word-so-far has a leading quote and
no embedded quotes, we can clearly strip the leading quote and compare
the rest directly to the name column.  So that can be fast.  The only
cases that need be slow are names with embedded quotes, which surely
isn't a case that too many people care about.

In short, I think we might be able to make this fast, and more usable,
just with hacking on psql's query generation rules.  There's no need for
server-side changes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 That's the kind of concern that I was expecting, to be honest. :)  As
 Kevin's pointed out, it's not likely to be needed anyway..  There's a
 bit of an open question still regarding case-insensitive searching, but
 perhaps we let that be slow and only done if we don't get any answers
 back from a case-sensetive search?

Um, I don't believe we do any case-insensitive search now, do we?

 We'd essentially do: LIKE 'xx%', and then run quote_ident() on the
 result (I assume we can replace the whole word, right?).  I'd also
 strip off any , for the purposes of searching with tab-completion.

I think you might be saying the same thing I said in my prior message,
but not quite sure.

 I'm
 not sure how easy it'd be to have a fall-back setup.  I do wonder if we
 should do what I often recommend my dev do though, which is to have a
 SQL or pl/pgsql function defined on the database-side, rather than
 sending large/complex queries to the database from the application..

The nice thing about keeping this knowledge on the psql side is it would
still work with older servers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 We'd still emit quote_ident output, which means that if you did
 
   select * from TETAB
 
 it would change that to
 
   select * from TEST
 
 (assuming you had say TEST1 and TEST2 so it couldn't complete
 further).
 
 if the word-so-far has a leading quote and no embedded quotes, we
 can clearly strip the leading quote and compare the rest directly
 to the name column.  So that can be fast.  The only cases that
 need be slow are names with embedded quotes, which surely isn't a
 case that too many people care about.
 
 In short, I think we might be able to make this fast, and more
 usable, just with hacking on psql's query generation rules. 
 There's no need for server-side changes.
 
I like it.  A lot.  With our camel-case naming convention, it would
make life a lot easier.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Robert Haas
On Tue, Aug 21, 2012 at 12:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 21, 2012 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 * View options are problematic for pg_dump

 I had hoped those who created this problem were going to fix it, but
 given the lack of response I guess I'll have to.

 This is my fault, but my hackers inbox got flooded and this got lost
 in the shuffle.  Sorry.  I can probably devote some time to it today
 if you don't want to be bothered with it.  Do you have a sense of what
 the right fix is?

 I can work on it if you're still swamped.  I think it is probably
 fixable by treating the view options as attached to the _RETURN rule
 instead of the base table in pg_dump's objects.  (There is an ALTER VIEW
 command to set the security option, no?)

Yep, we need to emit:

ALTER VIEW whatever SET (security_barrier = true);

...after creating the rule that transforms it into a view.  I spent a
little time looking at this before lunch and it seems pretty
straightforward to exclude the options from the dump of the table:  I
think we can just have repairViewRuleMultiLoop() to clear ((TableInfo
*) table)-reloptions.

However, that by itself would result in them not getting dumped
anywhere, so then I guess we need to add a reloptions field to
RuleInfo.  repairViewMultiLoop() can then detach the options from the
TableInfo object and attach them to the RuleInfo object.  Then we can
adjust dumpRule() to print an ALTER VIEW command for any attached
reloptions.  That seems pretty grotty because it kind of flies in the
face of the idea that the table and the rule are separate objects, but
I don't have a better idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u

2012-08-21 Thread Sachin Srivastava
On Tue, Aug 21, 2012 at 10:37 PM, Josh Berkus j...@agliodbs.com wrote:

 On 8/21/12 6:34 AM, Sachin Srivastava wrote:
  Hi all,
 
  On my Mac 10.6.8 using ActiveState Python 3.2.2, I am getting a crash
 when
  I try to execute CREATE EXTENSION plpython3u
  This is the backtrace:

 Does it work in Postgres 9.1?

No. I get the same backtrace when I try against the 9.1.5 (REL9_1_STABLE)
branch.



 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Regards,
Sachin Srivastava
EnterpriseDB, India


Re: [HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u

2012-08-21 Thread Josh Berkus

 No. I get the same backtrace when I try against the 9.1.5 (REL9_1_STABLE)
 branch.

OK, not a regression then.

Can you install plpython3u using non-Activestate python?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 21, 2012 at 12:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I can work on it if you're still swamped.  I think it is probably
 fixable by treating the view options as attached to the _RETURN rule
 instead of the base table in pg_dump's objects.  (There is an ALTER VIEW
 command to set the security option, no?)

 Yep, we need to emit:

 ALTER VIEW whatever SET (security_barrier = true);

 ...after creating the rule that transforms it into a view.  I spent a
 little time looking at this before lunch and it seems pretty
 straightforward to exclude the options from the dump of the table:  I
 think we can just have repairViewRuleMultiLoop() to clear ((TableInfo
 *) table)-reloptions.

 However, that by itself would result in them not getting dumped
 anywhere, so then I guess we need to add a reloptions field to
 RuleInfo.  repairViewMultiLoop() can then detach the options from the
 TableInfo object and attach them to the RuleInfo object.  Then we can
 adjust dumpRule() to print an ALTER VIEW command for any attached
 reloptions.  That seems pretty grotty because it kind of flies in the
 face of the idea that the table and the rule are separate objects, but
 I don't have a better idea.

Yeah, that sounds about right.  You want to do it, or shall I?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 21, 2012 at 2:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, that sounds about right.  You want to do it, or shall I?

 If you don't mind dealing with it, that's great.  If you'd prefer that
 I cleaned up my own mess, I'll take care of it.

I can do it.  I have nothing on my plate today except get RC1 ready.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 So, I think that hacking on psql's query generation rules may well be
 a good idea, but shouldn't we also be bumping procost for the
 pg_whatever_is_visible functions?  I mean, Stephen's information
 suggests that those values are pretty clearly wrong, regardless of
 anything else.

Yeah, I think we discussed that once before.  I have no particular
objection to doing that in HEAD, just think it's a bit late for 9.2.
In any case, it will only help for medium-size numbers of entries;
once you get to enough tables/functions/whatever that a seqscan of the
catalog is bad news, only fixing the name matching is going to help.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Robert Haas
On Tue, Aug 21, 2012 at 1:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 In short, I think we might be able to make this fast, and more usable,
 just with hacking on psql's query generation rules.  There's no need for
 server-side changes.

So, I think that hacking on psql's query generation rules may well be
a good idea, but shouldn't we also be bumping procost for the
pg_whatever_is_visible functions?  I mean, Stephen's information
suggests that those values are pretty clearly wrong, regardless of
anything else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Robert Haas
On Tue, Aug 21, 2012 at 2:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 21, 2012 at 12:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I can work on it if you're still swamped.  I think it is probably
 fixable by treating the view options as attached to the _RETURN rule
 instead of the base table in pg_dump's objects.  (There is an ALTER VIEW
 command to set the security option, no?)

 Yep, we need to emit:

 ALTER VIEW whatever SET (security_barrier = true);

 ...after creating the rule that transforms it into a view.  I spent a
 little time looking at this before lunch and it seems pretty
 straightforward to exclude the options from the dump of the table:  I
 think we can just have repairViewRuleMultiLoop() to clear ((TableInfo
 *) table)-reloptions.

 However, that by itself would result in them not getting dumped
 anywhere, so then I guess we need to add a reloptions field to
 RuleInfo.  repairViewMultiLoop() can then detach the options from the
 TableInfo object and attach them to the RuleInfo object.  Then we can
 adjust dumpRule() to print an ALTER VIEW command for any attached
 reloptions.  That seems pretty grotty because it kind of flies in the
 face of the idea that the table and the rule are separate objects, but
 I don't have a better idea.

 Yeah, that sounds about right.  You want to do it, or shall I?

If you don't mind dealing with it, that's great.  If you'd prefer that
I cleaned up my own mess, I'll take care of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slow tab completion w/ lots of tables

2012-08-21 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Um, I don't believe we do any case-insensitive search now, do we?

No, I don't suppose we do..  I was thinking we ran quote_ident() on the
search-string side, but apparently we don't, meaning:

select * from TEtab

doesn't find 'test'.  I suppose it's alright to keep it that way.

  We'd essentially do: LIKE 'xx%', and then run quote_ident() on the
  result (I assume we can replace the whole word, right?).  I'd also
  strip off any , for the purposes of searching with tab-completion.
 
 I think you might be saying the same thing I said in my prior message,
 but not quite sure.

You added the distinction that we have to watch out for embedded quotes.
Otherwise, I believe we had the same approach, which is to strip off a
leading quote, if there is one, and then compare the raw string directly
to relname using: LIKE 'xxx%';  If there is an embedded quote, go back
to using quote_ident and matching the whole string.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] sha1, sha2 functions into core?

2012-08-21 Thread Merlin Moncure
On Mon, Aug 20, 2012 at 5:54 PM, Greg Sabino Mullane g...@turnstep.com wrote:
 3) use a purposefully slow hashing function like bcrypt.

 but I disagree: I don't like any scheme that encourages use of low
 entropy passwords.

 Perhaps off-topic, but how to do you figure that?

Yeah -- bcrypt's main claim to fame is that it's slow.  I *lot* of
people argue your'e better off with a slow hash and that's reasonable
but I just don't like the speed/convenience tradeoff.   I suppose I'm
impatient.  My take on this is that relying on hash speed to protect
you if the attacker has the hash, the salt, and knows the algorithm is
pretty weak sauce.  At best it lowers the entropy requirements
somewhat:  an 80 bit entropy password is not brute forcible no matter
how many server farmed GPUs you have.  The mechanics of how the hash
is calculated (see Joe C's excellent comments upthread) are much more
important considerations than algorithm choice.

If you have high security requirements and your users refuse to use
high entropy passwords, I think you're better off going 2-factor then
hoisting slowness on everything that needs to authenticate.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Aug 21, 2012 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 * Checkpointer process split broke fsync'ing
 ** bug is fixed, but now we had better recheck earlier performance claims
 
 Is anyone actually going to do any performance testing on this?

 I am unlikely to have time between now and release.

Me either, and I didn't hear any other volunteers.

Even if testing showed that there was some performance regression,
I doubt that we would either revert the checkpointer process split or
hold up the release to look for another solution.  So realistically this
is not a blocker issue.  I'll move it to the not blockers section.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] huge tlb support

2012-08-21 Thread David Gould
On Tue, 21 Aug 2012 18:06:38 +0200
Andres Freund and...@2ndquadrant.com wrote:

 On Tuesday, August 21, 2012 05:56:58 PM Robert Haas wrote:
  On Tue, Aug 21, 2012 at 11:31 AM, Andres Freund
  and...@2ndquadrant.com 
 wrote:
   On Tuesday, August 21, 2012 05:30:28 PM Robert Haas wrote:
   On Thu, Aug 16, 2012 at 10:53 PM, David Gould da...@sonic.net
   wrote:
A warning, on RHEL 6.1 (2.6.32-131.4.1.el6.x86_64 #1 SMP) we
have had horrible problems caused by transparent_hugepages
running postgres on largish systems (128GB to 512GB memory, 32
cores). The system sometimes goes 99% system time and is very
slow and unresponsive to the point of not successfully
completing new tcp connections. Turning off
transparent_hugepages fixes it.
   
   Yikes!  Any idea WHY that happens?
 Afair there were several bugs that could cause that in earlier version
 of the hugepage feature. The prominent was something around never
 really stopping to search for mergeable pages even though the
 probability was small or such.

This is what I think was going on. We did see a lot (99%) of time in some
routine in the VM (I forget exactly which), and my interpretation was
that it was trying to create hugepages from scattered fragments.

   I'm inclined to think this torpedos any idea we might have of
   enabling hugepages automatically whenever possible.  I think we
   should just add a GUC for this and call it good.  If the state of
   the world improves sufficiently in the future, we can adjust, but
   I think for right now we should just do this in the simplest way
   possible and move on.
   
   He is talking about transparent hugepages not hugepages afaics.
  
  Hmm.  I guess you're right.  But why would it be different?
 Because in this case explicit hugepage usage reduces the pain instead
 of increasing it. And we cannot do much against transparent hugepages
 being enabled by default.
 Unless I misremember how things work the problem is/was independent of 
 anonymous mmap or sysv shmem.

Explicit hugepages work because the pages can be created early before all
of memory is fragmented and you either succeed or fail. Transparent
hugepages uses a daemon that looks for processe that might benefit from
hugepages and tries to create hugepages on the fly. On a system that has
been up for a some time memory may be so fragmented that this is just a
waste of time.

Real as opposed to transparent hugepages would be a huge win for
applications that try to use high connection counts. Each backend
attached to the postgresql shared memory uses its own set of page table
entries at the rate of 2KB per MB of mapped shared memory. At 8GB of
shared buffers and 1000 connections this uses 16GB just for page tables.

-dg

-- 
David Gould  510 282 0869 da...@sonic.net
If simplicity worked, the world would be overrun with insects.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reviewing the Reduce sinval synchronization overhead patch / b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4

2012-08-21 Thread Robert Haas
On Tue, Aug 21, 2012 at 12:14 PM, Nils Goroll sl...@schokola.de wrote:
 I am reviewing this one year old change again before backporting it to 9.1.3
 for production use.

 ATM, I believe the code is correct, but I don't want to miss the change to
 spot possible errors, so please let me dump my brain on some points:

 - IIUC, SIGetDataEntries() can return 0 when in fact there _are_ messages
   because  stateP-hasMessages could come from a stale cache (iow there is
 no
   read-membar used and because we return before acquiring SInvalReadLock
 (which
   the patch is all about in the first place), we don't get an implicit
   read-membar from a lock op any more).

Right.

   What I can't judge on: Would this cause any harm? What are the
 consequences
   of SIGetDataEntries returning 0 after another process has posted a message
   (looking at global temporal ordering)?

   I don't quite understand the significance of the respective comment in the
   code that the incoherence should be acceptable because the cached read
 can't
   migrate to before a previous lock acquisition (which itself is clear).

Our sinval synchronization mechanism has a somewhat weird design that
makes this OK.  Sinval basically exists to support backend-local
caching, and any given piece of data that's being cached is
conceptually protected by some heavyweight lock L, taken normally in
access-share mode.  That means that, before relying on a backend-local
cache to be up to date, you must take that heavyweight lock, which
will call AcceptInvalidationMessages().  The fact that you've
successfully taken that heavyweight lock means that nobody else is
changing the data you care about, because to do that they would have
needed a conflicting lock i.e. access-exclusive mode.  So the guy
modifying the data logically does this:

T0. take lock in access-exclusive mode
T1. change stuff
T2. send invalidation messages
T3. release lock

While the other guy does this:

U0. take lock in access-share mode
U1. receive invalidation messages
U2. rebuild cache if necessary
U3. release lock

Step U1 cannot occur before step U0 (because lock acquisition is a
memory barrier).   Step T2 cannot occur after step T3 (because lock
release is a memory barrier).  And step U0 cannot occur before step T3
(because the locks conflict).  So the ordering is necessarily
T2-T3-U0-U1; thus, T2 must happen before U1 and we're OK.

Now, it is still true that if the lock taken U0 is *a different lock*
than the one release in T3 then there's no ordering between T2 and U1,
so U1 could miss invalidation messages that wipe out *some cache other
than the one it's about to examine*.  But it can't miss the ones for
the cache that it actually cares about.  Woohoo!

   AcceptInvalidationMessages has a comment that it should be the first thing
   to do in a transaction, and I am not sure if all the consumers have a
   read-membar equivalent operation in place.

The really important call site for this purpose is the one in LockRelationOid().

   How bad would a missed cache invalidation be? Should we have a read-membar
   in SIGetDataEntries just to be safe?

Not needed, per the above.  We should not add memory barriers anywhere
without a precise definition of what problem we're fixing.  They are
not free, and we don't want to get into the habit of inserting them as
ill-considered insurance against problems we don't fully understand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] restartpoints stop generating on streaming-replication slave

2012-08-21 Thread Mathieu Fenniak
Hi all,

I've been investigating an issue with our PostgreSQL 9.1.1 (Linux x86-64
CentOS 5.8) database where restartpoints suddenly stop being generated on
the streaming-replication slave after working correctly for a week or two.
 The symptom of the problem is that the pg_xlog directory on the slave
doesn't get cleaned up, and the log_checkpoints output (eg. restartpoint
starting: time) stops appearing.

I was able to extract a core dump of the bgwriter process while it was in
BgWriterNap.  I inspected ckpt_start_time and last_checkpoint_time;
ckpt_start_time was 1345578533 (... 19:48:53 GMT) and last_checkpoint_time
was 1345578248 (... 19:44:08 GMT).  Based upon these values, I concluded
that it's performing checkpoints but missing the if (ckpt_performed)
condition (ie. CreateRestartPoint returns false); it's then setting
last_checkpoint_time to now - 5 minutes + 15 seconds.

There seems to be two causes of a false retval in CreateRestartPoint; the
first is if !RecoveryInProgress(), and the second is if the last
checkpoint record we've replayed is already our last restartpoint.  The
first condition doesn't seem likely; does anyone know how we might be
hitting the second condition?  We have continuous traffic on the master
server in the range of 1000 txn/s, and the slave seems to be completely
up-to-date, so I don't understand how we could be hitting this condition.

Mathieu


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
 I am thinking about to implement multi-master option for pgbench.
 Supose we have multiple PostgreSQL running on host1 and host2.
 Something like pgbench -c 10 -h host1,host2... will create 5
 connections to host1 and host2 and send queries to host1 and host2.
 The point of this functionality is to test some cluster software which
 have a capability to create multi-master configuration.
 
 Comments?
 
 To distinguish it from simply running separate pgbench tests for each
 host, would this somehow test propagation of the writes?  Such a thing
 would be quite useful, but it seems at first glance like a large
 project.

What does propagation of the writes mean?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] temporal support patch

2012-08-21 Thread Gavin Flower

On 22/08/12 02:16, Kevin Grittner wrote:

Jeff Davis pg...@j-davis.com wrote:

On Mon, 2012-08-20 at 19:32 -0500, Kevin Grittner wrote:

Josh Berkus j...@agliodbs.com wrote:
  

This is sounding like a completely runaway spec on what should
be a simple feature.
  
I hate to contribute to scope creep (or in this case scope

screaming down the tracks at full steam), but I've been watching
this with a queasy feeling about interaction with Serializable
Snapshot Isolation (SSI).

There are all kinds of challenges here, and I'm glad you're
thinking about them. I alluded to some problems here:



http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis

But those might be a subset of the problems you're talking about.

It sounds like, at a high level, there are two problems:

1. capturing the apparent order of execution in the audit log
2. assigning meaningful times to the changes that are consistent
with the apparent order of execution
  
As far as I can see, transactions which execute DML at any

transaction isolation level other than serializable can be
considered to have occurred in commit order.  Transactions which
don't write to the database don't need to be considered as part of
the history, at least in terms of viewing prior state.  Same with
transactions which roll back.  (Now, failed transactions and reads
might be of interest for some audit reports, but that seems to me
like a different issue than a temporal database.)
  
The funny bit is for a serializable transaction (TN) which commits

after writing to the database -- you can't know the apparent order
of execution as long as there are any serializable transactions
active which can't see the work of TN (i.e., the transactions
overlap).  If such a transaction (TX) executes a read which
conflicts with a TN write, TX appears to have executed first, since
it doesn't see the work of TN, so I think the sequence number or
timestamp for TN has to follow that for TX even though TN committed
first.  On the other hand, TX might write something that conflicts
with a TN read, in which case TN will appear to have executed first
and must get a sequence number or timestamp before TX.
If there is a cycle, SSI will cancel one of the transactions
involved, so that can't occur anywhere in the time line.
  
So, if you want to allow serializable temporal queries, the timing

of a read-write serializable transaction can't be locked down until
all overlapping read-write serializable transactions complete; and
the apparent order of execution must be based on read-write
conflicts, which are tracked within SSI.  I think that if we can
generate a list of committed transactions in order based on this
logic, it could feed into replication system -- hot standby as well
as trigger-based systems.  I think we could generate snapshots which
exclude the transactions for which the order of execution has not
yet been determined, and avoid the delays involved in other possible
solutions.
  
There's a lot of detail missing here in terms of what the API would

be, and how we handle the summarization that can occur within SSI so
that it can continue to function within bounded memory even in
pessimal circumstances, but that's the general outline of my
concerns and suggested solution.
  
-Kevin




So if I understand correctly...

If there is a very long running transaction, say 1 hour, then all (or 
just some? - depending) transactions that nominally start and finish 
within that time, can not have definitive start times until the very 
long running transaction finishes, even if they are successfully committed?


So if someone looks at the audit log they might not see all the 
transactions they expect to see.


So, if I had an automatic query A which updated statistics based on on 
transactions committed over the last 10 minutes, then many (all?) 
transactions starting and successfully completing during the time of the 
very long running transaction will never show up! Here I am envisioning 
a query fired off every ten minutes looking for audit records with 
timestamps within the previous ten minutes. However, if I ran a query B 
looking at audit record numbers with in 10 minute intervals for a week, 
but kicked off 24 hours after the week finished -- then I would see the 
records I did not see in query A.


Hmm... if I am at all right, then probably best to have some suitably 
worded 'government health warning' prominent in the documentation!



Cheers,
Gavin







[HACKERS] restartpoints stop generating on streaming replication slave

2012-08-21 Thread Mathieu Fenniak
Hi all,

I've been investigating an issue with our PostgreSQL 9.1.1 (Linux x86-64
CentOS 5.8) database where restartpoints suddenly stop being generated on
the slave after working correctly for a week or two.  The symptom of the
problem is that the pg_xlog directory on the slave doesn't get cleaned up,
and the log_checkpoints output (eg. restartpoint starting: time) stops
appearing.

I was able to extract a core dump of the bgwriter process while it was in
BgWriterNap.  I inspected ckpt_start_time and last_checkpoint_time;
ckpt_start_time was 1345578533 (... 19:48:53 GMT) and last_checkpoint_time
was 1345578248 (... 19:44:08 GMT).  Based upon these values, I concluded
that it's performing checkpoints but missing the if (ckpt_performed)
condition (ie. CreateRestartPoint returns false); it's then setting
last_checkpoint_time to now - 5 minutes + 15 seconds.

There seems to be two causes of a false retval in CreateRestartPoint; the
first is if !RecoveryInProgress(), and the second is if the last
checkpoint record we've replayed is already our last restartpoint.  The
first condition doesn't seem likely; does anyone know how we might be
hitting the second condition?  We have continuous traffic on the master
server in the range of 1000 txn/s, and the slave seems to be completely
up-to-date, so I don't understand how we could be hitting this condition.

Mathieu


Re: [HACKERS] temporal support patch

2012-08-21 Thread Alvaro Herrera
Excerpts from Gavin Flower's message of mar ago 21 16:51:57 -0400 2012:
 On 22/08/12 02:16, Kevin Grittner wrote:

  So, if you want to allow serializable temporal queries, the timing
  of a read-write serializable transaction can't be locked down until
  all overlapping read-write serializable transactions complete; and
  the apparent order of execution must be based on read-write
  conflicts, which are tracked within SSI.  I think that if we can
  generate a list of committed transactions in order based on this
  logic, it could feed into replication system -- hot standby as well
  as trigger-based systems.  I think we could generate snapshots which
  exclude the transactions for which the order of execution has not
  yet been determined, and avoid the delays involved in other possible
  solutions.

 If there is a very long running transaction, say 1 hour, then all (or 
 just some? - depending) transactions that nominally start and finish 
 within that time, can not have definitive start times until the very 
 long running transaction finishes, even if they are successfully committed?
 
 So if someone looks at the audit log they might not see all the 
 transactions they expect to see.

I think there would need to be a way to also list transactions which are
in progress -- this would include not only live transactions, but also
all those transactions that have actually committed but are not yet
listed as committed because their position in the commit sequence has
not been determined.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] temporal support patch

2012-08-21 Thread Kevin Grittner
Gavin Flower gavinflo...@archidevsys.co.nz wrote:
 
 So if I understand correctly...
 
 If there is a very long running transaction, say 1 hour, then all
 (or just some? - depending) transactions that nominally start and
 finish within that time, can not have definitive start times until
 the very long running transaction finishes, even if they are
 successfully committed?
 
That's not correct.  Any transaction which started can certainly
have a start time. Any transaction which completed can certainly
have a commit or rollback time.  What they *can't* have is a known
position in the apparent order of execution for serializable
transactions, which might be different from the order of start and
commit times.
 
The fact that it has an unknown sequence number or timestamp for
purposes of ordering visibility of transactions doesn't mean you
can't show that it completed in an audit log.  In other words, I
think the needs for a temporal database are significantly different
from the needs of an auditing system.
 
And keep in mind, we are only talking about seeing read-write
serializable transactions which might yet conflict with other
read-write serializable transactions when choosing to look at the
prior state within a temporal serializable transaction.  That's easy
enough to avoid if you want to do so.
 
 So if someone looks at the audit log they might not see all the 
 transactions they expect to see.
 
I would assume an audit log would have very different needs from
tracking changes for a temporal database view.  It even seems
possible that you might want to see what people *looked* at, versus
just changes.  You might want to see transactions which were rolled
back, which are irrelevant for a temporal view.  If we're talking
about an auditing system, we're talking about an almost completely
different animal from a temporal view of the database.
 
 So, if I had an automatic query A which updated statistics based
 on on transactions committed over the last 10 minutes, then many
 (all?) transactions starting and successfully completing during
 the time of the very long running transaction will never show up!
 
A statistics capture process like that doesn't seem like a place
where you care about the apparent order of execution of serializable
transactions.
 
 Here I am envisioning a query fired off every ten minutes looking
 for audit records with timestamps within the previous ten minutes.
 
Which timestamp would make sense for that?
 
 However, if I ran a query B looking at audit record numbers with
 in 10 minute intervals for a week, but kicked off 24 hours after
 the week finished -- then I would see the records I did not see in
 query A.
 
 Hmm... if I am at all right, then probably best to have some
 suitably worded 'government health warning' prominent in the
 documentation!
 
We're clearly talking at cross purposes.  I'm discussing what is
needed to be able to see a past state of the database in a
transaction which would only see states of the database which are
consistent with some serial execution of serializable transactions
which modified the database, and you're talking about an audit
table.  If we're actually talking about an audit system, I have a
whole different set of concerns, and I would not be bringing this
one up.
 
The whole point of my concern is that if you have a business rule
enforced by database triggers that rows in a certain table contain
some batch identifier and no rows can be added to a batch after some
database change flags that batch as closed, then *without* what I
suggest, you could view a closed batch and see one set of rows, and
view the batch at a later point in time and magically see rows
appear in violation of the enforced business rules.  I'm talking
about *preventing* surprising appearances of data after the fact. 
You would need a big warning if you *don't* have what I suggest.
 
Now, it is quite possible that one table (or set of tables) could do
double-duty for both temporal queries and auditing, but the fact
that something is not needed there for one purpose doesn't mean it
isn't needed for the other.
 
There are still some fuzzy areas around how things would look with a
*mix* of serializable and other transactions updating the database;
but I think in the long run we will find that people either want to
do all of their modifications through SSI, or none of them.  The
guarantees get pretty weak if you don't know that all transactions
were part of the review of dangerous structures which can cause
anomalies.
 
If anyone is still unclear about serializable transactions in
PostgreSQL and wants to invest the time needed to read 12 pages
(including footnotes and pretty colored performance graphs) to
learn about it -- the paper which is going to be presented at the
VLDB conference next week goes at it from a different angle than I
usually approach it, and it may click with many people where my
discussions have fallen short.
 

Re: [HACKERS] temporal support patch

2012-08-21 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 
 I think there would need to be a way to also list transactions
 which are in progress -- this would include not only live
 transactions, but also all those transactions that have actually
 committed but are not yet listed as committed because their
 position in the commit sequence has not been determined.
 
That might allow the same data to serve both needs with one or two
fewer timestamp (or similar) columns that what my post of a few
minutes ago suggested.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 64-bit API for large object

2012-08-21 Thread Tatsuo Ishii
Hi,

I found this in the TODO list:

  Add API for 64-bit large object access 

If this is a still valid TODO item and nobody is working on this, I
would like to work in this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Audit Logs WAS: temporal support patch

2012-08-21 Thread Josh Berkus
First, note the change in topic.

This whole discussion has gone rather far afield from Miroslav's
original submission, which was for temporal tables, which is NOT the
same thing as audit logs, although the use cases overlap significantly.

Miroslav, I know this has been hard to follow, but you're getting a lot
of feedback because people are really interested in the feature and
related features.

 That sounds like a good way to start. Actually, even before the tool,
 how about just some really good examples of triggers for specific kinds
 of audit logs, and some ways to run queries on them? I think that might
 settle a lot of these details.

Well, I'm not adverse to solving some problems in the core:

1) That it's difficult/impossible to write a completely generic audit
trigger which works with any table without utilizing an external SP
language like Python.

2) That there's no obvious way to handle audit triggers and FK
relationships intelligently.

3) That audit tables don't automatically track schema changes in the
live table.

4) Checking which columns have changed (see Craig Ringer's email)

These seem like difficult enough challenges without getting more
complicated.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
 Why wouldn't you just fire up several copies of pgbench, one per host?

Well, more convenient. Aside from bottle neck discussion below, simple
tool to generate load is important IMO. It will help developers to
enhance multi-master configuration in finding bugs and problems if
any. IMO I saw similar relationship between pgbench and PostgreSQL.

 The main reason I'm dubious about this is that it's demonstrable that
 pgbench itself is the bottleneck in many test scenarios.  That problem
 gets worse the more backends you try to have it control.  You can of
 course solve this with multiple threads in pgbench, but as soon as you
 do that there's no functional benefit over just running several copies.

Are you sure that running several copies of pgbench could produce more
TPS than single pgbench? I thought that's just a limitation of the
resource of the machine which pgbench is running on. So I thought to
avoid the bottle neck of pgbench, I have to use several pgbench client
machines simultaneously anyway.

Another point is, what kind of transactions you want. pgbench -S
type transaction produces trivial load, and could easily reveal bottle
neck of pgbench. However this type of transaction is pretty extrem one
and very different from transactions in the real world. So even if
your argument is true, I guess it's only adopted to pgbench -S case.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Audit Logs WAS: temporal support patch

2012-08-21 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 
 First, note the change in topic.
 
 This whole discussion has gone rather far afield from Miroslav's
 original submission, which was for temporal tables, which is NOT
 the same thing as audit logs, although the use cases overlap
 significantly.
 
I don't think the concerns I raised about apparent order of
execution for serializable transactions apply to audit logs.  If
we've moved entirely off the topic of the original subject, it is a
complete non-issue.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes:
 Why wouldn't you just fire up several copies of pgbench, one per host?

 Well, more convenient. Aside from bottle neck discussion below, simple
 tool to generate load is important IMO.

Well, my concern here is that it's *not* going to be simple.  By the
time we get done adding enough switches to control connection to N
different hosts (possibly with different usernames, passwords, etc),
then adding frammishes to control which scripts get sent to which hosts,
and so on, I don't think it's really going to be simpler to use than
launching N copies of pgbench.

It might be worth doing if we had features that allowed the different
test scripts to interact, so that they could do things like check
replication propagation from one host to another.  But pgbench hasn't
got that, and in multi-job mode really can't have that (at least not
in the Unix separate-processes implementation).  Anyway that's a whole
nother level of complexity that would have to be added on before you
got to a useful feature.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread David Fetter
On Wed, Aug 22, 2012 at 06:26:00AM +0900, Tatsuo Ishii wrote:
  I am thinking about to implement multi-master option for pgbench.
  Supose we have multiple PostgreSQL running on host1 and host2.
  Something like pgbench -c 10 -h host1,host2... will create 5
  connections to host1 and host2 and send queries to host1 and host2.
  The point of this functionality is to test some cluster software which
  have a capability to create multi-master configuration.
  
  Comments?
  
  To distinguish it from simply running separate pgbench tests for each
  host, would this somehow test propagation of the writes?  Such a thing
  would be quite useful, but it seems at first glance like a large
  project.
 
 What does propagation of the writes mean?

I apologize for not being clear.  In a multi-master system, people
frequently wish to know how quickly a write operation has been
duplicated to the other nodes.  In some sense, those write operations
are incomplete until they have happened on all nodes, even in the
asynchronous case.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
 Well, my concern here is that it's *not* going to be simple.  By the
 time we get done adding enough switches to control connection to N
 different hosts (possibly with different usernames, passwords, etc),
 then adding frammishes to control which scripts get sent to which hosts,
 and so on, I don't think it's really going to be simpler to use than
 launching N copies of pgbench.

 It might be worth doing if we had features that allowed the different
 test scripts to interact, so that they could do things like check
 replication propagation from one host to another.  But pgbench hasn't
 got that, and in multi-job mode really can't have that (at least not
 in the Unix separate-processes implementation).  Anyway that's a whole
 nother level of complexity that would have to be added on before you
 got to a useful feature.

I do not intended to implement such a feature. As I wrote in the
subject line, I intended to enhance pgbench for multi-master
configuration. IMO, any node on multi-master configuration should
accept *any* queries, not only read queries but write queries. So bare
PostgreSQL streaming replication configuration cannot be a
multi-master configuration and will not be a target of the new
pgbench.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] problem when optimizing the window aggregation

2012-08-21 Thread Wang, Chaoyong
Hi,

 

I'm trying to reduce the re-computing of window aggregation. Here the
AVG function for example.

 

The original window aggregation's transition value(transValue) of AVG is
an ArrayType, that contains two main values(sum, count). 

 

Now, I'm using a temporary transition value (tempTransValue), and I need
to copy tempTransValue to transValue.

 

I used the function datumCopy as following:

peraggstate-transValue = datumCopy(peraggstate-tempTransValue,
peraggstate-transtypeByVal, peraggstate-transtypeLen);

 

But when the copied transValue is passed to the invoke function, here is
int4_avg_accum, the ArrayType returned from PG_GETARG_ARRAYTYPE_P(0) is
null.

Which means the copy action is failed.

 

Anybody know why? Or give me some suggestions? Thanks very much.

 

 

Best Regards

Chaoyong Wang

 



Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
 What does propagation of the writes mean?
 
 I apologize for not being clear.  In a multi-master system, people
 frequently wish to know how quickly a write operation has been
 duplicated to the other nodes.  In some sense, those write operations
 are incomplete until they have happened on all nodes, even in the
 asynchronous case.

IMO, that kind of functionnality is beyond the scope of benchmark tools.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem when optimizing the window aggregation

2012-08-21 Thread Tom Lane
Wang, Chaoyong chaoyong.w...@emc.com writes:
 I used the function datumCopy as following:

 peraggstate-transValue = datumCopy(peraggstate-tempTransValue,
 peraggstate-transtypeByVal, peraggstate-transtypeLen);

You need to guard that with an is-null check, because datumCopy isn't
designed to cope with null values.  (Which is historical, I guess, but
we're not likely to change it now.)  If you look around you will find
plenty of examples of this coding pattern.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes:
 Well, my concern here is that it's *not* going to be simple.  By the
 time we get done adding enough switches to control connection to N
 different hosts (possibly with different usernames, passwords, etc),
 then adding frammishes to control which scripts get sent to which hosts,
 and so on, I don't think it's really going to be simpler to use than
 launching N copies of pgbench.

 I do not intended to implement such a feature. As I wrote in the
 subject line, I intended to enhance pgbench for multi-master
 configuration. IMO, any node on multi-master configuration should
 accept *any* queries, not only read queries but write queries. So bare
 PostgreSQL streaming replication configuration cannot be a
 multi-master configuration and will not be a target of the new
 pgbench.

Well, you're being shortsighted then, because such a feature will barely
have hit the git repository before somebody wants to use it differently.
I can easily imagine wanting to stress a master plus some hot-standby
slaves, for instance; and that would absolutely require being able to
direct different subsets of the test scripts to different hosts.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
 I do not intended to implement such a feature. As I wrote in the
 subject line, I intended to enhance pgbench for multi-master
 configuration. IMO, any node on multi-master configuration should
 accept *any* queries, not only read queries but write queries. So bare
 PostgreSQL streaming replication configuration cannot be a
 multi-master configuration and will not be a target of the new
 pgbench.
 
 Well, you're being shortsighted then, because such a feature will barely
 have hit the git repository before somebody wants to use it differently.
 I can easily imagine wanting to stress a master plus some hot-standby
 slaves, for instance; and that would absolutely require being able to
 direct different subsets of the test scripts to different hosts.

I don't see any practical way to implement such a tool because there's
always a chance to try to retrieve non existing data from hot-standby
because of replication delay.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem when optimizing the window aggregation

2012-08-21 Thread Wang, Chaoyong
Thanks again, the reason is found.

The following statement frees all the context for aggregation, including the 
transValue and tempTransValue:
  MemoryContextResetAndDeleteChildren(winstate-aggcontext);


-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: 2012年8月22日 9:47
To: Wang, Chaoyong
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] problem when optimizing the window aggregation

Wang, Chaoyong chaoyong.w...@emc.com writes:
 I used the function datumCopy as following:

 peraggstate-transValue = datumCopy(peraggstate-tempTransValue,
 peraggstate-transtypeByVal, peraggstate-transtypeLen);

You need to guard that with an is-null check, because datumCopy isn't
designed to cope with null values.  (Which is historical, I guess, but
we're not likely to change it now.)  If you look around you will find
plenty of examples of this coding pattern.

regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for Update Operation

2012-08-21 Thread Amit Kapila
From: Jesper Krogh [mailto:jes...@krogh.cc] 
Sent: Wednesday, August 22, 2012 1:13 AM
On 21/08/12 16:57, Amit kapila wrote: 

Test results: 

1. The pgbench test run for 10min. 
 2. The test reult is for modified pgbench (such that total row size is
1800 and updated columns are of length 300) tpc-b testcase. 
 The result and modified pgbench code is attached with mail. 

 3. The performance improvement shown in the m/c I have tested is quite
good (more than 100% for sync commit = off).


 I cannot comment on completeness or correctness of the code, but I do
think a relevant test would be 
 to turn synchronous_commit on as default. 


 Even though you aim at an improved performance, it would be nice to see
the reduction in WAL-size 
as an effect of this patch. 

Yes, I shall take care of doing both the above tests and send the report.

 

With Regards,

Amit Kapila.



Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 The point of this functionality is to test some cluster 
 software which have a capability to create multi-master 
 configuration.

As the maintainer of software that does multi-master, I'm a little 
confused as to why we would extend pg_bench to do this. The software 
in question should be doing the testing itself, ideally via 
it's test suite (i.e. make test). Having pg_bench do any of this 
would be at best a very poor subset of the tests the software 
should be performing. I suppose if the software *uses* pg_bench for 
its tests already, once could argue a limited test case - but it seems 
difficult to design some pg_bench options generic and powerful enough 
to handle other cases outside of the one software this change is aimed at.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201208212330
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlA0UvsACgkQvJuQZxSWSsjALgCgw2cGI3eWR5fBGkoX9hqV1N39
OSEAn2ZIxrNRCdkDfKVrMmx2PsQTs8ZS
=Xhqb
-END PGP SIGNATURE-




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-21 Thread Tatsuo Ishii
 As the maintainer of software that does multi-master, I'm a little 
 confused as to why we would extend pg_bench to do this. The software 
 in question should be doing the testing itself, ideally via 
 it's test suite (i.e. make test). Having pg_bench do any of this 
 would be at best a very poor subset of the tests the software 
 should be performing. I suppose if the software *uses* pg_bench for 
 its tests already, once could argue a limited test case - but it seems 
 difficult to design some pg_bench options generic and powerful enough 
 to handle other cases outside of the one software this change is aimed at.

Well, my point was in upthread:
 Right. If pgbench could have such a functionarlity, we could compare
 those projects by using pgbench. Currently those projects use
 different benchmarking tools. That means, the comparison is something
 like apple-to-orange. With enhanced pgbench we could do apple-to-apple
 comparison.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit API for large object

2012-08-21 Thread Peter Eisentraut
On Wed, 2012-08-22 at 07:27 +0900, Tatsuo Ishii wrote:
 I found this in the TODO list:
 
   Add API for 64-bit large object access 
 
 If this is a still valid TODO item and nobody is working on this, I
 would like to work in this.

Large objects are limited to 2 GB in size, so a 64-bit API doesn't sound
very useful to me at the moment.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit API for large object

2012-08-21 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On Wed, 2012-08-22 at 07:27 +0900, Tatsuo Ishii wrote:
 I found this in the TODO list:
 Add API for 64-bit large object access 
 If this is a still valid TODO item and nobody is working on this, I
 would like to work in this.

 Large objects are limited to 2 GB in size, so a 64-bit API doesn't sound
 very useful to me at the moment.

Not entirely.  pg_largeobject.pageno is int32, but that's still 2G pages
not bytes, so there's three or so orders of magnitude that could be
gotten by expanding the client-side API before we'd have to change the
server's on-disk representation.

There might well be some local variables in the server's largeobject
code that would need to be widened, but that's the easiest part of the
job.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit API for large object

2012-08-21 Thread Tatsuo Ishii
 Large objects are limited to 2 GB in size, so a 64-bit API doesn't sound
 very useful to me at the moment.
 
 Not entirely.  pg_largeobject.pageno is int32, but that's still 2G pages
 not bytes, so there's three or so orders of magnitude that could be
 gotten by expanding the client-side API before we'd have to change the
 server's on-disk representation.

Right. You have already explained that in this:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01888.php

 There might well be some local variables in the server's largeobject
 code that would need to be widened, but that's the easiest part of the
 job.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers