date:20090917

Re: [HACKERS] FSM search modes

2009-09-17 Thread decibel


On Sep 18, 2009, at 1:09 AM, Simon Riggs wrote:

On Fri, 2009-09-18 at 10:47 +0900, Itagaki Takahiro wrote:

Simon Riggs  wrote:


* compact - page selection specifically attempts to find the lowest
numbered blocks, so that the table will naturally shrink over time.


We cannot shrink the table if one tuple remains at the end of table
and the tuple is always HOT-updated, because we put a new tuple into
the same page where the old tuple is placed if possible.

In addition to your intelligent FSM search modes,
do we need another algorithm to make the compaction to work better?


Perhaps we can have an additional piece of information about a table.
Something like target_size, so that normal updaters that attempt HOT
updates on blocks greater than target_size would know to avoid that
block and to seek a new location for the row lower in the table.  
Perhaps

such information could be reset and then sent via invalidation
mechanisms.

I'm thinking along the lines of a "fire alarm". An occasional  
mechanism

by which we can inform users of the need to evacuate certain blocks.



It might be better to not beat around the bush and provide a vacuum  
mode that explicitly tries to free the last X percent of the table.  
That's especially handy for a very large table, because you probably  
don't want to be forced into scanning the whole thing in vacuum just  
to free some space at the end. This mode could also be more  
aggressive about trying to acquire the lock that's needed to trim the  
file on disk.


That said, *any* step that improves dealing with table bloat is  
extremely welcome, as right now you're basically stuck rebuilding the  
table.

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Simon Riggs

On Thu, 2009-09-17 at 17:44 -0400, Tom Lane wrote:
> Dimitri Fontaine  writes:
> > I don't see any reason why not breaking the user visible behavior of
> > tuples CTID between any two major releases,
> 
> > Am I completely wet here?
> 
> Completely.  This is a user-visible behavior that we have encouraged
> people to rely on, and for which there is no easy substitute.

Agreed. I investigated that avenue as a possible implementation approach
when designing HOT and I didn't find anything worth taking away.

I'm very much in favour of a higher-level solution to compacting a
table, as has been discussed for the batch update utility. That avoids
most of the low-level yuck that VACUUM FULL imposes upon itself and
everyone around it. If we want to move forward long term we need to keep
the internals as clean as possible. Hot Standby would never have been
possible without that principle having already been applied across the
other subsystems.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Albe Laurenz

Tom Lane wrote:
> > I don't see any reason why not breaking the user visible behavior of
> > tuples CTID between any two major releases,
> 
> > Am I completely wet here?
> 
> Completely.  This is a user-visible behavior that we have encouraged
> people to rely on, and for which there is no easy substitute.

I second that: it would hurt to lose this generic technique for
optimistic locking.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby 0.2.1

2009-09-17 Thread Simon Riggs

On Thu, 2009-09-17 at 19:01 -0400, Robert Haas wrote:
> >
> > I'm going to put the index-only scans aside for now to focus on hot
> > standby and streaming replication. Both are big patches, so there's
> > plenty of work in those two alone, and not only for me.
> 
> What is the best way to attack this?  Should I keep reviewing
> index-only scans so that you have feedback for when you get back to
> it, or should I move on to something else?  If something else, does it
> make more sense for me to look at HS since I did a bit of work with a
> previous version, or would it be better for me to just pick one of the
> other patches from the CommitFest and work on that?
> 
> Also, stepping back from me personally, should we try to assign some
> additional reviewers to these patches?  Is there some way we can
> divide up review tasks among multiple people so that we're not
> repeating each others work?
> 
> Thoughts appreciated, from Heikki, Simon, or others.

I think this is a great opportunity to widen the pool of people
contributing to reviews.

I suggest the creation of a second group of people, performing
round-robin testing of patches. These people would be able to verify
* documentation matches implemented features (does it do what it says on
the tin?)
* usability of explicit features (do the features work well?)
* usability gap of unimplemented features (what else do we need?)
* are there any bugs?

These questions are often quickly answered for smaller patches, but HS's
scope mean that such a task properly executed could take a full week, if
not longer.

Second group of people are just as skilled Postgres people as reviewers,
in some cases more so, apart from they have less detailed knowledge of
the codebase. We have many such people and it would be good to encourage
them to perform thorough reviews rather than "tire kicking".

I'm not sure that Heikki needs additional reviewers. He now has
significant knowledge of the patch and is good at focusing on key
aspects of the internals. Other code reviewers are welcome, of course.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] pg_ctl init extension

2009-09-17 Thread Zdenek Kotala


Bernd Helmle píše v čt 17. 09. 2009 v 23:26 +0200:
> 
> --On 17. September 2009 23:00:12 +0300 Peter Eisentraut  
> wrote:
> 
> > f the name is a problem, why not change the name?  What you are
> > proposing above is effectively a very elaborate name change, so why not
> > go for a simpler one?
> 
> I don't like the solution by using -o "" to push down 
> command line options to initdb. I always had the opinion that this was (and 
> is) a valid workaround for postgres itself, but it doesn't feel correct to 
> expose that further to initdb and its purposes.

hmm, I could modify it that all args after init keyword will be pass to
the initdb like this:

pg_ctl -D /tmp/db init []
 
  initdb []

and "pg_ctl -h init" will show help which commands are allowed.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] pg_ctl init extension

2009-09-17 Thread Zdenek Kotala


Peter Eisentraut píše v čt 17. 09. 2009 v 23:00 +0300:
> On tor, 2009-09-17 at 21:43 +0200, Zdenek Kotala wrote:
> > Attached patch extends pg_ctl command with init option. 
> > 
> > pg_ctl -D /var/lib/postgres [-s] init
> > 
> > This should replace usage of initdb command which has problematic name
> > as we already discussed several times. Initdb binary will be still
> > there, but it can be renamed and move into execlib dir in the future.
> 
> If the name is a problem, why not change the name?  What you are
> proposing above is effectively a very elaborate name change, so why not
> go for a simpler one?

The idea is to have one command for server control. By my opinion init
logically belongs to group of command like start/stop. It is also
possible to add parameter for init+start in one command and so on.
If you look on ZFS you have only two commands to manage everything, it
is easy and you can start to use ZFS very quickly. I think this patch
increase usability/adoption od postgreSQL for newbies.

And second big advantage is that it would be possible easily extend
pg_ctl to cope with different postgresql versions (e.g. "pg_ctl -v 8.2
-D . init"). There is no reason why pg_ctl couldn't start different
postgreSQL version depends on PG_VERSION in data directory.

Zdenek



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pgbench: new feature allowing to launch shell commands

2009-09-17 Thread Michael Paquier

>
> You really should be returning a value at the point since the function
> signature defines a return type. If not the function should be void,
> which it cannot be in this context since it is used for boolean tests
> elsewhere. The returns in question are all part of error blocks and
> should return false.
>

OK I got your point, I missed the part managing with CState, which is tested
after doCustom.
Another version of the patch is attached with this email.

 I must be doing something wrong when I am running this script. It is
> still throwing errors. Would you mind showing me the pgbench command you
> used to run it?
>
> Of course, here it is the list of commands I use:
pgbench -i dbname (in case your database is called dbname)
pgbench -c 10 -t 10 -f transaction_file_name.data dbname (customer and
transaction numbers can also bet set as you want).

Regards,

-- 
Michael Paquier

NTT OSSC
--- postgresql-8.4.0.orig/contrib/pgbench/pgbench.c	2009-09-17 09:07:24.0 +0900
+++ postgresql-8.4.0/contrib/pgbench/pgbench.c	2009-09-18 13:24:33.0 +0900
@@ -120,6 +120,7 @@ typedef struct
 } Variable;
 
 #define MAX_FILES		128		/* max number of SQL script files allowed */
+#define SHELL_COMMAND_SIZE	256		/* maximum size allowed for shell command */
 
 /*
  * structures used in custom query mode
@@ -984,7 +985,47 @@ top:
 
 			st->listen = 1;
 		}
+		else if (pg_strcasecmp(argv[0], "shell") == 0)
+		{
+			int	j,
+retval,
+retvalglob;
+			char	commandLoc[SHELL_COMMAND_SIZE];
+
+			retval = snprintf(commandLoc,SHELL_COMMAND_SIZE-1,"%s",argv[1]);
+			if (retval < 0
+|| retval > SHELL_COMMAND_SIZE-1)
+			{
+fprintf(stderr, "Error launching shell command: too many characters\n");
+st->ecnt++;
+return;
+			}
+			retvalglob = retval;
 
+			for (j = 2; j < argc; j++)
+			{
+char *commandLoc2 = strdup(commandLoc);
+retval = snprintf(commandLoc,SHELL_COMMAND_SIZE-1,"%s %s", commandLoc2, argv[j]);
+retvalglob += retval;
+if (retval < 0
+	|| retvalglob > SHELL_COMMAND_SIZE-1)
+{
+	fprintf(stderr, "Error launching shell command: too many characters\n");
+	free(commandLoc2);
+	st->ecnt++;
+	return;
+}
+free(commandLoc2);
+			}
+			retval = system(commandLoc);
+			if (retval < 0)
+			{
+fprintf(stderr, "Error launching shell command: command not launched");
+st->ecnt++;
+return;
+			}
+			st->listen = 1;
+		}
 		goto top;
 	}
 }
@@ -1280,6 +1321,14 @@ process_commands(char *buf)
 fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
 		my_commands->argv[0], my_commands->argv[j]);
 		}
+		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+		{
+			if (my_commands->argc < 1)
+			{
+fprintf(stderr, "%s: missing command\n", my_commands->argv[0]);
+return NULL;
+			}
+		}
 		else
 		{
 			fprintf(stderr, "Invalid command %s\n", my_commands->argv[0]);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FSM search modes

2009-09-17 Thread Simon Riggs

On Fri, 2009-09-18 at 10:47 +0900, Itagaki Takahiro wrote:
> Simon Riggs  wrote:
> 
> > * compact - page selection specifically attempts to find the lowest
> > numbered blocks, so that the table will naturally shrink over time.
> 
> We cannot shrink the table if one tuple remains at the end of table
> and the tuple is always HOT-updated, because we put a new tuple into
> the same page where the old tuple is placed if possible.
> 
> In addition to your intelligent FSM search modes,
> do we need another algorithm to make the compaction to work better?

Perhaps we can have an additional piece of information about a table.
Something like target_size, so that normal updaters that attempt HOT
updates on blocks greater than target_size would know to avoid that
block and to seek a new location for the row lower in the table. Perhaps
such information could be reset and then sent via invalidation
mechanisms.

I'm thinking along the lines of a "fire alarm". An occasional mechanism
by which we can inform users of the need to evacuate certain blocks.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-17 Thread Fujii Masao

Hi,

On Thu, Sep 17, 2009 at 8:32 PM, Heikki Linnakangas
 wrote:
> Some random comments:

Thanks for the comments.

> I don't think we need the new PM_SHUTDOWN_3 postmaster state. We can
> treat walsenders the same as the archive process, and kill and wait for
> both of them to die in PM_SHUTDOWN_2 state.

OK, I'll use PM_SHUTDOWN_2 for walsender instead of PM_SHUTDOWN_3.

> I think there's something wrong with the napping in walsender. When I
> perform px_xlog_switch(), it takes surprisingly long for it to trickle
> to the standby. When I put a little proxy program in between the master
> and slave that delays all messages from the slave to the master by one
> second, it got worse, even though I would expect the master to still
> keep sending WAL at full speed. I get logs like this:

Probably this is because XLOG records following XLOG_SWITCH are
sent to the standby, too. Though those records are obviously not used
for recovery, they are sent because walsender doesn't know where
XLOG_SWITCH is.

The difficulty is that there might be many XLOG_SWITCHs in the XLOG
files which are going to be sent by walsender. How should walsender
get to know those location? One possible solution is to make walsender
parse the XLOG files and search XLOG_SWITCH. But this is overkill,
I think.

I don't think that XLOG switch is often requested and is sensitive to
response time in many cases. So it's not worth changing walsender
to skip the XLOG following XLOG_SWITCH, I think. Thought?

> 2009-09-17 14:14:09.932 EEST LOG:  xlog send request 0/38000428; send
> 0/3800; write 0/3800
> 2009-09-17 14:14:09.932 EEST LOG:  xlog read request 0/38000428; send
> 0/38000428; write 0/3800
>
> It looks like it's having 100 or 200 ms naps in between. Also, I
> wouldn't expect to see so many "read request" acknowledgments from the
> slave. The master doesn't really need to know how far the slave is,
> except in synchronous replication when it has requested a flush to
> slave. Another reason why master needs to know is so that the master can
> recycle old log files, but for that we'd really only need an
> acknowledgment once per WAL file or even less.

You mean that the new protocol for asking the standby about the completion
location of replication is required? In synchronous case, the backend should
not wait for one acknowledgement per XLOG file, for its performance.

> Why does XLogSend() care about page boundaries? Perhaps it's a leftover
> from the old approach that read from wal_buffers?

That is for not sending a partially-filled XLOG *record*, which simplifies the
logic that startup process waits for the next XLOG record available, i.e.,
startup process doesn't need to take care of a partially-sent record.

> Do we really need the support for asynchronous backend libpq commands?
> Could walsender just keep blasting WAL to the slave, and only try to
> read an acknowledgment after it has requested one, by setting
> XLOGSTREAM_FLUSH flag. Or maybe we should be putting the socket into
> non-blocking mode.

Yes, that is required, especially for synchronous replication. The receiving of
the acknowledgement should not keep the subsequent XLOG-sending waiting.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: Review handling of MOVE and FETCH (ToDo)

2009-09-17 Thread Pavel Stehule

ok, thank you, I'll look on these issues early


regards
Pavel

2009/9/18 Selena Deckelmann :
> Hi!
>
> John Naylor and I reviewed this patch. John created two test cases to
> demonstrated issues described later in this email.  I've attached
> those for reference.
>
> On Thu, Aug 27, 2009 at 8:04 PM, Pavel Stehule  
> wrote:
>> Hello,
>>
>> this small patch complete MOVE support in plpgsql and equalize plpgsql
>> syntax with sql syntax.
>>
>> There are possible new directions:
>>
>> FORWARD expr, FORWARD ALL, BACKWARD expr, BACKWARD all.
>>
>> These directions are not allowed for FETCH statement, because returns more 
>> rows.
>>
>> This patch is related to ToDo issue: Review handling of MOVE and FETCH
>
> == Submission review ==
>
> * Is the patch in the standard form?
>
> Yes, we have a contextual diff!
>
> * Does it apply cleanly to the current CVS HEAD?
>
> Yes!
>
> * Does it include reasonable tests, docs, etc?
>
> Suggestion: change variable 'returns_row' to 'returns_multiple_rows'
> and invert the values of 'returns_row' in the patch.
>
> Example:
>
> if (!fetch->returns_row)
> ereport(ERROR,
> (errcode(ERRCODE_SYNTAX_ERROR),
> errmsg("statement FETCH returns more rows."),
> errhint("Multirows fetch are not allowed in PL/pgSQL.")));
>
> instead do:
>
> if (fetch->returns_multiple_rows)
> ereport(ERROR,
> (errcode(ERRCODE_SYNTAX_ERROR),
> errmsg("statement FETCH returns more than one row."),
> errhint("Multirow FETCH is not allowed in PL/pgSQL.")));
>
> Does that make sense?  In reading the code, we thought this change
> made this variable much easier to understand, and the affected code
> much easier to read.
>
> ===
>
> Need a hard tab here to match the surrounding code: :)
> + %token  K_ALL$
>
> ===
>
> Can you clarify this comment?
>
> + /*
> +  * Read FETCH or MOVE statement direction. For statement for are only
> +  * one row directions allowed. MOVE statement can use FORWARD [(n|ALL)],
> +  * BACKWARD [(n|ALL)] directions too.
> +  */
>
> We think what you mean is:
> By default, cursor will only move one row. To MOVE more than one row
> at a time see complete_direction()
>
> We tested on Mac OS X and Linux (Ubuntu)
> ===
>
> Also, the tests did not test what the standard SQL syntax would
> require. John created a test case that demonstrated that the current
> patch did not work according to the SQL spec.
>
> We used that to find a little bug in complete_direction() (see below!).
>
> == Usability review ==
>
> Read what the patch is supposed to do, and consider:
>
> * Does the patch actually implement that?
>
> No -- we found a bug:
>
> Line 162 of the patch:
> +   expr = read_sql_expression2(K_FROM, K_IN,
>  Should be:
> +   fetch->expr = read_sql_expression2(K_FROM, K_IN,
>
> And you don' t need to declare expr earlier in the function.
>
> Once that's changed, the regression test needs to be updated for the
> expected result set.
>
> * Does it follow SQL spec, or the community-agreed behavior?
>
> It conforms with the already implemented SQL syntax once the
> 'fetch->expr' thing is fixed.
>
> * Have all the bases been covered?
>
> We think so, as long the documentation is fixed and the above changes
> are applied.
>
> Another thing John noted is that additional documentation needs to be
> updated for the SQL standard syntax, so that it no longer says that
> PL/PgSQL doesn't implement the same functionality.
>
> Thanks!
> -selena & John
>
> --
> http://chesnok.com/daily - me
> http://endpoint.com - work
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: Review handling of MOVE and FETCH (ToDo)

2009-09-17 Thread Selena Deckelmann

Hi!

John Naylor and I reviewed this patch. John created two test cases to
demonstrated issues described later in this email.  I've attached
those for reference.

On Thu, Aug 27, 2009 at 8:04 PM, Pavel Stehule  wrote:
> Hello,
>
> this small patch complete MOVE support in plpgsql and equalize plpgsql
> syntax with sql syntax.
>
> There are possible new directions:
>
> FORWARD expr, FORWARD ALL, BACKWARD expr, BACKWARD all.
>
> These directions are not allowed for FETCH statement, because returns more 
> rows.
>
> This patch is related to ToDo issue: Review handling of MOVE and FETCH

== Submission review ==

* Is the patch in the standard form?

Yes, we have a contextual diff!

* Does it apply cleanly to the current CVS HEAD?

Yes!

* Does it include reasonable tests, docs, etc?

Suggestion: change variable 'returns_row' to 'returns_multiple_rows'
and invert the values of 'returns_row' in the patch.

Example:

if (!fetch->returns_row)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("statement FETCH returns more rows."),
errhint("Multirows fetch are not allowed in PL/pgSQL.")));

instead do:

if (fetch->returns_multiple_rows)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("statement FETCH returns more than one row."),
errhint("Multirow FETCH is not allowed in PL/pgSQL.")));

Does that make sense?  In reading the code, we thought this change
made this variable much easier to understand, and the affected code
much easier to read.

===

Need a hard tab here to match the surrounding code: :)
+ %token  K_ALL$

===

Can you clarify this comment?

+ /*
+  * Read FETCH or MOVE statement direction. For statement for are only
+  * one row directions allowed. MOVE statement can use FORWARD [(n|ALL)],
+  * BACKWARD [(n|ALL)] directions too.
+  */

We think what you mean is:
By default, cursor will only move one row. To MOVE more than one row
at a time see complete_direction()

We tested on Mac OS X and Linux (Ubuntu)
===

Also, the tests did not test what the standard SQL syntax would
require. John created a test case that demonstrated that the current
patch did not work according to the SQL spec.

We used that to find a little bug in complete_direction() (see below!).

== Usability review ==

Read what the patch is supposed to do, and consider:

* Does the patch actually implement that?

No -- we found a bug:

Line 162 of the patch:
+   expr = read_sql_expression2(K_FROM, K_IN,
 Should be:
+   fetch->expr = read_sql_expression2(K_FROM, K_IN,

And you don' t need to declare expr earlier in the function.

Once that's changed, the regression test needs to be updated for the
expected result set.

* Does it follow SQL spec, or the community-agreed behavior?

It conforms with the already implemented SQL syntax once the
'fetch->expr' thing is fixed.

* Have all the bases been covered?

We think so, as long the documentation is fixed and the above changes
are applied.

Another thing John noted is that additional documentation needs to be
updated for the SQL standard syntax, so that it no longer says that
PL/PgSQL doesn't implement the same functionality.

Thanks!
-selena & John

-- 
http://chesnok.com/daily - me
http://endpoint.com - work

MOVE_patch_test_case2
Description: Binary data

MOVE_patch_test_case
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bulk Inserts

2009-09-17 Thread Jeff Janes

2009/9/16 Pierre Frédéric Caillaud 

>
>  OK, that makes sense.  I thought you had hacked either XLogInsert or the
>> heap WAL replay code so that you could just accumulate tuples in the rdata
>> chain and then submit them all under the cover of a single WALInsertLock.
>>
>
> Ah, no, I did not do that.
>
> This would be difficult to do : rdata chain contains buffer pointers, and
> when we are in XLogInsert, we have an exclusive lock on the buffer. If
> XLogInsert accumulated xlog records as you say, then later, when it's time
> to write them to the xlog, it would no longer hold exclusive lock on the
> buffer, so its contents could have changed, and if XLogInsert decides to do
> a full page write, the contents will be wrong.
>

Yes, I didn't mean to make XLogInsert unilaterally accumulate rdata
(Actually I have done that, purely as a proof of concept.  The resulting
database is completely unrecoverable, but as long you don't bring it down,
it runs fine and lets me test the speed of different concepts without going
to the trouble of implementing them correctly).

heap_bulk_insert would do the accumulation.  The hack to XLogInsert would
involve making it insert an extra dummy xlog record header for every tuple
in the rdata chain.  Alternatively, the hack to heap replay would involve
making it accept multiple tuples reported under a single WAL record.  I
don't know which one would be easier.


>
> Besides, the LSN needs to be set in the page at every call to heap_insert
> (or else WAL will not be "Ahead"), so if XLogInsert accumulated stuff before
> writing it, it would need a mechanism to assign a LSN without having written
> the xlog data yet.
>

Right, XLogInsert would only be able to accumulate as long as it knew it was
going to get called again before the buffer exclusive lock was released.
That is why the accumulation is better done in the heap_bulk_insert,
otherwise it would require an unfortunate amount of communication between
the two.



>  If you haven't done that, then of course doing the bulk insert doesn't
>> help much if you still do tuple-by-tuple XLogInsert.
>>
>
> Exactly.
>
>  So in the case that it is
>> under the limit, you first run through the tuples putting them into the
>> block, then run through the tuples again doing the XLogInserts?
>>
>
> Yes, exactly. This isn't really optimal...
> I wonder if I could build one rdata chain containing all updates to the
> tuples and submit it in one go. Would it work ?...


I'm sure it could be made to work.  I haven't checked the replay code, but I
doubt it would work on this massed record right out of the box.  Or we could
insert dummy headers between the each tuple's WAL data.

...


> So in order to benchmark the right thing, I have :
> - all the tables in a big RAMDISK
> - xlog on RAID5
> - fsync=fdatasync
>
> I've also found that setting wal_buffers to a large value like 128MB gives
> a significant speed boost when doing COPY or INSERT INTO SELECT, probably
> because it allows the backends to always find space in the buffers even if
> the walwriter is a bit busy.


That seems very large, even for the high throughput set up you describe.  Is
the WAL background writer set at the default interval?  (On the other hand,
if 128MB is just a rounding error in your machine's total RAM size, why not
be generous?  On the other other hand, I've seen perfomance start to drop as
wal_buffers gets too large, though I never bothered to chased down the
cause.)

At one point, I think that XLogInsert would synchronously write out the
buffer when it noticed it was over half full, but that got ripped out (if
I'm going to block, I might as well wait until it actually is full to do).
Now that there is a background writer, maybe it would be a good idea to have
XLogInserters wake it up if the buffer is half full.
...


>  Another angle of attack would be to make wal-writing more efficient...
>>>
>> If you mean to do this without changing the xlog interfaces, I'm not
>> optimistic.
>>
>
> Agree : that's why I didn't even try ;)
>
>  If you have to call XLogInsert once per row that is copied (or
>> insert...select), then my experiments show that simply taking the
>> WALInsertLock and immediately releasing it, doing absolutely no real work
>> while it is held, is already a substanial multi-core scalibility
>> bottleneck.
>>
>
> Confirmed by context switch issue above...
>
> Having all cores block on the same lock for every row can be OK if it's a
> spinlock protecting just a few lines of code... which is not the present
> case...



Maybe there could be some hybrid approach.  You take the spinlock, check
that you could get the lwlock if you wanted to.  If you could get the lwlock
and know you have almost no work to do, then just do it while holding the
spinlock.  If you discover you have more work todo (xlog file switch,
nontrivial AdvanceXLInsert, etc.) then actually take the lwlock and drop the
spinlock.  This *really* breaks the encapsulation of lwlock, so it is
proba

Re: [HACKERS] LDAP where DN does not include UID attribute

2009-09-17 Thread Robert Fleming

On Thu, Sep 17, 2009 at 6:24 PM, Robert Fleming  wrote:

> The new patch ...
>

s/printf/ereport(LOG,/

Re: [HACKERS] PGCluster-II Progress

2009-09-17 Thread Marcos Luis Ortiz Valmaseda

Excellent news,Mitani. My personal intention if to help you with this work for 
the redesign of PGCluster-II, or on the coding of it. 

If we work like a team, we can release this new version of the project more 
quickly.
Do you have any sources repository of this new version? I work fine with CVS 
and Subversion but if you are working with Git, is better, because is my 
favorite SCM.

Regards, and I'll be waiting your answer. 

"The hurry is enemy of the success: for that reason...Be patient"

Ing. Marcos L. Ortiz Valmaseda
Línea Soporte y Despliegue
Centro de Tecnologías de Almacenamiento y Análisis de Datos (CENTALAD)

Linux User # 418229
PostgreSQL User
http://www.postgresql.org
http://www.planetpostgresql.org/
http://www.postgresql-es.org/

- Mensaje original -
De: mit...@sraw.co.jp
Para: dev...@gunduz.org, "Marc G. Fournier" 
CC: "pgsql-hackers" 
Enviados: Jueves, 17 de Septiembre 2009 15:13:43 GMT -10:00 Hawai
Asunto: Re: [HACKERS] PGCluster-II Progress

I'm sorry for late my response.
I already sent same contents to Marcos, but I did not cc to all of yours.
(I'll join pgsql-hackers ML).

1. Statement of PGCluster-II
The state of PGCluster-II is re designing.
I developed and demonstorated PGCluster-II as full data shared model in 2006.
It worked fine. 
However, perfomance point of view, it was not good at all.
Then, I improved it in 2007. Even so, it was absurdly slow.

I think that it was required redesigning to improve performanse.
The full data shread model is not difficult to build up (I developed the 1st 
PGCluster-II within 2 weeks),
but it is not easy to improve performanse.
Then, I chose partial data shared model as my conclusion.
I spend two years to decide it.

2. Prospects of PGCluster-II
Currentry, I'm developing the cluster server management tools as a part of the 
latest PGCluster.
It will be upcoming release.
After that, I will be back to the PGCluster-II project.

On comming Nov. 19, The International PostgreSQL Cluster Developper's Meeting 
(I'm not sure the proper name) will be held in Tokyo.
I'll discuss about a task of PGCluster-II as a part of PostgreSQL's cluster 
project at the meeting. 
Clustering is getting more important each day.
I'm expecting that PGCluster & PGCluster-II play a role in the solution.

Best regards,
---
At.Mitani

-- original message --
From: Devrim G？ND？Z
To: Marc G. Fournier
Cc: Marcos Luis Ortiz Valmaseda;
 mitani;pgsql-hackers
Sent: Tue, 15 Sep 2009 19:20:37 +0300
Subject: Re: [HACKERS] PGCluster-II Progress

>On Tue, 2009-09-15 at 13:16 -0300, Marc G. Fournier wrote:
>> Odd, I talked to him a couple of weeks ago and he was working on a
>> new 
>> release in preparation for some upcoming talks he was doing ... was 
>> working on bringing it up to support 8.3.x ...
>> 
>> "But, I'm just prepareing new version of the PGCluster..."
>
>Mitani was probably talking about PGCluster, not PGCluster -II. Mitani?
>
>Last time I heard about PGCluster-II was at Anniversary Summit(2006).
>Oh, and I met with him at France at 2007, and no improvements since
>then.
>
>Regards,
>-- 
>Devrim G？ND？Z, RHCE
>Command Prompt - http://www.CommandPrompt.com 
>devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
>   http://www.gunduz.org
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for 8.5, transformationHook

2009-09-17 Thread Robert Haas

On Sun, Sep 13, 2009 at 10:32 PM, Robert Haas  wrote:
> On Tue, Aug 11, 2009 at 12:09 AM, Pavel Stehule  
> wrote:
>> 2009/8/10 Tom Lane :
>>> Robert Haas  writes:
 On Sun, Jul 26, 2009 at 9:29 AM, Pavel Stehule 
 wrote:
> new patch add new contrib "transformations" with three modules
> anotation, decode and json.
>>>
 These are pretty good examples, but the whole thing still feels a bit
 grotty to me.  The set of syntax transformations that can be performed
 with a hook of this type is extremely limited - in particular, it's
 the set of things where the parser thinks it's valid and that the
 structure is reasonably similar to what you have in mind, but the
 meaning is somewhat different.  The fact that two of your three
 examples require your named and mixed parameters patch seems to me to
 be evidence of that.
>>>
>>> I finally got around to looking at these examples, and I still don't
>>> find them especially compelling.  Both the decode and the json example
>>> could certainly be done with regular function definitions with no need
>>> for this hook.  The => to AS transformation maybe not, but so what?
>>> The reason we don't have that one in core is not technological.
>>>
>>> The really fundamental problem with this hook is that it can't do
>>> anything except create syntactic sugar, and a pretty darn narrow class
>>> of syntactic sugar at that.  Both the raw parse tree and the transformed
>>> tree still have to be valid within the core system's understanding.
>>> What's more, since there's no hook in ruleutils.c, what is going to come
>>> out of the system (when dumping, examining a view, etc) is the
>>> transformed expression --- so you aren't really hiding any complexity
>>> from the user, you're just providing a one-time shorthand that will be
>>> expanded into a notation he also has to be familiar with.
>>>
>>
>> I agree - so this could be a problem
>>
>>> Now you could argue that we've partly created that restriction by
>>> insisting that the hook be in transformFuncCall and not transformExpr.
>>> But that only restricts the subset of raw parse trees that you can play
>>> with; it doesn't change any of the other restrictions.
>>>
>>> Lastly, I don't think the problem of multiple hook users is as easily
>>> solved as Pavel claims.  These contrib modules certainly fail to solve
>>> it.  Try unloading (or re-LOADing) them in a different order than they
>>> were loaded.
>>>
>>
>> There are two possible solution
>>
>> a) all modules should be loaded only from configuration
>> b) modules should be loaded in transformation time - transformation of
>> functions should be substituted some registered function for some
>> functions. This little bit change sense of this patch. But it's enough
>> for use cases like DECODE, JSON, SOAP. It's mean one new column to
>> pg_proc - like protransformfunc.
>>
>> ???
>> Pavel
>>
>>> So on the whole I still think this is a solution looking for a problem,
>>> and that any problems it could solve are better solved elsewhere.
>
> I am in the process of looking through the patches to be assigned for
> the September CommitFest, and it seems to me that we really haven't
> made any progress here since the last CommitFest.  Jeff Davis provided
> a fairly good summary of the issues:
>
> http://archives.postgresql.org/message-id/1249784508.9256.892.ca...@jdavis
>
> I don't think we really gain much by assigning yet another reviewer to
> this patch.  The patch is simple enough and doesn't really need any
> further code review AFAICS, but nobody except the patch author seems
> confident that this is all that useful.[1] I'm biased by the fact that
> I reviewed this patch and didn't particularly like it either, but I
> think we need more than to think about committing this in the face of
> Tom Lane's opinion (which I share, FWIW) that this is of very limited
> usefulness.
>
> ...Robert
>
> [1] Indeed, the few supportive responses were along the lines of "oh -
> this should help with X" to which the response was, in at least two
> cases, "well actually no it won't".

Based on the above reasoning, and hearing no contrary hue and cry from
the masses, I am marking this patch as rejected.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_class_ownercheck() in EnableDisableRule()

2009-09-17 Thread KaiGai Kohei

I found a strange checks in the native database privilege stuff.

The EnableDisableRule() calls pg_class_ownercheck() to check
the ownership of the relation which has a rule to be turned
on/off.

Only ATExecEnableDisableRule() calls the EnableDisableRule(),
and only ATExecCmd() calls the ATExecEnableDisableRule()
with AT_EnableRule, AT_EnableAlwaysRule, AT_EnableReplicaRule
and AT_DisableRule.

However, ATPrepCmd() already checks ownership of the target
relation using ATSimplePermissions(), so the pg_class_ownercheck()
at the EnableDisableRule() always returns true.

I replaced the checks by the abstraction of ac_rule_toggle()
in my patch. However, it may be better to remove the checks simply.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FSM search modes

2009-09-17 Thread Itagaki Takahiro

Simon Riggs  wrote:

> * compact - page selection specifically attempts to find the lowest
> numbered blocks, so that the table will naturally shrink over time.

We cannot shrink the table if one tuple remains at the end of table
and the tuple is always HOT-updated, because we put a new tuple into
the same page where the old tuple is placed if possible.

In addition to your intelligent FSM search modes,
do we need another algorithm to make the compaction to work better?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PGCluster-II Progress

2009-09-17 Thread mitani

I'm sorry for late my response.
I already sent same contents to Marcos, but I did not cc to all of yours.
(I'll join pgsql-hackers ML).

1. Statement of PGCluster-II
The state of PGCluster-II is re designing.
I developed and demonstorated PGCluster-II as full data shared model in 2006.
It worked fine. 
However, perfomance point of view, it was not good at all.
Then, I improved it in 2007. Even so, it was absurdly slow.

I think that it was required redesigning to improve performanse.
The full data shread model is not difficult to build up (I developed the 1st 
PGCluster-II within 2 weeks),
but it is not easy to improve performanse.
Then, I chose partial data shared model as my conclusion.
I spend two years to decide it.

2. Prospects of PGCluster-II
Currentry, I'm developing the cluster server management tools as a part of the 
latest PGCluster.
It will be upcoming release.
After that, I will be back to the PGCluster-II project.

On comming Nov. 19, The International PostgreSQL Cluster Developper's Meeting 
(I'm not sure the proper name) will be held in Tokyo.
I'll discuss about a task of PGCluster-II as a part of PostgreSQL's cluster 
project at the meeting. 
Clustering is getting more important each day.
I'm expecting that PGCluster & PGCluster-II play a role in the solution.

Best regards,
---
At.Mitani

-- original message --
From: Devrim G？ND？Z
To: Marc G. Fournier
Cc: Marcos Luis Ortiz Valmaseda;
 mitani;pgsql-hackers
Sent: Tue, 15 Sep 2009 19:20:37 +0300
Subject: Re: [HACKERS] PGCluster-II Progress

>On Tue, 2009-09-15 at 13:16 -0300, Marc G. Fournier wrote:
>> Odd, I talked to him a couple of weeks ago and he was working on a
>> new 
>> release in preparation for some upcoming talks he was doing ... was 
>> working on bringing it up to support 8.3.x ...
>> 
>> "But, I'm just prepareing new version of the PGCluster..."
>
>Mitani was probably talking about PGCluster, not PGCluster -II. Mitani?
>
>Last time I heard about PGCluster-II was at Anniversary Summit(2006).
>Oh, and I met with him at France at 2007, and no improvements since
>then.
>
>Regards,
>-- 
>Devrim G？ND？Z, RHCE
>Command Prompt - http://www.CommandPrompt.com 
>devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
>   http://www.gunduz.org
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LDAP where DN does not include UID attribute

2009-09-17 Thread Robert Fleming

On Thu, Sep 17, 2009 at 11:15 AM, Magnus Hagander wrote:

> On Thu, Sep 17, 2009 at 18:02, Robert Fleming  wrote:
> > Following a discussion on the pgsql-admin list
> > , I
> have
> > created a patch to (optionally) allow PostgreSQL to do a LDAP search to
> > determine the user's DN (as is done in Apache, MediaWiki, Bugzilla, et
> al.)
> > instead of building the DN from a prefix and suffix.
> > This is necessary for schemas where the login attribute is not in the DN,
> > such as is described here
> >  (look
> for
> > "name-based").  This patch is against PostgreSQL 8.4.0 from Debian
> > Lenny-backports.  If this would be a welcome addition, I can port it
> forward
> > to the latest from postgresql.org.
> > Thanks in advance for your feedback.
>
> This sounds like a very useful feature, and one that I can then remove
> from my personal TODO list without having to do much work :-)
>
> A couple of comments:
>
> First of all, please read up on the PostgreSQL coding style, or at
> least look at the code around yours. This doesn't look anything like
> our standards.
>

Sorry, the 8.1 manual said all contributions go through pgindent so I didn't
spend too much time on that.  I see that the 8.4 manual clarifies that
pgindent won't get run till release time.  In any case, I've re-formatted
the patch using the Emacs settings from the 8.1 manual (why are they gone
from the 8.4 manual)?  It seems like most of the rest of the Coding
Conventions have to do with error reporting.  Do please let me know if
there's something else I can do.

Second, this appears to require an anonymous bind to the directory,
> which is something we should not encourage people to enable on their
> LDAP servers. I think we need to also take parameters with a DN and a
> password to bind with in order to do the search, and then re-bind as
> the user when found.


The new patch implements the initial bind using new configuration parameters
"ldapbinddn" and "ldapbindpasswd".  I've also added a "ldapsearchattribute"
just to be complete.

Robert
diff -ru postgresql-8.4-8.4.0-original/src/backend/libpq/auth.c postgresql-8.4-8.4.0/src/backend/libpq/auth.c
--- postgresql-8.4-8.4.0-original/src/backend/libpq/auth.c	2009-06-25 04:30:08.0 -0700
+++ postgresql-8.4-8.4.0/src/backend/libpq/auth.c	2009-09-17 18:12:57.0 -0700
@@ -2150,10 +2150,110 @@
 		}
 	}
 
-	snprintf(fulluser, sizeof(fulluser), "%s%s%s",
-			 port->hba->ldapprefix ? port->hba->ldapprefix : "",
-			 port->user_name,
-			 port->hba->ldapsuffix ? port->hba->ldapsuffix : "");
+	if (port->hba->ldapbasedn)
+	{
+			char filter[NAMEDATALEN + 10]; // FIXME: maybe there's a preferred way to pick this size?
+			LDAPMessage* search_message;
+			char* attributes[2];
+			LDAPMessage* entry;
+			char* dn;
+
+			/* bind for searching */
+			r = ldap_simple_bind_s(ldap,
+   port->hba->ldapbinddn ? port->hba->ldapbinddn : "",
+   port->hba->ldapbindpasswd ? port->hba->ldapbindpasswd : "");
+			if (r != LDAP_SUCCESS)
+			{
+	ereport(LOG,
+			(errmsg("LDAP initial bind failed for ldapbinddn \"%s\" on server \"%s\": error code %d",
+	port->hba->ldapbinddn, port->hba->ldapserver, r)));
+	return STATUS_ERROR;
+			}
+
+			/* fetch just one attribute, else /all/ attributes are returned */
+			attributes[0] = "uid";
+			attributes[1] = NULL;
+
+			snprintf(filter, sizeof(filter), "(%s=%s)",
+	 port->hba->ldapsearchattribute ? port->hba->ldapsearchattribute : "uid",
+	 port->user_name); /* FIXME: sanitize user_name? */
+			filter[sizeof(filter) - 1] = '\0';
+
+			r = ldap_search_s(ldap,
+			  port->hba->ldapbasedn,
+			  LDAP_SCOPE_SUBTREE,
+			  filter,
+			  attributes,
+			  0,
+			  &search_message);
+
+			if (r != LDAP_SUCCESS)
+			{
+	ereport(LOG,
+			(errmsg("LDAP search failed for filter \"%s\" on server \"%s\": error code %d",
+	filter, port->hba->ldapserver, r)));
+	return STATUS_ERROR;
+			}
+
+			if (ldap_count_entries(ldap, search_message) != 1)
+			{
+	if (ldap_count_entries(ldap, search_message) == 0)
+			ereport(LOG,
+	(errmsg("LDAP search failed for filter \"%s\" on server \"%s\": no such user",
+			filter, port->hba->ldapserver)));
+	else
+			ereport(LOG,
+	(errmsg("LDAP search failed for filter \"%s\" on server \"%s\": user is not unique (%d matches)",
+			filter, port->hba->ldapserver, ldap_count_entries(ldap, search_message;
+
+	ldap_msgfree(search_message);
+	return STATUS_ERROR;
+			}
+
+			entry = ldap_first_entry(ldap, search_message);
+			dn = ldap_get_dn(ldap, entry);
+			if (dn == NULL)
+			{
+	int error;
+	(void)ldap_get_option(ldap, LDAP_OPT_RESULT_CODE, &error);
+	ereport(LOG,
+			(errmsg("LDAP get_dn() for the first entry matching \"%s\" on server \"%s\": %s",
+	filter, po

Re: [HACKERS] generic copy options

2009-09-17 Thread Andrew Dunstan




Robert Haas wrote:

On Thu, Sep 17, 2009 at 8:31 PM, Dan Colish  wrote:
  

Ok, so I ran something like you suggested and did a simple copy from an
empty file to just test the parsing. I have the COPY statement run 3733
times in the transaction block and did the select timestamps, but I
still only was a few milliseconds difference between the two versions.
Maybe a more complex copy statment could be a better test of the parser,
but I do not see a significant difference of parsing speed here.



I find that entirely unsurprising.


  


Me too.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-17 Thread daveg

On Tue, Sep 15, 2009 at 07:38:18AM +0200, Pavel Stehule wrote:
> it isn't fair :) why you use $$ without single quote? And still this
> case should be vulnerable on SQL injection. Maybe you or me knows,
> what SQL injection means, but beginners knows nothing and this people
> use following bad code:
> 
> sql := $$SELECT * FROM '${table_name}'$$} and are happy. But this code
> is wrong!

I have an idea you will like less: have multiple interpolation codes that
automagically do the right quoting. Perhaps as extra printf like type codes.
The above then becomes:

  sql := pgprintf($$SELECT * FROM %I;$$, table_name )

Where %I evaluates as if it were quote_ident(%s).

This would maybe even encourage users to do the quoting they should by
making it easy.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 8:31 PM, Dan Colish  wrote:
> Ok, so I ran something like you suggested and did a simple copy from an
> empty file to just test the parsing. I have the COPY statement run 3733
> times in the transaction block and did the select timestamps, but I
> still only was a few milliseconds difference between the two versions.
> Maybe a more complex copy statment could be a better test of the parser,
> but I do not see a significant difference of parsing speed here.

I find that entirely unsurprising.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Dan Colish

On Thu, Sep 17, 2009 at 07:45:45PM -0400, Andrew Dunstan wrote:
>
>
> Dan Colish wrote:
>>  CREATE TABLE
>>  INSERT 0 10
>>  Timing is on.
>>  COPY 10
>>  Time: 83.273 ms
>>  BEGIN
>>  Time: 0.412 ms
>>  TRUNCATE TABLE
>>  Time: 0.357 ms
>>  COPY 10
>>  Time: 140.911 ms
>>  COMMIT
>>  Time: 4.909 ms
>>
>>
>>   
>
> Anything that doesn't have times that are orders of magnitude greater  
> than this is pretty much useless as a measurement of COPY performance,  
> IMNSHO.
>
> In this particular test, to check for paring times, I'd be inclined to  
> do copy repeatedly (i.e. probably quite a few thousand times) from an  
> empty file to test the speed. Something like:
>
>select current_timestamp;
>begin;
>truncate;
>copy;copy;copy; ...
>commit;
>select current_timestamp;
>
>
> (tests like this are really a good case for DO ' something'; - we could  
> put a loop in the DO.)
>
> cheers
>
> andrew
>

Ok, so I ran something like you suggested and did a simple copy from an
empty file to just test the parsing. I have the COPY statement run 3733
times in the transaction block and did the select timestamps, but I
still only was a few milliseconds difference between the two versions.
Maybe a more complex copy statment could be a better test of the parser,
but I do not see a significant difference of parsing speed here.

--
--Dan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Andrew Dunstan




Dan Colish wrote:

CREATE TABLE
INSERT 0 10
Timing is on.
COPY 10
Time: 83.273 ms
BEGIN
Time: 0.412 ms
TRUNCATE TABLE
Time: 0.357 ms
COPY 10
Time: 140.911 ms
COMMIT
Time: 4.909 ms


  


Anything that doesn't have times that are orders of magnitude greater 
than this is pretty much useless as a measurement of COPY performance, 
IMNSHO.


In this particular test, to check for paring times, I'd be inclined to 
do copy repeatedly (i.e. probably quite a few thousand times) from an 
empty file to test the speed. Something like:


   select current_timestamp;
   begin;
   truncate;
   copy;copy;copy; ...
   commit;
   select current_timestamp;


(tests like this are really a good case for DO ' something'; - we could 
put a loop in the DO.)


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Emmanuel Cecchet


On that particular patch, as Robert mentioned, only the parsing has changed.
In the case of \copy, the parsing is much lighter than before in psql 
(remains the same in the server). The bigger the COPY operation the less 
you will see the impact of the parsing since it is done only once for 
the entire operation.


Emmanuel

Dan Colish wrote:

On Thu, Sep 17, 2009 at 07:10:35PM -0400, Andrew Dunstan wrote:
  

Greg Smith wrote:


On Thu, 17 Sep 2009, Dan Colish wrote:

  
- Performance appears to be the same although I don't have a good 
way for

  testing this at the moment


Here's what I do to generate simple COPY performance test cases:

CREATE TABLE t (i integer);
INSERT INTO t SELECT x FROM generate_series(1,10) AS x;
\timing
COPY t TO '/some/file' WITH [options];
BEGIN;
TRUNCATE TABLE t;
COPY t FROM '/some/file' WITH [options];
COMMIT;

You can adjust the size of the generated table based on whether you  
want to minimize (small number) or maximize (big number) the impact of  
the setup overhead relative to actual processing time.  Big numbers  
make sense if there's a per-row change, small ones if it's mainly COPY  
setup that's been changed if you want a small bit of data to test  
against.


An example with one column in it is a good test case for seeing  
whether per-row impact has gone up.  You'd want something with a wider  
row for other types of performance tests.


The reason for the BEGIN/COMMIT there is that form utilizes an  
optimization that lowers WAL volume when doing the COPY insertion,  
which makes it more likely you'll be testing performance of the right  
thing.



  
I usually prefer to test with a table that is more varied than anything  
you can make with generate_series. When I tested my ragged copy patch  
the other day I copied 1,000,000 rows out of a large table with a  
mixture of dates, strings, numbers and nulls.


But then, it has a (tiny) per field overhead so I wanted to make sure  
that was well represented in the test.


You are certainly right about wrapping it in begin/truncate/commit (and  
when you do make sure that archiving is not on).


You probably want to make sure that the file is not on the same disk as  
the database, to avoid disk contention. Or, better, make sure that it is  
in OS file system cache, or on a RAM disk.


cheers

andrew



If someone with a more significant setup can run tests that would ideal.
I only have my laptop which is a single disk and fairly underpowered.

That said, here are my results running the script above, it looks like
the pach improves performance. I would really interested to see results
on a larger data set and heavier iron.

--
--Dan

Without Patch:

CREATE TABLE
INSERT 0 10
Timing is on.
COPY 10
Time: 83.273 ms
BEGIN
Time: 0.412 ms
TRUNCATE TABLE
Time: 0.357 ms
COPY 10
Time: 140.911 ms
COMMIT
Time: 4.909 ms

With Patch:

CREATE TABLE
INSERT 0 10
Timing is on.
COPY 10
Time: 80.205 ms
BEGIN
Time: 0.351 ms
TRUNCATE TABLE
Time: 0.346 ms
COPY 10
Time: 124.303 ms
COMMIT
Time: 4.130 ms



  



--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Dan Colish

On Thu, Sep 17, 2009 at 07:10:35PM -0400, Andrew Dunstan wrote:
>
>
> Greg Smith wrote:
>> On Thu, 17 Sep 2009, Dan Colish wrote:
>>
>>> - Performance appears to be the same although I don't have a good 
>>> way for
>>>   testing this at the moment
>>
>> Here's what I do to generate simple COPY performance test cases:
>>
>> CREATE TABLE t (i integer);
>> INSERT INTO t SELECT x FROM generate_series(1,10) AS x;
>> \timing
>> COPY t TO '/some/file' WITH [options];
>> BEGIN;
>> TRUNCATE TABLE t;
>> COPY t FROM '/some/file' WITH [options];
>> COMMIT;
>>
>> You can adjust the size of the generated table based on whether you  
>> want to minimize (small number) or maximize (big number) the impact of  
>> the setup overhead relative to actual processing time.  Big numbers  
>> make sense if there's a per-row change, small ones if it's mainly COPY  
>> setup that's been changed if you want a small bit of data to test  
>> against.
>>
>> An example with one column in it is a good test case for seeing  
>> whether per-row impact has gone up.  You'd want something with a wider  
>> row for other types of performance tests.
>>
>> The reason for the BEGIN/COMMIT there is that form utilizes an  
>> optimization that lowers WAL volume when doing the COPY insertion,  
>> which makes it more likely you'll be testing performance of the right  
>> thing.
>>
>>
>
> I usually prefer to test with a table that is more varied than anything  
> you can make with generate_series. When I tested my ragged copy patch  
> the other day I copied 1,000,000 rows out of a large table with a  
> mixture of dates, strings, numbers and nulls.
>
> But then, it has a (tiny) per field overhead so I wanted to make sure  
> that was well represented in the test.
>
> You are certainly right about wrapping it in begin/truncate/commit (and  
> when you do make sure that archiving is not on).
>
> You probably want to make sure that the file is not on the same disk as  
> the database, to avoid disk contention. Or, better, make sure that it is  
> in OS file system cache, or on a RAM disk.
>
> cheers
>
> andrew

If someone with a more significant setup can run tests that would ideal.
I only have my laptop which is a single disk and fairly underpowered.

That said, here are my results running the script above, it looks like
the pach improves performance. I would really interested to see results
on a larger data set and heavier iron.

--
--Dan

Without Patch:

CREATE TABLE
INSERT 0 10
Timing is on.
COPY 10
Time: 83.273 ms
BEGIN
Time: 0.412 ms
TRUNCATE TABLE
Time: 0.357 ms
COPY 10
Time: 140.911 ms
COMMIT
Time: 4.909 ms

With Patch:

CREATE TABLE
INSERT 0 10
Timing is on.
COPY 10
Time: 80.205 ms
BEGIN
Time: 0.351 ms
TRUNCATE TABLE
Time: 0.346 ms
COPY 10
Time: 124.303 ms
COMMIT
Time: 4.130 ms




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Andrew Dunstan




Greg Smith wrote:

On Thu, 17 Sep 2009, Dan Colish wrote:

- Performance appears to be the same although I don't have a good 
way for

  testing this at the moment


Here's what I do to generate simple COPY performance test cases:

CREATE TABLE t (i integer);
INSERT INTO t SELECT x FROM generate_series(1,10) AS x;
\timing
COPY t TO '/some/file' WITH [options];
BEGIN;
TRUNCATE TABLE t;
COPY t FROM '/some/file' WITH [options];
COMMIT;

You can adjust the size of the generated table based on whether you 
want to minimize (small number) or maximize (big number) the impact of 
the setup overhead relative to actual processing time.  Big numbers 
make sense if there's a per-row change, small ones if it's mainly COPY 
setup that's been changed if you want a small bit of data to test 
against.


An example with one column in it is a good test case for seeing 
whether per-row impact has gone up.  You'd want something with a wider 
row for other types of performance tests.


The reason for the BEGIN/COMMIT there is that form utilizes an 
optimization that lowers WAL volume when doing the COPY insertion, 
which makes it more likely you'll be testing performance of the right 
thing.





I usually prefer to test with a table that is more varied than anything 
you can make with generate_series. When I tested my ragged copy patch 
the other day I copied 1,000,000 rows out of a large table with a 
mixture of dates, strings, numbers and nulls.


But then, it has a (tiny) per field overhead so I wanted to make sure 
that was well represented in the test.


You are certainly right about wrapping it in begin/truncate/commit (and 
when you do make sure that archiving is not on).


You probably want to make sure that the file is not on the same disk as 
the database, to avoid disk contention. Or, better, make sure that it is 
in OS file system cache, or on a RAM disk.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 6:54 PM, Greg Smith  wrote:
> On Thu, 17 Sep 2009, Dan Colish wrote:
>
>>        - Performance appears to be the same although I don't have a good
>> way for
>>          testing this at the moment
>
> Here's what I do to generate simple COPY performance test cases:
>
> CREATE TABLE t (i integer);
> INSERT INTO t SELECT x FROM generate_series(1,10) AS x;
> \timing
> COPY t TO '/some/file' WITH [options];
> BEGIN;
> TRUNCATE TABLE t;
> COPY t FROM '/some/file' WITH [options];
> COMMIT;
>
> You can adjust the size of the generated table based on whether you want to
> minimize (small number) or maximize (big number) the impact of the setup
> overhead relative to actual processing time.  Big numbers make sense if
> there's a per-row change, small ones if it's mainly COPY setup that's been
> changed if you want a small bit of data to test against.
>
> An example with one column in it is a good test case for seeing whether
> per-row impact has gone up.  You'd want something with a wider row for other
> types of performance tests.
>
> The reason for the BEGIN/COMMIT there is that form utilizes an optimization
> that lowers WAL volume when doing the COPY insertion, which makes it more
> likely you'll be testing performance of the right thing.

Unless something has changed drastically in the last day or two, this
patch is only affecting the option-parsing phase of copy, so the
impact should be nearly all but noticeable, and it should be an
up-front cost, not per row.  It would be good to verify that, of
course.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby 0.2.1

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 2:54 AM, Heikki Linnakangas
 wrote:
> Robert Haas wrote:
>> On Wed, Sep 16, 2009 at 6:05 PM, Josh Berkus  wrote:
>>> Now that Simon has submitted this, can some of the heavy-hitters here
>>> review it?  Heikki?
>>>
>>> Nobody's name is next to it.
>>
>> I don't think anyone is planning to ignore this patch, but it wasn't
>> included in the first round of round-robin reviewing assignments
>> because it wasn't submitted until the following day, after the
>> announced deadline for submissions had already passed.  So most people
>> are probably busy with with some other patch at the moment, but that's
>> a temporary phenomenon.
>
> Right, I've added myself as reviewer now.
>
>>  This is a pretty small CommitFest, so there
>> shouldn't be any shortage of reviewers, though Heikki's time may be
>> stretched a little thin, since Streaming Replication is also in the
>> queue, and he is working on index-only scans.  That's really for him
>> to comment on, though.
>
> I'm going to put the index-only scans aside for now to focus on hot
> standby and streaming replication. Both are big patches, so there's
> plenty of work in those two alone, and not only for me.

What is the best way to attack this?  Should I keep reviewing
index-only scans so that you have feedback for when you get back to
it, or should I move on to something else?  If something else, does it
make more sense for me to look at HS since I did a bit of work with a
previous version, or would it be better for me to just pick one of the
other patches from the CommitFest and work on that?

Also, stepping back from me personally, should we try to assign some
additional reviewers to these patches?  Is there some way we can
divide up review tasks among multiple people so that we're not
repeating each others work?

Thoughts appreciated, from Heikki, Simon, or others.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Greg Smith


On Thu, 17 Sep 2009, Dan Colish wrote:


- Performance appears to be the same although I don't have a good way 
for
  testing this at the moment


Here's what I do to generate simple COPY performance test cases:

CREATE TABLE t (i integer);
INSERT INTO t SELECT x FROM generate_series(1,10) AS x;
\timing
COPY t TO '/some/file' WITH [options];
BEGIN;
TRUNCATE TABLE t;
COPY t FROM '/some/file' WITH [options];
COMMIT;

You can adjust the size of the generated table based on whether you want 
to minimize (small number) or maximize (big number) the impact of the 
setup overhead relative to actual processing time.  Big numbers make sense 
if there's a per-row change, small ones if it's mainly COPY setup that's 
been changed if you want a small bit of data to test against.


An example with one column in it is a good test case for seeing whether 
per-row impact has gone up.  You'd want something with a wider row for 
other types of performance tests.


The reason for the BEGIN/COMMIT there is that form utilizes an 
optimization that lowers WAL volume when doing the COPY insertion, which 
makes it more likely you'll be testing performance of the right thing.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pgbench: new feature allowing to launch shell commands

2009-09-17 Thread Dan Colish

On Thu, Sep 17, 2009 at 09:56:44AM +0900, Michael Paquier wrote:
> Hi all,
> 
> Sorry for my late reply.
> There is no other update for this patch since the 13th of August, at least
> until today. The new patch is attached
> By the way I worked on the comments that Dan and Gabriel pointed out.
> I added a check on system such as to prevent an error from this side.
> By the way, I noticed that the way return values are managed in doCustom and
> in process_commands is different. Such as to make an homogeneous code,
> return values are managed the same way in both functions in the patch,
> explaining why I did not return a specific value when file commands are
> treated in doCustom.

You really should be returning a value at the point since the function
signature defines a return type. If not the function should be void,
which it cannot be in this context since it is used for boolean tests
elsewhere. The returns in question are all part of error blocks and
should return false.

> 
> Here is also as wanted a simple transaction that permits to use this
> function:
> \set nbranches :scale
> \set naccounts 10 * :scale
> \setrandom aid 1 :naccounts
> \setrandom bid 1 :nbranches
> \setrandom delta -5000 5000
> \setrandom txidrand 0 1
> START TRANSACTION;
> UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
> SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
> UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
> PREPARE TRANSACTION ':txidrand';
> \shell ls -ll $PGDATA/pg_twophase
> COMMIT PREPARED ':txidrand';

I must be doing something wrong when I am running this script. It is
still throwing errors. Would you mind showing me the pgbench command you
used to run it?

--
--Dan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gcc versus division-by-zero traps

2009-09-17 Thread Bruce Momjian

Tom Lane wrote:
> I hope that the bug will get fixed in due course, but even if they
> respond pretty quickly it will be years before the problem disappears
> from every copy of gcc in the field.  So I'm thinking that it would
> behoove us to install a workaround, now that we've characterized the
> problem sufficiently.  What I am thinking is that in the three
> functions known to exhibit the bug (int24div, int28div, int48div)
> we should do something like this:
> 
> 
>   if (arg2 == 0)
> + {
>   ereport(ERROR,
>   (errcode(ERRCODE_DIVISION_BY_ZERO),
>errmsg("division by zero")));
> + /* ensure compiler realizes we don't reach the division */
> + PG_RETURN_NULL();
> + }

Could we document this a little better, perhaps refer to a file
somewhere or a central comment on its purpose?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Dan Colish

Hi,

I have read through the patch a few times and it looks OK. The
additions to the COPY syntax work as expected and as agreed upon
based on the thread. Below are some points from my checklist.

- Patch applies cleanly
- Included new tests and documentation
- Well commented
- Documentation is clearly written
- Produced no error or warning on compile
- When compiled passes all tests
- Syntax works as expected
- Performance appears to be the same although I don't have a good way 
for
  testing this at the moment
- Patch integrates well with current backend copy functions
- Patch cleanly extends the psql \copy feature

Any further thoughts on this patch? I think its pretty much ready.

--
--Dan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: CVS NULL Documentation Clearify documentation of CVS's output of

2009-09-17 Thread Bruce Momjian

Tom Lane wrote:
> momj...@postgresql.org (Bruce Momjian) writes:
> > Log Message:
> > ---
> > CVS NULL Documentation
> 
> > Clearify documentation of CVS's output of NULL values, per suggestion
> > from Magnus.
> 
> This has not "clarified" anything; what it has done is transformed
> correct statements into incorrect ones.  You forgot about the
> possibility that the copy is using a nonempty string as the "null
> string".  Kindly revert.

It has been adjusted per Andrew's suggestion.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Dimitri Fontaine

Tom Lane  writes:
> Completely.  This is a user-visible behavior that we have encouraged
> people to rely on, and for which there is no easy substitute.

Excited to have self-healing tables (against bloat), I parse this as an
opening. Previously on this thread you say:

> (Actually, the ctid is only being used for fast access here; the xmin
> is what is really needed to detect that someone else updated the row.
> But the proposed tuple-mover would break the xmin check too.)

So to have the impossible feature, we need a way not to break existing
code relying on ctid and xmin. How stretching would you consider the
idea of taking a (maybe new) table lock as soon as a SELECT output
contains system columns, this lock preventing the magic utility to
operate?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Copy out wording

2009-09-17 Thread Bruce Momjian

Andrew Dunstan wrote:
> 
> 
> Bruce Momjian wrote:
> > Magnus Hagander wrote:
> >   
> >> On Thu, Sep 3, 2009 at 13:31, Andrew Dunstan wrote:
> >> 
> >>> Magnus Hagander wrote:
> >>>   
>  Oh, hang on, "the NULL string" refers to the copy parameter? Not a
>  part of the data? I read it as "a string being NULL". Maybe something
>  like "the value of the NULL string parameter" to be overly clear for
>  people like me? :-)
> 
>  
> >>> We could change:
> >>>
> >>> ? A NULL is output as the NULL string and is not quoted, while a data 
> >>> value
> >>> ? matching the NULL string is quoted.
> >>>
> >>>
> >>> to
> >>>
> >>> ? A NULL is output as the NULL parameter and is not quoted, while a 
> >>> non-NULL
> >>> data value whose text representation
> >>> ? matches the NULL parameter is quoted.
> >>>
> >>>
> >>> or something similar. Would that be better?
> >>>   
> >> Yes, much better IMO.
> >> 
> >
> > I have applied the attached documentation clarification patch, and
> > backpatched it to 8.4.X.
> >
> >   
> >   
> 
> Why didn't you follow the wording I actually suggested, which had the 
> virtue of being accurate.

I thought the problem was the use of the word "null string", which
clearly was confusing.  

I have updated the docs to use your wording, with a little
clarification.  The diff against yesterday's CVS is attached, which is
not smaller.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: copy.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.87
diff -c -r1.87 copy.sgml
*** copy.sgml	5 Sep 2009 23:58:01 -	1.87
--- copy.sgml	17 Sep 2009 21:47:11 -
***
*** 1,5 
  
  
--- 1,5 
  
  
***
*** 552,561 
  NULL value from an empty string.
  PostgreSQL's COPY handles this by
  quoting. A NULL is output as the NULL
! string and is not quoted, while a data value matching the
! NULL string is quoted. Therefore, using the default
  settings, a NULL is written as an unquoted empty
! string, while an empty string is written with double quotes
  (""). Reading values follows similar rules. You can
  use FORCE NOT NULL to prevent NULL input
  comparisons for specific columns.
--- 552,561 
  NULL value from an empty string.
  PostgreSQL's COPY handles this by
  quoting. A NULL is output as the NULL
! parameter and is not quoted, while a non-NULL value matching the
! the NULL parameter string is quoted. Therefore, using the default
  settings, a NULL is written as an unquoted empty
! string, while an empty string data value is written with double quotes
  (""). Reading values follows similar rules. You can
  use FORCE NOT NULL to prevent NULL input
  comparisons for specific columns.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Dimitri Fontaine  writes:
> I don't see any reason why not breaking the user visible behavior of
> tuples CTID between any two major releases,

> Am I completely wet here?

Completely.  This is a user-visible behavior that we have encouraged
people to rely on, and for which there is no easy substitute.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Hannu Krosing  writes:
> But maybe something can be tahen from this discussion the other way
> round - maybe we should not be afraid of doing null updates during
> in-place update

The problem for in-place update is that it can't assume that any of the
normal infrastructure (like index insertion or WAL logging) is up.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logging configuration changes [REVIEW]

2009-09-17 Thread Peter Eisentraut

On ons, 2009-09-16 at 10:54 +0530, Abhijit Menon-Sen wrote:
> I can't help but think that it would be nice to report the default value
> of a parameter that is reset (i.e. "parameter $x reset to default value
> $y"). The first attached patch does this by calling GetConfigOption()
> after the value has been reset by set_config_option(). It also skips
> the report if the earlier value was the same as the default.

I have applied the rest, but the problem with this change is that it
logs the values without units.  I'm not sure that we want to expose
that.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Copy out wording

2009-09-17 Thread Andrew Dunstan




Bruce Momjian wrote:

Magnus Hagander wrote:
  

On Thu, Sep 3, 2009 at 13:31, Andrew Dunstan wrote:


Magnus Hagander wrote:
  

Oh, hang on, "the NULL string" refers to the copy parameter? Not a
part of the data? I read it as "a string being NULL". Maybe something
like "the value of the NULL string parameter" to be overly clear for
people like me? :-)



We could change:

? A NULL is output as the NULL string and is not quoted, while a data value
? matching the NULL string is quoted.


to

? A NULL is output as the NULL parameter and is not quoted, while a non-NULL
data value whose text representation
? matches the NULL parameter is quoted.


or something similar. Would that be better?
  

Yes, much better IMO.



I have applied the attached documentation clarification patch, and
backpatched it to 8.4.X.

  
  


Why didn't you follow the wording I actually suggested, which had the 
virtue of being accurate.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: CVS NULL Documentation Clearify documentation of CVS's output of

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 5:22 PM, Tom Lane  wrote:
> momj...@postgresql.org (Bruce Momjian) writes:
>> Log Message:
>> ---
>> CVS NULL Documentation
>
>> Clearify documentation of CVS's output of NULL values, per suggestion
>> from Magnus.
>
> This has not "clarified" anything; what it has done is transformed
> correct statements into incorrect ones.  You forgot about the
> possibility that the copy is using a nonempty string as the "null
> string".  Kindly revert.

Also, the format is called CSV, not CVS.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] pg_ctl init extension

2009-09-17 Thread Bernd Helmle




--On 17. September 2009 23:00:12 +0300 Peter Eisentraut  
wrote:



f the name is a problem, why not change the name?  What you are
proposing above is effectively a very elaborate name change, so why not
go for a simpler one?


I don't like the solution by using -o "" to push down 
command line options to initdb. I always had the opinion that this was (and 
is) a valid workaround for postgres itself, but it doesn't feel correct to 
expose that further to initdb and its purposes.


My 2 cents...

--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: CVS NULL Documentation Clearify documentation of CVS's output of

2009-09-17 Thread Tom Lane

momj...@postgresql.org (Bruce Momjian) writes:
> Log Message:
> ---
> CVS NULL Documentation

> Clearify documentation of CVS's output of NULL values, per suggestion
> from Magnus.

This has not "clarified" anything; what it has done is transformed
correct statements into incorrect ones.  You forgot about the
possibility that the copy is using a nonempty string as the "null
string".  Kindly revert.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] updated join removal patch

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 4:59 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Sep 15, 2009 at 9:29 PM, Tom Lane  wrote:
>>> Robert Haas  writes:
 Here we go again.  Following Tom's advice to not insert crocks in
 add_path(), I've instead introduced a noopjoin path type which ignores
 its inner side.
>
> I've committed this with some revisions.  As to the points discussed
> earlier:

Thanks, it looks nice. I wonder if we shouldn't add some documentation
or regression tests, though - any thoughts?

>>> * I'm not really happy with the "NoopJoinPath" representation. ...
>>> A possible compromise is to use T_SubqueryScan as the pathtype, with
>>> the idea that we're pretending a SubqueryScan is to be inserted and then
>>> immediately optimized away.  But I don't want the inner path in there
>>> in any case.
>
>> I don't have strong feelings about it.  I thought about making just a
>> Noop path type, but I couldn't see any clear reason to prefer it one
>> way or the other.
>
> I ended up using a NoOpPath struct with pathtype = T_Join, which is a
> dummy superclass nodetag that never really gets instantiated.  The
> SubqueryScan idea looked too messy after I remembered that createplan.c
> likes to switch() on the pathtype, so having two different pathtype
> uses of T_SubqueryScan seemed pretty ugly.  But T_Join isn't really used
> anywhere, despite being nominally a executable plan type.  It's pretty
> much a judgment call whether that's really cleaner than using the path's
> own NodeTag...

Yeah.  I haven't been able to wrap my head around why it's good to
have two tags on some of these things.

>>> * I'm quite unimpressed with the refactoring you did to make a single
>>> function deal with merge clauses, hash clauses, *and* join removal
>>> clauses.  I think that makes it close to unmaintainable and is buying
>>> little if anything speedwise.
>
>> You're the committer; I'm not.  But I completely disagree.  There
>> isn't any reason at all to duplicate this logic in two separate
>> places, let alone three.  I'd actually be in favor of merging the
>> existing two cases even if we weren't adding join removal.
>
> No, I still think this was a bad idea.  There are *parts* of those
> tests that are similar, but combining them all into one function is
> just a recipe for bugs.

Having read your commit, it makes more sense to me.  The fact that
we're now looking at innerrel->baserestrictinfo also is a pretty
powerful argument for your way.

>> As for a GUC, I think it would be useful to have for debugging, or in
>> case of bugs.  It's really another join strategy, and we have enable_*
>> flags for all the others.  But if you don't want it for some reason,
>> I'm not in a position to twist your arm.
>
> I left it out for the moment.  If anyone can point to a case where the
> join removal logic slows planning noticeably without gaining anything,
> we can reconsider.

Well, the purpose of such flags is for debugging, not production.  The
only argument I can see for not having one for join removal versus
nestloop, mergejoin, etc. is that it's pretty easy to kill removal of
a particular join by tweaking the output list, whereas it is not
comparably easy to get the planner to pick a different join type
otherwise.

>>> * Not entirely sure where to put the code that does the hard work
>>> (relation_is_distinct_for).
>
>> Yeah, it seemed a little weird to me.  For a while I was reusing some
>> of the support functions that query_is_distinct_for() calls, but the
>> final version doesn't.  I wonder if it should just be moved to
>> joinpath.c; it seems to fit in better with what is going on there.
>
> I ended up putting the index-specific logic in indxpath.c, which aside
> from seeming reasonable made it easier to deal with expression indexes.
> If we ever add another test method that doesn't depend on indexes, we
> can put it in a reasonable place and have joinpath.c call that too.

Sounds fine to me.  I think that the next project in this area should
be to try to support removal of INNER joins.  (Removal of SEMI, ANTI,
and FULL joins seems unlikely ever to be interesting.)  That will
require teaching the planner about foreign key constraints, which
interestingly opens up some other interesting optimization
opportunities.  An INNER join that can never match zero rows can
alternatively be implemented as a LEFT join, which can be reordered in
some situations where an inner join can't.  e.g. A LJ (B IJ C ON Pbc)
ON Pab can be implemented as (A LJ B ON Pab) LJ/IJ C ON Pbc if it is
the case that for every row in B, there will be at least one row in C
such that Pbc holds.

Thanks again for fixing this up, and committing it.  I have a TON of
queries that will benefit from this, and it's one of the reasons why I
started reading this mailing list on a regular basis and getting
involved in development.  For me, it's worth an upgrade by itself.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.o

Re: [HACKERS] Fwd: Copy out wording

2009-09-17 Thread Bruce Momjian

Magnus Hagander wrote:
> On Thu, Sep 3, 2009 at 13:31, Andrew Dunstan wrote:
> >
> >
> > Magnus Hagander wrote:
> >>
> >> Oh, hang on, "the NULL string" refers to the copy parameter? Not a
> >> part of the data? I read it as "a string being NULL". Maybe something
> >> like "the value of the NULL string parameter" to be overly clear for
> >> people like me? :-)
> >>
> >
> > We could change:
> >
> > ? A NULL is output as the NULL string and is not quoted, while a data value
> > ? matching the NULL string is quoted.
> >
> >
> > to
> >
> > ? A NULL is output as the NULL parameter and is not quoted, while a non-NULL
> > data value whose text representation
> > ? matches the NULL parameter is quoted.
> >
> >
> > or something similar. Would that be better?
> 
> Yes, much better IMO.

I have applied the attached documentation clarification patch, and
backpatched it to 8.4.X.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/ref/copy.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.87
diff -c -c -r1.87 copy.sgml
*** doc/src/sgml/ref/copy.sgml	5 Sep 2009 23:58:01 -	1.87
--- doc/src/sgml/ref/copy.sgml	17 Sep 2009 21:06:59 -
***
*** 550,562 

  The CSV format has no standard way to distinguish a
  NULL value from an empty string.
! PostgreSQL's COPY handles this by
! quoting. A NULL is output as the NULL
! string and is not quoted, while a data value matching the
! NULL string is quoted. Therefore, using the default
! settings, a NULL is written as an unquoted empty
! string, while an empty string is written with double quotes
! (""). Reading values follows similar rules. You can
  use FORCE NOT NULL to prevent NULL input
  comparisons for specific columns.

--- 550,559 

  The CSV format has no standard way to distinguish a
  NULL value from an empty string.
! PostgreSQL's COPY handles this using
! quoting. A NULL is output as an empty string without
! quotes, while an empty string data value is double-quoted
! ("").  Reading values follows similar rules. You can
  use FORCE NOT NULL to prevent NULL input
  comparisons for specific columns.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Emmanuel Cecchet


Here you go.

Emmanuel

Dan Colish wrote:

On Thu, Sep 17, 2009 at 02:56:07PM -0400, Emmanuel Cecchet wrote:
  

Dan,

My bad, I copy/pasted the hard link in output/copy.source instead of  
@abs_build...@.

Here is a complete version of the patch with the fix on output/copy.source.

Emmanuel



Emmanuel, 
	

Thanks for getting that back so quickly. As I said before, it
applies cleanly and passes regression tests. I'm reading through the
	changes now. When you get a moment could you send me the patch as a 
	context diff?


--
--Dan
  



--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com

Index: src/test/regress/sql/copy2.sql
===
RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/copy2.sql,v
retrieving revision 1.18
diff -c -r1.18 copy2.sql
*** src/test/regress/sql/copy2.sql  25 Jul 2009 00:07:14 -  1.18
--- src/test/regress/sql/copy2.sql  17 Sep 2009 20:59:50 -
***
*** 73,89 
  \.
  
  -- various COPY options: delimiters, oids, NULL string
! COPY x (b, c, d, e) from stdin with oids delimiter ',' null 'x';
  50,x,45,80,90
  51,x,\x,\\x,\\\x
  52,x,\,,\\\,,\\
  \.
  
! COPY x from stdin WITH DELIMITER AS ';' NULL AS '';
  3000;;c;;
  \.
  
! COPY x from stdin WITH DELIMITER AS ':' NULL AS E'\\X';
  4000:\X:C:\X:\X
  4001:1:empty::
  4002:2:null:\X:\X
--- 73,89 
  \.
  
  -- various COPY options: delimiters, oids, NULL string
! COPY x (b, c, d, e) from stdin (oids, delimiter ',', null 'x');
  50,x,45,80,90
  51,x,\x,\\x,\\\x
  52,x,\,,\\\,,\\
  \.
  
! COPY x from stdin (DELIMITER ';', NULL '');
  3000;;c;;
  \.
  
! COPY x from stdin (DELIMITER ':', NULL E'\\X');
  4000:\X:C:\X:\X
  4001:1:empty::
  4002:2:null:\X:\X
***
*** 108,120 
  INSERT INTO no_oids (a, b) VALUES (20, 30);
  
  -- should fail
! COPY no_oids FROM stdin WITH OIDS;
! COPY no_oids TO stdout WITH OIDS;
  
  -- check copy out
  COPY x TO stdout;
  COPY x (c, e) TO stdout;
! COPY x (b, e) TO stdout WITH NULL 'I''m null';
  
  CREATE TEMP TABLE y (
col1 text,
--- 108,120 
  INSERT INTO no_oids (a, b) VALUES (20, 30);
  
  -- should fail
! COPY no_oids FROM stdin (OIDS);
! COPY no_oids TO stdout (OIDS);
  
  -- check copy out
  COPY x TO stdout;
  COPY x (c, e) TO stdout;
! COPY x (b, e) TO stdout (NULL 'I''m null');
  
  CREATE TEMP TABLE y (
col1 text,
***
*** 130,140 
  COPY y TO stdout WITH CSV FORCE QUOTE col2 ESCAPE E'\\';
  COPY y TO stdout WITH CSV FORCE QUOTE *;
  
  --test that we read consecutive LFs properly
  
  CREATE TEMP TABLE testnl (a int, b text, c int);
  
! COPY testnl FROM stdin CSV;
  1,"a field with two LFs
  
  inside",2
--- 130,152 
  COPY y TO stdout WITH CSV FORCE QUOTE col2 ESCAPE E'\\';
  COPY y TO stdout WITH CSV FORCE QUOTE *;
  
+ -- Test new 8.5 syntax
+ 
+ COPY y TO stdout (CSV);
+ COPY y TO stdout (CSV, CSV_QUOTE , DELIMITER '|');
+ COPY y TO stdout (CSV, CSV_FORCE_QUOTE (col2), CSV_ESCAPE E'\\');
+ COPY y TO stdout (CSV, CSV_FORCE_QUOTE *);
+ 
+ \COPY y TO stdout (CSV)
+ \COPY y TO stdout (CSV, CSV_QUOTE , DELIMITER '|')
+ \COPY y TO stdout (CSV, CSV_FORCE_QUOTE (col2), CSV_ESCAPE E'\\')
+ \COPY y TO stdout (CSV, CSV_FORCE_QUOTE *)
+ 
  --test that we read consecutive LFs properly
  
  CREATE TEMP TABLE testnl (a int, b text, c int);
  
! COPY testnl FROM stdin (CSV);
  1,"a field with two LFs
  
  inside",2
***
*** 143,156 
  -- test end of copy marker
  CREATE TEMP TABLE testeoc (a text);
  
! COPY testeoc FROM stdin CSV;
  a\.
  \.b
  c\.d
  "\."
  \.
  
! COPY testeoc TO stdout CSV;
  
  DROP TABLE x, y;
  DROP FUNCTION fn_x_before();
--- 155,168 
  -- test end of copy marker
  CREATE TEMP TABLE testeoc (a text);
  
! COPY testeoc FROM stdin (CSV);
  a\.
  \.b
  c\.d
  "\."
  \.
  
! COPY testeoc TO stdout (CSV);
  
  DROP TABLE x, y;
  DROP FUNCTION fn_x_before();
Index: src/test/regress/sql/aggregates.sql
===
RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/aggregates.sql,v
retrieving revision 1.15
diff -c -r1.15 aggregates.sql
*** src/test/regress/sql/aggregates.sql 25 Apr 2009 16:44:56 -  1.15
--- src/test/regress/sql/aggregates.sql 17 Sep 2009 20:59:50 -
***
*** 104,110 
BIT_OR(i4)  AS "?"
  FROM bitwise_test;
  
! COPY bitwise_test FROM STDIN NULL 'null';
  1 1   1   1   1   B0101
  3 3   3   null2   B0100
  7 7   7   3   4   B1100
--- 104,110 
BIT_OR(i4)  AS "?"
  FROM bitwise_test;
  
! COPY bitwise_test FROM STDIN (NULL 'null');
  1 1   1   1   1   B0101
  3 3   3   null2   B0100
  7 7   7   3   4   B1100
***
*** 171,177 
BOOL_OR(b3)AS "n"
  FROM bool_test;
  
! COPY bool_test FROM STDIN NULL '

Re: [HACKERS] updated join removal patch

2009-09-17 Thread Tom Lane

Robert Haas  writes:
> On Tue, Sep 15, 2009 at 9:29 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Here we go again.  Following Tom's advice to not insert crocks in
>>> add_path(), I've instead introduced a noopjoin path type which ignores
>>> its inner side.

I've committed this with some revisions.  As to the points discussed
earlier:

>> * I'm not really happy with the "NoopJoinPath" representation. ...
>> A possible compromise is to use T_SubqueryScan as the pathtype, with
>> the idea that we're pretending a SubqueryScan is to be inserted and then
>> immediately optimized away.  But I don't want the inner path in there
>> in any case.

> I don't have strong feelings about it.  I thought about making just a
> Noop path type, but I couldn't see any clear reason to prefer it one
> way or the other.

I ended up using a NoOpPath struct with pathtype = T_Join, which is a
dummy superclass nodetag that never really gets instantiated.  The
SubqueryScan idea looked too messy after I remembered that createplan.c
likes to switch() on the pathtype, so having two different pathtype
uses of T_SubqueryScan seemed pretty ugly.  But T_Join isn't really used
anywhere, despite being nominally a executable plan type.  It's pretty
much a judgment call whether that's really cleaner than using the path's
own NodeTag...

>> * I'm quite unimpressed with the refactoring you did to make a single
>> function deal with merge clauses, hash clauses, *and* join removal
>> clauses.  I think that makes it close to unmaintainable and is buying
>> little if anything speedwise.

> You're the committer; I'm not.  But I completely disagree.  There
> isn't any reason at all to duplicate this logic in two separate
> places, let alone three.  I'd actually be in favor of merging the
> existing two cases even if we weren't adding join removal.

No, I still think this was a bad idea.  There are *parts* of those
tests that are similar, but combining them all into one function is
just a recipe for bugs.

> As for a GUC, I think it would be useful to have for debugging, or in
> case of bugs.  It's really another join strategy, and we have enable_*
> flags for all the others.  But if you don't want it for some reason,
> I'm not in a position to twist your arm.

I left it out for the moment.  If anyone can point to a case where the
join removal logic slows planning noticeably without gaining anything,
we can reconsider.

>> * Not entirely sure where to put the code that does the hard work
>> (relation_is_distinct_for).

> Yeah, it seemed a little weird to me.  For a while I was reusing some
> of the support functions that query_is_distinct_for() calls, but the
> final version doesn't.  I wonder if it should just be moved to
> joinpath.c; it seems to fit in better with what is going on there.

I ended up putting the index-specific logic in indxpath.c, which aside
from seeming reasonable made it easier to deal with expression indexes.
If we ever add another test method that doesn't depend on indexes, we
can put it in a reasonable place and have joinpath.c call that too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linux LSB init script

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 4:18 PM, Peter Eisentraut  wrote:
> On tor, 2009-09-17 at 11:59 -0500, Kevin Grittner wrote:
>> > Well, in such cases it may be useful to add an option such as
>> > --oknodo to select the idempotent behavior.
>>
>> I found that confusing (as did Robert); how about --lsm-conforming?
>
> s/lsm/lsb/
>
> I'm not so sure that I would label it as LSB, because that is too broad,
> and not very descriptive.
>
> I think this option should only control whether start and stop are
> idempotent.  Other LSB issues such as exit codes ought to become the
> default, or possibly a different option if necessary.

Maybe we should just call it --idempotent.

Or, they could be additional actions, like ensure-start/ensure-stop.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Dan Colish

On Thu, Sep 17, 2009 at 02:56:07PM -0400, Emmanuel Cecchet wrote:
> Dan,
>
> My bad, I copy/pasted the hard link in output/copy.source instead of  
> @abs_build...@.
> Here is a complete version of the patch with the fix on output/copy.source.
>
> Emmanuel

Emmanuel, 

Thanks for getting that back so quickly. As I said before, it
applies cleanly and passes regression tests. I'm reading through the
changes now. When you get a moment could you send me the patch as a 
context diff?

--
--Dan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linux LSB init script

2009-09-17 Thread Peter Eisentraut

On tor, 2009-09-17 at 14:21 -0400, Greg Smith wrote:
> I've implemented exactly the same logic Kevin describes for a past 
> consulting customer and for multiple forms of shutdown scripts at
> Truviso. 
> I would suggest the situation here is that it's so easy to script a
> custom 
> solution that the other people who have done the same don't have any 
> motivation to get that fixed upstream, which would take a bunch of
> arguing 
> on this list to accomplish.  Easier to just work around it and move
> on.

I grant you all that, but this sort of thing is best not slipped in via
a patch tagged LSB init script.  If there is a problem to be fixed,
let's discuss it.  If it can't be fixed, let's at least document it.  Or
if you don't have the motivation, there are other places to host "custom
solutions".

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linux LSB init script

2009-09-17 Thread Peter Eisentraut

On tor, 2009-09-17 at 11:59 -0500, Kevin Grittner wrote:
> > Well, in such cases it may be useful to add an option such as
> > --oknodo to select the idempotent behavior.
>  
> I found that confusing (as did Robert); how about --lsm-conforming?

s/lsm/lsb/

I'm not so sure that I would label it as LSB, because that is too broad,
and not very descriptive.

I think this option should only control whether start and stop are
idempotent.  Other LSB issues such as exit codes ought to become the
default, or possibly a different option if necessary.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linux LSB init script

2009-09-17 Thread Peter Eisentraut

On tor, 2009-09-17 at 09:26 -0400, Robert Haas wrote:
> On Thu, Sep 17, 2009 at 1:30 AM, Peter Eisentraut  wrote:
> > Well, in such cases it may be useful to add an option such as --oknodo
> > to select the idempotent behavior.
> 
> It took me about 60 seconds to figure out what I thought you were
> going for there, so I submit that's not a good choice of option name.

This is the name that the start-stop-daemon program in Debian uses, but
I can see how it can be puzzling.  We don't have to use this exact
spelling.

> > Yeah, except that part of the spec is hilariously unrealistic.  And
> 
> But since when do we worry about such things?  :-)

I submit to this quip, but note that there is a difference between an
implementation of a standard and an application using that standard.  I
have done a fair amount of work on the LSB init script support in Debian
over the years, and yes, there I favor going with the standard behavior
if at all possible.  But when it comes to writing an application that is
supposed to work with an LSB or SQL platform, you have to take a more
pragmatic approach.  At least that is my approach.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] pg_ctl init extension

2009-09-17 Thread Peter Eisentraut

On tor, 2009-09-17 at 21:43 +0200, Zdenek Kotala wrote:
> Attached patch extends pg_ctl command with init option. 
> 
> pg_ctl -D /var/lib/postgres [-s] init
> 
> This should replace usage of initdb command which has problematic name
> as we already discussed several times. Initdb binary will be still
> there, but it can be renamed and move into execlib dir in the future.

If the name is a problem, why not change the name?  What you are
proposing above is effectively a very elaborate name change, so why not
go for a simpler one?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [patch] pg_ctl init extension

2009-09-17 Thread Zdenek Kotala

Attached patch extends pg_ctl command with init option. 

pg_ctl -D /var/lib/postgres [-s] init

This should replace usage of initdb command which has problematic name
as we already discussed several times. Initdb binary will be still
there, but it can be renamed and move into execlib dir in the future.

Patch does not contains documentation changes. They will depends on
decision which database initialization method will be preferred.



Zdenek

*** pgsql_init.8d83e5030d44/src/bin/pg_ctl/pg_ctl.c	2009-09-17 21:42:20.865268360 +0200
--- /export/home/zk200664/work/mercurial/pgsql_init/src/bin/pg_ctl/pg_ctl.c	2009-09-17 21:15:04.630265322 +0200
***
*** 57,62 
--- 57,63 
  typedef enum
  {
  	NO_COMMAND = 0,
+ 	INIT_COMMAND,
  	START_COMMAND,
  	STOP_COMMAND,
  	RESTART_COMMAND,
***
*** 100,105 
--- 101,107 
  static void do_help(void);
  static void set_mode(char *modeopt);
  static void set_sig(char *signame);
+ static void do_init(void);
  static void do_start(void);
  static void do_stop(void);
  static void do_restart(void);
***
*** 615,620 
--- 617,655 
  }
  
  static void
+ do_init(void)
+ {
+ 	char pg_initdb[MAXPGPATH];
+ 	char cmd[MAXPGPATH];
+ 	int ret;
+ 
+ 	if ((ret = find_other_exec(argv0, "initdb", "initdb (PostgreSQL) " PG_VERSION "\n",
+ 			   pg_initdb)) < 0)
+ 
+ 	{
+ 		write_stderr(_("%s: could not find initdb\n"),
+ 	 progname);
+ 		exit(1);
+ 	}
+ 
+ 	if (post_opts == NULL)
+ 		post_opts = "";
+ 
+ 	if (!silent_mode)
+ 		snprintf(cmd, MAXPGPATH, SYSTEMQUOTE "\"%s\" %s%s" SYSTEMQUOTE,
+  pg_initdb, pgdata_opt, post_opts);
+ 	else
+ 		snprintf(cmd, MAXPGPATH, SYSTEMQUOTE "\"%s\" %s%s > \"%s\"" SYSTEMQUOTE,
+  pg_initdb, pgdata_opt, post_opts, DEVNULL);
+ 	
+ 	if ( system(cmd) != 0 )
+ 	{
+ 		write_stderr(_("%s: database initialization failed.\n"), progname);
+ 		exit(1);
+ 	}
+ }
+ 
+ static void
  do_start(void)
  {
  	pgpid_t		pid;
***
*** 694,700 
  	 progname, exitcode);
  		exit(1);
  	}
- 
  	if (old_pid != 0)
  	{
  		pg_usleep(100);
--- 729,734 
***
*** 1535,1540 
--- 1569,1575 
  	printf(_("%s is a utility to start, stop, restart, reload configuration files,\n"
  			 "report the status of a PostgreSQL server, or signal a PostgreSQL process.\n\n"), progname);
  	printf(_("Usage:\n"));
+ 	printf(_("  %s init[-D DATADIR] [-s] [-o \"OPTIONS\"]\n"), progname);
  	printf(_("  %s start   [-w] [-t SECS] [-D DATADIR] [-s] [-l FILENAME] [-o \"OPTIONS\"]\n"), progname);
  	printf(_("  %s stop[-W] [-t SECS] [-D DATADIR] [-s] [-m SHUTDOWN-MODE]\n"), progname);
  	printf(_("  %s restart [-w] [-t SECS] [-D DATADIR] [-s] [-m SHUTDOWN-MODE]\n"
***
*** 1567,1573 
  #endif
  	printf(_("  -l, --log FILENAME write (or append) server log to FILENAME\n"));
  	printf(_("  -o OPTIONS command line options to pass to postgres\n"
! 			 " (PostgreSQL server executable)\n"));
  	printf(_("  -p PATH-TO-POSTGRESnormally not necessary\n"));
  	printf(_("\nOptions for stop or restart:\n"));
  	printf(_("  -m SHUTDOWN-MODE   can be \"smart\", \"fast\", or \"immediate\"\n"));
--- 1602,1608 
  #endif
  	printf(_("  -l, --log FILENAME write (or append) server log to FILENAME\n"));
  	printf(_("  -o OPTIONS command line options to pass to postgres\n"
! 			 " (PostgreSQL server executable) or initdb\n"));
  	printf(_("  -p PATH-TO-POSTGRESnormally not necessary\n"));
  	printf(_("\nOptions for stop or restart:\n"));
  	printf(_("  -m SHUTDOWN-MODE   can be \"smart\", \"fast\", or \"immediate\"\n"));
***
*** 1824,1830 
  exit(1);
  			}
  
! 			if (strcmp(argv[optind], "start") == 0)
  ctl_command = START_COMMAND;
  			else if (strcmp(argv[optind], "stop") == 0)
  ctl_command = STOP_COMMAND;
--- 1859,1867 
  exit(1);
  			}
  
! 			if (strcmp(argv[optind], "init") == 0)
! ctl_command = INIT_COMMAND;
! 			else if (strcmp(argv[optind], "start") == 0)
  ctl_command = START_COMMAND;
  			else if (strcmp(argv[optind], "stop") == 0)
  ctl_command = STOP_COMMAND;
***
*** 1921,1926 
--- 1958,1966 
  
  	switch (ctl_command)
  	{
+ 		case INIT_COMMAND:
+ 			do_init();
+ 			break;
  		case STATUS_COMMAND:
  			do_status();
  			break;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Dimitri Fontaine

Hi,

Forewords: re-reading, I hope my english will not make this sound like a
high-kick when I'm just struggling to understand what all this is
about. Sending in order not to regret missing the oportunity I think I'm
seeing...

Tom Lane  writes:
> Hannu Krosing  writes:
>> On Wed, 2009-09-16 at 21:19 -0400, Tom Lane wrote:
>>> VACUUM FULL CONCURRENTLY is a contradiction in terms.  Wishing it were
>>> possible doesn't make it so.
>
>> It depends on what do you mean by "VACUUM FULL"
>
> Anything that moves tuples is not acceptable as a hidden background
> operation, because it will break applications that depend on CTID.

I though this community had the habit of pushing public interface
backward compatibility while going as far as requiring systematic full
dump and restore cycle for major version upgrade in order to allow for
internal redesign anytime in development.

And even if it's easy enough to SELECT ctid FROM table, this has always
been an implementation detail in my mind, the same way catalog layout
is.

I don't see any reason why not breaking the user visible behavior of
tuples CTID between any two major releases, all the more when the reason
we're talking about it is automated online physical optimisations, which
seems to be opening the door for bloat resistant PostgreSQL.

> The utility Heikki is talking about is something that DBAs would
> invoke explicitly, presumably with an understanding of the side effects.

That's the CLUSTER on seqscan. As far as the table rewritting goes, the
above only states your POV, based on ctid backward compatibility need
which I'm not the only one here not sharing, let alone understanding.

Am I completely wet here?
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Dan Colish

On Thu, Sep 17, 2009 at 02:56:07PM -0400, Emmanuel Cecchet wrote:
> Dan,
>
> My bad, I copy/pasted the hard link in output/copy.source instead of  
> @abs_build...@.
> Here is a complete version of the patch with the fix on output/copy.source.
>
> Emmanuel
>

Hooray, that works just fine now. I guess I should have seen that...

--
--Dan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Emmanuel Cecchet


Dan,

My bad, I copy/pasted the hard link in output/copy.source instead of 
@abs_build...@.

Here is a complete version of the patch with the fix on output/copy.source.

Emmanuel


On Thu, Sep 17, 2009 at 11:07:33AM -0400, Emmanuel Cecchet wrote:
  

Tom Lane wrote:


I wonder though if we couldn't simplify matters. Offhand it seems to me
that psql doesn't need to validate the command's syntax fully.  All it
really needs to do is find the target filename and replace it with
STDIN/STDOUT.  Could we have it just treat the remainder of the line
literally, and not worry about the details of what the options might be?
Let the backend worry about throwing an error if they're bad.
  
  

New version with the simplified psql. Still supports the 7.3 syntax.
I am going to look into porting the other COPY enhancements (error  
logging and autopartitioning) on this implementation. We might come up  
with new ideas for the documentation side of things with more options.


manu

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com




Hi,

I've been working on a review of this patch and currently its
failing regression tests. Here's the regression.diff. I'm going to
spend some time today trying to figure out if the tests need to
change of there is an actual issue in the patch. Just wanted to give
a heads up.

--
--Dan

*** /home/dc0lish/workspace/postgresql/src/test/regress/expected/copy.out   
2009-09-17 11:45:04.041818319 -0700
--- /home/dc0lish/workspace/postgresql/src/test/regress/results/copy.out
2009-09-17 11:45:14.215152558 -0700
***
*** 72,88 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax
! copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv);
  truncate copytest2;
! copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv);
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

  (0 rows)
  
  truncate copytest2;

! copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
! copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

--- 72,88 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax
! copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv);
  truncate copytest2;
! copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv);
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

  (0 rows)
  
  truncate copytest2;

! copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
! copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

***
*** 95,111 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax from psql
! \copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv)
  truncate copytest2;
! \copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv)
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

  (0 rows)
  
  truncate copytest2;

! \copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\')
! \copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\')
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

--- 95,111 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax from psql
! \copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' (csv)
  truncate copytest2;
! \copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' (csv)
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

  (0 rows)
  
  truncate copytest2;

! \copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\')
! \copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , c

Re: [HACKERS] generic copy options

2009-09-17 Thread Dan Colish

On Thu, Sep 17, 2009 at 11:07:33AM -0400, Emmanuel Cecchet wrote:
> Tom Lane wrote:
>> I wonder though if we couldn't simplify matters. Offhand it seems to me
>> that psql doesn't need to validate the command's syntax fully.  All it
>> really needs to do is find the target filename and replace it with
>> STDIN/STDOUT.  Could we have it just treat the remainder of the line
>> literally, and not worry about the details of what the options might be?
>> Let the backend worry about throwing an error if they're bad.
>>   
> New version with the simplified psql. Still supports the 7.3 syntax.
> I am going to look into porting the other COPY enhancements (error  
> logging and autopartitioning) on this implementation. We might come up  
> with new ideas for the documentation side of things with more options.
>
> manu
>
> -- 
> Emmanuel Cecchet
> Aster Data Systems
> Web: http://www.asterdata.com
>

Hi,

I've been working on a review of this patch and currently its
failing regression tests. Here's the regression.diff. I'm going to
spend some time today trying to figure out if the tests need to
change of there is an actual issue in the patch. Just wanted to give
a heads up.

--
--Dan

*** /home/dc0lish/workspace/postgresql/src/test/regress/expected/copy.out   
2009-09-17 11:45:04.041818319 -0700
--- /home/dc0lish/workspace/postgresql/src/test/regress/results/copy.out
2009-09-17 11:45:14.215152558 -0700
***
*** 72,88 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax
! copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv);
  truncate copytest2;
! copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv);
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+
  (0 rows)
  
  truncate copytest2;
! copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
! copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+
--- 72,88 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax
! copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv);
  truncate copytest2;
! copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv);
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+
  (0 rows)
  
  truncate copytest2;
! copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
! copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\');
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+
***
*** 95,111 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax from psql
! \copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv)
  truncate copytest2;
! \copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv)
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+
  (0 rows)
  
  truncate copytest2;
! \copy copytest to 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\')
! \copy copytest2 from 
'/home/manu/workspace/Postgres8.5-COPY/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\')
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+
--- 95,111 
  1,a,1
  2,b,2
  -- Repeat the above tests with the new 8.5 option syntax from psql
! \copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' (csv)
  truncate copytest2;
! \copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' (csv)
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+
  (0 rows)
  
  truncate copytest2;
! \copy copytest to 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\')
! \copy copytest2 from 
'/home/dc0lish/workspace/postgresql/src/test/regress/results/copytest.csv' 
(csv, csv_quote , csv_escape E'\\')
  select * from copytest except select * from copytest2;
   style | test | filler 
  ---+--+

==

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Hannu Krosing

On Thu, 2009-09-17 at 14:33 -0400, Greg Smith wrote:
> On Wed, 16 Sep 2009, Tom Lane wrote:
> 
> >> * Shrink a table in place - when no space available
> > To be addressed by the UPDATE-style tuple-mover (which could be thought
> > of as VACUUM FULL rewritten to not use any special mechanisms).
> 
> Is there any synergy here with the needs of a future in-place upgrade 
> upgrade mechanism that handles page header expansion?  That problem seemed 
> to always get stuck on the issue of how to move tuples around when the 
> pages were full.  Not trying to drag the scope of this job out, just 
> looking for common ground that might be considered when designing the 
> tuple-mover if it could serve both purposes.

I understood that the main difficulty for in-place tuple expansion was
keeping CTIDs to not need to update indexes.

Current tuple mover discussion does not address that.

But maybe something can be tahen from this discussion the other way
round - maybe we should not be afraid of doing null updates during
in-place update


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Hannu Krosing

On Thu, 2009-09-17 at 12:36 -0400, Tom Lane wrote:
> Hannu Krosing  writes:
> > On Thu, 2009-09-17 at 12:18 -0400, Tom Lane wrote:
> >> Or for an update without having to hold a transaction open.  We have
> >> recommended this type of technique in the past:
> 
> > If you need real locking, then just define a locked (or locked_by or
> > locked_until) column and use that for concurrent edit control
> 
> That's pessimistic locking, and it sucks for any number of reasons,
> most obviously if your client crashes or otherwise forgets to release
> the lock. 

That's the (locked_by,locked_until) case. It is used for a) telling
other potential editors that "this row is being edited" and also to time
out the lock.

>  The method I was illustrating is specifically meant for
> apps that would prefer optimistic locking.

But surely any reliance on internal implementation details like CTID or - 
XMIN should be discouraged in ordinanry user code, or really anything 
except maintenance utilities which sometimes _have_ to do that.

Still most people would _not_ want that to fail, if someone just opended
the edit windeo and then clicked "Save" without making any changes.

Telling the user the "You can't save your edited record as somebody just
changed the xmin field seems kind of silly.

> >> Exactly.  The application is typically going to throw a "concurrent
> >> update" type of error when this happens, and we don't want magic
> >> background operations to cause that.
> 
> > Would'nt current VACUUM FULL or CLUSTER cause much more grief in this
> > situation ?
> 
> Sure, but neither of those are recommended for routine maintenance
> during live database operations.  

If they were, then we would net be having this whole discussion now.

> (What you might do during maintenance windows is a different discussion.)

I aim at 24/7 operations with no maintenance window in sight

> 
>   regards, tom lane
> 
-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Greg Smith


On Wed, 16 Sep 2009, Tom Lane wrote:


* Shrink a table in place - when no space available

To be addressed by the UPDATE-style tuple-mover (which could be thought
of as VACUUM FULL rewritten to not use any special mechanisms).


Is there any synergy here with the needs of a future in-place upgrade 
upgrade mechanism that handles page header expansion?  That problem seemed 
to always get stuck on the issue of how to move tuples around when the 
pages were full.  Not trying to drag the scope of this job out, just 
looking for common ground that might be considered when designing the 
tuple-mover if it could serve both purposes.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schedule for 8.5 Development

2009-09-17 Thread Bruce Momjian

Josh Berkus wrote:
> Hackers,
> 
> Per discussions on two other threads on this list which have apparently
> reached consensus, we will be going with the following schedule:
> 
> CF1   7/15 to 8/14
> Alpha1by 8/20
> CF2   9/15 to 10/14
> Alpha2by 10/20
> CF3   11/15 to 12/14
> Alpha3by 11/20
> CF4   1/15 to 2/14
> Alpha4  by 2/20
> Beta1 est. 3/1 to 3/7
> Release June, depending on bugs

I think that June release date is realistic.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba.conf: samehost and samenet

2009-09-17 Thread Stef Walter

[Thanks for the heads up about the MessageID missing when posting this
previously. Was doing some mail filter development, and accidentally
left it in place... ]

Magnus Hagander wrote:
> 2009/8/25 Alvaro Herrera :
>> Something to keep in mind -- my getifaddrs(3) manpage says that on BSD
>> it can return addresses that have ifa_addr set to NULL, which your code
>> doesn't seem to check.

Thanks for catching that. I've added a check, and attached a new patch.

> Eek. This is not defined by any standard, is it? I wonder how many
> different behaviours we can find there :(

I've checked AIX, Linux, BSD and Mac OS and NULL ifa_addr's are
documented in all of them.

Cheers,

Stef

diff --git a/configure.in b/configure.in
index e545a1f..b77ce2b 100644
*** a/configure.in
--- b/configure.in
*** AC_SUBST(OSSP_UUID_LIBS)
*** 969,975 
  ##

  dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES
! AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h])

  # At least on IRIX, cpp test for netinet/tcp.h will fail unless
  # netinet/in.h is included first.
--- 969,975 
  ##

  dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES
! AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h ifaddrs.h])

  # At least on IRIX, cpp test for netinet/tcp.h will fail unless
  # netinet/in.h is included first.
*** PGAC_VAR_INT_TIMEZONE
*** 1148,1154 
  AC_FUNC_ACCEPT_ARGTYPES
  PGAC_FUNC_GETTIMEOFDAY_1ARG

! AC_CHECK_FUNCS([cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs])

  # posix_fadvise() is a no-op on Solaris, so don't incur function overhead
  # by calling it, 2009-04-02
--- 1148,1154 
  AC_FUNC_ACCEPT_ARGTYPES
  PGAC_FUNC_GETTIMEOFDAY_1ARG

! AC_CHECK_FUNCS([cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs getifaddrs])

  # posix_fadvise() is a no-op on Solaris, so don't incur function overhead
  # by calling it, 2009-04-02
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index ad4d084..e88c796 100644
*** a/doc/src/sgml/client-auth.sgml
--- b/doc/src/sgml/client-auth.sgml
*** hostnossl  database

+   Instead of an CIDR-address, you can specify 
+the values samehost or samenet. To 
+match any address on the subnets connected to the local machine, specify 
+samenet. By specifying samehost, any 
+addresses present on the network interfaces of local machine will match.
+   
+ 

 This field only applies to host,
 hostssl, and hostnossl records.
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index e6f7db2..c2da3a0 100644
*** a/src/backend/libpq/hba.c
--- b/src/backend/libpq/hba.c
*** check_db(const char *dbname, const char 
*** 512,517 
--- 512,608 
  	return false;
  }

+ /*
+  * Check to see if a connecting IP matches the address and netmask.
+  */
+ static bool
+ check_ip (SockAddr *raddr, struct sockaddr *addr, struct sockaddr *mask)
+ {
+ 	if (raddr->addr.ss_family == addr->sa_family)
+ 	{
+ 		/* Same address family */
+ 		if (!pg_range_sockaddr(&raddr->addr, (struct sockaddr_storage*)addr, 
+ 		   (struct sockaddr_storage*)mask))
+ 			return false;
+ 	}
+ #ifdef HAVE_IPV6
+ 	else if (addr->sa_family == AF_INET &&
+ 			 raddr->addr.ss_family == AF_INET6)
+ 	{
+ 		/*
+ 		 * Wrong address family.  We allow only one case: if the file
+ 		 * has IPv4 and the port is IPv6, promote the file address to
+ 		 * IPv6 and try to match that way.
+ 		 */
+ 		struct sockaddr_storage addrcopy,
+ 	maskcopy;
+ 
+ 		memcpy(&addrcopy, &addr, sizeof(addrcopy));
+ 		memcpy(&maskcopy, &mask, sizeof(maskcopy));
+ 		pg_promote_v4_to_v6_addr(&addrcopy);
+ 		pg_promote_v4_to_v6_mask(&maskcopy);
+ 
+ 		if (!pg_range_sockaddr(&raddr->addr, &addrcopy, &maskcopy))
+ 			return false;
+ 	}
+ #endif   /* HAVE_IPV6 */
+ 	else
+ 	{
+ 		/* Wrong address family, no IPV6 */
+ 		return false;
+ 	}
+ 
+ 	return true;
+ }
+ 
+ typedef struct CheckNetwork {
+ 	NetMethod method;
+ 	SockAddr *raddr;
+ 	bool result;	
+ } CheckNetwork;
+ 
+ static void
+ callback_check_network (struct sockaddr *addr, struct sockaddr *netmask, void *data)
+ {
+ 	CheckNetwork *cn = data;
+ 	struct sockaddr_storage mask;
+ 
+ 	/* Already found a ma

Re: [HACKERS] Linux LSB init script

2009-09-17 Thread Greg Smith


On Thu, 17 Sep 2009, Peter Eisentraut wrote:


On ons, 2009-09-16 at 12:05 -0500, Kevin Grittner wrote:

Well, with differences in behavior, of course, like attempting a fast
shutdown, and escalating to an immediate shutdown if that doesn't
succeed in the allowed time.


Right, but if you think that this behavior the right/better one (which
it arguably isn't), why shouldn't it be available from the mainline
tools?  Obviously, most people have felt for the last N years that
pg_ctl provides adequate shutdown options.


I've implemented exactly the same logic Kevin describes for a past 
consulting customer and for multiple forms of shutdown scripts at Truviso. 
I would suggest the situation here is that it's so easy to script a custom 
solution that the other people who have done the same don't have any 
motivation to get that fixed upstream, which would take a bunch of arguing 
on this list to accomplish.  Easier to just work around it and move on.


You're correct that pg_ctl provides "adequate" options here in the respect 
that it's possible to build the behavior many people want from the 
primitives available.  I'm with Kevin that what many people would like to 
see in a system shutdown script is not possible using pg_ctl alone.  Fast 
shutdown, timeout, then immediate shutdown is absolutely the best default 
behavior if the server must go down, but you can afford to wait a bit to 
try and do that more cleanly than going straight to immediate.


To answer the next question, "why doesn't fast shutdown work for you?", 
see http://archives.postgresql.org/pgsql-bugs/2009-03/msg00062.php


If that were fixed maybe fast shutdown would be more useful, as it is I 
have to assume it will fail because open COPYs are pretty common for our 
apps.  But I don't want to go straight to immediate for the sake of other 
programs that can shut themselves down more cleanly if the server goes 
through the fast shutdown stage first.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LDAP where DN does not include UID attribute

2009-09-17 Thread Magnus Hagander

On Thu, Sep 17, 2009 at 18:02, Robert Fleming  wrote:
> Following a discussion on the pgsql-admin list
> , I have
> created a patch to (optionally) allow PostgreSQL to do a LDAP search to
> determine the user's DN (as is done in Apache, MediaWiki, Bugzilla, et al.)
> instead of building the DN from a prefix and suffix.
> This is necessary for schemas where the login attribute is not in the DN,
> such as is described here
>  (look for
> "name-based").  This patch is against PostgreSQL 8.4.0 from Debian
> Lenny-backports.  If this would be a welcome addition, I can port it forward
> to the latest from postgresql.org.
> Thanks in advance for your feedback.

This sounds like a very useful feature, and one that I can then remove
from my personal TODO list without having to do much work :-)

A couple of comments:

First of all, please read up on the PostgreSQL coding style, or at
least look at the code around yours. This doesn't look anything like
our standards.

Second, this appears to require an anonymous bind to the directory,
which is something we should not encourage people to enable on their
LDAP servers. I think we need to also take parameters with a DN and a
password to bind with in order to do the search, and then re-bind as
the user when found.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] opportunistic tuple freezing

2009-09-17 Thread Jeff Davis

On Thu, 2009-09-17 at 12:36 -0400, Robert Haas wrote:
> Despite my recent screw-up in this department, it should really be the
> patch author's responsibility to test the patch first.  Then the
> reviewing process can involve additional testing.  So I would say this
> should be moved to Returned With Feedback, and then it can be
> resubmitted later with test results.

Fine with me. I already suspected that this patch wouldn't make it to
the September commitfest:

http://archives.postgresql.org/pgsql-hackers/2009-09/msg00798.php

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] community decision-making & 8.5

2009-09-17 Thread Bruce Momjian

Selena Deckelmann wrote:
> On Thu, Sep 3, 2009 at 9:00 AM, Robert Haas wrote:
> 
> > Yeah, I'm game, though I'm hoping not to become the guy who spends all
> > his time doing release planning, because I like writing code, too.
> > Hopefully Selena won't mind my mentioning that she sent me a private
> > email expressing some interest in this area, too.
> 
> Not at all!  My schedule is largely open this fall and winter, and I
> have a patch or two that I aught to finish soonish as well.

FYI, my schedule in the next few months is not good.  I just returned
from 19 days of traveling (mostly vacation), and will be attending
conferences two weeks in October and two weeks in November.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Simon Riggs


On Thu, 2009-09-17 at 12:30 -0400, Tom Lane wrote:
> Simon Riggs  writes:
> > The update utility being discussed is in danger of confusing these two
> > goals
> > * compact the table using minimal workspace
> > * compact the table with minimal interruption to concurrent updaters
> 
> Actually, the update utility is explicitly meant to satisfy both of
> those goals (possibly with different usage styles).  I don't see any
> particular confusion.

 It wasn't explicit until now. The confusion was you saying that
"VACUUM FULL CONCURRENTLY" was an impossible dream, that's why I've
restated it the above way so its clear what we want.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linux LSB init script

2009-09-17 Thread Kevin Grittner

Peter Eisentraut  wrote:

> I think it's important to consider what the "conforming behavior"
> really achieves in practice.  The INFO section is essential nowadays
> for correct functioning on a large portion of distributions.  The
> exit codes are relatively uninteresting but occasionally useful. 
> The init-functions don't make a difference at all, as far as I'm
> aware.

I don't claim to be aware of how heavily current and planned
conforming implementations make use of which features; which is why I
felt it best to conform as fully as possible.  As far as I can tell,
full conformance in the script only has the down side of bloating it
to 300 lines, including current comments; it would be much shorter if
we modified pg_ctl as you propose.  I'm not clear on any other
benefits of pseudo-conformance.

> Well, you mentioned earlier that you were hesitant about backporting
> the INFO header without adding all the other "conforming" stuff. 
> From a practical point of view, I think an INFO header is nowadays
> essential independent of what the rest of the script does.

The only down side to including it in a script which doesn't behave in
an LSB conforming manner is that it could convince a conforming
environment that it is a conforming script.  Lying about that might be
worse than remaining silent on the matter.

> Well, in such cases it may be useful to add an option such as
> --oknodo to select the idempotent behavior.

I found that confusing (as did Robert); how about --lsm-conforming?

>> > And then you proceed in pg_initd_stop to basically reimplement
>> > half of the pg_ctl logic in shell code.
>>  
>> Well, with differences in behavior, of course, like attempting a
>> fast shutdown, and escalating to an immediate shutdown if that
>> doesn't succeed in the allowed time.
> 
> Right, but if you think that this behavior the right/better one
> (which it arguably isn't), why shouldn't it be available from the
> mainline tools?

Well, it's certainly the best behavior for my shop, because the stop
command is used by the OS shutdown.  If the command never returns, the
shutdown hangs indefinitely, which can be very bad in our environment.
If the shutdown returns without actually stopping PostgreSQL, the
system will kill -9 it as a last resort before power-off or reboot.
I like my heuristic better than either of those.

> Obviously, most people have felt for the last N years that
> pg_ctl provides adequate shutdown options.

Which is why I didn't suggest changing pg_ctl; but perhaps that was
overly timid.

>> Not using the functions would cause the script to only work for
>> some undefined subset of conforming implementations.
> 
> Yeah, except that part of the spec is hilariously unrealistic.  And
> there is no evidence to suggest that use of these functions will
> cause messages to be formatted like other services on the machine.
> Debian/Ubuntu for example have given up on those functions because
> they don't anything useful and provide their own set of functions
> that you ought to use to format messages like on those systems.

SuSE also wraps these with their own functions, but I really don't
want to try to learn the implementation details of every distribution
when there is a supported standard.  I think you're wrong about
current support.

On my kubuntu system:

r...@kgrittn-desktop:~# . /lib/lsb/init-functions
r...@kgrittn-desktop:~# log_success_msg "test: started"
 * test: started
r...@kgrittn-desktop:~# log_warning_msg "test: unexpected state"
 * test: unexpected state
r...@kgrittn-desktop:~# log_failure_msg "test: insufficient
priviledges"
 * test: insufficient priviledges

The star is black, yellow, or red, depending on the function used. 
This matches the behavior of init scripts which come with the OS.

On my SLES10 systems:

CIRDEV:~ # . /lib/lsb/init-functions
CIRDEV:~ # log_success_msg "test: started"
test: started   done
CIRDEV:~ # log_warning_msg "test: unexpected state"
test: unexpected state   warning
CIRDEV:~ # log_failure_msg "test: insufficient priviledges"
test: insufficient priviledges  failed

Where the word off to the right lines up near the right edge of
whatever the line width is, and the colors of those words is green,
yellow, or red based on the function used.  This matches the
behavior of init scripts which come with the OS.

> I would pretty much remove all of that completely and replace it
> with a link to the spec.

I'm skeptical.  Robert didn't think those were over the top.  I would
like to hear what others think, but I, personally, appreciate such
comments when I'm reading code.  (Even my own, if it's been a few
months.)

> How about making a list of current behavior of pg_ctl vs. required
> behavior under LSB, and then we can judge how much of that could
> become a standard behavior and how much would have to be hidden
> behind an option.
> 
> I think figuring this out

Re: [HACKERS] opportunistic tuple freezing

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 12:24 PM, Tom Lane  wrote:
> Jeff Janes  writes:
>> Does the community think that experimental performance testing is a
>> must-have in order for this patch to be acceptable?
>
> Dunno about others, but I think so.  It's complicating both the
> implementation and the users-eye view of VACUUM, and I want more than
> a hypothesis that we're going to get something useful out of that.
>
> If we can't test it in a reasonable time frame for this commitfest,
> then we should move it to the queue for the next one.

Despite my recent screw-up in this department, it should really be the
patch author's responsibility to test the patch first.  Then the
reviewing process can involve additional testing.  So I would say this
should be moved to Returned With Feedback, and then it can be
resubmitted later with test results.

The problem with bumping things to the next CommitFest is that it then
becomes the CommitFest management team's problem to sort out which
patches were bumped but the necessary to-do items weren't completed,
versus being the patch author's problem to let us know when they have
completed the necessary to-do items.

So I am in favor of a policy that things should only be moved to the
next CommitFest when they have ALREADY satisfied the requirements for
being reviewed during that CommitFest.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Hannu Krosing  writes:
> On Thu, 2009-09-17 at 12:18 -0400, Tom Lane wrote:
>> Or for an update without having to hold a transaction open.  We have
>> recommended this type of technique in the past:

> If you need real locking, then just define a locked (or locked_by or
> locked_until) column and use that for concurrent edit control

That's pessimistic locking, and it sucks for any number of reasons,
most obviously if your client crashes or otherwise forgets to release
the lock.  The method I was illustrating is specifically meant for
apps that would prefer optimistic locking.

>> Exactly.  The application is typically going to throw a "concurrent
>> update" type of error when this happens, and we don't want magic
>> background operations to cause that.

> Would'nt current VACUUM FULL or CLUSTER cause much more grief in this
> situation ?

Sure, but neither of those are recommended for routine maintenance
during live database operations.  (What you might do during maintenance
windows is a different discussion.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 12:31 PM, Hannu Krosing  wrote:
>> Exactly.  The application is typically going to throw a "concurrent
>> update" type of error when this happens, and we don't want magic
>> background operations to cause that.
>
> Would'nt current VACUUM FULL or CLUSTER cause much more grief in this
> situation ?

No.  They take an exclusive lock on the table, so this situation can't
occur in those cases, which was Tom's point.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 12:18 PM, Tom Lane  wrote:
>> It's no different from the situation where another backend UPDATEs the
>> row under your nose, but it's not something you want to do automatically
>> without notice.
>
> Exactly.  The application is typically going to throw a "concurrent
> update" type of error when this happens, and we don't want magic
> background operations to cause that.

OK, that makes sense.  It seems like we more or less have consensus on
what to do here.

- Change VACUUM FULL to be the equivalent of CLUSTER-without-index.
- Add some kind of tuple mover that can be invoked when it's necessary
to incrementally compact a table in place.

This might not cover every possible use case, but it seems that it
can't be any worse than what we have now.  The tuple mover seems like
a workable substitute for the current VACUUM FULL in cases where space
is limited, and by virtual of being incremental it can be used in
situations where the current VACUUM FULL can't.  There could be a loss
of functionality of the tuple mover is slower than VACUUM FULL, but
the consensus seems to be that's almost impossible to contemplate.

The new VACUUM FULL behavior, OTOH, should be faster than the existing
one in cases where space consumption is not an issue.

So nothing gets any worse, and some things get better.

But who is implementing this?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Hannu Krosing

On Thu, 2009-09-17 at 12:18 -0400, Tom Lane wrote:
> Heikki Linnakangas  writes:
> > Robert Haas wrote:
> >> On Thu, Sep 17, 2009 at 10:21 AM, Tom Lane  wrote:
> >>> Anything that moves tuples is not acceptable as a hidden background
> >>> operation, because it will break applications that depend on CTID.
> 
> >> I'm a bit confused.  CTIDs change all the time anyway, whenever you
> >> update the table.  What could someone possibly be using them for?
> 
> > As a unique identifier, while you hold a portal open.
> 
> Or for an update without having to hold a transaction open.  We have
> recommended this type of technique in the past:
> 
>   select ctid, xmin, * from table where id = something;
> 
>   ... allow user to edit the row at his leisure ...
> 
>   update table set ... where id = something and
>   ctid = previous value and xmin = previous value;
>   if rows_updated = 0 then
>   report error ("row was already updated by someone else");
> 
> (Actually, the ctid is only being used for fast access here; the xmin
> is what is really needed to detect that someone else updated the row.
> But the proposed tuple-mover would break the xmin check too.)

I have used mostly duck-typed, interface-not-identity  languages lately,
so for me the natural thing to check in similar situation is if any
"interesting columns" have changed, by simply preserving old values in
user application and use these in WHERE clause of update.

Why should anyone care if there has been say a null update (set id=id
where id=...) ?

If you need real locking, then just define a locked (or locked_by or
locked_until) column and use that for concurrent edit control

> > It's no different from the situation where another backend UPDATEs the
> > row under your nose, but it's not something you want to do automatically
> > without notice.
> 
> Exactly.  The application is typically going to throw a "concurrent
> update" type of error when this happens, and we don't want magic
> background operations to cause that.

Would'nt current VACUUM FULL or CLUSTER cause much more grief in this
situation ?

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Simon Riggs  writes:
> The update utility being discussed is in danger of confusing these two
> goals
> * compact the table using minimal workspace
> * compact the table with minimal interruption to concurrent updaters

Actually, the update utility is explicitly meant to satisfy both of
those goals (possibly with different usage styles).  I don't see any
particular confusion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] opportunistic tuple freezing

2009-09-17 Thread Tom Lane

Jeff Janes  writes:
> Does the community think that experimental performance testing is a
> must-have in order for this patch to be acceptable?

Dunno about others, but I think so.  It's complicating both the
implementation and the users-eye view of VACUUM, and I want more than
a hypothesis that we're going to get something useful out of that.

If we can't test it in a reasonable time frame for this commitfest,
then we should move it to the queue for the next one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Hannu Krosing

On Thu, 2009-09-17 at 10:32 -0400, Tom Lane wrote:
> Hannu Krosing  writes:
> > On Thu, 2009-09-17 at 10:21 -0400, Tom Lane wrote:
> >> because it will break applications that depend on CTID.
> 
> > Do you know of any such applications out in the wild ?
> 
> Yes, they're out there.

How do they deal with concurrent UPDATEs ?

>   regards, tom lane
> 
-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Heikki Linnakangas  writes:
> Robert Haas wrote:
>> On Thu, Sep 17, 2009 at 10:21 AM, Tom Lane  wrote:
>>> Anything that moves tuples is not acceptable as a hidden background
>>> operation, because it will break applications that depend on CTID.

>> I'm a bit confused.  CTIDs change all the time anyway, whenever you
>> update the table.  What could someone possibly be using them for?

> As a unique identifier, while you hold a portal open.

Or for an update without having to hold a transaction open.  We have
recommended this type of technique in the past:

select ctid, xmin, * from table where id = something;

... allow user to edit the row at his leisure ...

update table set ... where id = something and
ctid = previous value and xmin = previous value;
if rows_updated = 0 then
report error ("row was already updated by someone else");

(Actually, the ctid is only being used for fast access here; the xmin
is what is really needed to detect that someone else updated the row.
But the proposed tuple-mover would break the xmin check too.)

> It's no different from the situation where another backend UPDATEs the
> row under your nose, but it's not something you want to do automatically
> without notice.

Exactly.  The application is typically going to throw a "concurrent
update" type of error when this happens, and we don't want magic
background operations to cause that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] opportunistic tuple freezing

2009-09-17 Thread Jeff Janes

On Mon, Sep 14, 2009 at 2:07 AM, Jeff Davis  wrote:

> On Mon, 2009-08-17 at 10:22 -0400, Tom Lane wrote:
> > As always with patches that are meant to improve performance,
> > some experimental evidence would be a good thing.
>
> I haven't had time to performance test this patch yet, and it looks like
> it will take a significant amount of effort to do so. I'm focusing on my
> other work, so I don't know if this one is going to be in shape for the
> September commitfest.
>
> If someone is interested in doing some performance testing for this
> patch, let me know. I still think it has potential.
>
>
Does the community think that experimental performance testing is a
must-have in order for this patch to be acceptable?  If so, it sounds like
this should be marked as rejected or RwF, and no longer considered for this
commit fest, and I should move on to a different patch.

I'll do some work on benchmarking it, but since it takes hundreds of
millions of transactions to make a plausible scenario, this will not be done
any time soon.

Also, I see that the number of frozen tuples is not logged.  I'd like to add
that to the info reported under Log_autovacuum_min_duration, both to help
evaluate this patch and because it seems like something that should be
reported.

Cheers,

Jeff Janes

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Heikki Linnakangas

Robert Haas wrote:
> On Thu, Sep 17, 2009 at 10:21 AM, Tom Lane  wrote:
>> Hannu Krosing  writes:
>>> On Wed, 2009-09-16 at 21:19 -0400, Tom Lane wrote:
 VACUUM FULL CONCURRENTLY is a contradiction in terms.  Wishing it were
 possible doesn't make it so.
>>> It depends on what do you mean by "VACUUM FULL"
>> Anything that moves tuples is not acceptable as a hidden background
>> operation, because it will break applications that depend on CTID.
> 
> I'm a bit confused.  CTIDs change all the time anyway, whenever you
> update the table.  What could someone possibly be using them for?

As a unique identifier, while you hold a portal open. I recall that last
time this was discussed was wrt. HOT. At least one of the drivers used
it to implement client-side updateable cursors (ODBC if I recall
correctly). We normally guarantee that CTID of a row doesn't change
within the same transaction that you read it, but if we do UPDATEs to
move tuples behind the application's back, the UPDATEs will cause the
CTID of the row to change.

It's no different from the situation where another backend UPDATEs the
row under your nose, but it's not something you want to do automatically
without notice.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] LDAP where DN does not include UID attribute

2009-09-17 Thread Robert Fleming

Following a discussion on the pgsql-admin list <
http://archives.postgresql.org/pgsql-admin/2009-09/msg00075.php>, I have
created a patch to (optionally) allow PostgreSQL to do a LDAP search to
determine the user's DN (as is done in Apache, MediaWiki, Bugzilla, et al.)
instead of building the DN from a prefix and suffix.
This is necessary for schemas where the login attribute is not in the DN,
such as is described here <
http://www.ldapman.org/articles/intro_to_ldap.html#individual> (look for
"name-based").  This patch is against PostgreSQL 8.4.0 from Debian
Lenny-backports.  If this would be a welcome addition, I can port it forward
to the latest from postgresql.org.
Thanks in advance for your feedback.
Robert
diff -ru postgresql-8.4-8.4.0-original/src/backend/libpq/auth.c postgresql-8.4-8.4.0/src/backend/libpq/auth.c
--- postgresql-8.4-8.4.0-original/src/backend/libpq/auth.c	2009-06-25 04:30:08.0 -0700
+++ postgresql-8.4-8.4.0/src/backend/libpq/auth.c	2009-09-16 22:33:46.0 -0700
@@ -2150,10 +2150,75 @@
 		}
 	}
 
-	snprintf(fulluser, sizeof(fulluser), "%s%s%s",
-			 port->hba->ldapprefix ? port->hba->ldapprefix : "",
-			 port->user_name,
-			 port->hba->ldapsuffix ? port->hba->ldapsuffix : "");
+	if (port->hba->ldapbasedn)
+	  {
+	char filter[NAMEDATALEN + 10]; // FIXME: maybe there's a preferred way to pick this size?
+	LDAPMessage* search_message;
+	char* attributes[2];
+	LDAPMessage* entry;
+	char* dn;
+
+	/* it seems that you need to search for at least one attribute, else /all/ attributes are returned */
+	attributes[0] = "uid";
+	attributes[1] = NULL;
+
+	snprintf(filter, sizeof(filter), "(uid=%s)", port->user_name);
+	filter[sizeof(filter) - 1] = '\0';
+
+	r = ldap_search_s(ldap,
+			  port->hba->ldapbasedn,
+			  LDAP_SCOPE_SUBTREE,
+			  filter,
+			  attributes,
+			  0,
+			  &search_message);
+
+	if (r != LDAP_SUCCESS)
+	  {
+		ereport(LOG,
+			(errmsg("LDAP search failed for user \"%s\" on server \"%s\": error code %d",
+fulluser, port->hba->ldapserver, r)));
+		return STATUS_ERROR;
+	  }
+
+	if (ldap_count_entries(ldap, search_message) != 1)
+	  {
+		if (ldap_count_entries(ldap, search_message) == 0)
+		  ereport(LOG,
+			  (errmsg("LDAP search failed for user \"%s\" on server \"%s\": no such user",
+  fulluser, port->hba->ldapserver)));
+		else
+		  ereport(LOG,
+			  (errmsg("LDAP search failed for user \"%s\" on server \"%s\": user is not unique (%d matches)",
+  fulluser, port->hba->ldapserver, ldap_count_entries(ldap, search_message;
+
+		ldap_msgfree(search_message);
+		return STATUS_ERROR;
+	  }
+
+	entry = ldap_first_entry(ldap, search_message);
+	dn = ldap_get_dn(ldap, entry);
+	if (dn == NULL)
+	  {
+		int error;
+		r = ldap_get_option(ldap, LDAP_OPT_RESULT_CODE, &error);
+		ereport(LOG,
+			(errmsg("LDAP search failed for user \"%s\" on server \"%s\": %s",
+fulluser, port->hba->ldapserver, ldap_err2string(error;
+		ldap_msgfree(search_message);
+		return STATUS_ERROR;
+	  }
+	strncpy(fulluser, dn, sizeof(fulluser));
+
+	ldap_memfree(dn);
+	ldap_msgfree(search_message);
+	  }
+	else
+	  snprintf(fulluser, sizeof(fulluser), "%s%s%s",
+		   port->hba->ldapprefix ? port->hba->ldapprefix : "",
+		   port->user_name,
+		   port->hba->ldapsuffix ? port->hba->ldapsuffix : "");
+
 	fulluser[sizeof(fulluser) - 1] = '\0';
 
 	r = ldap_simple_bind_s(ldap, fulluser, passwd);
diff -ru postgresql-8.4-8.4.0-original/src/backend/libpq/hba.c postgresql-8.4-8.4.0/src/backend/libpq/hba.c
--- postgresql-8.4-8.4.0-original/src/backend/libpq/hba.c	2009-06-24 06:39:42.0 -0700
+++ postgresql-8.4-8.4.0/src/backend/libpq/hba.c	2009-09-16 22:19:59.0 -0700
@@ -1032,6 +1032,11 @@
 	return false;
 }
 			}
+			else if (strcmp(token, "ldapbasedn") == 0)
+			{
+REQUIRE_AUTH_OPTION(uaLDAP, "ldapbasedn", "ldap");
+parsedline->ldapbasedn = pstrdup(c);
+			}
 			else if (strcmp(token, "ldapprefix") == 0)
 			{
 REQUIRE_AUTH_OPTION(uaLDAP, "ldapprefix", "ldap");
diff -ru postgresql-8.4-8.4.0-original/src/include/libpq/hba.h postgresql-8.4-8.4.0/src/include/libpq/hba.h
--- postgresql-8.4-8.4.0-original/src/include/libpq/hba.h	2009-06-11 07:49:11.0 -0700
+++ postgresql-8.4-8.4.0/src/include/libpq/hba.h	2009-09-16 22:20:07.0 -0700
@@ -53,6 +53,7 @@
 	bool		ldaptls;
 	char	   *ldapserver;
 	int			ldapport;
+	char	   *ldapbasedn;
 	char	   *ldapprefix;
 	char	   *ldapsuffix;
 	bool		clientcert;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Simon Riggs

On Thu, 2009-09-17 at 11:25 -0400, Robert Haas wrote:
> On Thu, Sep 17, 2009 at 10:21 AM, Tom Lane  wrote:
> > Hannu Krosing  writes:
> >> On Wed, 2009-09-16 at 21:19 -0400, Tom Lane wrote:
> >>> VACUUM FULL CONCURRENTLY is a contradiction in terms.  Wishing it were
> >>> possible doesn't make it so.
> >
> >> It depends on what do you mean by "VACUUM FULL"
> >
> > Anything that moves tuples is not acceptable as a hidden background
> > operation, because it will break applications that depend on CTID.
> 
> I'm a bit confused.  CTIDs change all the time anyway, whenever you
> update the table.  What could someone possibly be using them for?

This part of the thread is somewhat strange. I don't think anybody was
suggesting the thing that Tom has assumed was meant, so how that chimera
would work isn't important. So, moving on...

The update utility being discussed is in danger of confusing these two
goals
* compact the table using minimal workspace
* compact the table with minimal interruption to concurrent updaters

We really *need* it to do the first for when emergencies arrive, but
most of the time we'd like it do the the second one. They aren't
necessarily the same thing and I don't want us to forget the "using
minimal workspace" requirement because the other one sounds so juicy.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Simon Riggs


On Thu, 2009-09-17 at 10:45 -0400, Tom Lane wrote:
> Hannu Krosing  writes:
> > On Thu, 2009-09-17 at 09:45 -0400, Robert Haas wrote:
> >> Making the code more complicated so that it's easier to tune something
> >> that isn't very hard to tune anyway doesn't seem like a good
> >> trade-off.
> 
> > I think that just making sure that pessimal cases don't happen should be
> > enough, maybe just check for too-much-time-in-transaction after each N
> > pages touched.
> 
> If people think that a runtime limit is the most natural way to control
> this, I don't see a reason not to do it that way.  I would envision
> checking the elapsed time once per page or few pages; shouldn't be a
> huge amount of effort or complication ...

Yes, I think time is the most natural way. Currently, VACUUM provides an
effective max impact time since it locks one block at any one time and
therefore limits how long users need wait for it. We need a way to
specify the maximum time we are prepared for an update/delete
transaction to wait when this utility runs (in ms). That way we can
easily assess the impact on transactional systems.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 10:21 AM, Tom Lane  wrote:
> Hannu Krosing  writes:
>> On Wed, 2009-09-16 at 21:19 -0400, Tom Lane wrote:
>>> VACUUM FULL CONCURRENTLY is a contradiction in terms.  Wishing it were
>>> possible doesn't make it so.
>
>> It depends on what do you mean by "VACUUM FULL"
>
> Anything that moves tuples is not acceptable as a hidden background
> operation, because it will break applications that depend on CTID.

I'm a bit confused.  CTIDs change all the time anyway, whenever you
update the table.  What could someone possibly be using them for?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] FSM search modes

2009-09-17 Thread Simon Riggs


Just been looking again at the way FSM works. In fsm_search_avail() we
essentially have just a single way for working out how to search the
tree.

Seems like it would be good to abstract this so that we can implement a
number of FSM search strategies

* (current) randomize - page selection encourages different backends to
access different blocks, thus reducing block contention

* cluster - page selection made based around selecting block with
freespace nearest current block and we prefer keep-in-sequence to
avoid-contention

* compact - page selection specifically attempts to find the lowest
numbered blocks, so that the table will naturally shrink over time.

These are not all mutually exclusive, suggested combinations would be

randomize, randomize | cluster, randomize | compact

So we don't give up the load spreading behaviour, we just apply the
logic at lower levels of the tree only.

VACUUM could set the FSM into FSMstrategy = compact when it notices that
most of the free blocks are at lower end of table. Or explicitly set
during VF replacement utility.

FSMstrategy = cluster would be the default if clustering is enabled on a
table.

FSMstrategy can change via ALTER TABLE ... WITH (fsm_strategy = ...)

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Emmanuel Cecchet


Tom Lane wrote:

I wonder though if we couldn't simplify matters. Offhand it seems to me
that psql doesn't need to validate the command's syntax fully.  All it
really needs to do is find the target filename and replace it with
STDIN/STDOUT.  Could we have it just treat the remainder of the line
literally, and not worry about the details of what the options might be?
Let the backend worry about throwing an error if they're bad.
  

New version with the simplified psql. Still supports the 7.3 syntax.
I am going to look into porting the other COPY enhancements (error 
logging and autopartitioning) on this implementation. We might come up 
with new ideas for the documentation side of things with more options.


manu

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com

### Eclipse Workspace Patch 1.0
#P Postgres8.5-COPY
Index: src/test/regress/sql/copy2.sql
===
RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/copy2.sql,v
retrieving revision 1.18
diff -u -r1.18 copy2.sql
--- src/test/regress/sql/copy2.sql  25 Jul 2009 00:07:14 -  1.18
+++ src/test/regress/sql/copy2.sql  17 Sep 2009 15:04:06 -
@@ -73,17 +73,17 @@
 \.
 
 -- various COPY options: delimiters, oids, NULL string
-COPY x (b, c, d, e) from stdin with oids delimiter ',' null 'x';
+COPY x (b, c, d, e) from stdin (oids, delimiter ',', null 'x');
 50,x,45,80,90
 51,x,\x,\\x,\\\x
 52,x,\,,\\\,,\\
 \.
 
-COPY x from stdin WITH DELIMITER AS ';' NULL AS '';
+COPY x from stdin (DELIMITER ';', NULL '');
 3000;;c;;
 \.
 
-COPY x from stdin WITH DELIMITER AS ':' NULL AS E'\\X';
+COPY x from stdin (DELIMITER ':', NULL E'\\X');
 4000:\X:C:\X:\X
 4001:1:empty::
 4002:2:null:\X:\X
@@ -108,13 +108,13 @@
 INSERT INTO no_oids (a, b) VALUES (20, 30);
 
 -- should fail
-COPY no_oids FROM stdin WITH OIDS;
-COPY no_oids TO stdout WITH OIDS;
+COPY no_oids FROM stdin (OIDS);
+COPY no_oids TO stdout (OIDS);
 
 -- check copy out
 COPY x TO stdout;
 COPY x (c, e) TO stdout;
-COPY x (b, e) TO stdout WITH NULL 'I''m null';
+COPY x (b, e) TO stdout (NULL 'I''m null');
 
 CREATE TEMP TABLE y (
col1 text,
@@ -130,11 +130,23 @@
 COPY y TO stdout WITH CSV FORCE QUOTE col2 ESCAPE E'\\';
 COPY y TO stdout WITH CSV FORCE QUOTE *;
 
+-- Test new 8.5 syntax
+
+COPY y TO stdout (CSV);
+COPY y TO stdout (CSV, CSV_QUOTE , DELIMITER '|');
+COPY y TO stdout (CSV, CSV_FORCE_QUOTE (col2), CSV_ESCAPE E'\\');
+COPY y TO stdout (CSV, CSV_FORCE_QUOTE *);
+
+\COPY y TO stdout (CSV)
+\COPY y TO stdout (CSV, CSV_QUOTE , DELIMITER '|')
+\COPY y TO stdout (CSV, CSV_FORCE_QUOTE (col2), CSV_ESCAPE E'\\')
+\COPY y TO stdout (CSV, CSV_FORCE_QUOTE *)
+
 --test that we read consecutive LFs properly
 
 CREATE TEMP TABLE testnl (a int, b text, c int);
 
-COPY testnl FROM stdin CSV;
+COPY testnl FROM stdin (CSV);
 1,"a field with two LFs
 
 inside",2
@@ -143,14 +155,14 @@
 -- test end of copy marker
 CREATE TEMP TABLE testeoc (a text);
 
-COPY testeoc FROM stdin CSV;
+COPY testeoc FROM stdin (CSV);
 a\.
 \.b
 c\.d
 "\."
 \.
 
-COPY testeoc TO stdout CSV;
+COPY testeoc TO stdout (CSV);
 
 DROP TABLE x, y;
 DROP FUNCTION fn_x_before();
Index: src/test/regress/sql/aggregates.sql
===
RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/aggregates.sql,v
retrieving revision 1.15
diff -u -r1.15 aggregates.sql
--- src/test/regress/sql/aggregates.sql 25 Apr 2009 16:44:56 -  1.15
+++ src/test/regress/sql/aggregates.sql 17 Sep 2009 15:04:06 -
@@ -104,7 +104,7 @@
   BIT_OR(i4)  AS "?"
 FROM bitwise_test;
 
-COPY bitwise_test FROM STDIN NULL 'null';
+COPY bitwise_test FROM STDIN (NULL 'null');
 1  1   1   1   1   B0101
 3  3   3   null2   B0100
 7  7   7   3   4   B1100
@@ -171,7 +171,7 @@
   BOOL_OR(b3)AS "n"
 FROM bool_test;
 
-COPY bool_test FROM STDIN NULL 'null';
+COPY bool_test FROM STDIN (NULL 'null');
 TRUE   nullFALSE   null
 FALSE  TRUEnullnull
 null   TRUEFALSE   null
Index: src/test/regress/sql/copyselect.sql
===
RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/copyselect.sql,v
retrieving revision 1.2
diff -u -r1.2 copyselect.sql
--- src/test/regress/sql/copyselect.sql 7 Aug 2008 01:11:52 -   1.2
+++ src/test/regress/sql/copyselect.sql 17 Sep 2009 15:04:06 -
@@ -61,7 +61,7 @@
 --
 -- Test headers, CSV and quotes
 --
-copy (select t from test1 where id = 1) to stdout csv header force quote t;
+copy (select t from test1 where id = 1) to stdout (csv, csv_header, 
csv_force_quote (t));
 --
 -- Test psql builtins, plain table
 --
Index: src/test/regress/expected/aggregates.out
===
RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/expected/aggregates.out,v
retrieving revision 1.19
diff

Re: [HACKERS] generic copy options

2009-09-17 Thread Tom Lane

Emmanuel Cecchet  writes:
> Does that mean that we can drop the 7.3 syntax or should we just keep it?

I wouldn't object to dropping the 7.3 syntax now, especially if we're
about to introduce a new syntax and deprecate the old one ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Hannu Krosing  writes:
> On Thu, 2009-09-17 at 09:45 -0400, Robert Haas wrote:
>> Making the code more complicated so that it's easier to tune something
>> that isn't very hard to tune anyway doesn't seem like a good
>> trade-off.

> I think that just making sure that pessimal cases don't happen should be
> enough, maybe just check for too-much-time-in-transaction after each N
> pages touched.

If people think that a runtime limit is the most natural way to control
this, I don't see a reason not to do it that way.  I would envision
checking the elapsed time once per page or few pages; shouldn't be a
huge amount of effort or complication ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Emmanuel Cecchet


Tom Lane wrote:

Emmanuel Cecchet  writes:
  

Tom Lane wrote:


psql has MORE need to support old syntax than the backend does, because
it's supposed to work against old servers.

  

Well, I wonder how many users just upgrade psql vs upgrade the server.



We have established a project policy that psql backslash commands will
support servers at least back to 7.4, and a great deal of work has
already been expended in support of that goal.  It is not within the
charter of this patch to ignore or redefine that policy.
  

Does that mean that we can drop the 7.3 syntax or should we just keep it?
For future references, where can I find the various project policies?

Thanks
Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Hannu Krosing  writes:
> On Thu, 2009-09-17 at 10:21 -0400, Tom Lane wrote:
>> because it will break applications that depend on CTID.

> Do you know of any such applications out in the wild ?

Yes, they're out there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Hannu Krosing

On Thu, 2009-09-17 at 10:21 -0400, Tom Lane wrote:
> Hannu Krosing  writes:
> > On Wed, 2009-09-16 at 21:19 -0400, Tom Lane wrote:
> >> VACUUM FULL CONCURRENTLY is a contradiction in terms.  Wishing it were
> >> possible doesn't make it so.
> 
> > It depends on what do you mean by "VACUUM FULL"
> 
> Anything that moves tuples is not acceptable as a hidden background
> operation, 

I did not mean VACUUM FULL to be run as a hidden background operation.
just as something that does not need everything else to be shut down.

> because it will break applications that depend on CTID.

Do you know of any such applications out in the wild ?

> The utility Heikki is talking about is something that DBAs would
> invoke explicitly, presumably with an understanding of the side effects.

Like VACUUM FULL ?

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Hannu Krosing

On Thu, 2009-09-17 at 09:45 -0400, Robert Haas wrote:
> On Thu, Sep 17, 2009 at 9:35 AM, Joshua Tolley  wrote:
> > On Wed, Sep 16, 2009 at 09:48:20PM -0400, Tom Lane wrote:
> >> Seems like there would
> >> be lots of situations where short exclusive-lock intervals could be
> >> tolerated, even though not long ones.  So that's another argument
> >> for being able to set an upper bound on how many tuples get moved
> >> per call.
> >
> > Presumably this couldn't easily be an upper bound on the time spent moving
> > tuples, rather than an upper bound on the number of tuples moved?
> 
> It's probably not worth it.  There shouldn't be a tremendous amount of
> variability in how long it takes to move N tuples, so it's just a
> matter of finding the right value of N for your system and workload.

If you already have found the free space and the tuples to move, and
they both are evenly distributed, then it should take more or less than
same time to move them.

If you yet have to find the tuples, one by one and then place them in
small free slots on pages far apart then it takes significantly longer
than just moving full pages. 

Also, associated index updates can be of very different length,
especially for huge indexes where you may not only end up doing lots of
page splits, but may also need to read in large sets of pages from disk.

> Making the code more complicated so that it's easier to tune something
> that isn't very hard to tune anyway doesn't seem like a good
> trade-off.

I think that just making sure that pessimal cases don't happen should be
enough, maybe just check for too-much-time-in-transaction after each N
pages touched.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Tom Lane

Hannu Krosing  writes:
> On Wed, 2009-09-16 at 21:19 -0400, Tom Lane wrote:
>> VACUUM FULL CONCURRENTLY is a contradiction in terms.  Wishing it were
>> possible doesn't make it so.

> It depends on what do you mean by "VACUUM FULL"

Anything that moves tuples is not acceptable as a hidden background
operation, because it will break applications that depend on CTID.

The utility Heikki is talking about is something that DBAs would
invoke explicitly, presumably with an understanding of the side effects.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Tom Lane

Emmanuel Cecchet  writes:
> Tom Lane wrote:
>> psql has MORE need to support old syntax than the backend does, because
>> it's supposed to work against old servers.
>> 
> Well, I wonder how many users just upgrade psql vs upgrade the server.

We have established a project policy that psql backslash commands will
support servers at least back to 7.4, and a great deal of work has
already been expended in support of that goal.  It is not within the
charter of this patch to ignore or redefine that policy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Pavel Stehule

2009/9/17 Emmanuel Cecchet :
> Pavel Stehule wrote:
>>>
>>> Well, I wonder how many users just upgrade psql vs upgrade the server. I
>>> was
>>> thinking that when users perform a database upgrade their application
>>> often
>>> remain the same and therefore the server needs to support the old syntax.
>>> Unless you are upgrading a machine where a bunch of psql-based scripts
>>> are
>>> running to update various remote Postgres instances with older versions,
>>> I
>>> would guess that it is unlikely that someone is going to upgrade psql and
>>> keep the old instance of the server on the same machine.
>>> I just wonder how many users are using a single psql to manage multiple
>>> server instances of different older versions.
>>>
>>
>> What application, that use current copy format for fast data import? I
>> thing, so doing incompatible changes of copy statement syntax is very
>> bad idea.
>>
>
> The old syntax is still supported in both psql and the server but I am not
> sure how many applications are relying on psql to perform a copy operation
> (actually a \copy).

who knows. \copy is very useful thinks and people who imports data
from local use it. I am sure, so this feature is often used, mainly by
unix dba.

regards
Pavel

>
> manu
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 9:35 AM, Joshua Tolley  wrote:
> On Wed, Sep 16, 2009 at 09:48:20PM -0400, Tom Lane wrote:
>> Seems like there would
>> be lots of situations where short exclusive-lock intervals could be
>> tolerated, even though not long ones.  So that's another argument
>> for being able to set an upper bound on how many tuples get moved
>> per call.
>
> Presumably this couldn't easily be an upper bound on the time spent moving
> tuples, rather than an upper bound on the number of tuples moved?

It's probably not worth it.  There shouldn't be a tremendous amount of
variability in how long it takes to move N tuples, so it's just a
matter of finding the right value of N for your system and workload.
Making the code more complicated so that it's easier to tune something
that isn't very hard to tune anyway doesn't seem like a good
trade-off.

(Plus, of course, you can't stop in the middle: so you'd end up moving
a few tuples and then trying to estimate whether you had enough time
left to move a few more...  and maybe being wrong... blech.)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Kevin Grittner

Emmanuel Cecchet  wrote:

> I just wonder how many users are using a single psql to manage
> multiple  server instances of different older versions.

I do that, but I do try to keep all the active versions on my machine,
so that I can use one which exactly matches any of our 100 servers
when it matters.  (Or I can ssh to the server and use its psql.)

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on getting rid of VACUUM FULL

2009-09-17 Thread Joshua Tolley

On Wed, Sep 16, 2009 at 09:48:20PM -0400, Tom Lane wrote:
> Seems like there would
> be lots of situations where short exclusive-lock intervals could be
> tolerated, even though not long ones.  So that's another argument
> for being able to set an upper bound on how many tuples get moved
> per call.

Presumably this couldn't easily be an upper bound on the time spent moving
tuples, rather than an upper bound on the number of tuples moved?

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature

Re: [HACKERS] Linux LSB init script

2009-09-17 Thread Robert Haas

On Thu, Sep 17, 2009 at 1:30 AM, Peter Eisentraut  wrote:
> Well, in such cases it may be useful to add an option such as --oknodo
> to select the idempotent behavior.

It took me about 60 seconds to figure out what I thought you were
going for there, so I submit that's not a good choice of option name.

> Yeah, except that part of the spec is hilariously unrealistic.  And

But since when do we worry about such things?  :-)

> I would pretty much remove all of that completely and replace it with a
> link to the spec.

FWIW, the comments Kevin quoted seem reasonable to me.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

2009-09-17 Thread Emmanuel Cecchet


Pavel Stehule wrote:

Well, I wonder how many users just upgrade psql vs upgrade the server. I was
thinking that when users perform a database upgrade their application often
remain the same and therefore the server needs to support the old syntax.
Unless you are upgrading a machine where a bunch of psql-based scripts are
running to update various remote Postgres instances with older versions, I
would guess that it is unlikely that someone is going to upgrade psql and
keep the old instance of the server on the same machine.
I just wonder how many users are using a single psql to manage multiple
server instances of different older versions.



What application, that use current copy format for fast data import? I
thing, so doing incompatible changes of copy statement syntax is very
bad idea.
  
The old syntax is still supported in both psql and the server but I am 
not sure how many applications are relying on psql to perform a copy 
operation (actually a \copy).


manu

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 111 matches

Mail list logo