date:20100920

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Simon Riggs

On Mon, 2010-09-20 at 22:42 +0100, Thom Brown wrote:
> On 20 September 2010 22:14, Robert Haas  wrote:
> > Well, if you need to talk to "all the other standbys" and see who has
> > the furtherst-advanced xlog pointer, it seems like you have to have a
> > list somewhere of who they all are.
> 
> When they connect to the master to get the stream, don't they in
> effect, already talk to the primary with the XLogRecPtr being relayed?
>  Can the connection IP, port, XLogRecPtr and request time of the
> standby be stored from this communication to track the states of each
> standby?  They would in effect be registering upon WAL stream
> request... and no doubt this is a horrifically naive view of how it
> works.

It's not viable to record information at the chunk level in that way.

But the overall idea is fine. We can track who was connected and how to
access their LSNs. They don't need to be registered ahead of time on the
master to do that. They can register and deregister each time they
connect.

This discussion is reminiscent of the discussion we had when Fujii first
suggested that the standby should connect to the master. At first I
though "don't be stupid, the master needs to connect to the standby!".
It stood everything I had thought about on its head and that hurt, but
there was no logical reason to oppose. We could have used standby
registration on the master to handle that, but we didn't. I'm happy that
we have a more flexible system as a result.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason why the default_with_oids GUC is still there?

2010-09-20 Thread Heikki Linnakangas


On 21/09/10 04:18, Josh Berkus wrote:

... or did we just forget to remove it?


Backwards-compatibility? ;-) There hasn't been any pressing reason to 
remove it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pg_upgrade performance

2010-09-20 Thread Mark Kirkwood


On 21/09/10 16:14, Mark Kirkwood wrote:
I've been having a look at this guy, trying to get a handle on how 
much down time it will save.


As a quick check, I tried upgrading a cluster with a 1 non default db 
containing a scale 100 pgbench schema:


- pg_upgrade : 57 s
- pgdump/pg_restore : 154 s

So, a reasonable saving all up - but I guess still a sizable chunk of 
downtime in the case of a big database to copy the user relation files.


I notice there is a "link" option that would be quicker I guess - 
would it make sense to have a "move" option too? (perhaps with 
pg_upgrade writing an "un-move" script to move them back just in case).


Replying to this - looking more carefully at what the --link option 
does, it is clear that this is in fact covered. Sorry for the (my) 
confusion. For completeness, with this option the upgrade is 
substantially faster:


- pg_upgrade (link):   12 s

regards

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] .gitignore files, take two

2010-09-20 Thread Tom Lane

Robert Haas  writes:
> I suppose you already know my votes, but here they are again just in case.
> ...
> Centralize.
> ...
> All the build products in a normal build.

I don't understand your preference for this together with a centralized
ignore file.  That will be completely unmaintainable IMNSHO.  A
centralized file would work all right if it's limited to the couple
dozen files that are currently listed in .cvsignore's, but I can't see
doing it that way if it has to list every executable and .so built
anywhere in the tree.  You'd get merge conflicts from
completely-unrelated patches, not to mention the fundamental
action-at-a-distance nastiness of a top-level file that knows about
everything going on in every part of the tree.

To put it another way: would you expect anyone to take it seriously
if you proposed moving all the "make clean" rules into the top-level
Makefile?  That's pretty much exactly what this would be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] .gitignore files, take two

2010-09-20 Thread Robert Haas

I suppose you already know my votes, but here they are again just in case.

On Tue, Sep 21, 2010 at 12:00 AM, Tom Lane  wrote:
> 1. Whether to keep the per-subdirectory ignore files (which CVS
> insisted on, but git doesn't) or centralize in a single ignore file.

Centralize.

> 2. Whether to have the ignore files ignore common cruft such as
> editor backup files, or only "expected" build product files.

I don't care too much about this.  A mild preference for just the
expected build product files, but then that's heavily influenced by my
choice of editor, which doesn't leave such cruft around permanently.

> 3. What are the ignore filesets *for*, in particular should they list
> just the derived files expected in a distribution tarball, or all the
> files in the set of build products in a normal build?

All the build products in a normal build.  One of the infelicities of
git is that 'git status' shows the untracked files at the bottom.  So
if you have lots of unignored stuff floating around, the information
about which files you've actually changed or added to the index
scrolls right off the screen.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq changes for synchronous replication

2010-09-20 Thread Fujii Masao

On Mon, Sep 20, 2010 at 11:55 PM, Heikki Linnakangas
 wrote:
> It doesn't feel right to always accept PQputCopyData in COPY OUT mode,
> though. IMHO there should be a new COPY IN+OUT mode.
>
> It should be pretty safe to add a CopyInOutResponse message to the protocol
> without a protocol version bump. Thoughts on that?

Or we check "replication" field in PGConn, and accept PQputCopyData in
COPY OUT mode only if it indicates TRUE? This is much simpler, but maybe
not versatile..

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-20 Thread Fujii Masao

On Tue, Sep 21, 2010 at 9:48 AM, fazool mein  wrote:
> Hi,
>
> I want to shut down the server under certain conditions that can be checked
> inside a backend process. For instance, while running symmetric replication,
> if the primary dies, I want the the walreceiver to detect that and shutdown
> the standby. The reason for shutdown is that I want to execute some other
> stuff before I start the standby as a primary. Creating a trigger file
> doesn't help as it converts the standby into primary at run time.
>
> Using proc_exit() inside walreceiver only terminates the walreceiver
> process, which postgres starts again. The other way I see is using
> ereport(PANIC, ...). Is there some other way to shutdown the main server
> from within a backend process?

Are you going to change the source code? If yes, you might be able to
do that by making walreceiver send the shutdown signal to postmaster.

If no, I think that a straightforward approach is to use a clusterware
like pacemaker. That is, you need to make a clusterware periodically
check the master and cause the standby to end when detecting the crash
of the master.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Path question

2010-09-20 Thread David Fetter

On Mon, Sep 20, 2010 at 10:57:00PM -0400, Robert Haas wrote:
> 2010/9/3 Hans-Jürgen Schönig :
> > On Sep 2, 2010, at 1:20 AM, Robert Haas wrote:
> >> I agree. Explicit partitioning may open up some additional
> >> optimization possibilities in certain cases, but Merge Append is
> >> more general and extremely valuable in its own right.
> >
> > we have revised greg's wonderful work and ported the entire thing
> > to head.  it solves the problem of merge_append. i did some
> > testing earlier on today and it seems most important cases are
> > working nicely.
> 
> First, thanks for merging this up to HEAD.  I took a look through
> this patch tonight, and the previous reviews thereof that I was able
> to find, most notably Tom's detailed review on 2009-07-26.  I'm not
> sure whether or not it's accidental that this didn't get added to
> the CF,

It's because I missed putting it in, and oversight I've corrected.  If
we need to bounce it on to the next one, them's the breaks.

> [points elided]
> 
> 7. I think there's some basic code cleanup needed here, also: comment
> formatting, variable naming, etc.

Hans-Jürgen,

Will you be able to get to this in the next couple of days?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Pg_upgrade performance

2010-09-20 Thread Mark Kirkwood

I've been having a look at this guy, trying to get a handle on how much 
down time it will save.


As a quick check, I tried upgrading a cluster with a 1 non default db 
containing a scale 100 pgbench schema:


- pg_upgrade : 57 s
- pgdump/pg_restore : 154 s

So, a reasonable saving all up - but I guess still a sizable chunk of 
downtime in the case of a big database to copy the user relation files.


I notice there is a "link" option that would be quicker I guess - would 
it make sense to have a "move" option too? (perhaps with pg_upgrade 
writing an "un-move" script to move them back just in case).


Regards

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] .gitignore files, take two

2010-09-20 Thread Tom Lane

Back here I asked what we were going to do about .gitignore files:
http://archives.postgresql.org/pgsql-hackers/2010-08/msg01232.php
The thread died off when the first git conversion attempt crashed and
burned; but not before it became apparent that we didn't have much
consensus.  It seemed that there was lack of agreement as to:

1. Whether to keep the per-subdirectory ignore files (which CVS
insisted on, but git doesn't) or centralize in a single ignore file.

2. Whether to have the ignore files ignore common cruft such as
editor backup files, or only "expected" build product files.

It was pointed out that exclusion rules could be configured locally
to one's own repository, so one possible answer to issue #2 is to
have only a minimal ignore-set embodied in .gitignore files, and let
people who prefer to ignore more stuff set that up in local preferences.

Although this point wasn't really brought up during that thread, it's
also the case that the existing implementation is far from consistent
about ignoring build products.  We really only have .cvsignore entries
for files that are not in CVS but are meant to be present in
distribution tarballs.  CVS will, of its own accord, ignore certain
build products such as .o files; but it doesn't ignore executables for
instance.  So unless you do a "make distclean" before "cvs update",
you will get notices about non-ignored files.  That never bothered me
particularly but I believe it annoys some other folks.  So really there
is a third area of disagreement:

3. What are the ignore filesets *for*, in particular should they list
just the derived files expected in a distribution tarball, or all the
files in the set of build products in a normal build?

We need to get some consensus on this now.  Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Magnus Hagander  writes:
> Ok, I've pushed a new repository to both gitmaster and the
> postgresql-migration.git mirror, that has this setting.
> NOTE! Do a complete wipe of your repository before you clone this
> again - it's a completely new repo that will have different SHA1s.

AFAICT this version is good: it passes comparisons against all the
historical tarballs I have, as well as against my checked-out copies of
branch tips.  History looks sane as best I can tell, too.  I'm ready
to sign off on this.

NOTE: Magnus told me earlier that the new repository isn't ready to
accept commits, so committers please hold your fire till he gives
the all-clear.  It looks okay to clone this and start working locally,
though.

For the archives' sake, below are the missing historical tags that
match available tarballs, plus re-instantiation of the Release_2_0
and Release_2_0_0 tags on non-manufactured commits.  I will push
these up to the repo once it's open for pushing.

regards, tom lane


git tag PG95-1_08   bf3473c468b1938f782fdcc208bd62c4b061daa3
# commit bf3473c468b1938f782fdcc208bd62c4b061daa3   refs/heads/Release_1_0_3
# Author: Marc G. Fournier 
# Date:   Fri Oct 4 20:38:49 1996 +

git tag PG95-1_09   1b5e30e615eacae651a3cd12aa6b5c44d398b919
# commit 1b5e30e615eacae651a3cd12aa6b5c44d398b919   refs/heads/Release_1_0_3
# Author: Marc G. Fournier 
# Date:   Thu Oct 31 20:25:56 1996 +

git tag REL6_1  0acf9c9b28433120ca96a3a1c03222bfe45c8932
# commit 0acf9c9b28433120ca96a3a1c03222bfe45c8932   refs/tags/release-6-3
# Author: Bruce Momjian 
# Date:   Fri Jun 13 14:08:48 1997 +

git tag REL6_1_1b6d983559a2d2a6bd0b03b7b7f59a63a4c3f4918
# commit b6d983559a2d2a6bd0b03b7b7f59a63a4c3f4918   refs/tags/release-6-3
# Author: Bruce Momjian 
# Date:   Mon Jul 21 22:29:41 1997 +

git tag REL6_2  d663f1c83944cf8934f549ff879b51364f1a60ad
# commit d663f1c83944cf8934f549ff879b51364f1a60ad   refs/tags/release-6-3
# Author: Bruce Momjian 
# Date:   Thu Oct 2 18:32:58 1997 +

git tag REL6_2_18a1a39c39079ebc26f1bb55ad1ed2a11c2d36045
# commit 8a1a39c39079ebc26f1bb55ad1ed2a11c2d36045   refs/tags/release-6-3
# Author: Bruce Momjian 
# Date:   Sat Oct 18 16:59:06 1997 +

git tag REL6_3  b1c7c31e07b9284843d85bbe71a327a1ca13be63
# commit b1c7c31e07b9284843d85bbe71a327a1ca13be63   refs/tags/release-6-3
# Author: Marc G. Fournier 
# Date:   Mon Mar 2 14:54:59 1998 +

git tag REL6_3_2b542fa1a6e838d3e32857cdfbe8aeff940a91c74
# commit b542fa1a6e838d3e32857cdfbe8aeff940a91c74   refs/tags/REL6_5
# Author: Marc G. Fournier 
# Date:   Sat Apr 18 18:32:44 1998 +

git tag REL6_4_23be6c6eb73922fb872a6251cb45cb89d8822744f
# commit 3be6c6eb73922fb872a6251cb45cb89d8822744f   refs/heads/REL6_4
# Author: Bruce Momjian 
# Date:   Sun Jan 3 06:50:17 1999 +

git tag REL6_5  275a1d054e72b35bfd98c9731e51b2961ab8dbf5
# commit 275a1d054e72b35bfd98c9731e51b2961ab8dbf5   refs/tags/REL6_5
# Author: Tom Lane 
# Date:   Mon Jun 14 17:49:06 1999 +

git tag REL6_5_1c7092a8e8fe67e556f5c7b2f1336453b2ebecbeb
# commit c7092a8e8fe67e556f5c7b2f1336453b2ebecbeb   
refs/heads/REL6_5_PATCHES
# Author: Bruce Momjian 
# Date:   Mon Jul 19 05:08:23 1999 +

git tag REL6_5_2d5d33e2ee453656d607ad6b1036f0091d29de25a
# commit d5d33e2ee453656d607ad6b1036f0091d29de25a   
refs/heads/REL6_5_PATCHES
# Author: Tom Lane 
# Date:   Tue Sep 14 22:33:35 1999 +

git tag REL6_5_3ef26b944b12ce52b14101512c39cf7a42ca970a6
# commit ef26b944b12ce52b14101512c39cf7a42ca970a6   
refs/heads/REL6_5_PATCHES
# Author: Bruce Momjian 
# Date:   Thu Nov 4 16:22:41 1999 +

git tag REL7_0_2e261306b439e8286f8e8d7dcb6871c485df581c8
# commit e261306b439e8286f8e8d7dcb6871c485df581c8   
refs/heads/REL7_0_PATCHES
# Author: Bruce Momjian 
# Date:   Mon Jun 5 17:02:27 2000 +

git tag REL7_0_36835ca629877b9470f206cbea36c21aac9cdd493
# commit 6835ca629877b9470f206cbea36c21aac9cdd493   
refs/heads/REL7_0_PATCHES
# Author: Marc G. Fournier 
# Date:   Sun Nov 12 07:31:36 2000 +

git tag REL7_1  741604dd84dbbd58368a0206f73de259cb6718f4
# commit 741604dd84dbbd58368a0206f73de259cb6718f4   refs/tags/REL7_2_BETA1
# Author: Marc G. Fournier 
# Date:   Fri Apr 13 21:21:33 2001 +

git tag REL7_1_1ed6586063813cb4c9263254bb60b514cd12427e9
# commit ed6586063813cb4c9263254bb60b514cd12427e9   refs/tags/REL7_1_2
# Author: Marc G. Fournier 
# Date:   Sat May 5 20:23:57 2001 +

git tag REL7_1_20b471cc338777b84f3510b124aeaa7de75572848
# commit 0b471cc338777b84f3510b124aeaa7de75572848   refs/heads/REL7_1_STABLE
# Author: Thomas G. Lockhart 
# Date:   Tue May 22 14:46:46 2001 +

git tag REL7_1_38c78169c4a766376317b2255572820dfcc52470e
# co

Re: [HACKERS] Path question

2010-09-20 Thread Robert Haas

2010/9/3 Hans-Jürgen Schönig :
> On Sep 2, 2010, at 1:20 AM, Robert Haas wrote:
>> I agree. Explicit partitioning may open up some additional optimization 
>> possibilities in certain cases, but Merge Append is more general and 
>> extremely valuable in its own right.
>
> we have revised greg's wonderful work and ported the entire thing to head.
> it solves the problem of merge_append. i did some testing earlier on today 
> and it seems most important cases are working nicely.

First, thanks for merging this up to HEAD.  I took a look through this
patch tonight, and the previous reviews thereof that I was able to
find, most notably Tom's detailed review on 2009-07-26.  I'm not sure
whether or not it's accidental that this didn't get added to the CF,
but here's an attempt to enumerate the things that seem like they need
to be fixed.  The quotes labeled "TGL" are from the aforementioned
review by Tom.

1. The code in set_append_rel_pathlist() that accumulates the pathkeys
of all sub-paths is, as it says, and as previous discussed, O(n^2).
In a previous email on this topic, Tom suggested on possible approach
for this problem: choose the largest child relation and call it the
leader, and consider only the pathkeys for that relation rather than
all of them.  I think it would be nice if we can find a way to be a
bit smarter, though, because that won't necessarily always find the
best path.  One idea I had is to choose some arbitrary limit on how
long the all_pathkeys list is allowed to become and iterate over the
children from largest to smallest, stopping early if you hit that
limit.  But thinking about it a little more, can't we just adjust the
way we do this so that it's not O(n^2)?  It seems we're only concerned
with equality here, so what about using a hash table?  We could hash
the PathKey pointers in each list, but not the lists or listcells
obviously.

2. TGL: "you need an explicit flag to say 'we'll do a merge', not just
rely on whether the path has pathkeys."  This makes sense and doesn't
seem difficult.

3. TGL: "Speaking of sorting, it's not entirely clear to me how the
patch ensures that all the child plans produce the necessary sort keys
as output columns, and especially not how it ensures that they all get
produced in the *same* output columns.  This might accidentally manage
to work because of the "throwaway" call to make_sort_from_pathkeys(),
but at the very least that's a misleading comment."  I'm not sure what
needs to be done about this; I'm going to look at this further.

4. TGL: "In any case, I'm amazed that it's not failing regression
tests all over the place with those critical tests in
make_sort_from_pathkeys lobotomized by random #ifdef FIXMEs.  Perhaps
we need some more regression tests...".  Obviously, we need to remove
that lobotomy and insert the correct fix for whatever problem it was
trying to solve.  Adding some regression tests seems wise, too.

5. TGL: "In the same vein, the hack to 'short circuit' the append
stuff for a single child node is simply wrong, because it doesn't
allow for column number variances.  Please remove it."  This seems
like straightforward cleanup, and maybe another candidate for a
regression test.  (Actually, I notice that the patch has NO regression
tests at all, which surely can't be right for something of this type,
though admittedly since we didn't have EXPLAIN (COSTS OFF) when this
was first written it might have been difficult to write anything
meaningful at the time.)

6. The dummy call to cost_sort() seems like a crock; what that
function does doesn't seem particularly relevant to the cost of the
merge operation.  Just off the top of my head, it looks like the cost
of the merge step will be roughly O(lg n) * the cost of comparing two
tuples * the total number of tuples from all child paths.  In practice
it might be less, because once some of the paths run out of tuples the
number of comparisons will drop, I think.  But the magnitude of that
effect seems difficult to predict, and may be rather small, so perhaps
we should just ignore it.

7. I think there's some basic code cleanup needed here, also: comment
formatting, variable naming, etc.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Any reason why the default_with_oids GUC is still there?

2010-09-20 Thread Josh Berkus

... or did we just forget to remove it?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bg worker: general purpose requirements

2010-09-20 Thread Robert Haas

On Mon, Sep 20, 2010 at 1:45 PM, Markus Wanner  wrote:
> Hm.. I see. So in other words, you are saying
> min_spare_background_workers isn't flexible enough in case one has
> thousands of databases but only uses a few of them frequently.

Yes, I think that is true.

> I understand that reasoning and the wish to keep the number of GUCs as
> low as possible. I'll try to drop the min_spare_background_workers from
> the bgworker patches.

OK.  At least for me, what is important is not only how many GUCs
there are but how likely they are to require tuning and how easy it
will be to know what the appropriate value is.  It seems fairly easy
to tune the maximum number of background workers, and it doesn't seem
hard to tune an idle timeout, either.  Both of those are pretty
straightforward trade-offs between, on the one hand, consuming more
system resources, and on the other hand, better throughput and/or
latency.  On the other hand, the minimum number of workers to keep
around per-database seems hard to tune.  If performance is bad, do I
raise it or lower it?  And it's certainly not really a hard minimum
because it necessarily bumps up against the limit on overall number of
workers if the number of databases grows too large; one or the other
has to give.

I think we need to look for a way to eliminate the maximum number of
workers per database, too.  Your previous point about not wanting one
database to gobble up all the available slots makes sense, but again,
it's not obvious how to set this sensibly.  If 99% of your activity is
in one database, you might want to use all the slots for that
database, at least until there's something to do in some other
database.  I feel like the right thing here is for the number of
workers for any given database to fluctuate in some natural way that
is based on the workload.  If one database has all the activity, it
gets all the slots, at least until somebody else needs them.  Of
course, you need to design the algorithm so as to avoid starvation...

> The rest of the bgworker infrastructure should behave pretty much like
> what you have described. Parallelism in starting bgworkers could be a
> nice improvement, especially if we kill the min_space_background_workers
> mechanism.

Works for me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-20 Thread fazool mein

Hi,

I want to shut down the server under certain conditions that can be checked
inside a backend process. For instance, while running symmetric replication,
if the primary dies, I want the the walreceiver to detect that and shutdown
the standby. The reason for shutdown is that I want to execute some other
stuff before I start the standby as a primary. Creating a trigger file
doesn't help as it converts the standby into primary at run time.

Using proc_exit() inside walreceiver only terminates the walreceiver
process, which postgres starts again. The other way I see is using
ereport(PANIC, ...). Is there some other way to shutdown the main server
from within a backend process?

Thanks.

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Robert Haas

On Mon, Sep 20, 2010 at 5:42 PM, Thom Brown  wrote:
> On 20 September 2010 22:14, Robert Haas  wrote:
>> Well, if you need to talk to "all the other standbys" and see who has
>> the furtherst-advanced xlog pointer, it seems like you have to have a
>> list somewhere of who they all are.
>
> When they connect to the master to get the stream, don't they in
> effect, already talk to the primary with the XLogRecPtr being relayed?
>  Can the connection IP, port, XLogRecPtr and request time of the
> standby be stored from this communication to track the states of each
> standby?  They would in effect be registering upon WAL stream
> request... and no doubt this is a horrifically naive view of how it
> works.

Sure, but the point is that we can want DISCONNECTED slaves to affect
master behavior in a variety of ways (master retains WAL for when they
reconnect, master waits for them to connect before acking commits,
master shuts down if they're not there, master tries to stream WAL
backwards from them before entering normal running).  I just work
here, but it seems to me that such things will be easier if the master
has an explicit notion of what's out there.  Can we make it all work
without that?  Possibly, but I think it will be harder to understand.
With standby registration, you can DECLARE the behavior you want.  You
can tell the master "replicate synchronously to Bob".  And that's it.
Without standby registration, what's being proposed is basically that
you can tell the master "replicate synchronously to one server" and
you can tell Bob "you are a server to which the master can replicate
synchronously" and you can tell the other servers "you are not a
server to which Bob can replicate synchronously".  That works, but to
me it seems less straightforward.

And that's actually a relatively simple example.  Suppose you want to
tell the master "keep enough WAL for Bob to catch up when he
reconnects, but if he gets more than 1GB behind, forget about him".
I'm sure someone can devise a way of making that work without standby
registration, too, but I'm not too sure off the top of my head what it
will be.  With standby registration, you can just write something like
this in standbys.conf (syntax invented):

[bob]
wal_keep_segments=64

I feel like that's really nice and simple.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-20 Thread Kevin Grittner

Dan S  wrote:

> Well I guess one would like some way to find out which statements
> in the involved transactions are the cause of the serialization
> failure and what programs they reside in.

Unless we get the conflict list optimization added after the base
patch, you might get anywhere from one to three of the two to three
transactions involved in the serialization failure.  We can also
report the position they have in the "dangerous structure" and
mention that there are other, unidentified, transactions
participating in the conflict.  Once I get through with the issue
I'm working on based on Heikki's observations, I'll take a look at
this.

> Also which relations were involved, the sql-statements may contain
> many relations but just one or a few might be involved in the
> failure, right ?

The conflicts would have occurred on specific relations, but we
don't store all that -- it would be prohibitively expensive.  What
we track is that transaction T0's read couldn't see the write from
transaction T1.  Once you know that, SSI doesn't require that you
know which or how many relations were involved in that -- you've
established that T0 must logically come before T1.  That in itself
is no problem, of course.  But if you also establish that T1 must
come before TN (where TN might be T0 or a third transaction), you've
got a "pivot" at T1.  You're still not dead in the water yet, but if
that third logical transaction actually *commits* first, you're
probably in trouble.  The only way out is that if T0 is not TN, T0
is read only, and TN did *not* commit before T0 got its snapshot,
you're OK.

Where it gets complicated is that in the algorithm in the paper,
which we are following for the initial commit attempt, each
transaction keeps one "conflictIn" and one "conflictOut" pointer
for checking all this.  If you already have a conflict with one
transaction and then detect a conflict of the same type with
another, you change the conflict pointer to a self-reference --
which means you conflict with *all* other concurrent transactions
in that direction.  You also have lost the ability to report all
transaction which are involved in the conflict.

> The tuples involved if available.
> 
> I don't know how helpful it would be to know the pages involved
> might be, I certainly wouldn't know what to do with that info.

That information would only be available on the *read* side.  We
count on MVCC data on the *write* side, and I'm not aware of any way
for a transaction to list everything it's written.  Since we're not
recording the particular points of conflict between transactions,
there's probably not a lot of point in listing it anyway -- there
might be a conflict on any number of tuples out of a great many read
or written.

> All this is of course to be able to guess at which statements to
> modify or change execution order of, take an explicit lock on and
> so on to reduce serialization failure rate.

I understand the motivation, but the best this technique is likely
to be able to provide is the transactions involved, and that's not
always going to be complete unless we convert those single-
transaction conflict fields to lists.

> If holding a list of the involved transactions turns out to be
> expensive, maybe one should be able to turn it on by a GUC only
> when you have a problem and need the extra information to track it
> down.

That might be doable.  If we're going to add such a GUC, though, it
should probably be considered a tuning GUC, with the "list" setting
recommended for debugging problems.  Of course, if you change it
from "field" to "list" the problem might disappear.  Hmmm.  Unless
we also had a "debug" setting which kept track of the list but
ignored it for purposes of detecting the dangerous structures
described above.

Of course, you will always know what transaction was canceled.
That does give you something to look at.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Thom Brown

On 20 September 2010 22:14, Robert Haas  wrote:
> Well, if you need to talk to "all the other standbys" and see who has
> the furtherst-advanced xlog pointer, it seems like you have to have a
> list somewhere of who they all are.

When they connect to the master to get the stream, don't they in
effect, already talk to the primary with the XLogRecPtr being relayed?
 Can the connection IP, port, XLogRecPtr and request time of the
standby be stored from this communication to track the states of each
standby?  They would in effect be registering upon WAL stream
request... and no doubt this is a horrifically naive view of how it
works.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Robert Haas

On Mon, Sep 20, 2010 at 4:10 PM, Dimitri Fontaine
 wrote:
> Robert Haas  writes:
>>   So the "wait forever" case is, in my opinion,
>> sufficient to demonstrate that we need it, but it's not even my
>> primary reason for wanting to have it.
>
> You're talking about standby registration on the master. You can solve
> this case without it, because when a slave is not connected it's not
> giving any feedback (vote, weight, ack) to the master. All you have to
> do is have the quorum setup in a way that disconnecting your slave means
> you can't reach the quorum any more. Have it SIGHUP and you can even
> choose to fix the setup, rather than fix the standby.

I suppose that could work.

>> The most important reason why I think we should have standby
>> registration is for simplicity of configuration.  Yes, it adds another
>> configuration file, but that configuration file contains ALL of the
>> information about which standbys are synchronous.  Without standby
>> registration, this information will inevitably be split between the
>> master config and the various slave configs and you'll have to look at
>> all the configurations to be certain you understand how it's going to
>> end up working.
>
> So, here, we have two quite different things to be concerned
> about. First is the configuration, and I say that managing a distributed
> setup will be easier for the DBA.

Yeah, I disagree with that, but I suppose it's a question of opinion.

> Then there's how to obtain a nice view about the distributed system,
> which again we can achieve from the master without manually registering
> the standbys. After all, the information you want needs to be there.

I think that without standby registration it will be tricky to display
information like "the last time that standby foo was connected".
Yeah, you could set a standby name on the standby server and just have
the master remember details for every standby name it's ever seen, but
then how do you prune the list?

Heikki mentioned another application for having a list of the current
standbys only (rather than "every standby that has ever existed")
upthread: you can compute the exact amount of WAL you need to keep
around.

>>  As a particular manifestation of this, and as
>> previously argued and +1'd upthread, the ability to change the set of
>> standbys to which the master is replicating synchronously without
>> changing the configuration on the master or any of the existing slaves
>> seems seems dangerous.
>
> Well, you still need to open the HBA for the new standby to be able to
> connect, and to somehow take a base backup, right? We're not exactly
> transparent there, yet, are we?

Sure, but you might have that set relatively open on a trusted network.

>> Another reason why I think we should have standby registration is to
>> allow eventually allow the "streaming WAL backwards" configuration
>> which has previously been discussed.  IOW, you could stream the WAL to
>> the slave in advance of fsync-ing it on the master.  After a power
>> failure, the machines in the cluster can talk to each other and figure
>> out which one has the furthest-advanced WAL pointer and stream from
>> that machine to all the others.  This is an appealing configuration
>> for people using sync rep because it would allow the fsyncs to be done
>> in parallel rather than sequentially as is currently necessary - but
>> if you're using it, you're certainly not going to want the master to
>> enter normal running without waiting to hear from the slave.
>
> I love the idea.
>
> Now it seems to me that all you need here is the master sending one more
> information with each WAL "segment", the currently fsync'ed position,
> which pre-9.1 is implied as being the current LSN from the stream,
> right?

I don't see how that would help you.

> Here I'm not sure to follow you in details, but it seems to me
> registering the standbys is just another way of achieving the same. To
> be honest, I don't understand a bit how it helps implement your idea.

Well, if you need to talk to "all the other standbys" and see who has
the furtherst-advanced xlog pointer, it seems like you have to have a
list somewhere of who they all are.  Maybe there's some way to get
this to work without standby registration, but I don't really
understand the resistance to the idea, and I fear it's going to do
nothing good for our reputation for ease of use (or lack thereof).
The idea of making this all work without standby registration strikes
me as akin to the notion of having someone decide whether they're
running a three-legged race by checking whether their leg is currently
tied to someone else's leg.  You can probably make that work by
patching around the various failure cases, but why isn't simpler to
just tell the poor guy "Hi, Joe.  You're running a three-legged race
with Jane today.  Hans and Juanita will be following you across the
field, too, but don't worry about whether they're keeping up."?

-- 
Robert Haas
EnterpriseDB: ht

Re: [HACKERS] Git conversion status

2010-09-20 Thread Magnus Hagander

On Mon, Sep 20, 2010 at 20:05, Magnus Hagander  wrote:
> On Mon, Sep 20, 2010 at 7:57 PM, Magnus Hagander  wrote:
>> On Mon, Sep 20, 2010 at 19:49, Tom Lane  wrote:
>>> Magnus Hagander  writes:
 On Mon, Sep 20, 2010 at 19:34, Tom Lane  wrote:
> Please fix and re-run.
>>>
 Uh, what the heck. I ran the exact same command as last time.. Hmm:
 Stefan rbeooted the machine in between, I wonder if that changed
 something.
>>>
>>> I'm not sure we ever checked that.  My comparisons against the tarballs
>>> were done from my own run of the conversion script.  I'm using C locale
>>> here, probably you aren't?
>>
>> Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see exaclty 
>> what changes.
>> Hmm
>>
>> Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs export". 
>> but it comes back with "-" in the dates, so it seems to not care about that.
>>
>> ("locale" clearly shows it's changed everything to C though)
>>
>> Is there a cvs setting for this somewhere that you know of?
>
> Think I found it.
>
> debian applies a patch to change it. If I set DateStyle=old in
> CVSROOT/config, cvs export behaves sanely. I'll re-run with that
> setting.

Ok, I've pushed a new repository to both gitmaster and the
postgresql-migration.git mirror, that has this setting.

NOTE! Do a complete wipe of your repository before you clone this
again - it's a completely new repo that will have different SHA1s.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Dimitri Fontaine

Hi,

I'm somewhat sorry to have to play this game, as I sure don't feel
smarter by composing this email. Quite the contrary.

Robert Haas  writes:
>   So the "wait forever" case is, in my opinion,
> sufficient to demonstrate that we need it, but it's not even my
> primary reason for wanting to have it.

You're talking about standby registration on the master. You can solve
this case without it, because when a slave is not connected it's not
giving any feedback (vote, weight, ack) to the master. All you have to
do is have the quorum setup in a way that disconnecting your slave means
you can't reach the quorum any more. Have it SIGHUP and you can even
choose to fix the setup, rather than fix the standby.

So no need for registration here, it's just another way to solve the
problem. Not saying it's better or worse, just another.

Now we could have a summary function on the master showing all the known
slaves, their last time of activity, their known current setup, etc, all
from the master, but read-only. Would that be useful enough?

> The most important reason why I think we should have standby
> registration is for simplicity of configuration.  Yes, it adds another
> configuration file, but that configuration file contains ALL of the
> information about which standbys are synchronous.  Without standby
> registration, this information will inevitably be split between the
> master config and the various slave configs and you'll have to look at
> all the configurations to be certain you understand how it's going to
> end up working.

So, here, we have two quite different things to be concerned
about. First is the configuration, and I say that managing a distributed
setup will be easier for the DBA.

Then there's how to obtain a nice view about the distributed system,
which again we can achieve from the master without manually registering
the standbys. After all, the information you want needs to be there.

>  As a particular manifestation of this, and as
> previously argued and +1'd upthread, the ability to change the set of
> standbys to which the master is replicating synchronously without
> changing the configuration on the master or any of the existing slaves
> seems seems dangerous.

Well, you still need to open the HBA for the new standby to be able to
connect, and to somehow take a base backup, right? We're not exactly
transparent there, yet, are we?

> Another reason why I think we should have standby registration is to
> allow eventually allow the "streaming WAL backwards" configuration
> which has previously been discussed.  IOW, you could stream the WAL to
> the slave in advance of fsync-ing it on the master.  After a power
> failure, the machines in the cluster can talk to each other and figure
> out which one has the furthest-advanced WAL pointer and stream from
> that machine to all the others.  This is an appealing configuration
> for people using sync rep because it would allow the fsyncs to be done
> in parallel rather than sequentially as is currently necessary - but
> if you're using it, you're certainly not going to want the master to
> enter normal running without waiting to hear from the slave.

I love the idea. 

Now it seems to me that all you need here is the master sending one more
information with each WAL "segment", the currently fsync'ed position,
which pre-9.1 is implied as being the current LSN from the stream,
right?

Here I'm not sure to follow you in details, but it seems to me
registering the standbys is just another way of achieving the same. To
be honest, I don't understand a bit how it helps implement your idea.

Regards,
-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian  writes:
> > Tom Lane wrote:
> >> This is not even close to matching the tarballs :-(.  Seems to be a
> >> locale problem: the diffs look like
> >> 
> >> 1c1
> >> < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 
> >> 18:25:16 tgl Exp $ */
> >> ---
> > /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 
> > 18:25:16 tgl Exp $ */
> 
> > As a curiosity, I do prefer the dashed dates. I have had a number of
> > cases where I have to change dashes to slashes when passing ISO dates as
> > parameters to CVS.  Shame they improve it just as we are leaving CVS.
> 
> Yeah.  It appears that this was prompted by a desire to match ISO style
> going forward.  I wouldn't be against that necessarily if we were
> keeping the keywords and not getting rid of them.  But since we are
> going to get rid of them going forward, I think what we want this
> conversion to do is match what's in the historical tarballs.

Agreed, no question.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-20 Thread Dan S

Well I guess one would like some way to find out which statements in the
involved transactions are the cause of the serialization failure and what
programs they reside in.

Also which relations were involved, the sql-statements may contain many
relations but just one or a few might be involved in the failure, right ?

The tuples involved if available.

I don't know how helpful it would be to know the pages involved might be, I
certainly wouldn't know what to do with that info.

All this is of course to be able to guess at which statements to modify or
change execution order of, take an explicit lock on and so on to reduce
serialization failure rate.

If holding a list of the involved transactions turns out to be expensive,
maybe one should be able to turn it on by a GUC only when you have a problem
and need the extra information to track it down.

Best Regards
Dan S

2010/9/20 Kevin Grittner 

> Dan S  wrote:
>
> > I wonder if the SSI implementation will give some way of detecting
> > the cause of a serialization failure.
> > Something like the deadlock detection maybe where you get the
> > sql-statements involved.
>
> I've been wondering what detail to try to include.  There will often
> be three transactions involved in an SSI serialization failure,
> although the algorithm we're using (based on the referenced papers)
> may only know about one or two of them at the point of failure,
> because conflicts with multiple other transactions get collapsed to
> a self-reference.  (One "optimization" I want to try is to maintain
> a list of conflicts rather than doing the above -- in which case we
> could always show all three transactions; but we may run out of time
> for that, and even if we don't, the decreased rollbacks might not
> pay for the cost of maintaining such a list.)
>
> The other information we would have would be the predicate locks
> held by whatever transactions we know about at the point of
> cancellation, based on what reads they've done; however, we wouldn't
> know about the writes done by those transaction, or which of the
> reads resulting in conflicts.
>
> So, given the above, any thoughts on what we *should* show?
>
> -Kevin
>

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Peter Eisentraut  writes:
> On mÃ¥n, 2010-09-20 at 15:09 -0400, Tom Lane wrote:
>> I wouldn't be against that necessarily if we were
>> keeping the keywords and not getting rid of them.  But since we are
>> going to get rid of them going forward, I think what we want this
>> conversion to do is match what's in the historical tarballs.

> Stupid question: Why don't you get rid of the key words beforehand?

That *definitely* wouldn't match the tarballs.

One of the base requirements we set at the beginning of the whole SCM
conversion discussion was that we be able to reproduce the historical
release tarballs as nearly as possible.  Now, if there were some reason
that we couldn't match $PostgreSQL$ tags at all, I'd have grumbled and
accepted it.  But we're 99.44% of the way there, and I don't see some
Debian maintainer's idea of how things ought to work as a reason for
not being 100% of the way there.

What I got the last time I did this locally, and expect to see when
we have the final conversion, is an exact match for every tarball
8.0.0 and later.  Earlier than that we have discrepancies because
some files are now in Attic, and/or the cvsroot path moved around,
and/or the project's module name moved around.  That sort of thing
I've resigned myself to just grumbling about.  But if we can have an
exact match for everything from 8.0.0 forward, we should not give that
up for trivial reasons.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Stefan Kaltenbrunner


On 09/20/2010 09:06 PM, Tom Lane wrote:

Stefan Kaltenbrunner  writes:

http://lists.nongnu.org/archive/html/info-cvs/2004-07/msg00106.html
is what I'm refering too and what the debian people provided a patch to
work around for(starting with1:1.12.9-17 in 2005) - nut sure why you are
not seeing it...


Hm, that is talking about the output of "cvs log".  It doesn't say
anything one way or the other about what gets put into $Header$ keyword
expansions.  A look into the 1.12.13 source code says that dates in
keywords are always printed with this:

 sprintf (buf, "%04d/%02d/%02d %02d:%02d:%02d", year, mon, mday,
 hour, min, sec);

(see printable_date in src/rcs.c).  So I'm still of the opinion that
debian fixed that which wasn't broken.  I tried searching the nongnu
archives and found this:

http://lists.nongnu.org/archive/html/info-cvs/2004-03/msg00359.html

which leads me to think that the upstream developers considered and
ultimately rejected moving to ISO style in keyword expansion.  Probably
the debian maintainer decided he knew better and changed it anyway;
there seems to be a lot of that going around among debian packagers.


wow - now that I look closer it seems you are right...

The patch in debian against the upstream package (see: 
http://ftp.de.debian.org/debian/pool/main/c/cvs/cvs_1.12.13-12.diff.gz) 
has this hunk:


--- cvs-1.12.13-old/src/rcs.c  2006-02-26 23:03:04.0 +0800
+++ cvs-1.12.13/src/rcs.c  2006-02-26 23:03:05.0 +0800
@@ -33,6 +33,8 @@
 # endif
 #endif

+int datesep = '-';
+
 /* The RCS -k options, and a set of enums that must match the array.
These come first so that we can use enum kflag in function
prototypes.  */
@@ -3537,8 +3539,8 @@
  &sec);
 if (year < 1900)
   year += 1900;
-sprintf (buf, "%04d/%02d/%02d %02d:%02d:%02d", year, mon, mday,
-   hour, min, sec);
+sprintf (buf, "%04d%c%02d%c%02d %02d:%02d:%02d", year, datesep, on,
+ datesep, mday, hour, min, sec);
 return xstrdup (buf);
 }


so the broke that in early 2006 and nobody noticed so far...

Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Peter Eisentraut

On mån, 2010-09-20 at 15:09 -0400, Tom Lane wrote:
> I wouldn't be against that necessarily if we were
> keeping the keywords and not getting rid of them.  But since we are
> going to get rid of them going forward, I think what we want this
> conversion to do is match what's in the historical tarballs.

Stupid question: Why don't you get rid of the key words beforehand?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Bruce Momjian  writes:
> Tom Lane wrote:
>> This is not even close to matching the tarballs :-(.  Seems to be a
>> locale problem: the diffs look like
>> 
>> 1c1
>> < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 
>> 18:25:16 tgl Exp $ */
>> ---
> /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 18:25:16 
> tgl Exp $ */

> As a curiosity, I do prefer the dashed dates. I have had a number of
> cases where I have to change dashes to slashes when passing ISO dates as
> parameters to CVS.  Shame they improve it just as we are leaving CVS.

Yeah.  It appears that this was prompted by a desire to match ISO style
going forward.  I wouldn't be against that necessarily if we were
keeping the keywords and not getting rid of them.  But since we are
going to get rid of them going forward, I think what we want this
conversion to do is match what's in the historical tarballs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Stefan Kaltenbrunner  writes:
> http://lists.nongnu.org/archive/html/info-cvs/2004-07/msg00106.html
> is what I'm refering too and what the debian people provided a patch to 
> work around for(starting with1:1.12.9-17 in 2005) - nut sure why you are 
> not seeing it...

Hm, that is talking about the output of "cvs log".  It doesn't say
anything one way or the other about what gets put into $Header$ keyword
expansions.  A look into the 1.12.13 source code says that dates in
keywords are always printed with this:

sprintf (buf, "%04d/%02d/%02d %02d:%02d:%02d", year, mon, mday,
 hour, min, sec);

(see printable_date in src/rcs.c).  So I'm still of the opinion that
debian fixed that which wasn't broken.  I tried searching the nongnu
archives and found this:

http://lists.nongnu.org/archive/html/info-cvs/2004-03/msg00359.html

which leads me to think that the upstream developers considered and
ultimately rejected moving to ISO style in keyword expansion.  Probably
the debian maintainer decided he knew better and changed it anyway;
there seems to be a lot of that going around among debian packagers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Bruce Momjian

Tom Lane wrote:
> Magnus Hagander  writes:
> > Since there haven't been any commits in cvs during the day, the test
> > conversoin I created after lunch should be identical to a new one I'd
> > run now, so let's use that one :-)
> 
> This is not even close to matching the tarballs :-(.  Seems to be a
> locale problem: the diffs look like
> 
> 1c1
> < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 
> 18:25:16 tgl Exp $ */
> ---
> > /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 
> > 18:25:16 tgl Exp $ */
> 
> Please fix and re-run.

As a curiosity, I do prefer the dashed dates. I have had a number of
cases where I have to change dashes to slashes when passing ISO dates as
parameters to CVS.  Shame they improve it just as we are leaving CVS.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Stefan Kaltenbrunner


On 09/20/2010 08:33 PM, Tom Lane wrote:

Stefan Kaltenbrunner  writes:

On 09/20/2010 08:21 PM, Tom Lane wrote:

Well, I'm testing with an unmodified copy of 1.12.13, and I got output
matching our historical tarballs.  So I'm blaming debian for this one.



As far as I know magnus is using a debian based CVS server for his
testing so that would certainly be 1.12.x - are you too?


No server anywhere: I'm reading from a local repository which is a
tarball copy of the one on cvs.postgresql.org.  1.12.13 is the only
version in question.  (I believe Magnus is not using a server either;
the cvs2git documentation says that it will only work from a local repo,
and even if that's not true I shudder to think how long it would take
over a network.)


http://lists.nongnu.org/archive/html/info-cvs/2004-07/msg00106.html


is what I'm refering too and what the debian people provided a patch to 
work around for(starting with1:1.12.9-17 in 2005) - nut sure why you are 
not seeing it...




Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Stefan Kaltenbrunner  writes:
> On 09/20/2010 08:21 PM, Tom Lane wrote:
>> Well, I'm testing with an unmodified copy of 1.12.13, and I got output
>> matching our historical tarballs.  So I'm blaming debian for this one.

> As far as I know magnus is using a debian based CVS server for his 
> testing so that would certainly be 1.12.x - are you too?

No server anywhere: I'm reading from a local repository which is a
tarball copy of the one on cvs.postgresql.org.  1.12.13 is the only
version in question.  (I believe Magnus is not using a server either;
the cvs2git documentation says that it will only work from a local repo,
and even if that's not true I shudder to think how long it would take
over a network.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Stefan Kaltenbrunner


On 09/20/2010 08:21 PM, Tom Lane wrote:

Stefan Kaltenbrunner  writes:

On 09/20/2010 08:05 PM, Magnus Hagander wrote:

debian applies a patch to change it. If I set DateStyle=old in
CVSROOT/config, cvs export behaves sanely. I'll re-run with that
setting.



actually as I understand it the behaviour changed in cvs 1.12.x and
debian applied a patch to provide the old output for backwards
compatibility...


Well, I'm testing with an unmodified copy of 1.12.13, and I got output
matching our historical tarballs.  So I'm blaming debian for this one.


not sure  - if I read the CVS changelog the "new style" output only 
triggers if both the server AND the client are > 1.12.x (for some value 
of x on both).
As far as I know magnus is using a debian based CVS server for his 
testing so that would certainly be 1.12.x - are you too?



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Andres Freund

On Monday 20 September 2010 20:22:55 Tom Lane wrote:
> Andres Freund  writes:
> > On Monday 20 September 2010 20:15:50 Tom Lane wrote:
> >> BTW, while poking around in this morning's attempt I noticed
> >> .git/description, containing
> >> 
> >> Unnamed repository; edit this file 'description' to name the repository.
> >> 
> >> No idea if this is shown anywhere or if there is any practical way to
> >> change it once the repo's been published.  Might be an idea to stick
> >> something in there.
> > 
> > Its mostly used for display in gitweb and can be changed anytime.
> 
> Hm, I might've misinterpreted its semantics.  Is that file copied by
> "git clone", or is it something that's unique to each physical
> repository?
Unique to each "physical repository" (like everything in .git - unless you 
count the cloned 'objects').

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Magnus Hagander  writes:
> On Mon, Sep 20, 2010 at 20:15, Tom Lane  wrote:
>> BTW, while poking around in this morning's attempt I noticed
>> .git/description, containing
>> 
>> Unnamed repository; edit this file 'description' to name the repository.

> That said, where was it set to that? A locally initialized repo, or on
> the clone?

That's what I found in the result of
git clone ssh://g...@gitmaster.postgresql.org/postgresql.git

If git clone isn't meant to copy it, then this is a non-issue.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Andres Freund  writes:
> On Monday 20 September 2010 20:15:50 Tom Lane wrote:
>> BTW, while poking around in this morning's attempt I noticed
>> .git/description, containing
>> 
>> Unnamed repository; edit this file 'description' to name the repository.
>> 
>> No idea if this is shown anywhere or if there is any practical way to
>> change it once the repo's been published.  Might be an idea to stick
>> something in there.

> Its mostly used for display in gitweb and can be changed anytime.

Hm, I might've misinterpreted its semantics.  Is that file copied by
"git clone", or is it something that's unique to each physical
repository?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Stefan Kaltenbrunner  writes:
> On 09/20/2010 08:05 PM, Magnus Hagander wrote:
>> debian applies a patch to change it. If I set DateStyle=old in
>> CVSROOT/config, cvs export behaves sanely. I'll re-run with that
>> setting.

> actually as I understand it the behaviour changed in cvs 1.12.x and 
> debian applied a patch to provide the old output for backwards 
> compatibility...

Well, I'm testing with an unmodified copy of 1.12.13, and I got output
matching our historical tarballs.  So I'm blaming debian for this one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-20 Thread Kevin Grittner

Dan S  wrote:

> I wonder if the SSI implementation will give some way of detecting
> the cause of a serialization failure.
> Something like the deadlock detection maybe where you get the
> sql-statements involved.

I've been wondering what detail to try to include.  There will often
be three transactions involved in an SSI serialization failure,
although the algorithm we're using (based on the referenced papers)
may only know about one or two of them at the point of failure,
because conflicts with multiple other transactions get collapsed to
a self-reference.  (One "optimization" I want to try is to maintain
a list of conflicts rather than doing the above -- in which case we
could always show all three transactions; but we may run out of time
for that, and even if we don't, the decreased rollbacks might not
pay for the cost of maintaining such a list.)

The other information we would have would be the predicate locks
held by whatever transactions we know about at the point of
cancellation, based on what reads they've done; however, we wouldn't
know about the writes done by those transaction, or which of the
reads resulting in conflicts.

So, given the above, any thoughts on what we *should* show?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Magnus Hagander

On Mon, Sep 20, 2010 at 20:15, Tom Lane  wrote:
> BTW, while poking around in this morning's attempt I noticed
> .git/description, containing
>
> Unnamed repository; edit this file 'description' to name the repository.
>
> No idea if this is shown anywhere or if there is any practical way to
> change it once the repo's been published.  Might be an idea to stick
> something in there.

That's, AFAIK, only used for gitweb.

That said, where was it set to that? A locally initialized repo, or on
the clone? Because I changed it in the repository before I published
it I think (i now deleted the whole repo to make room for the new
conversion, so i can't doublecheck that :D)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Andres Freund

On Monday 20 September 2010 20:15:50 Tom Lane wrote:
> BTW, while poking around in this morning's attempt I noticed
> .git/description, containing
> 
> Unnamed repository; edit this file 'description' to name the repository.
> 
> No idea if this is shown anywhere or if there is any practical way to
> change it once the repo's been published.  Might be an idea to stick
> something in there.
Its mostly used for display in gitweb and can be changed anytime.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

BTW, while poking around in this morning's attempt I noticed
.git/description, containing

Unnamed repository; edit this file 'description' to name the repository.

No idea if this is shown anywhere or if there is any practical way to
change it once the repo's been published.  Might be an idea to stick
something in there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Stefan Kaltenbrunner


On 09/20/2010 08:05 PM, Magnus Hagander wrote:

On Mon, Sep 20, 2010 at 7:57 PM, Magnus Hagander  wrote:

On Mon, Sep 20, 2010 at 19:49, Tom Lane  wrote:

Magnus Hagander  writes:

On Mon, Sep 20, 2010 at 19:34, Tom Lane  wrote:

Please fix and re-run.



Uh, what the heck. I ran the exact same command as last time.. Hmm:
Stefan rbeooted the machine in between, I wonder if that changed
something.


I'm not sure we ever checked that.  My comparisons against the tarballs
were done from my own run of the conversion script.  I'm using C locale
here, probably you aren't?


Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see exaclty what 
changes.
Hmm

Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs export". but it comes 
back with "-" in the dates, so it seems to not care about that.

("locale" clearly shows it's changed everything to C though)

Is there a cvs setting for this somewhere that you know of?


Think I found it.

debian applies a patch to change it. If I set DateStyle=old in
CVSROOT/config, cvs export behaves sanely. I'll re-run with that
setting.


actually as I understand it the behaviour changed in cvs 1.12.x and 
debian applied a patch to provide the old output for backwards 
compatibility...



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Markus Wanner

On 09/20/2010 08:06 PM, Kevin Grittner wrote:
> Obviously, if there were a dynamic way to add to the entries as
> needed, there would be one less setting (hard-coded or GUC) to worry
> about getting right.  Too low means transactions need to be
> canceled.  Too high means you're wasting space which could otherwise
> go to caching.  And of course, the optimal number could change from
> day to day or hour to hour.

Yeah, same problem as with lots of the other users shared memory.

It certainly makes sense to decouple the two projects, so you'll have to
pick some number that sounds good to you now.

Regards

Markus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Magnus Hagander

On Mon, Sep 20, 2010 at 20:07, Tom Lane  wrote:
> Magnus Hagander  writes:
>> debian applies a patch to change it.
>
> [ rolls eyes... ]  Thank you, debian.

Indeed.

For the archives, that's DateFormat=old, not DateStyle. Oops.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Magnus Hagander  writes:
> debian applies a patch to change it.

[ rolls eyes... ]  Thank you, debian.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Kevin Grittner

Markus Wanner  wrote:
 
> I'm wondering how you want to implement the memory allocation part
 
Based on the feedback I've received, it appears that the only sane
way to do that in the current shared memory environment is to
allocate a fixed size of memory to hold these entries on postmaster
startup.  To minimize the chance that we'll be forced to cancel
running transactions to deal with the limit, it will need to be
sized to some multiple of max_connections.
 
Obviously, if there were a dynamic way to add to the entries as
needed, there would be one less setting (hard-coded or GUC) to worry
about getting right.  Too low means transactions need to be
canceled.  Too high means you're wasting space which could otherwise
go to caching.  And of course, the optimal number could change from
day to day or hour to hour.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Magnus Hagander

On Mon, Sep 20, 2010 at 7:57 PM, Magnus Hagander  wrote:
> On Mon, Sep 20, 2010 at 19:49, Tom Lane  wrote:
>> Magnus Hagander  writes:
>>> On Mon, Sep 20, 2010 at 19:34, Tom Lane  wrote:
 Please fix and re-run.
>>
>>> Uh, what the heck. I ran the exact same command as last time.. Hmm:
>>> Stefan rbeooted the machine in between, I wonder if that changed
>>> something.
>>
>> I'm not sure we ever checked that.  My comparisons against the tarballs
>> were done from my own run of the conversion script.  I'm using C locale
>> here, probably you aren't?
>
> Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see exaclty 
> what changes.
> Hmm
>
> Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs export". 
> but it comes back with "-" in the dates, so it seems to not care about that.
>
> ("locale" clearly shows it's changed everything to C though)
>
> Is there a cvs setting for this somewhere that you know of?

Think I found it.

debian applies a patch to change it. If I set DateStyle=old in
CVSROOT/config, cvs export behaves sanely. I'll re-run with that
setting.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Magnus Hagander  writes:
> Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see
> exaclty what changes.
> Hmm

> Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs
> export". but it comes back with "-" in the dates, so it seems to not
> care about that.

I thought "cvs export" removed keywords entirely ... try a checkout
instead.  Also, are you sure you don't have any LC_xxx variables set?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Magnus Hagander

On Mon, Sep 20, 2010 at 19:49, Tom Lane  wrote:
> Magnus Hagander  writes:
>> On Mon, Sep 20, 2010 at 19:34, Tom Lane  wrote:
>>> Please fix and re-run.
>
>> Uh, what the heck. I ran the exact same command as last time.. Hmm:
>> Stefan rbeooted the machine in between, I wonder if that changed
>> something.
>
> I'm not sure we ever checked that.  My comparisons against the tarballs
> were done from my own run of the conversion script.  I'm using C locale
> here, probably you aren't?

Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see
exaclty what changes.
Hmm

Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs
export". but it comes back with "-" in the dates, so it seems to not
care about that.

("locale" clearly shows it's changed everything to C though)

Is there a cvs setting for this somewhere that you know of?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Serializable snapshot isolation error logging

2010-09-20 Thread Dan S

Hi !

I wonder if the SSI implementation will give some way of detecting the cause
of a serialization failure.
Something like the deadlock detection maybe where you get the sql-statements
involved.

Best Regards
Dan S

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Markus Wanner

On 09/20/2010 06:09 PM, Kevin Grittner wrote:
> Yeah, I mostly followed that thread.  If such a feature was present,
> it might well make sense to use it for this; however, I've got
> enough trouble selling the SSI technology without making it
> dependent on something else which was clearly quite controversial,
> and which seemed to have some technical hurdles of its own left to
> clear.  :-/

Okay, well understandable. I'm wondering how you want to implement the
memory allocation part, though.

> At the point where there is an implementation which is accepted by
> the community, I'll certainly take another look.

Fair enough, thanks.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Magnus Hagander  writes:
> On Mon, Sep 20, 2010 at 19:34, Tom Lane  wrote:
>> Please fix and re-run.

> Uh, what the heck. I ran the exact same command as last time.. Hmm:
> Stefan rbeooted the machine in between, I wonder if that changed
> something.

I'm not sure we ever checked that.  My comparisons against the tarballs
were done from my own run of the conversion script.  I'm using C locale
here, probably you aren't?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bg worker: general purpose requirements

2010-09-20 Thread Markus Wanner

Robert,

On 09/20/2010 06:57 PM, Robert Haas wrote:
> Gee, that doesn't seem slow enough to worry about to me.  If we
> suppose that you need 2 * CPUs + spindles processes to fully load the
> system, that means you should be able to ramp up from zero to
> consuming every available system resource in under a second; except
> perhaps on a system with a huge RAID array, which might need 2 or 3
> seconds.  If you parallelize the worker startup, as you suggest, I'd
> think you could knock quite a bit more off of this, but why all the
> worry about startup latency?  Once the system is chugging along, none
> of this should matter very much, I would think.  If you need to
> repeatedly kill off some workers bound to one database and start some
> new ones to bind to a different database, that could be sorta painful,
> but if you can actually afford to keep around the workers for all the
> databases you care about, it seems fine.

Hm.. I see. So in other words, you are saying
min_spare_background_workers isn't flexible enough in case one has
thousands of databases but only uses a few of them frequently.

I understand that reasoning and the wish to keep the number of GUCs as
low as possible. I'll try to drop the min_spare_background_workers from
the bgworker patches.

The rest of the bgworker infrastructure should behave pretty much like
what you have described. Parallelism in starting bgworkers could be a
nice improvement, especially if we kill the min_space_background_workers
mechanism.

> Neat stuff.

Thanks.

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Magnus Hagander

On Mon, Sep 20, 2010 at 19:34, Tom Lane  wrote:
> Magnus Hagander  writes:
>> Since there haven't been any commits in cvs during the day, the test
>> conversoin I created after lunch should be identical to a new one I'd
>> run now, so let's use that one :-)
>
> This is not even close to matching the tarballs :-(.  Seems to be a
> locale problem: the diffs look like
>
> 1c1
> < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 
> 18:25:16 tgl Exp $ */
> ---
>> /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 18:25:16 
>> tgl Exp $ */
>
> Please fix and re-run.

Uh, what the heck. I ran the exact same command as last time.. Hmm:
Stefan rbeooted the machine in between, I wonder if that changed
something.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] compile/install of git

2010-09-20 Thread Andrew Dunstan




On 09/20/2010 01:16 PM, Mark Wong wrote:

On Mon, Sep 20, 2010 at 9:42 AM, Andrew Dunstan  wrote:


On 09/20/2010 12:24 PM, Mark Wong wrote:

On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjianwrote:

Well, I can run tests for folks before they apply a patch and "red" the
build farm.  I can also research fixes easier because I am using the OS,
rather than running blind tests.  I am just telling you what people told
me.

I've been slowly trying to rebuild something that was in use at the
OSDL to test patches.  I just proofed something that I think works
with the git repository:

http://207.173.203.223:5000/patch/show/48

If you click on the PASS or FAIL text, it will display the SHA1,
author and commit message that the patch was applied to.  Think this
will be useful?


The issue has always been how much we want to ask people to trust code that
is not committed. My answer is "not at all." Reviewers and committers will
presumably eyeball the code before trying to compile/run it, but any
automated system of code testing for uncommitted code is way too risky,
IMNSHO.

I was hoping this would be more of a reviewing tool, not something
that would be an excuse for someone to not try running with a patch.
For example, if patch doesn't apply, configure, or build the output is
captured and can be referenced.  Also specifically in Bruce's example
if there is enough concern about making the buildfarm red I thought
this could help in these few specific aspects.  But maybe I don't
understand the scope of testing Bruce is referring to. :)


The whole point of the buildfarm is to identify quickly any 
platform-dependent problems. Committers can't be expected to have access 
to the whole range of platforms we support, so as long as they make sure 
that things are working well on their systems they should be able to 
rely on the buildfarm to cover the others. But that also means that the 
buildfarm should contain instances of all the supported platforms. I 
don't think we should be afraid of sending the buildfarm red. If we do 
it's an indication that it's doing its job. If you're a committer and 
you haven't made it go red a few times you're either very lucky or not 
very active. Making it go red isn't a problem. Leaving it red is, but 
we've really been pretty darn good about that.


Having someone act in effect as an informal buildfarm member is less 
than satisfactory, IMNSHO. For one thing, it is likely to be less timely 
about notifying us of problems than the automated system. And it's also 
much less likely to catch problems on the back branches. So if you want 
platform X supported (even BSD/OS, regardless of the fact that it's way 
out of date), the first thing you should do is set up a buildfarm member 
for it.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Tom Lane

Magnus Hagander  writes:
> Since there haven't been any commits in cvs during the day, the test
> conversoin I created after lunch should be identical to a new one I'd
> run now, so let's use that one :-)

This is not even close to matching the tarballs :-(.  Seems to be a
locale problem: the diffs look like

1c1
< /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 18:25:16 
tgl Exp $ */
---
> /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 18:25:16 
> tgl Exp $ */

Please fix and re-run.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Git conversion status

2010-09-20 Thread Bruce Momjian

Magnus Hagander wrote:
> Hi!
> 
> CVS has been frozen, and all commit access locked out.
> 
> Since there haven't been any commits in cvs during the day, the test
> conversoin I created after lunch should be identical to a new one I'd
> run now, so let's use that one :-)
> 
> So I've moved it in place. It's on
> http://git.postgresql.org/gitweb?p=postgresql-migration.git. Git
> access available at
> git://git.postgresql.org/git/postgresql-migration.git.
> 
> Committers can (and should! please test!) clone from git clone
> ssh://g...@gitmaster.postgresql.org/postgresql.git.
> 
> Please do *NOT* commit or push anything to this repository yet though:
> The repo is there - all the scripts to manage it are *not*. So don't
> commit until I confirm that it is.
> 
> But please clone and verify the stuff we have now.

Git clone worked just fine.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] compile/install of git

2010-09-20 Thread Mark Wong

On Mon, Sep 20, 2010 at 9:42 AM, Andrew Dunstan  wrote:
>
>
> On 09/20/2010 12:24 PM, Mark Wong wrote:
>>
>> On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian  wrote:
>>>
>>> Well, I can run tests for folks before they apply a patch and "red" the
>>> build farm.  I can also research fixes easier because I am using the OS,
>>> rather than running blind tests.  I am just telling you what people told
>>> me.
>>
>> I've been slowly trying to rebuild something that was in use at the
>> OSDL to test patches.  I just proofed something that I think works
>> with the git repository:
>>
>> http://207.173.203.223:5000/patch/show/48
>>
>> If you click on the PASS or FAIL text, it will display the SHA1,
>> author and commit message that the patch was applied to.  Think this
>> will be useful?
>
>
> The issue has always been how much we want to ask people to trust code that
> is not committed. My answer is "not at all." Reviewers and committers will
> presumably eyeball the code before trying to compile/run it, but any
> automated system of code testing for uncommitted code is way too risky,
> IMNSHO.

I was hoping this would be more of a reviewing tool, not something
that would be an excuse for someone to not try running with a patch.
For example, if patch doesn't apply, configure, or build the output is
captured and can be referenced.  Also specifically in Bruce's example
if there is enough concern about making the buildfarm red I thought
this could help in these few specific aspects.  But maybe I don't
understand the scope of testing Bruce is referring to. :)

Regards,
Mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bg worker: general purpose requirements

2010-09-20 Thread Robert Haas

On Mon, Sep 20, 2010 at 11:30 AM, Markus Wanner  wrote:
> Well, Apache pre-forks 5 processes in total (by default, that is, for
> high volume webservers a higher MinSpareServers setting is certainly not
> out of question). While bgworkers currently needs to fork
> min_spare_background_workers processes per database.
>
> AIUI, that's the main problem with the current architecture.

Assuming that "the main problem" refers more or less to the words "per
database", I agree.

>>> I haven't measured the actual time it takes, but given the use case of a
>>> connection pool, I so far thought it's obvious that this process takes too
>>> long.
>>
>> Maybe that would be a worthwhile exercise...
>
> On my laptop I'm measuring around 18 bgworker starts per second, i.e.
> roughly 50 ms per bgworker start. That's certainly just a ball-park figure..

Gee, that doesn't seem slow enough to worry about to me.  If we
suppose that you need 2 * CPUs + spindles processes to fully load the
system, that means you should be able to ramp up from zero to
consuming every available system resource in under a second; except
perhaps on a system with a huge RAID array, which might need 2 or 3
seconds.  If you parallelize the worker startup, as you suggest, I'd
think you could knock quite a bit more off of this, but why all the
worry about startup latency?  Once the system is chugging along, none
of this should matter very much, I would think.  If you need to
repeatedly kill off some workers bound to one database and start some
new ones to bind to a different database, that could be sorta painful,
but if you can actually afford to keep around the workers for all the
databases you care about, it seems fine.

>> How do you accumulate the change sets?
>
> Logical changes get collected at the heapam level. They get serialized
> and streamed (via imessages and a group communication system) to all
> nodes. Application of change sets is highly parallelized and should be
> pretty efficient. Commit ordering is decided by the GCS to guarantee
> consistency across all nodes, conflicts get resolved by aborting the
> later transaction.

Neat stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] work_mem / maintenance_work_mem maximums

2010-09-20 Thread Stephen Frost

Greetings,

  After watching a database import go abysmally slow on a pretty beefy
  box with tons of RAM, I got annoyed and went to hunt down why in the
  world PG wasn't using but a bit of memory.  Turns out to be a well
  known and long-standing issue:

  http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg101139.html

  Now, we could start by fixing guc.c to correctly have the max value
  for these be MaxAllocSize/1024, for starters, then at least our users
  would know when they set a higher value it's not going to be used.
  That, in my mind, is a pretty clear bug fix.  Of course, that doesn't
  help us poor data-warehousing bastards with 64G+ machines.

  Sooo..  I don't know much about what the limit is or why it's there,
  but based on the comments, I'm wondering if we could just move the
  limit to a more 'sane' place than the-function-we-use-to-allocate.  If
  we need a hard limit due to TOAST, let's put it there, but I'm hopeful
  we could work out a way to get rid of this limit in repalloc and that
  we can let sorts and the like (uh, index creation) use what memory the
  user has decided it should be able to.

Thanks,

Stephen


signature.asc
Description: Digital signature

[HACKERS] Git conversion status

2010-09-20 Thread Magnus Hagander

Hi!

CVS has been frozen, and all commit access locked out.

Since there haven't been any commits in cvs during the day, the test
conversoin I created after lunch should be identical to a new one I'd
run now, so let's use that one :-)

So I've moved it in place. It's on
http://git.postgresql.org/gitweb?p=postgresql-migration.git. Git
access available at
git://git.postgresql.org/git/postgresql-migration.git.

Committers can (and should! please test!) clone from git clone
ssh://g...@gitmaster.postgresql.org/postgresql.git.

Please do *NOT* commit or push anything to this repository yet though:
The repo is there - all the scripts to manage it are *not*. So don't
commit until I confirm that it is.

But please clone and verify the stuff we have now.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Kevin Grittner

Tom Lane  wrote:

> There's nothing vestigial about SHM_QUEUE --- it's used by the
> lock manager.  But it's intended to link together structs whose
> existence is managed by somebody else.

Yep, that's exactly my problem.

> I'm not excited about inventing an API with just one use-case;
> it's unlikely that you actually end up with anything generally
> useful.  (SHM_QUEUE seems like a case in point...)  Especially
> when there are so many other constraints on what shared memory is
> usable for.  You might as well just do this internally to the
> SERIALIZABLEXACT management code.

Fair enough.  I'll probably abstract it within the SSI patch anyway,
just because it will keep the other code cleaner where the logic is
necessarily kinda messy anyway, and I think it'll reduce the chance
of weird memory bugs.  I just won't get quite so formal about the
interface.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Simon Riggs

On Mon, 2010-09-20 at 12:35 -0400, Tom Lane wrote:
> "Kevin Grittner"  writes:
> > Simon Riggs  wrote:
> >> My understanding is that we used to have that and it was removed
> >> for the reasons Heikki states. There are still vestigial bits
> >> still in code.
> 
> There's nothing vestigial about SHM_QUEUE --- it's used by the lock
> manager. 

Yes, I was talking about an implementation that allocated memory as
well. There are sections of IFDEF'd out code there...

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] compile/install of git

2010-09-20 Thread Andrew Dunstan




On 09/20/2010 12:24 PM, Mark Wong wrote:

On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian  wrote:


Well, I can run tests for folks before they apply a patch and "red" the
build farm.  I can also research fixes easier because I am using the OS,
rather than running blind tests.  I am just telling you what people told
me.

I've been slowly trying to rebuild something that was in use at the
OSDL to test patches.  I just proofed something that I think works
with the git repository:

http://207.173.203.223:5000/patch/show/48

If you click on the PASS or FAIL text, it will display the SHA1,
author and commit message that the patch was applied to.  Think this
will be useful?



The issue has always been how much we want to ask people to trust code 
that is not committed. My answer is "not at all." Reviewers and 
committers will presumably eyeball the code before trying to compile/run 
it, but any automated system of code testing for uncommitted code is way 
too risky, IMNSHO.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] compile/install of git

2010-09-20 Thread Robert Haas

On Mon, Sep 20, 2010 at 12:24 PM, Mark Wong  wrote:
> On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian  wrote:
>> Andrew Dunstan wrote:
>>>
>>>
>>> On 09/18/2010 10:22 AM, Bruce Momjian wrote:
>>> > Dave Page wrote:
>>> >> On Fri, Sep 17, 2010 at 10:02 PM, Bruce Momjian  wrote:
>>> >>> FYI, I have compiled/installed git 1.7.3.rc2 on my BSD/OS 4.3.1 machine
>>> >>> with the attached minor changes.
>>> >> I thought you were replacing that old thing with pile of hardware that
>>> >> Matthew was putting together?
>>> > Matthew was busy this summer so I am going to try to get some of his
>>> > time by January to switch to Ubuntu.  And some people are complaining we
>>> > will lose a BSD test machine once I switch.
>>> >
>>>
>>> Test machines belong in the buildfarm. And why would they complain about
>>> losing a machine running a totally out of date and unsupported OS? Maybe
>>> you should run BeOS instead.
>>
>> Well, I can run tests for folks before they apply a patch and "red" the
>> build farm.  I can also research fixes easier because I am using the OS,
>> rather than running blind tests.  I am just telling you what people told
>> me.
>
> I've been slowly trying to rebuild something that was in use at the
> OSDL to test patches.  I just proofed something that I think works
> with the git repository:
>
> http://207.173.203.223:5000/patch/show/48
>
> If you click on the PASS or FAIL text, it will display the SHA1,
> author and commit message that the patch was applied to.  Think this
> will be useful?

Seems interesting. You might need to take precautions against someone
uploading a trojan, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Tom Lane

"Kevin Grittner"  writes:
> Simon Riggs  wrote:
>> My understanding is that we used to have that and it was removed
>> for the reasons Heikki states. There are still vestigial bits
>> still in code.

There's nothing vestigial about SHM_QUEUE --- it's used by the lock
manager.  But it's intended to link together structs whose existence
is managed by somebody else.

>> Not exactly impressed with the SHM_QUEUE stuff though, so I
>> appreciate the sentiment that Kevin expresses.

> So, if I just allocated a fixed memory space to provide an API
> similar to my previous post, does that sound reasonable to you?

I'm not excited about inventing an API with just one use-case; it's
unlikely that you actually end up with anything generally useful.
(SHM_QUEUE seems like a case in point...)  Especially when there are so
many other constraints on what shared memory is usable for.  You might
as well just do this internally to the SERIALIZABLEXACT management code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Heikki Linnakangas


On 20/09/10 19:04, Kevin Grittner wrote:

Heikki Linnakangas  wrote:


In the SSI patch, you'd also need a way to insert an existing
struct into a hash table. You currently work around that by using
a hash element that contains only the hash key, and a pointer to
the SERIALIZABLEXACT struct. It isn't too bad I guess, but I find
it a bit confusing.


Hmmm...  Mucking with the hash table implementation to accommodate
that seems like it's a lot of work and risk for pretty minimal
benefit.  Are you sure it's worth it?


No, I'm not sure at all.


Well, we generally try to avoid dynamic structures in shared
memory, because shared memory can't be resized.


But don't HTAB structures go beyond their estimated sizes as needed?


Yes, but not in a very smart way. The memory allocated for hash table 
elements are never free'd. So if you use up all the "slush fund" shared 
memory for SIREAD locks, it can't be used for anything else anymore, 
even if the SIREAD locks are later released.



Any chance of collapsing together entries of already-committed
transactions in the SSI patch, to put an upper limit on the number
of shmem list entries needed? If you can do that, then a simple
array allocated at postmaster startup will do fine.


I suspect it can be done, but I'm quite sure that any such scheme
would increase the rate of serialization failures.  Right now I'm
trying to see how much I can do to *decrease* the rate of
serialization failures, so I'm not eager to go there.  :-/


I see. It's worth spending some mental power on, an upper limit would 
make life a lot easier. It doesn't matter much if it's 2*max_connections 
or 100*max_connections, as long as it's finite.



If it is
necessary, the most obvious way to manage this is just to force
cancellation of the oldest running serializable transaction and
running ClearOldPredicateLocks(), perhaps iterating, until we free
an entry to service the new request.


Hmm, that's not very appealing either. But perhaps it's still better 
than not letting any new transactions to begin. We could say "snapshot 
too old" in the error message :-).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq changes for synchronous replication

2010-09-20 Thread Simon Riggs

On Fri, 2010-09-17 at 18:22 +0900, Fujii Masao wrote:
> On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas
>  wrote:
> > That said, there's a few small things that can be progressed regardless of
> > the details of synchronous replication. There's the changes to trigger
> > failover with a signal, and it seems that we'll need some libpq changes to
> > allow acknowledgments to be sent back to the master regardless of the rest
> > of the design. We can discuss those in separate threads in parallel.
> 
> Agreed. The attached patch introduces new function which is used
> to send ACK back from walreceiver. The function sends a message
> to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData
> to be called even during COPY OUT.

Does this differ from Zoltan's code?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Kevin Grittner

Simon Riggs  wrote:

> My understanding is that we used to have that and it was removed
> for the reasons Heikki states. There are still vestigial bits
> still in code.
> 
> Not exactly impressed with the SHM_QUEUE stuff though, so I
> appreciate the sentiment that Kevin expresses.

So, if I just allocated a fixed memory space to provide an API
similar to my previous post, does that sound reasonable to you?  For
the record, my intention would be to hide the SHM_QUEUE structures
in this API -- an entry would be just the structure you're
interested in working with.  If practical, I would prefer for
ShmList to be a pointer to an opaque structure; users of this
shouldn't really be exposed to or depend upon the implementation.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] compile/install of git

2010-09-20 Thread Mark Wong

On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian  wrote:
> Andrew Dunstan wrote:
>>
>>
>> On 09/18/2010 10:22 AM, Bruce Momjian wrote:
>> > Dave Page wrote:
>> >> On Fri, Sep 17, 2010 at 10:02 PM, Bruce Momjian  wrote:
>> >>> FYI, I have compiled/installed git 1.7.3.rc2 on my BSD/OS 4.3.1 machine
>> >>> with the attached minor changes.
>> >> I thought you were replacing that old thing with pile of hardware that
>> >> Matthew was putting together?
>> > Matthew was busy this summer so I am going to try to get some of his
>> > time by January to switch to Ubuntu.  And some people are complaining we
>> > will lose a BSD test machine once I switch.
>> >
>>
>> Test machines belong in the buildfarm. And why would they complain about
>> losing a machine running a totally out of date and unsupported OS? Maybe
>> you should run BeOS instead.
>
> Well, I can run tests for folks before they apply a patch and "red" the
> build farm.  I can also research fixes easier because I am using the OS,
> rather than running blind tests.  I am just telling you what people told
> me.

I've been slowly trying to rebuild something that was in use at the
OSDL to test patches.  I just proofed something that I think works
with the git repository:

http://207.173.203.223:5000/patch/show/48

If you click on the PASS or FAIL text, it will display the SHA1,
author and commit message that the patch was applied to.  Think this
will be useful?

Mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Simon Riggs

On Mon, 2010-09-20 at 18:37 +0300, Heikki Linnakangas wrote:

> > SHM_QUEUE objects provide the infrastructure for maintaining a
> > shared memory linked list, but they don't do anything about the
> > allocation and release of the space for the objects.  So it occurs
> > to me that I'm using an HTAB for this collection because it provides
> > the infrastructure for managing the memory for the collection,
> > rather than because I need hash lookup.  :-(  It works, but that
> > hardly seems optimal.
> 
> > Have I missed something we already have which could meet that need?
> 
> Well, we generally try to avoid dynamic structures in shared memory, 
> because shared memory can't be resized. So, you'd typically use an array 
> with a fixed number of elements. One could even argue that we 
> specifically *don't* want to have the kind of infrastructure you 
> propose, to discourage people from writing patches that need dynamic 
> shmem structures.

My understanding is that we used to have that and it was removed for the
reasons Heikki states. There are still vestigial bits still in code.

Not exactly impressed with the SHM_QUEUE stuff though, so I appreciate
the sentiment that Kevin expresses.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq changes for synchronous replication

2010-09-20 Thread Tom Lane

Heikki Linnakangas  writes:
> It doesn't feel right to always accept PQputCopyData in COPY OUT mode, 
> though. IMHO there should be a new COPY IN+OUT mode.

Yeah, I was going to make the same complaint.  Breaking basic
error-checking functionality in libpq is not very acceptable.

> It should be pretty safe to add a CopyInOutResponse message to the 
> protocol without a protocol version bump. Thoughts on that?

Not if it's something that an existing application might see.  If
it can only happen in replication mode it's OK.

Personally I think this demonstrates that piggybacking replication
data transfer on the COPY protocol was a bad design to start with.
It's probably time to split them apart.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Kevin Grittner

Markus Wanner  wrote:

> On 09/20/2010 05:12 PM, Kevin Grittner wrote:
>> SHM_QUEUE objects provide the infrastructure for maintaining a
>> shared memory linked list, but they don't do anything about the
>> allocation and release of the space for the objects.
> 
> Did you have a look at my dynshmem stuff? It tries to solve the
> problem of dynamic allocation from shared memory. Not just for
> lists, but very generally.

Yeah, I mostly followed that thread.  If such a feature was present,
it might well make sense to use it for this; however, I've got
enough trouble selling the SSI technology without making it
dependent on something else which was clearly quite controversial,
and which seemed to have some technical hurdles of its own left to
clear.  :-/

At the point where there is an implementation which is accepted by
the community, I'll certainly take another look.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Kevin Grittner

Heikki Linnakangas  wrote:

> In the SSI patch, you'd also need a way to insert an existing
> struct into a hash table. You currently work around that by using
> a hash element that contains only the hash key, and a pointer to
> the SERIALIZABLEXACT struct. It isn't too bad I guess, but I find
> it a bit confusing.

Hmmm...  Mucking with the hash table implementation to accommodate
that seems like it's a lot of work and risk for pretty minimal
benefit.  Are you sure it's worth it?  Perhaps better commenting
around the SERIALIZABLEXID structure to indicate it's effectively a
used for a non-primary key index into the other collection?

> Well, we generally try to avoid dynamic structures in shared
> memory, because shared memory can't be resized.

But don't HTAB structures go beyond their estimated sizes as needed?
I was trying to accommodate the situation where one collection
might not be anywhere near its limit, but some other collection has
edged past.  Unless I'm misunderstanding things (which is always
possible), the current HTAB implementation takes advantage of the
"slush fund" of unused space to some degree.  I was just trying to
maintain the same flexibility with the list.

I was thinking of returning a size based on the *maximum* allowed
allocations from the estimated size function, and actually limiting
it to that size.  So it wasn't so much a matter of grabbing more
than expected, but leaving something for the hash table slush if
possible.  Of course I was also thinking that this would allow one
to be a little bit more generous with he maximum, as it might have
benefit elsewhere...

> So, you'd typically use an array with a fixed number of elements.

That's certainly a little easier, if you think it's better.

> Any chance of collapsing together entries of already-committed 
> transactions in the SSI patch, to put an upper limit on the number
> of shmem list entries needed? If you can do that, then a simple
> array allocated at postmaster startup will do fine.

I suspect it can be done, but I'm quite sure that any such scheme
would increase the rate of serialization failures.  Right now I'm
trying to see how much I can do to *decrease* the rate of
serialization failures, so I'm not eager to go there.  :-/  If it is
necessary, the most obvious way to manage this is just to force
cancellation of the oldest running serializable transaction and
running ClearOldPredicateLocks(), perhaps iterating, until we free
an entry to service the new request.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Markus Wanner

Kevin,

On 09/20/2010 05:12 PM, Kevin Grittner wrote:
> SHM_QUEUE objects provide the infrastructure for maintaining a
> shared memory linked list, but they don't do anything about the
> allocation and release of the space for the objects.

Did you have a look at my dynshmem stuff? It tries to solve the problem
of dynamic allocation from shared memory. Not just for lists, but very
generally.

Regards

Markus Wanner



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Heikki Linnakangas


On 20/09/10 18:12, Kevin Grittner wrote:

On the Serializable Snapshot Isolation thread, Heikki pointed out a
collection of objects in an HTAB which didn't really need its key on
VirtualTransactionId, but there isn't really any other useful key,
either.  One of these objects may live and die, seeing use from
multiple processes, without ever getting a TransactionId assigned;
and it needs to be in a collection in shared memory the whole time.
This suggests to me that some sort of list would be better.


In the SSI patch, you'd also need a way to insert an existing struct 
into a hash table. You currently work around that by using a hash 
element that contains only the hash key, and a pointer to the 
SERIALIZABLEXACT struct. It isn't too bad I guess, but I find it a bit 
confusing.



SHM_QUEUE objects provide the infrastructure for maintaining a
shared memory linked list, but they don't do anything about the
allocation and release of the space for the objects.  So it occurs
to me that I'm using an HTAB for this collection because it provides
the infrastructure for managing the memory for the collection,
rather than because I need hash lookup.  :-(  It works, but that
hardly seems optimal.



Have I missed something we already have which could meet that need?


Well, we generally try to avoid dynamic structures in shared memory, 
because shared memory can't be resized. So, you'd typically use an array 
with a fixed number of elements. One could even argue that we 
specifically *don't* want to have the kind of infrastructure you 
propose, to discourage people from writing patches that need dynamic 
shmem structures.


Any chance of collapsing together entries of already-committed 
transactions in the SSI patch, to put an upper limit on the number of 
shmem list entries needed? If you can do that, then a simple array 
allocated at postmaster startup will do fine.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What happened to the is_ family of functions proposal?

2010-09-20 Thread Colin 't Hart

On 20 September 2010 16:54, Andrew Dunstan  wrote:
>
>
> On 09/20/2010 10:29 AM, Colin 't Hart wrote:
>>
>> Hi,
>>
>> Back in 2002 these were proposed, what happened to them?
>>
>> http://archives.postgresql.org/pgsql-sql/2002-09/msg00406.php
>
>
> 2002 is a long time ago.



> I think to_date is the wrong gadget to use here. You should probably be using 
> the date input routine and trapping any data exception. e.g.:
>
>    test_date := date_in(textout(some_text));
>
> In plpgsql you'd put that inside a begin/exception/end block that traps 
> SQLSTATE '22000' which is the class covering data exceptions.

So it's not possible using pure SQL unless one writes a function?

Are the is_ family of functions still desired?

Also, where are the to_ conversions done?

Thanks,

Colin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bg worker: general purpose requirements

2010-09-20 Thread Markus Wanner

Hi,

On 09/18/2010 05:21 AM, Robert Haas wrote:
> Wow, 100 processes??! Really?  I guess I don't actually know how large
> modern proctables are, but on my MacOS X machine, for example, there
> are only 75 processes showing up right now in "ps auxww".  My Fedora
> 12 machine has 97.  That's including a PostgreSQL instance in the
> first case and an Apache instance in the second case.  So 100 workers
> seems like a ton to me.

Well, Apache pre-forks 5 processes in total (by default, that is, for
high volume webservers a higher MinSpareServers setting is certainly not
out of question). While bgworkers currently needs to fork
min_spare_background_workers processes per database.

AIUI, that's the main problem with the current architecture.

>> I haven't measured the actual time it takes, but given the use case of a
>> connection pool, I so far thought it's obvious that this process takes too
>> long.
> 
> Maybe that would be a worthwhile exercise...

On my laptop I'm measuring around 18 bgworker starts per second, i.e.
roughly 50 ms per bgworker start. That's certainly just a ball-park figure..

One could parallelize the communication channel between the coordinator
and postmaster, so as to be able to start multiple bgworkers in
parallel, but the initial latency remains.

It's certainly quick enough for autovacuum. But equally certainly not
acceptable for Postgres-R, where latency is the worst enemy in the first
place.

For autonomous transactions and parallel querying, I'd also say that I'd
rather not like to have such a latency.

> I think the kicker here is the idea of having a certain number of
> extra workers per database.

Agreed, but I don't see any better way. Short of a re-connecting feature.

> So
> if you knew you only had 1 database, keeping around 2 or 3 or 5 or
> even 10 workers might seem reasonable, but since you might have 1
> database or 1000 databases, it doesn't.  Keeping 2 or 3 or 5 or 10
> workers TOTAL around could be reasonable, but not per-database.  As
> Tom said upthread, we don't want to assume that we're the only thing
> running on the box and are therefore entitled to take up all the
> available memory/disk/process slots/whatever.  And even if we DID feel
> so entitled, there could be hundreds of databases, and it certainly
> doesn't seem practical to keep 1000 workers around "just in case".

Agreed. Looks like Postgres-R has a slightly different focus, because if
you need multi-master replication, you probably don't have 1000s of
databases and/or lots of other services on the same machine.

> I don't know whether an idle Apache worker consumes more or less
> memory than an idle PostgreSQL worker, but another difference between
> the Apache case and the PostgreSQL case is that presumably all those
> backend processes have attached shared memory and have ProcArray
> slots.  We know that code doesn't scale terribly well, especially in
> terms of taking snapshots, and that's one reason why high-volume
> PostgreSQL installations pretty much require a connection pooler.  I
> think the sizes of the connection pools I've seen recommended are
> considerably smaller than 100, more like 2 * CPUs + spindles, or
> something like that.  It seems like if you actually used all 100
> workers at the same time performance might be pretty awful.

Sounds reasonable, yes.

> I was taking a look at the Mammoth Replicator code this week
> (parenthetical note: I couldn't figure out where mcp_server was or how
> to set it up) and it apparently has a limitation that only one
> database in the cluster can be replicated.  I'm a little fuzzy on how
> Mammoth works, but apparently this problem of scaling to large numbers
> of databases is not unique to Postgres-R.

Postgres-R is able to replicate multiple databases. Maybe not thousands,
but still designed for it.

> What is the granularity of replication?  Per-database?  Per-table?

Currently per-cluster (i.e. all your databases at once).

> How do you accumulate the change sets?

Logical changes get collected at the heapam level. They get serialized
and streamed (via imessages and a group communication system) to all
nodes. Application of change sets is highly parallelized and should be
pretty efficient. Commit ordering is decided by the GCS to guarantee
consistency across all nodes, conflicts get resolved by aborting the
later transaction.

> Some kind of bespoke hook, WAL scanning, ...?

No hooks, please!  ;-)

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Do we need a ShmList implementation?

2010-09-20 Thread Kevin Grittner

On the Serializable Snapshot Isolation thread, Heikki pointed out a
collection of objects in an HTAB which didn't really need its key on
VirtualTransactionId, but there isn't really any other useful key,
either.  One of these objects may live and die, seeing use from
multiple processes, without ever getting a TransactionId assigned;
and it needs to be in a collection in shared memory the whole time. 
This suggests to me that some sort of list would be better.
 
SHM_QUEUE objects provide the infrastructure for maintaining a
shared memory linked list, but they don't do anything about the
allocation and release of the space for the objects.  So it occurs
to me that I'm using an HTAB for this collection because it provides
the infrastructure for managing the memory for the collection,
rather than because I need hash lookup.  :-(  It works, but that
hardly seems optimal.
 
Have I missed something we already have which could meet that need? 
If not, how would people feel about a ShmList implementation?  A
quick first draft for the API (which can almost certainly be
improved, so don't be shy), is:
 
ShmList ShmInitList(const char *name,
Size entrySize,
int initalEntryAlloc,
int maxExtensions);
Size ShmListEstimateSize(ShmList list);
void *CreateShmListEntry(ShmList list);
void ReleaseShmListEntry(ShmList list, void *entry);
int ShmListSize(ShmList list);
void *ShmListFirst(ShmList list);
void *ShmListNext(ShmList list, void *entry);
 
I see this as grabbing the initial allocation, filling it with
zeros, and then creating a linked list of available entries. 
Internally the entries would be a SHM_QUEUE structure followed by
space for the entrySize passed on init.  A "create entry" call would
remove an entry from the available list, link it into the
collection, and return a pointer to the structure.  Releasing an
entry would remove it from the collection list, zero it, and link it
to the available list.  Hopefully the rest is fairly self-evident --
if not, let me know.
 
Thoughts?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bg worker: general purpose requirements

2010-09-20 Thread Markus Wanner

On 09/18/2010 05:43 AM, Tom Lane wrote:
> The part of that that would worry me is open files.  PG backends don't
> have any compunction about holding open hundreds of files.  Apiece.
> You can dial that down but it'll cost you performance-wise.  Last
> I checked, most Unix kernels still had limited-size FD arrays.

Thank you very much, that's a helpful hint.

I did some quick testing and managed to fork up to around 2000 backends,
at which point my (laptop) system got unresponsive. To be honest, that's
really surprising me.

(I had to increased the SHM and SEM kernel limits to be able to start
Postgres with that many processes at all. Obviously, Linux doesn't seem
to like that... on a second test I got a kernel panic)

> And as you say, ProcArray manipulations aren't going to be terribly
> happy about large numbers of idle backends, either.

Very understandable, yes.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq changes for synchronous replication

2010-09-20 Thread Heikki Linnakangas


On 17/09/10 12:22, Fujii Masao wrote:

On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas
  wrote:

That said, there's a few small things that can be progressed regardless of
the details of synchronous replication. There's the changes to trigger
failover with a signal, and it seems that we'll need some libpq changes to
allow acknowledgments to be sent back to the master regardless of the rest
of the design. We can discuss those in separate threads in parallel.


Agreed. The attached patch introduces new function which is used
to send ACK back from walreceiver. The function sends a message
to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData
to be called even during COPY OUT.


Oh, that's simple.

It doesn't feel right to always accept PQputCopyData in COPY OUT mode, 
though. IMHO there should be a new COPY IN+OUT mode.


It should be pretty safe to add a CopyInOutResponse message to the 
protocol without a protocol version bump. Thoughts on that?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What happened to the is_ family of functions proposal?

2010-09-20 Thread Andrew Dunstan




On 09/20/2010 10:29 AM, Colin 't Hart wrote:

Hi,

Back in 2002 these were proposed, what happened to them?

http://archives.postgresql.org/pgsql-sql/2002-09/msg00406.php



2002 is a long time ago.




Also I note:

co...@ruby:~/workspace/eyedb$ psql
psql (8.4.4)
Type "help" for help.

colin=> select to_date('731332', 'YYMMDD');
  to_date

 1974-02-01
(1 row)

colin=>


The fact that this wraps would seem to me to make the implementation 
of is_date() difficult.






I think to_date is the wrong gadget to use here. You should probably be 
using the date input routine and trapping any data exception. e.g.:


test_date := date_in(textout(some_text));

In plpgsql you'd put that inside a begin/exception/end block that traps 
SQLSTATE '22000' which is the class covering data exceptions.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] What happened to the is_ family of functions proposal?

2010-09-20 Thread Colin 't Hart

Hi,

Back in 2002 these were proposed, what happened to them?

http://archives.postgresql.org/pgsql-sql/2002-09/msg00406.php


Also I note:

co...@ruby:~/workspace/eyedb$ psql
psql (8.4.4)
Type "help" for help.

colin=> select to_date('731332', 'YYMMDD');
  to_date

 1974-02-01
(1 row)

colin=>


The fact that this wraps would seem to me to make the implementation of
is_date() difficult.


I'm trying to query character strings for valid dates but can't see how to
do this quickly... but for that discussion I will move to pgsql-general :-)

Cheers,

Colin

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-20 Thread Kevin Grittner

I wrote:
> Heikki Linnakangas  wrote:
> 
>> ISTM you never search the SerializableXactHash table using a hash
>> key, except the one call in CheckForSerializableConflictOut, but
>> there you already have a pointer to the SERIALIZABLEXACT struct.
>> You only re-find it to make sure it hasn't gone away while you
>> trade the shared lock for an exclusive one. If we find another
>> way to ensure that, ISTM we don't need SerializableXactHash at
>> all. My first thought was to forget about VirtualTransactionId
>> and use TransactionId directly as the hash key for
>> SERIALIZABLEXACT. The problem is that a transaction doesn't have
>> a transaction ID when RegisterSerializableTransaction is called.
>> We could leave the TransactionId blank and only add the
>> SERIALIZABLEXACT struct to the hash table when an XID is
>> assigned, but there's no provision to insert an existing struct
>> to a hash table in the current hash table API.
>>
>> So, I'm not sure of the details yet, but it seems like it could
>> be made simpler somehow..
> 
> After tossing it around in my head for a bit, the only thing that
> I see (so far) which might work is to maintain a *list* of
> SERIALIZABLEXACT objects in memory rather than a using a hash
> table.  The recheck after releasing the shared lock and acquiring
> an exclusive lock would then go through SerializableXidHash.  I
> think that can work, although I'm not 100% sure that it's an
> improvement.  I'll look it over in more detail.  I'd be happy to
> hear your thoughts on this or any other suggestions.
 
I haven't come up with any better ideas.  Pondering this one, it
seems to me that a list would be better than a hash table if we had
a list which would automatically allocate and link new entries, and
would maintain a list of available entries for (re)use.  I wouldn't
want to sprinkle such an implementation in with predicate locking
and SSI code, but if there is a feeling that such a thing would be
worth having in shmqueue.c or some new file which uses the SHM_QUEUE
structure to provide an API for such functionality, I'd be willing
to write that and use it in the SSI code.  Without something like
that, I have so far been unable to envision an improvement along the
lines Heikki is suggesting here.
 
Thoughts?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Configuring Text Search parser?

2010-09-20 Thread jesper

Hi.

I'm trying to migrate an application off an existing Full Text Search engine
and onto PostgreSQL .. one of my main (remaining) headaches are the
fact that PostgreSQL treats _ as a seperation charachter whereas the existing
behaviour is to "not split". That means:

testdb=# select ts_debug('database_tag_number_999');
   ts_debug
--
 (asciiword,"Word, all ASCII",database,{english_stem},english_stem,{databas})
 (blank,"Space symbols",_,{},,)
 (asciiword,"Word, all ASCII",tag,{english_stem},english_stem,{tag})
 (blank,"Space symbols",_,{},,)
 (asciiword,"Word, all ASCII",number,{english_stem},english_stem,{number})
 (blank,"Space symbols",_,{},,)
 (uint,"Unsigned integer",999,{simple},simple,{999})
(7 rows)

Where the incoming data, by design contains a set of tags which includes _
and are expected to be one "lexeme".

I've tried patching my way out of this using this patch.

$ diff -w -C 5 src/backend/tsearch/wparser_def.c.orig
src/backend/tsearch/wparser_def.c
*** src/backend/tsearch/wparser_def.c.orig  2010-09-20 15:58:37.06460
+0200
--- src/backend/tsearch/wparser_def.c   2010-09-20 15:58:41.193335577 +0200
***
*** 967,986 
--- 967,988 

  static const TParserStateActionItem actionTPS_InNumWord[] = {
{p_isEOF, 0, A_BINGO, TPS_Base, NUMWORD, NULL},
{p_isalnum, 0, A_NEXT, TPS_InNumWord, 0, NULL},
{p_isspecial, 0, A_NEXT, TPS_InNumWord, 0, NULL},
+   {p_iseqC, '_', A_NEXT, TPS_InNumWord, 0, NULL},
{p_iseqC, '@', A_PUSH, TPS_InEmail, 0, NULL},
{p_iseqC, '/', A_PUSH, TPS_InFileFirst, 0, NULL},
{p_iseqC, '.', A_PUSH, TPS_InFileNext, 0, NULL},
{p_iseqC, '-', A_PUSH, TPS_InHyphenNumWordFirst, 0, NULL},
{NULL, 0, A_BINGO, TPS_Base, NUMWORD, NULL}
  };

  static const TParserStateActionItem actionTPS_InAsciiWord[] = {
{p_isEOF, 0, A_BINGO, TPS_Base, ASCIIWORD, NULL},
{p_isasclet, 0, A_NEXT, TPS_Null, 0, NULL},
+   {p_iseqC, '_', A_NEXT, TPS_Null, 0, NULL},
{p_iseqC, '.', A_PUSH, TPS_InHostFirstDomain, 0, NULL},
{p_iseqC, '.', A_PUSH, TPS_InFileNext, 0, NULL},
{p_iseqC, '-', A_PUSH, TPS_InHostFirstAN, 0, NULL},
{p_iseqC, '-', A_PUSH, TPS_InHyphenAsciiWordFirst, 0, NULL},
{p_iseqC, '@', A_PUSH, TPS_InEmail, 0, NULL},
***
*** 995,1004 
--- 997,1007 

  static const TParserStateActionItem actionTPS_InWord[] = {
{p_isEOF, 0, A_BINGO, TPS_Base, WORD_T, NULL},
{p_isalpha, 0, A_NEXT, TPS_Null, 0, NULL},
{p_isspecial, 0, A_NEXT, TPS_Null, 0, NULL},
+   {p_iseqC, '_', A_NEXT, TPS_Null, 0, NULL},
{p_isdigit, 0, A_NEXT, TPS_InNumWord, 0, NULL},
{p_iseqC, '-', A_PUSH, TPS_InHyphenWordFirst, 0, NULL},
{NULL, 0, A_BINGO, TPS_Base, WORD_T, NULL}
  };



This will obviously break other peoples applications, so my questions would
be: If this should be made configurable.. how should it be done?

As a sidenote... Xapian doesn't split on _ .. Lucene does.

Thanks.

-- 
Jesper


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Robert Haas

On Mon, Sep 20, 2010 at 8:50 AM, Simon Riggs  wrote:
> Please respond to the main point: Following some thought and analysis,
> AFAICS there is no sensible use case that requires standby registration.

I disagree.  You keep analyzing away the cases that require standby
registration, but I don't believe that they're not real.  Aidan Van
Dyk's case upthread of wanting to make sure that the standby is up and
replicating synchronously before the master starts processing
transactions seems perfectly legitimate to me.  Sure, it's paranoid,
but so what?  We're all about paranoia, at least as far as data loss
is concerned.  So the "wait forever" case is, in my opinion,
sufficient to demonstrate that we need it, but it's not even my
primary reason for wanting to have it.

The most important reason why I think we should have standby
registration is for simplicity of configuration.  Yes, it adds another
configuration file, but that configuration file contains ALL of the
information about which standbys are synchronous.  Without standby
registration, this information will inevitably be split between the
master config and the various slave configs and you'll have to look at
all the configurations to be certain you understand how it's going to
end up working.  As a particular manifestation of this, and as
previously argued and +1'd upthread, the ability to change the set of
standbys to which the master is replicating synchronously without
changing the configuration on the master or any of the existing slaves
seems seems dangerous.

Another reason why I think we should have standby registration is to
allow eventually allow the "streaming WAL backwards" configuration
which has previously been discussed.  IOW, you could stream the WAL to
the slave in advance of fsync-ing it on the master.  After a power
failure, the machines in the cluster can talk to each other and figure
out which one has the furthest-advanced WAL pointer and stream from
that machine to all the others.  This is an appealing configuration
for people using sync rep because it would allow the fsyncs to be done
in parallel rather than sequentially as is currently necessary - but
if you're using it, you're certainly not going to want the master to
enter normal running without waiting to hear from the slave.

Just to be clear, that is a list of three independent reasons any one
of which I think is sufficient for wanting standby registration.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Heikki Linnakangas


On 20/09/10 15:50, Simon Riggs wrote:

On Mon, 2010-09-20 at 15:16 +0300, Heikki Linnakangas wrote:

On 20/09/10 12:17, Simon Riggs wrote:

err... what is the difference between a timeout and stonith?


STONITH ("Shoot The Other Node In The Head") means that the other node
is somehow disabled so that it won't unexpectedly come back alive. A
timeout means that the slave hasn't been seen for a while, but it might
reconnect just after the timeout has expired.


You've edited my reply to change the meaning of what was a rhetorical
question, as well as completely ignoring the main point of my reply.

Please respond to the main point: Following some thought and analysis,
AFAICS there is no sensible use case that requires standby registration.


Ok, I had completely missed your point then.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Simon Riggs

On Mon, 2010-09-20 at 15:16 +0300, Heikki Linnakangas wrote:
> On 20/09/10 12:17, Simon Riggs wrote:
> > err... what is the difference between a timeout and stonith?
> 
> STONITH ("Shoot The Other Node In The Head") means that the other node 
> is somehow disabled so that it won't unexpectedly come back alive. A 
> timeout means that the slave hasn't been seen for a while, but it might 
> reconnect just after the timeout has expired.

You've edited my reply to change the meaning of what was a rhetorical
question, as well as completely ignoring the main point of my reply.

Please respond to the main point: Following some thought and analysis,
AFAICS there is no sensible use case that requires standby registration.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Heikki Linnakangas


On 20/09/10 12:17, Simon Riggs wrote:

err... what is the difference between a timeout and stonith?


STONITH ("Shoot The Other Node In The Head") means that the other node 
is somehow disabled so that it won't unexpectedly come back alive. A 
timeout means that the slave hasn't been seen for a while, but it might 
reconnect just after the timeout has expired.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_comments

2010-09-20 Thread Robert Haas

On Mon, Sep 20, 2010 at 1:07 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> In view of the foregoing problems, I'd like to propose adding a new
>> system view, tentatively called pg_comments, which lists all of the
>> comments for everything in the system in such a way that it's
>> reasonably possible to do further filtering out the output in ways
>> that you might care about; and which also gives objects the names and
>> types in a format that matches what the COMMENT command will accept as
>> input.  Patch attached.
>
> Unless you propose to break psql's hard-won backwards compatibility,
> this isn't going to accomplish anything towards making describe.c
> simpler or shorter. Also, it seems to me that what you've mostly done
> is to move complexity from describe.c (where the query can be fixed
> easily if it's found to be broken) to system_views.sql (where it cannot
> be changed without an initdb).

Those are legitimate gripes, but...

> How about improving the query in-place in describe.c instead?

...I still don't care much for this option.  It doesn't do anything to
easy the difficulty of ad-hoc queries, which I think is important (and
seems likely to be even more important for security labels - because
people who use that feature at all are going to label the heck out of
everything, whereas comments are never strictly necessary), and it
isn't useful for clients other than psql.  Most of this code hasn't
been touched since 2002, despite numerous, relevant changes since
then.  You could take as support for your position that we need the
ability to fix future bugs without initdb, but my reading of it is
that that code is just too awful to be easily maintained and so no one
has bothered.

(It also supports my previous contention that we need a way to make
minor system catalog updates without forcing initdb, but that's a
problem for another day.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Report: removing the inconsistencies in our CVS->git conversion

2010-09-20 Thread Magnus Hagander

On Sun, Sep 19, 2010 at 18:52, Tom Lane  wrote:
> Andrew Dunstan  writes:
>> On 09/19/2010 12:25 PM, Tom Lane wrote:
>>> # We don't want to change line numbers, so we simply reduce the keyword
>>> # string to the file pathname part.  For example,
>>> # $PostgreSQL: pgsql/src/port/unsetenv.c,v 1.12 2010/09/07 14:10:30 momjian 
>>> Exp $
>>> # becomes
>>> # $PostgreSQL: pgsql/src/port/unsetenv.c,v 1.12 2010/09/07 14:10:30 momjian 
>>> Exp $
>
>> These before and after lines look identical to me.
>
> Sigh ... obviously didn't finish editing the comment :-(
> Of course the last line should read
>
> # src/port/unsetenv.c

I've applied those to my repo, and am now re-running a final
conversion before we do the "live one".


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Simon Riggs

On Mon, 2010-09-20 at 09:27 +0300, Heikki Linnakangas wrote:
> On 18/09/10 22:59, Robert Haas wrote:
> > On Sat, Sep 18, 2010 at 4:50 AM, Simon Riggs  wrote:
> >> Waiting might sound attractive. In practice, waiting will make all of
> >> your connections lock up and it will look to users as if their master
> >> has stopped working as well. (It has!). I can't imagine why anyone would
> >> ever want an option to select that; its the opposite of high
> >> availability. Just sounds like a serious footgun.
> >
> > Nevertheless, it seems that some people do want exactly that behavior,
> > no matter how crazy it may seem to you.
> 
> Yeah, I agree with both of you. I have a hard time imaging a situation 
> where you would actually want that. It's not high availability, it's 
> high durability. When a transaction is acknowledged as committed, you 
> know it's never ever going to disappear even if a meteor strikes the 
> current master server within the next 10 milliseconds. In practice, 
> people want high availability instead.
> 
> That said, the timeout option also feels a bit wishy-washy to me. With a 
> timeout, acknowledgment of a commit means "your transaction is safely 
> committed in the master and slave. Or not, if there was some glitch with 
> the slave". That doesn't seem like a very useful guarantee; if you're 
> happy with that why not just use async replication?
> 
> However, the "wait forever" behavior becomes useful if you have a 
> monitoring application outside the DB that decides when enough is enough 
> and tells the DB that the slave can be considered dead. So "wait 
> forever" actually means "wait until I tell you that you can give up". 
> The monitoring application can STONITH to ensure that the slave stays 
> down, before letting the master proceed with the commit.

err... what is the difference between a timeout and stonith? None. We
still proceed without the slave in both cases after the decision point. 

In all cases, we would clearly have a user accessible function to stop
particular sessions, or all sessions, from waiting for standby to
return.

You would have 3 choices:
* set automatic timeout
* set wait forever and then wait for manual resolution
* set wait forever and then trust to external clusterware

Many people have asked for timeouts and I agree it's probably the
easiest thing to do if you just have 1 standby.

> With that in mind, we have to make sure that a transaction that's 
> waiting for acknowledgment of the commit from a slave is woken up if the 
> configuration changes.

There's a misunderstanding here of what I've said and its a subtle one.

My patch supports a timeout of 0, i.e. wait forever. Which means I agree
that functionality is desired and should be included. This operates by
saying that if a currently-connected-standby goes down we will wait
until the timeout. So I agree all 3 choices should be available to
users.

Discussion has been about what happens to ought-to-have-been-connected
standbys. Heikki had argued we need standby registration because if a
server *ought* to have been there, yet isn't currently there when we
wait for sync rep, we would still wait forever for it to return. To do
this you require standby registration.

But there is a hidden issue there: If you care about high availability
AND sync rep you have two standbys. If one goes down, the other is still
there. In general, if you want high availability on N servers then you
have N+1 standbys. If one goes down, the other standbys provide the
required level of durability and we do not wait.

So the only case where standby registration is required is where you
deliberately choose to *not* have N+1 redundancy and then yet still
require all N standbys to acknowledge. That is a suicidal config and
nobody would sanely choose that. It's not a large or useful use case for
standby reg. (But it does raise the question again of whether we need
quorum commit).

My take is that if the above use case occurs it is because one standby
has just gone down and the standby is, for a hopefully short period, in
a degraded state and that the service responds to that. So in my
proposal, if a standby is not there *now* we don't wait for it. 

Which cuts out a huge bag of code, specification and such like that
isn't required to support sane use cases. More stuff to get wrong and
regret in later releases. The KISS principle, just like we apply in all
other cases.

If we did have standby registration, then I would implement it in a
table, not in an external config file. That way when we performed a
failover the data would be accessible on the new master. But I don't
suggest we have CREATE/ALTER STANDBY syntax. We already have
CREATE/ALTER SERVER if we wanted to do it in SQL. If we did that, ISTM
we should choose functions.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to yo

Re: [HACKERS] pgxs docdir question

2010-09-20 Thread Dimitri Fontaine

Tom Lane  writes:
> Devrim =?ISO-8859-1?Q?G=DCND=DCZ?=  writes:
>> Where does PGXS makefile get /usr/share/doc/pgsql/contrib directory
>> from?
>
>> While building 3rd party RPMs using PGXS, even if I specify docdir in
>> Makefile, README.* files are installed to this directory, which breaks
>> parallel installation path as of 9.0+
>
> Maybe you need to fool with MODULEDIR.  See
> http://archives.postgresql.org/pgsql-committers/2010-01/msg00025.php

Well it's been working fine in debian without that for a long time
now. I've taken the liberty to CC Martin Pitt, because I don't have the
time to look at how things are done exactly in his debian packaging
there.

  http://bazaar.launchpad.net/%7Epitti/postgresql/common/files
  https://code.launchpad.net/postgresql

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuring synchronous replication

2010-09-20 Thread Markus Wanner


Hi,

On 09/17/2010 01:56 PM, Fujii Masao wrote:

And standby registration is required when we support "wait forever when
synchronous standby isn't connected at the moment" option that Heikki
explained upthread.


That requirement can be reduced to say that the master only needs to 
known how many synchronous standbys *should* be connected.


IIUC that's pretty much exactly the quorum_commit GUC that Simon 
proposed, because it doesn't make sense to have more synchronous 
standbys connected than quorum_commit (as Simon pointed out downthread).


I'm unsure about what's better, the full list (giving a good overview, 
but more to configure) or the single sum GUC (being very flexible and 
closer to how things work internally). But that seems to be a UI 
question exclusively.



Regarding the "wait forever" option: I don't think continuing is a 
viable alternative, as it silently ignores the requested level of 
persistence. The only alternative I can see is to abort with an error. 
As far as comparison is allowed, that's what Postgres-R currently does 
if there's no majority of nodes. It allows to emit an error message and 
helpful hints, as opposed to letting the admin figure out what and where 
it's hanging. Not throwing false errors has the same requirements as 
"waiting forever", so that's an orthogonal issue, IMO.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres Licensing

2010-09-20 Thread Vaibhav Kaushal

You seem to be working for EnterpriseDB, which is a company specializing on
postgres. So how does EnterpriseDB sell the advanced server? By modifying
it, I guess! So that is something similar I want to do. Getting a few
dollars for some hard work is not bad for me. Plus I love to find new
things, so it would be fun as well.

I would surely include the PostgreSQL licence in the product (if I get
successful to) and sell it to a few peopl who are looking for some specific
features and are pissed off with the great MySQL.

Thanks to both DAVE and Heikki :)

-Vaibhav (*_*)



On Sun, Sep 19, 2010 at 11:58 PM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> On 20/09/10 09:48, Vaibhav Kaushal wrote:
>
>> 1. PostgreSQL can be distributed freely according to the license terms.
>> Can
>> it be sold (for a price) without changing anything in the source?
>>
>
> Yes.
>
> You will have a hard time finding anyone to buy it, though, because you can
> download it for free from the PostgreSQL website.
>
>
>  2. Does the license restrict me from adding my closed source additions to
>> the project and then sell the product? I want to add in a few files here
>> and
>> there which would be closed source in nature, while all the changes made
>> to
>> the original files will be open, and then sell the modified database with
>> a
>> dual license. Is this possible?
>>
>
> In general, yes. I don't know what exactly you mean by the dual license,
> but you are free to mix proprietary code with the PostgreSQL sources, and
> sell or distribute for free the combined product with or without sources.
> The only requirement of the PostgreSQL license is that all copies must
> include the copyright notices and the license text.
>
> (Disclaimer: I am not a lawyer)
>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>

Re: [HACKERS] Postgres Licensing

2010-09-20 Thread Dave Page

On Mon, Sep 20, 2010 at 7:48 AM, Vaibhav Kaushal
 wrote:
> May be this is the wrong place to ask the question. Still, answer me if
> someone can or please redirect me to some place where it can be answered. My
> questions are:
>
> 1. PostgreSQL can be distributed freely according to the license terms. Can
> it be sold (for a price) without changing anything in the source?

Yes.

> 2. Does the license restrict me from adding my closed source additions to
> the project and then sell the product? I want to add in a few files here and
> there which would be closed source in nature, while all the changes made to
> the original files will be open, and then sell the modified database with a
> dual license. Is this possible?

You should check with your own counsel of course (I am not a lawyer),
but essentially the licence allows you produce derivative
closed-source products and release them under different licences as
long as the terms of the original licence are met (which basically
means you can't sue UC Berkeley, or remove the original
licence/copyright notices).

> May be you guys are hard core OSS enthusiasts and may flame me. I request
> not to and please consider my question.

We like people building cool stuff with our code - and like the
freedom to do so that our licence allows.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

95 matches

Mail list logo