Re: [HACKERS] 9.4 regression

2013-08-07 Thread Andres Freund
On 2013-08-07 23:55:22 -0500, Jon Nelson wrote:
> > - enable log_checkpoints, post the lines spat out together with the results
> > - run pgbench without reinitializing the cluster/pgbench tables
> >   inbetween?
> 
> When I do this (as I note below), the performance difference (for me)
> disappears.

Ok. In this case the files have been written to and fsync()ed before the
test is run.

> > Basically, I'd like to know whether its the initialization that's slow
> > (measurable because we're regularly creating new files because of a too
> > low checkpoint_segments) or whether it's the actual writes.
> 
> What I've found so far is very confusing.
> I start by using initdb to initialize a temporary cluster, copy in a
> postgresql.conf (with the modifcations from Thom Brown tweaked for my
> small laptop), start the cluster, create a test database, initialize
> it with pgbench, and then run. I'm also only running for two minutes
> at this time.

Hm. In a 2 minutes run there won't be any checkpoints..

> Every time I test, the non-fallocate version is faster. I modifed
> xlog.c to use posix_fallocate (or not) based on an environment
> variable.
> Once the WAL files have been rewritten at least once, then it doesn't
> seem to matter which method is used to allocate them (although
> posix_fallocate seems to have a slight edge). I even went to far as to
> modify the code to posix_fallocate the file *and then zero it out
> anyway*, and it was almost as fast as zeroing alone, and faster than
> using posix_fallocate alone.

Ok, so it seems there's performance issues with allocating the data even
though the allocation supposedly has already been made. Access by
several backends may play a role here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Jon Nelson
On Wed, Aug 7, 2013 at 10:05 PM, Andres Freund  wrote:
> On 2013-08-07 20:23:55 +0100, Thom Brown wrote:
>> >>> 269e78 was the commit immediately after 8800d8, so it appears that
>> >>> introduced the regression.
>> >>>
>> >>> "Use posix_fallocate() for new WAL files, where available."
>> >>
>> >> This is curious. Could you either run a longer test before/after the
>> >> commit or reduce checkpoint_timeout to something like 3min?
>> >
>> > Okay, I'll rerun the test for both those commits at 1 hour each with
>> > checkpoint_timeout set at 3mins, but with all other configuration
>> > settings the same
>>
>> Results
>> (checkpoint_timeout = 3min)
>>
>> pgbench -j 80 -c 80 -T 3600
>>
>> 269e78: 606.268013
>> 8800d8: 779.583129
>
> Ok, so the performance difference is lower. But, it seems to be still to
> high to be explaineable by performance differences during
> initialization/fallocate.
> In a very quick test, I don't see the same on my laptop. I am currently
> travelling and I don't have convenient access to anything else.
>
> Could you:
> - run filefrag on a couple of wal segments of both clusters after the
>   test and post the results here?

For me, there is no meaningful difference, but I have a relatively
fresh filesystem (ext4) with lots of free space.

> - enable log_checkpoints, post the lines spat out together with the results
> - run pgbench without reinitializing the cluster/pgbench tables
>   inbetween?

When I do this (as I note below), the performance difference (for me)
disappears.

> Basically, I'd like to know whether its the initialization that's slow
> (measurable because we're regularly creating new files because of a too
> low checkpoint_segments) or whether it's the actual writes.

What I've found so far is very confusing.
I start by using initdb to initialize a temporary cluster, copy in a
postgresql.conf (with the modifcations from Thom Brown tweaked for my
small laptop), start the cluster, create a test database, initialize
it with pgbench, and then run. I'm also only running for two minutes
at this time.

Every time I test, the non-fallocate version is faster. I modifed
xlog.c to use posix_fallocate (or not) based on an environment
variable.
Once the WAL files have been rewritten at least once, then it doesn't
seem to matter which method is used to allocate them (although
posix_fallocate seems to have a slight edge). I even went to far as to
modify the code to posix_fallocate the file *and then zero it out
anyway*, and it was almost as fast as zeroing alone, and faster than
using posix_fallocate alone.

>> Jon, here are the test results you asked for:
>>
>> $ for i in 1 2 5 10 100; do ./test_fallocate foo $i 1; done
>> method: classic. 1 open/close iterations, 1 rewrite in 0.9157s
>> method: posix_fallocate. 1 open/close iterations, 1 rewrite in 0.5314s
>> method: glibc emulation. 1 open/close iterations, 1 rewrite in 0.6018s
>> method: classic. 2 open/close iterations, 1 rewrite in 1.1417s
>> method: posix_fallocate. 2 open/close iterations, 1 rewrite in 0.6505s
>> method: glibc emulation. 2 open/close iterations, 1 rewrite in 1.8900s
>> method: classic. 5 open/close iterations, 1 rewrite in 3.6490s
>> method: posix_fallocate. 5 open/close iterations, 1 rewrite in 1.9841s
>> method: glibc emulation. 5 open/close iterations, 1 rewrite in 3.1053s
>> method: classic. 10 open/close iterations, 1 rewrite in 5.7562s
>> method: posix_fallocate. 10 open/close iterations, 1 rewrite in 3.2015s
>> method: glibc emulation. 10 open/close iterations, 1 rewrite in 7.1426s
>> method: classic. 100 open/close iterations, 1 rewrite in 64.9483s
>> method: posix_fallocate. 100 open/close iterations, 1 rewrite in 36.3748s
>> method: glibc emulation. 100 open/close iterations, 1 rewrite in 64.8386s
>
> Ok, this seems to indicate that fallocate works nicely. Jon, wasn't
> there a version of the test that rewrote the file afterwards?

Yes. If you use a different number besides '1' as the third argument
in the command line above, it will rewrite the file that many times.

-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StrategyGetBuffer optimization, take 2

2013-08-07 Thread Amit Kapila


> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
> ow...@postgresql.org] On Behalf Of Merlin Moncure
> Sent: Thursday, August 08, 2013 12:09 AM
> To: Andres Freund
> Cc: PostgreSQL-development; Jeff Janes
> Subject: Re: [HACKERS] StrategyGetBuffer optimization, take 2
> 
> On Wed, Aug 7, 2013 at 12:07 PM, Andres Freund 
> wrote:
> > On 2013-08-07 09:40:24 -0500, Merlin Moncure wrote:
> >> > I don't think the unlocked increment of nextVictimBuffer is a good
> idea
> >> > though. nextVictimBuffer jumping over NBuffers under concurrency
> seems
> >> > like a recipe for disaster to me. At the very, very least it will
> need a
> >> > good wad of comments explaining what it means and how you're
> allowed to
> >> > use it. The current way will lead to at least bgwriter accessing a
> >> > nonexistant/out of bounds buffer via StrategySyncStart().
> >> > Possibly it won't even save that much, it might just increase the
> >> > contention on the buffer header spinlock's cacheline.
> >>
> >> I agree; at least then it's not unambiguously better. if you (in
> >> effect) swap all contention on allocation from a lwlock to a
> spinlock
> >> it's not clear if you're improving things; it would have to be
> proven
> >> and I'm trying to keep things simple.
> >
> > I think converting it to a spinlock actually is a good idea, you just
> > need to expand the scope a bit.
> 
> all right: well, I'll work up another version doing full spinlock and
> maybe see things shake out in performance.
> 
> > FWIW, I am not convinced this is the trigger for the problems you're
> > seing. It's a good idea nonetheless though.
> 
> I have some very strong evidence that the problem is coming out of the
> buffer allocator.  Exhibit A is that vlad's presentation of the
> problem was on a read only load (if not allocator lock, then what?).
> Exhibit B is that lowering shared buffers to 2gb seems to have (so
> far, 5 days in) fixed the issue.  This problem shows up on fast
> machines with fast storage and lots of cores.  So what I think is
> happening is that usage_count starts creeping up faster than it gets
> cleared by the sweep with very large buffer settings which in turn
> causes the 'problem' buffers to be analyzed for eviction more often.

  Yes one idea which was discussed previously is to not increase usage
count, every time buffer is pinned.
  I am also working on some of the optimizations on similar area, which you
can refer here:
 
http://www.postgresql.org/message-id/006e01ce926c$c7768680$56639380$@kapila@
huawei.com


> What is not as clear is if the proposed optimizations will fix the
> problem -- I'd have to get approval to test and confirm them in
> production which seems unlikely at this juncture; that's why I'm
> trying to keep things 'win-win' so as to not have to have them be
> accepted on that basis.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] confusing error message

2013-08-07 Thread Tom Lane
Peter Eisentraut  writes:
> I'm having trouble parsing this:
> ERROR:  aggregate functions are not allowed in FROM clause of their own query 
> level

> The example in the regression tests is:

> -- LATERAL can be used to put an aggregate into the FROM clause of its query
> select 1 from tenk1 a, lateral (select max(a.unique1) from int4_tbl b) ss;
> ERROR:  aggregate functions are not allowed in FROM clause of their own query 
> level
> LINE 1: select 1 from tenk1 a, lateral (select max(a.unique1) from i...
>^
> I think the "own query level" of the max aggregate function in this case
> is the subquery "ss", and so it's not in the FROM clause of its own
> query level.

No.  The max() aggregate function is on a column of table "a", so it
belongs to the outer query level, and it is within the FROM clause of
that level.  (Don't blame me, this is per SQL spec ...)

> It's understandable why this is not allowed, but I don't think the error
> message explains it.  Could we come up with a better wording?

I'm not sure either.  The main point here is that an aggregate's semantic
level is normally the natural level of its argument expression, not the
textual location of the call.  The "natural level" is the level of the
lowest-level variable used in the argument.  But what if the argument is
variable-free?  Then you *do* take the textual location as the semantic
level.  This is all sufficiently bizarre that I don't know if there's an
easy explanation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we remove "not fast" promotion at all?

2013-08-07 Thread Michael Paquier
On Thu, Aug 8, 2013 at 12:24 PM, Andres Freund  wrote:
> On 2013-08-08 06:40:00 +0900, Michael Paquier wrote:
>> On Tue, Aug 6, 2013 at 8:05 PM, Tomonari Katsumata
>>  wrote:
>> > Hi,
>> >
>> > 2013/8/6 Tom Lane 
>> >>
>> >> Fujii Masao  writes:
>> >> > On Tue, Aug 6, 2013 at 11:40 AM, Andres Freund 
>> >> > wrote:
>> >> >> FWIW I'd rather keep plain promotion for a release or two. TBH, I have
>> >> >> a
>> >> >> bit of trust issues regarding the new method, and I'd like to be able
>> >> >> to
>> >> >> test potential issues against a stock postgres by doing a normal
>> >> >> instead
>> >> >> of a fast promotion.
>> >>
>> >> > So we should add new option specifying the promotion mode, into pg_ctl?
>> >> > Currently pg_ctl cannot trigger the normal promotion.
>> >>
>> >> It would be silly to add such an option if we want to remove the old mode
>> >> in a release or two.
>> >>
>> >> I think what Andres is suggesting is to leave it as-is for 9.4 and then
>> >> remove the old code in 9.5 or 9.6.  Which seems prudent to me.
>> >>
>> > How about giving trigger_file an ability to chose "fast promote" and 
>> > "normal
>> > promote"
>> > like the triggerfile of pg_standby.
>> > It means that if the contents of the trigger_file is empty or 'smart' then
>> > do "normal promote",
>> > and it's 'fast' then do "fast promote".
>> > I think this change would be smaller than change to pg_ctl.
>> > And this would allow us to treat ${PGDATA}/promote and trigger_file only.
>> > (because ${PGDATA}/fast_promote is not created automatically)
>> Indeed, this would be the way to go to have an extensible format for
>> other promotion modes or other actions that could be triggered by a
>> standby. So why not taking the approach suggested by Katsumata-san
>> now? One single file to rule them all, in this case called promote,
>> including a keyword indicating the promotion action to take. This
>> could be controlled by pg_ctl entirely, and opens the door to extra
>> possible modes.
>
> Why are we suddenly trying to make this even more complicated? It's too
> late to redesign stuff without very good evidence that it's
> needed. Renaming trigger files and changing their format certainly
> doesn't seem appropriate post-beta.
>
> Let's just leave this as is, and remove the code in 9.4/9.5.
Sorry. I should have been clearer. I meant that for 9.4~ only. For 9.3
yes it's too late.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Bruce Momjian
On Wed, Aug  7, 2013 at 04:39:49PM -0700, Josh Berkus wrote:
> > OK, so the process is independent of commit activity. You realize that
> > if someone significantly modifies a patch we already have them in the
> > commit message as an author and on the release note item, right?  So you
> > are really looking for reviews that modify the patch but not enough for
> > a committer to include their name in the commit message as an author.
> 
> Oh, good point, I can look at the commit messages for where I don't need
> to bother.   However, you pointed out that *during* CF1, the committers
> *didn't know* that they were supposed to include reviewers who did
> "major work" in the commit message.  And they might miss them in the future.

Well, the clear way to do this would be to ask the committers if they
can reliably take on this job.  You are right for CF1 they treated
reviewer patch modifications just like anyone else, but of course, the
larger question is whether you _should_ treat the reviewers different,
because other people are reviewers just not recorded as CF reviewers. 
Another question is whether committers are going to recognize CF
reviewers vs. ordinary patch modifiers.

> > Anyone can commit patches to the release notes.  I am unlikely to do it,
> > as I lack confidence in the process, for reasons already outlined.
> 
> Bruce, you are steadfastly resistant to change of any kind.

Perhaps I am pessimistic, but I need to have confidence in the process,
and at this point, I don't, and considering how long it took for me to
get an explanation of the process, this seems prudent.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updatable/deletable terminology

2013-08-07 Thread Karl O. Pinc
On 08/07/2013 08:19:03 PM, Peter Eisentraut wrote:
> We have these two error messages:
> 
> To make the view updatable, provide an unconditional ON UPDATE DO
> INSTEAD rule or an INSTEAD OF UPDATE trigger.
> 
> and
> 
> To make the view updatable, provide an unconditional ON DELETE DO
> INSTEAD rule or an INSTEAD OF DELETE trigger.
> 
> I think it's a bit strange to claim that adding a DELETE rule/trigger
> makes a view *updatable*.  I suspect someone thought they would apply
> the term "updatable" in an SQL standard sense, but that seems
> backwards,
> because you get to these error conditions exactly because the view as
> defined was not Updatable(tm).

Isn't the problem here that you need a word, instead of "updateable",
to indicate that table content may be changed (insert/update/
delete) through the view?

So...   "to allow the view to influence table content"


Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we remove "not fast" promotion at all?

2013-08-07 Thread Andres Freund
On 2013-08-08 06:40:00 +0900, Michael Paquier wrote:
> On Tue, Aug 6, 2013 at 8:05 PM, Tomonari Katsumata
>  wrote:
> > Hi,
> >
> > 2013/8/6 Tom Lane 
> >>
> >> Fujii Masao  writes:
> >> > On Tue, Aug 6, 2013 at 11:40 AM, Andres Freund 
> >> > wrote:
> >> >> FWIW I'd rather keep plain promotion for a release or two. TBH, I have
> >> >> a
> >> >> bit of trust issues regarding the new method, and I'd like to be able
> >> >> to
> >> >> test potential issues against a stock postgres by doing a normal
> >> >> instead
> >> >> of a fast promotion.
> >>
> >> > So we should add new option specifying the promotion mode, into pg_ctl?
> >> > Currently pg_ctl cannot trigger the normal promotion.
> >>
> >> It would be silly to add such an option if we want to remove the old mode
> >> in a release or two.
> >>
> >> I think what Andres is suggesting is to leave it as-is for 9.4 and then
> >> remove the old code in 9.5 or 9.6.  Which seems prudent to me.
> >>
> > How about giving trigger_file an ability to chose "fast promote" and "normal
> > promote"
> > like the triggerfile of pg_standby.
> > It means that if the contents of the trigger_file is empty or 'smart' then
> > do "normal promote",
> > and it's 'fast' then do "fast promote".
> > I think this change would be smaller than change to pg_ctl.
> > And this would allow us to treat ${PGDATA}/promote and trigger_file only.
> > (because ${PGDATA}/fast_promote is not created automatically)
> Indeed, this would be the way to go to have an extensible format for
> other promotion modes or other actions that could be triggered by a
> standby. So why not taking the approach suggested by Katsumata-san
> now? One single file to rule them all, in this case called promote,
> including a keyword indicating the promotion action to take. This
> could be controlled by pg_ctl entirely, and opens the door to extra
> possible modes.

Why are we suddenly trying to make this even more complicated? It's too
late to redesign stuff without very good evidence that it's
needed. Renaming trigger files and changing their format certainly
doesn't seem appropriate post-beta.

Let's just leave this as is, and remove the code in 9.4/9.5.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Andres Freund
On 2013-08-07 20:23:55 +0100, Thom Brown wrote:
> >>> 269e78 was the commit immediately after 8800d8, so it appears that
> >>> introduced the regression.
> >>>
> >>> "Use posix_fallocate() for new WAL files, where available."
> >>
> >> This is curious. Could you either run a longer test before/after the
> >> commit or reduce checkpoint_timeout to something like 3min?
> >
> > Okay, I'll rerun the test for both those commits at 1 hour each with
> > checkpoint_timeout set at 3mins, but with all other configuration
> > settings the same
> 
> Results
> (checkpoint_timeout = 3min)
> 
> pgbench -j 80 -c 80 -T 3600
> 
> 269e78: 606.268013
> 8800d8: 779.583129

Ok, so the performance difference is lower. But, it seems to be still to
high to be explaineable by performance differences during
initialization/fallocate.
In a very quick test, I don't see the same on my laptop. I am currently
travelling and I don't have convenient access to anything else.

Could you:
- run filefrag on a couple of wal segments of both clusters after the
  test and post the results here?
- enable log_checkpoints, post the lines spat out together with the results
- run pgbench without reinitializing the cluster/pgbench tables
  inbetween?

Basically, I'd like to know whether its the initialization that's slow
(measurable because we're regularly creating new files because of a too
low checkpoint_segments) or whether it's the actual writes.

> Jon, here are the test results you asked for:
> 
> $ for i in 1 2 5 10 100; do ./test_fallocate foo $i 1; done
> method: classic. 1 open/close iterations, 1 rewrite in 0.9157s
> method: posix_fallocate. 1 open/close iterations, 1 rewrite in 0.5314s
> method: glibc emulation. 1 open/close iterations, 1 rewrite in 0.6018s
> method: classic. 2 open/close iterations, 1 rewrite in 1.1417s
> method: posix_fallocate. 2 open/close iterations, 1 rewrite in 0.6505s
> method: glibc emulation. 2 open/close iterations, 1 rewrite in 1.8900s
> method: classic. 5 open/close iterations, 1 rewrite in 3.6490s
> method: posix_fallocate. 5 open/close iterations, 1 rewrite in 1.9841s
> method: glibc emulation. 5 open/close iterations, 1 rewrite in 3.1053s
> method: classic. 10 open/close iterations, 1 rewrite in 5.7562s
> method: posix_fallocate. 10 open/close iterations, 1 rewrite in 3.2015s
> method: glibc emulation. 10 open/close iterations, 1 rewrite in 7.1426s
> method: classic. 100 open/close iterations, 1 rewrite in 64.9483s
> method: posix_fallocate. 100 open/close iterations, 1 rewrite in 36.3748s
> method: glibc emulation. 100 open/close iterations, 1 rewrite in 64.8386s

Ok, this seems to indicate that fallocate works nicely. Jon, wasn't
there a version of the test that rewrote the file afterwards?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXactId concept underdocumented

2013-08-07 Thread Andres Freund
Hi,


On 2013-08-07 21:25:06 -0400, Peter Eisentraut wrote:
> The new MultiXactId concept appears in a number of user-facing error
> messages, including in the scary context of transaction ID wraparound,
> so it seems kind of important, but it doesn't appear to be documented in
> any user-facing places.  Should this be rectified?  Maybe some of the
> material from README.tuplock should be moved to the user manual.  Maybe
> a less jargony term could be found as well.

Multixacts aren't really new, they've been there since 8.1 or so. It's
just that their use has been a bit expanded and that their internal
format changed a bit.

WRT user facing messages: I think it'd suffice to explain the
anti-wraparound part of the docs that the same holds true for
multixacts. Doing a more detailed explanation seems to require a good
amount of low level explanations and thus seems unlikely to be actually
read.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] timeline signedness

2013-08-07 Thread Peter Eisentraut
WAL timelines are unsigned 32-bit integers everywhere, except the
replication parser (replication/repl_gram.y and
replication/repl_scanner.l) treats them as signed 32-bit integers.  It's
obviously a corner case, but it would be prudent to be correct about
this.  It should be easy to fix in those grammar files.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] confusing error message

2013-08-07 Thread Peter Eisentraut
I'm having trouble parsing this:

ERROR:  aggregate functions are not allowed in FROM clause of their own query 
level

The example in the regression tests is:

-- LATERAL can be used to put an aggregate into the FROM clause of its query
select 1 from tenk1 a, lateral (select max(a.unique1) from int4_tbl b) ss;
ERROR:  aggregate functions are not allowed in FROM clause of their own query 
level
LINE 1: select 1 from tenk1 a, lateral (select max(a.unique1) from i...
   ^
I think the "own query level" of the max aggregate function in this case
is the subquery "ss", and so it's not in the FROM clause of its own
query level.

It's understandable why this is not allowed, but I don't think the error
message explains it.  Could we come up with a better wording?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] MultiXactId concept underdocumented

2013-08-07 Thread Peter Eisentraut
The new MultiXactId concept appears in a number of user-facing error
messages, including in the scary context of transaction ID wraparound,
so it seems kind of important, but it doesn't appear to be documented in
any user-facing places.  Should this be rectified?  Maybe some of the
material from README.tuplock should be moved to the user manual.  Maybe
a less jargony term could be found as well.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] updatable/deletable terminology

2013-08-07 Thread Peter Eisentraut
We have these two error messages:

To make the view updatable, provide an unconditional ON UPDATE DO INSTEAD rule 
or an INSTEAD OF UPDATE trigger.

and

To make the view updatable, provide an unconditional ON DELETE DO INSTEAD rule 
or an INSTEAD OF DELETE trigger.

I think it's a bit strange to claim that adding a DELETE rule/trigger
makes a view *updatable*.  I suspect someone thought they would apply
the term "updatable" in an SQL standard sense, but that seems backwards,
because you get to these error conditions exactly because the view as
defined was not Updatable(tm).

Or perhaps "deletable" isn't such a good word here?

Maybe "To enable updates/deletions in the view, ..."?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Josh Berkus
On 08/07/2013 05:40 PM, Thom Brown wrote:
> On 8 August 2013 00:04, Thom Brown  wrote:
>> On 7 August 2013 23:40, Greg Stark  wrote:
>>> Did you report information about the system affected? What filesystem is it
>>> on? If it's ext4 does it have extents enabled?
>>
>> Yes, ext4.  It's using whatever the default options are, but running
>> lsattr on my data dir shows that extents are being used.

Would it be worth testing XFS?  If so, any particular options I should use?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Thom Brown
On 8 August 2013 00:04, Thom Brown  wrote:
> On 7 August 2013 23:40, Greg Stark  wrote:
>> Did you report information about the system affected? What filesystem is it
>> on? If it's ext4 does it have extents enabled?
>
> Yes, ext4.  It's using whatever the default options are, but running
> lsattr on my data dir shows that extents are being used.

I've just created an ext2 partition (so no extents), and did some
quick benchmarks of those 2 commits.  This is on the original system
(not my laptop), with the same config settings as before.  All tests
initialised with -s 20 and tested with -j 80 -c 80.

30 seconds:

8800d8: 1327.983018 / 1317.848514
269e78: 1388.009819 / 1285.929100

5 minutes:

8800d8: 1744.446592 / 1729.602706
269e78: 1677.561277 / 1677.967873

10 minutes:

8800d8: 1753.147609 / 1806.600335
269e78: 1815.862842 / 1804.906769

So this appears to be okay.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Has anybody been able to install Pg 9.3 beta .debs on Ubuntu 12.04

2013-08-07 Thread Ryan Kelly
On Wed, Aug 08/07/13, 2013 at 07:29:32PM -0400, Hannu Krosing wrote:
> Has anybody been able to install Pg 9.3 beta .debs on Ubuntu 12.04
Yes.

> when I try to select postgresql-9.3 (9,3~beta2-2.pgdg12.4+1) for
> installation I get error saying
> 
> ...
> The following packages have unmet dependencies:
>  postgresql-9.3 : Depends: postgresql-client-9.3 but it is not going to
> be installed
> E: Unable to correct problems, you have held broken packages.
I seem to recall not adding "9.3" explicitly to the sources.list line as
causing this problem. Is this you sources.list line:

deb http://apt.postgresql.org/pub/repos/apt/ precise-pgdg main 9.3

This is recommended by:
https://wiki.postgresql.org/wiki/Apt/FAQ#I_want_to_try_the_beta_version_of_the_next_PostgreSQL_release

-Ryan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Josh Berkus

> Michael who?

Blackwell, asssistant CFM for this CF.

> 9.4 CF1?  Where are you recording the names?  In the commitfest app?

Right now in a googledoc.  The CF app has no such capability now,
although Magnus' new app might in the future.

> OK, so the process is independent of commit activity. You realize that
> if someone significantly modifies a patch we already have them in the
> commit message as an author and on the release note item, right?  So you
> are really looking for reviews that modify the patch but not enough for
> a committer to include their name in the commit message as an author.

Oh, good point, I can look at the commit messages for where I don't need
to bother.   However, you pointed out that *during* CF1, the committers
*didn't know* that they were supposed to include reviewers who did
"major work" in the commit message.  And they might miss them in the future.

> Anyone can commit patches to the release notes.  I am unlikely to do it,
> as I lack confidence in the process, for reasons already outlined.

Bruce, you are steadfastly resistant to change of any kind.

>> That's why it'll be a social process.  Next week I'll be posting a list
>> of patches and reviewers from CF1 to this list for other people to
>> correct and expand.  Just like the code itself, if everybody reviews the
>> list of reviewers, it'll be as good as we can get it.
> 
> OK, at least that is a plan.  Is that your plan for the future too?

Sure.  It's the open source way.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Has anybody been able to install Pg 9.3 beta .debs on Ubuntu 12.04

2013-08-07 Thread Hannu Krosing
Hi

Has anybody been able to install Pg 9.3 beta .debs on Ubuntu 12.04

when I try to select postgresql-9.3 (9,3~beta2-2.pgdg12.4+1) for
installation I get error saying

...
The following packages have unmet dependencies:
 postgresql-9.3 : Depends: postgresql-client-9.3 but it is not going to
be installed
E: Unable to correct problems, you have held broken packages.



I have tried it earlier on a system with multiple extra repos defined and
so thought that this is my setup specific.

Today I tried it on fresh install of 12.04 with only pgdg PPA added and
still I can't add
any packages from 9.3 series.

I did successfully install versions 8.4, 9.0, 9.1 amd 9,2 side by side

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Bruce Momjian
On Wed, Aug  7, 2013 at 01:48:06PM -0700, Josh Berkus wrote:
> 
> > Well, reviewers on the bottom was just for 9.3 or 9.4, but the final
> > goal was to get reviewers who modified patches credited with the release
> > note items.  I asked how that was to be accomplished, and suggested that
> > the only practical way would be for every committer to check the patch
> > chain to see who else had modified the patch.  
> 
> Actually, it's not that hard.  Someone who modified the patch is going
> to post it to -hackers, and we have that all in the archives.  Michael
> and I are combing through the threads from CF1 now to get that list;
> you're talking a few hours effort at most.

Michael who?

9.4 CF1?  Where are you recording the names?  In the commitfest app?

> Of course, the actual patch author/committers need to verify that these
> people actually did a lot of *useful* work, but that isn't much effort
> from them.

OK, so the process is independent of commit activity. You realize that
if someone significantly modifies a patch we already have them in the
commit message as an author and on the release note item, right?  So you
are really looking for reviews that modify the patch but not enough for
a committer to include their name in the commit message as an author.

> > You suggested something about the commit-fest-manager doing it, and I
> > couldn't see how that would help because it has to be in the commit
> > message at the time the release notes are being written.  
> 
> Why?  You didn't provide *any* justification as to why the release notes
> could not include input *in addition to* commit messages.  In fact, the
> release notes *do* incorporate additional input, every year.

Yes, we can always add --- the problem is getting the content to add.

> > You said our
> > release note writing process was not written stone, and that we had to
> > do whatever it takes to get those names on the items in the release
> > notes.  At that point I pointed out that there was no consideration of
> > the effort necessary to accomplish this, and that's how we got here
> > today.
> 
> The only additional effort I'm asking of you, Bruce, is to accept
> patches on the release notes.  That really doesn't seem like an
> unreasonable request.

Anyone can commit patches to the release notes.  I am unlikely to do it,
as I lack confidence in the process, for reasons already outlined.

> > Yes, that is somewhat easy in that we can get the names from the
> > commit-fest app, but it doesn't include reviewers who replied via email
> > but did not record their names on the commit-fest app.  I can tell you
> > from my release note writing experience that a partial job in this area
> > is likely to get lots of negative feedback from people who are
> > excluded.
> 
> That's why it'll be a social process.  Next week I'll be posting a list
> of patches and reviewers from CF1 to this list for other people to
> correct and expand.  Just like the code itself, if everybody reviews the
> list of reviewers, it'll be as good as we can get it.

OK, at least that is a plan.  Is that your plan for the future too?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Thom Brown
On 7 August 2013 23:40, Greg Stark  wrote:
> Did you report information about the system affected? What filesystem is it
> on? If it's ext4 does it have extents enabled?

Yes, ext4.  It's using whatever the default options are, but running
lsattr on my data dir shows that extents are being used.
-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Jon Nelson
On Wed, Aug 7, 2013 at 4:18 PM, Kevin Grittner  wrote:
> Thom Brown  wrote:
>
>> pgbench -j 80 -c 80 -T 3600
>>
>> 269e78: 606.268013
>> 8800d8: 779.583129

I have also been running some tests and - as yet - they are
inconclusive. What I can say about them so far is that - at times -
they agree with these results and do show a performance hit.

I took the settings as posted and adjusted them slightly for my much
lower-powered personal laptop, changing checkpoint_completion_target
to 1.0 and checkpoint_timeout to 1min.

I am testing with the latest git HEAD, turning fallocate support on
and off by editing xlog.c directly. Furthermore, before each run I
would try to reduce the number of existing WAL files by making a "pre"
run with checkpoint_segments = 3 before changing it to 32.

For reasons I don't entirely understand, when WAL files are being
created (rather than recycled) I found a performance hit, but when
they were being recycled I got a slight performance gain (this may be
due to reduced fragmentation) but the gain was never more than 10% and
frequently less than that.

I can't explain - yet (if ever!) - why my initial tests (and many of
those done subsequently) showed improvement - it may very well have
something to do with how the code interacts with these settings.



-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Greg Stark
Did you report information about the system affected? What filesystem is it
on? If it's ext4 does it have extents enabled?

I think on ext3 or ext4 without extents it won't have any benefit but it
shouldn't really be any slower either since the libc implementation is very
similar to what we used to do.

I wouldn't expect the benefit from it to have a huge effect either. The
main benefit is actually the reduced fragmentation so it would be pretty
dependent on the filesystem it's on anyways. Avoiding stalls when creating
new wal files is also nice but shouldn't happen often enough n your
benchmarks to cause an effect on the average numbers.


Re: [HACKERS] Should we remove "not fast" promotion at all?

2013-08-07 Thread Michael Paquier
On Tue, Aug 6, 2013 at 8:05 PM, Tomonari Katsumata
 wrote:
> Hi,
>
> 2013/8/6 Tom Lane 
>>
>> Fujii Masao  writes:
>> > On Tue, Aug 6, 2013 at 11:40 AM, Andres Freund 
>> > wrote:
>> >> FWIW I'd rather keep plain promotion for a release or two. TBH, I have
>> >> a
>> >> bit of trust issues regarding the new method, and I'd like to be able
>> >> to
>> >> test potential issues against a stock postgres by doing a normal
>> >> instead
>> >> of a fast promotion.
>>
>> > So we should add new option specifying the promotion mode, into pg_ctl?
>> > Currently pg_ctl cannot trigger the normal promotion.
>>
>> It would be silly to add such an option if we want to remove the old mode
>> in a release or two.
>>
>> I think what Andres is suggesting is to leave it as-is for 9.4 and then
>> remove the old code in 9.5 or 9.6.  Which seems prudent to me.
>>
> How about giving trigger_file an ability to chose "fast promote" and "normal
> promote"
> like the triggerfile of pg_standby.
> It means that if the contents of the trigger_file is empty or 'smart' then
> do "normal promote",
> and it's 'fast' then do "fast promote".
> I think this change would be smaller than change to pg_ctl.
> And this would allow us to treat ${PGDATA}/promote and trigger_file only.
> (because ${PGDATA}/fast_promote is not created automatically)
Indeed, this would be the way to go to have an extensible format for
other promotion modes or other actions that could be triggered by a
standby. So why not taking the approach suggested by Katsumata-san
now? One single file to rule them all, in this case called promote,
including a keyword indicating the promotion action to take. This
could be controlled by pg_ctl entirely, and opens the door to extra
possible modes.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Kevin Grittner
Thom Brown  wrote:

> pgbench -j 80 -c 80 -T 3600
>
> 269e78: 606.268013
> 8800d8: 779.583129

As another data point I duplicated Thom's original tests:

max_connections = 500
shared_buffers = 4GB
effective_cache_size = 12GB
random_page_cost = 2.0
cpu_tuple_cost = 0.03
wal_buffers = 32MB
work_mem = 100MB
maintenance_work_mem = 512MB
checkpoint_segments = 32
checkpoint_timeout = 15min
checkpoint_completion_target = 0.8
commit_delay = 50
commit_siblings = 15

pgbench -i -s 20 pgbench
pgbench -j 80 -c 80 -T 1800 pgbench
(note the high contention pgbench setup)

On this system:

Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
  (hyperthreading enabled)
2 x 15k RPS drives in software RAID 1, ext4, write cache disabled
3.5.0-37-generic #58-Ubuntu SMP Mon Jul 8 22:07:55 UTC 2013 x86_64 x86_64 
x86_64 GNU/Linux

commit 8800d8061dd151d6556f5f8d58f8211fd830169f
number of transactions actually processed: 1298864
tps = 721.384612 (including connections establishing)
tps = 721.388938 (excluding connections establishing)

commit 269e780822abb2e44189afaccd6b0ee7aefa7ddd
number of transactions actually processed: 875900
tps = 486.524741 (including connections establishing)
tps = 486.527891 (excluding connections establishing)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Josh Berkus

> Well, reviewers on the bottom was just for 9.3 or 9.4, but the final
> goal was to get reviewers who modified patches credited with the release
> note items.  I asked how that was to be accomplished, and suggested that
> the only practical way would be for every committer to check the patch
> chain to see who else had modified the patch.  

Actually, it's not that hard.  Someone who modified the patch is going
to post it to -hackers, and we have that all in the archives.  Michael
and I are combing through the threads from CF1 now to get that list;
you're talking a few hours effort at most.

Of course, the actual patch author/committers need to verify that these
people actually did a lot of *useful* work, but that isn't much effort
from them.

> You suggested something about the commit-fest-manager doing it, and I
> couldn't see how that would help because it has to be in the commit
> message at the time the release notes are being written.  

Why?  You didn't provide *any* justification as to why the release notes
could not include input *in addition to* commit messages.  In fact, the
release notes *do* incorporate additional input, every year.

> You said our
> release note writing process was not written stone, and that we had to
> do whatever it takes to get those names on the items in the release
> notes.  At that point I pointed out that there was no consideration of
> the effort necessary to accomplish this, and that's how we got here
> today.

The only additional effort I'm asking of you, Bruce, is to accept
patches on the release notes.  That really doesn't seem like an
unreasonable request.

> Yes, that is somewhat easy in that we can get the names from the
> commit-fest app, but it doesn't include reviewers who replied via email
> but did not record their names on the commit-fest app.  I can tell you
> from my release note writing experience that a partial job in this area
> is likely to get lots of negative feedback from people who are
> excluded.

That's why it'll be a social process.  Next week I'll be posting a list
of patches and reviewers from CF1 to this list for other people to
correct and expand.  Just like the code itself, if everybody reviews the
list of reviewers, it'll be as good as we can get it.

> My point is this has to be done accurately.

It will be just as accurate as the current process, which, as you've
just pointed out, is not 100% accurate either.

> You have to distinguish between names at the end of the release notes,
> and names on release note items, and you have to tell us how this going
> to happen with reasonable effort.

You're making a mountain out of a molehill here.   I've already spent
more time arguing with you than it will take me to do the work.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: How to configer the pg_hba record which the database name with "\n" ?

2013-08-07 Thread Andrew Dunstan


On 08/07/2013 04:12 PM, Bruce Momjian wrote:

On Thu, Aug  1, 2013 at 07:26:38AM -0700, David Johnston wrote:

huxm wrote

  where there is a
newline(\n) in the name.

I can't imagine why you would want to use non-printing characters in a name,
especially a database name.  Even if the hba.conf file was able to interpret
it (which it probably cannot but I do not know for certain) client
interfaces are likely to have problems as well.  Most of these would not
think of interpolating a database identifier in that manner but instead
treat the name as a literal value.  Even when line-continuations are allowed
they are often cosmetic in nature and the resultant newline is discarded
during the pre-execution phase of the command interpreter.

Arguably having a check constraint on the catalog to prohibit such a name
would be more useful than trying to make such a construct functional.

I'd guess in the immediate term the users accessing this database would need
to have "all" as their target and then you use role-based authorization to
limit which specific databases are accessible.

I suppose the cleanest solution would be to allow a \n or a backslash
for line continuation, but I don't think pg_hba.conf supports those.




It doesn't. I really think this comes into the category of "don't do 
that!" The most we should do is document the pain that names with 
embedded newlines can cause.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: How to configer the pg_hba record which the database name with "\n" ?

2013-08-07 Thread Bruce Momjian
On Thu, Aug  1, 2013 at 07:26:38AM -0700, David Johnston wrote:
> huxm wrote
> >  where there is a
> > newline(\n) in the name.
> 
> I can't imagine why you would want to use non-printing characters in a name,
> especially a database name.  Even if the hba.conf file was able to interpret
> it (which it probably cannot but I do not know for certain) client
> interfaces are likely to have problems as well.  Most of these would not
> think of interpolating a database identifier in that manner but instead
> treat the name as a literal value.  Even when line-continuations are allowed
> they are often cosmetic in nature and the resultant newline is discarded
> during the pre-execution phase of the command interpreter.
> 
> Arguably having a check constraint on the catalog to prohibit such a name
> would be more useful than trying to make such a construct functional.
> 
> I'd guess in the immediate term the users accessing this database would need
> to have "all" as their target and then you use role-based authorization to
> limit which specific databases are accessible.

I suppose the cleanest solution would be to allow a \n or a backslash
for line continuation, but I don't think pg_hba.conf supports those.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Bruce Momjian
On Wed, Aug  7, 2013 at 12:39:01PM -0700, Josh Berkus wrote:
> Bruce,
> 
> > You are getting into some kind of loop where not wanting to expend
> > unlimited effort on something means, to you, that the person doesn't
> > think the goal is important.   Effort has to be balanced.  This is not
> > the first time I have seen such loops.  And why do you even care about
> > my opinion?
> 
> Aha, OK.  So you're talking about all the different things we might do
> to get more reviewers.  I'm only talking about adding reviewers to the
> bottom of the release notes, which is certainly a bounded activity of
> *very* limited effort.  Which is why I was confused and aghast at your
> talk of "unbounded work".

Well, reviewers on the bottom was just for 9.3 or 9.4, but the final
goal was to get reviewers who modified patches credited with the release
note items.  I asked how that was to be accomplished, and suggested that
the only practical way would be for every committer to check the patch
chain to see who else had modified the patch.  

You suggested something about the commit-fest-manager doing it, and I
couldn't see how that would help because it has to be in the commit
message at the time the release notes are being written.  You said our
release note writing process was not written stone, and that we had to
do whatever it takes to get those names on the items in the release
notes.  At that point I pointed out that there was no consideration of
the effort necessary to accomplish this, and that's how we got here
today.

> To be completely clear: I am talking only about the compromise discussed
> on this thread, namely:
> 
> a) listing reviewers who did "extensive work" as co-authors on the
> patch, and

See above --- I need to know how that is going to get to the release
note items _with_ reasonable effort.

> b) listing other reviewers at the bottom of the release notes.

Yes, that is somewhat easy in that we can get the names from the
commit-fest app, but it doesn't include reviewers who replied via email
but did not record their names on the commit-fest app.  I can tell you
from my release note writing experience that a partial job in this area
is likely to get lots of negative feedback from people who are
excluded.

As an example, I got a pg_upgrade patch fix for 9.2, thought it was
ugly, tried other methods, ended up doing the same thing the original
patch author did, but didn't mention the patch author because I had
forgotten I ended up with the same fix.  When the minor release notes
came out, the person complained on the hackers list, and Simon and Tom
had to apologize as I was away:


http://www.postgresql.org/message-id/CABRT9RAe3zxdins=BBysjqEPA8=nqnxwta4a0xm01pbdvbm...@mail.gmail.com

My point is this has to be done accurately.

> Per earlier discussions.  You started this thread by claiming that
> adding the reviewers to 9.4 would be too hard, and I argued that it
> would not and in fact I'm already working on it.  Nothing I've talked
> about in this thread has been about anything else.

You have to distinguish between names at the end of the release notes,
and names on release note items, and you have to tell us how this going
to happen with reasonable effort.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StrategyGetBuffer optimization, take 2

2013-08-07 Thread Atri Sharma
On Wed, Aug 7, 2013 at 10:37 PM, Andres Freund  wrote:
> On 2013-08-07 09:40:24 -0500, Merlin Moncure wrote:
>> > I don't think the unlocked increment of nextVictimBuffer is a good idea
>> > though. nextVictimBuffer jumping over NBuffers under concurrency seems
>> > like a recipe for disaster to me. At the very, very least it will need a
>> > good wad of comments explaining what it means and how you're allowed to
>> > use it. The current way will lead to at least bgwriter accessing a
>> > nonexistant/out of bounds buffer via StrategySyncStart().
>> > Possibly it won't even save that much, it might just increase the
>> > contention on the buffer header spinlock's cacheline.
>>
>> I agree; at least then it's not unambiguously better. if you (in
>> effect) swap all contention on allocation from a lwlock to a spinlock
>> it's not clear if you're improving things; it would have to be proven
>> and I'm trying to keep things simple.
>
> I think converting it to a spinlock actually is a good idea, you just
> need to expand the scope a bit.

Umm, sorry if I am being naive, but dont spinlocks perform bad when a
lot of contention is present on that node?

I feel that we may hit on that case here. A preliminary check before
the actual spinlock may be good,though,since spinlocks are cheap until
the contention remains low.

Regards,

Atri

-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Joshua D. Drake


On 08/07/2013 10:35 AM, Bruce Momjian wrote:


Actually, for me, motiving reviewers seems like the Lemon-Soaked Paper
Napkins, as it requires unbounded effort and its importance is not being
balanced with other priorities.



Ignoring the non-productive part of this thread, I would like to mention 
that motivating reviewers is not necessarily complicated. We just have 
to ask ourselves what motivates a person:


The feeling that their work is worthwhile, productive, will be 
appreciated and that they will receive recognition for the effort.


Right now, we do not publicly outside of the dome that is -hackers 
provide those incentives. Give reviewers the just recognition they 
deserve and I believe we will see more reviewing effort.


Sincerely,

Joshua D. Drake

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Josh Berkus
Bruce,

> You are getting into some kind of loop where not wanting to expend
> unlimited effort on something means, to you, that the person doesn't
> think the goal is important.   Effort has to be balanced.  This is not
> the first time I have seen such loops.  And why do you even care about
> my opinion?

Aha, OK.  So you're talking about all the different things we might do
to get more reviewers.  I'm only talking about adding reviewers to the
bottom of the release notes, which is certainly a bounded activity of
*very* limited effort.  Which is why I was confused and aghast at your
talk of "unbounded work".

To be completely clear: I am talking only about the compromise discussed
on this thread, namely:

a) listing reviewers who did "extensive work" as co-authors on the
patch, and
b) listing other reviewers at the bottom of the release notes.

Per earlier discussions.  You started this thread by claiming that
adding the reviewers to 9.4 would be too hard, and I argued that it
would not and in fact I'm already working on it.  Nothing I've talked
about in this thread has been about anything else.
-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Bruce Momjian
On Wed, Aug  7, 2013 at 12:07:32PM -0700, Josh Berkus wrote:
> Bruce,
> 
> > Actually, for me, motiving reviewers seems like the Lemon-Soaked Paper
> > Napkins, as it requires unbounded effort and its importance is not being
> > balanced with other priorities.
> 
> Let me be absolutely clear here: You do not think that the work
> reviewers do is important at all, and you think that our project has
> more than enough reviewers?  I want to be crystal-clear on your opinion.
>
> I will point out that you did exactly zero reviews in the last
> commitfest, which makes me wonder what your opinion of "we have enough
> reviewers" is based on.

Only Lemon-Soaked Paper Napkins users think that everything is binary
--- every effort has to be balanced against the work involved, which was
#2 on my list.  

You are getting into some kind of loop where not wanting to expend
unlimited effort on something means, to you, that the person doesn't
think the goal is important.   Effort has to be balanced.  This is not
the first time I have seen such loops.  And why do you even care about
my opinion?

And I have never said "we have enough reviewers".  Is this where you say
I should be doing more?  I thought we dealt with that already.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Thom Brown
On 7 August 2013 17:54, Thom Brown  wrote:
> On 7 August 2013 17:49, Andres Freund  wrote:
>> On 2013-08-07 17:21:01 +0100, Thom Brown wrote:
>>> Only build option used was --enable-depend.  I did have
>>> --enable-cassert for the shorter 5 min benchmarks, but was removed for
>>> the 30 min tests.
>>
>>> pgbench -j 80 -c 80 -T 300:
>>>
>>> 8.4 - 535.990042
>>> 9.2 - 820.798141
>>> 9.3 - 828.395498
>>> 9.4 - 197.851720
>> e>
>>> pgbench -j 80 -c 80 -T 1800:
>>>
>>> 8.4: 812.482108
>>> 9.4 HEAD: 355.397658
>>> 9.4 e5592c (9th July): 356.485625
>>> 9.4 537227 (7th July): 365.992518
>>> 9.4 9b2543 (7th July): 362.587339
>>> 9.4 269e78 (5th July): 359.439143
>>> 9.4 8800d8 (5th July): 821.933082
>>> 9.4 568d41 (2nd July): 822.991160
>>
>> The differences between those runs look to small for enable/disable
>> cassert to me. Are you you properly rebuilt for that?
>>
>>> 269e78 was the commit immediately after 8800d8, so it appears that
>>> introduced the regression.
>>>
>>> "Use posix_fallocate() for new WAL files, where available."
>>
>> This is curious. Could you either run a longer test before/after the
>> commit or reduce checkpoint_timeout to something like 3min?
>
> Okay, I'll rerun the test for both those commits at 1 hour each with
> checkpoint_timeout set at 3mins, but with all other configuration
> settings the same

Results
(checkpoint_timeout = 3min)

pgbench -j 80 -c 80 -T 3600

269e78: 606.268013
8800d8: 779.583129


Jon, here are the test results you asked for:

$ for i in 1 2 5 10 100; do ./test_fallocate foo $i 1; done
method: classic. 1 open/close iterations, 1 rewrite in 0.9157s
method: posix_fallocate. 1 open/close iterations, 1 rewrite in 0.5314s
method: glibc emulation. 1 open/close iterations, 1 rewrite in 0.6018s
method: classic. 2 open/close iterations, 1 rewrite in 1.1417s
method: posix_fallocate. 2 open/close iterations, 1 rewrite in 0.6505s
method: glibc emulation. 2 open/close iterations, 1 rewrite in 1.8900s
method: classic. 5 open/close iterations, 1 rewrite in 3.6490s
method: posix_fallocate. 5 open/close iterations, 1 rewrite in 1.9841s
method: glibc emulation. 5 open/close iterations, 1 rewrite in 3.1053s
method: classic. 10 open/close iterations, 1 rewrite in 5.7562s
method: posix_fallocate. 10 open/close iterations, 1 rewrite in 3.2015s
method: glibc emulation. 10 open/close iterations, 1 rewrite in 7.1426s
method: classic. 100 open/close iterations, 1 rewrite in 64.9483s
method: posix_fallocate. 100 open/close iterations, 1 rewrite in 36.3748s
method: glibc emulation. 100 open/close iterations, 1 rewrite in 64.8386s

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Stephen Frost
Josh,

* Josh Berkus (j...@agliodbs.com) wrote:
> > Actually, for me, motiving reviewers seems like the Lemon-Soaked Paper
> > Napkins, as it requires unbounded effort and its importance is not being
> > balanced with other priorities.
> 
> Let me be absolutely clear here: You do not think that the work
> reviewers do is important at all, and you think that our project has
> more than enough reviewers?  I want to be crystal-clear on your opinion.

Bruce certainly didn't say that and it's rather disingenuous to claim
that he did.  What I read is that he simply pointed out that we have
multiple priorities and need to consider work on acquiring new
reviewers in balance with the rest.

Also mentioned is that it's unclear how one might bound the work of
getting new reviewers- you can't say "it'll take X hours to get enough
reviewers" or even "it'll take X hours to get 5 new reviewers".

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Josh Berkus
Bruce,

> Actually, for me, motiving reviewers seems like the Lemon-Soaked Paper
> Napkins, as it requires unbounded effort and its importance is not being
> balanced with other priorities.

Let me be absolutely clear here: You do not think that the work
reviewers do is important at all, and you think that our project has
more than enough reviewers?  I want to be crystal-clear on your opinion.

I will point out that you did exactly zero reviews in the last
commitfest, which makes me wonder what your opinion of "we have enough
reviewers" is based on.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StrategyGetBuffer optimization, take 2

2013-08-07 Thread Merlin Moncure
On Wed, Aug 7, 2013 at 12:07 PM, Andres Freund  wrote:
> On 2013-08-07 09:40:24 -0500, Merlin Moncure wrote:
>> > I don't think the unlocked increment of nextVictimBuffer is a good idea
>> > though. nextVictimBuffer jumping over NBuffers under concurrency seems
>> > like a recipe for disaster to me. At the very, very least it will need a
>> > good wad of comments explaining what it means and how you're allowed to
>> > use it. The current way will lead to at least bgwriter accessing a
>> > nonexistant/out of bounds buffer via StrategySyncStart().
>> > Possibly it won't even save that much, it might just increase the
>> > contention on the buffer header spinlock's cacheline.
>>
>> I agree; at least then it's not unambiguously better. if you (in
>> effect) swap all contention on allocation from a lwlock to a spinlock
>> it's not clear if you're improving things; it would have to be proven
>> and I'm trying to keep things simple.
>
> I think converting it to a spinlock actually is a good idea, you just
> need to expand the scope a bit.

all right: well, I'll work up another version doing full spinlock and
maybe see things shake out in performance.

> FWIW, I am not convinced this is the trigger for the problems you're
> seing. It's a good idea nonetheless though.

I have some very strong evidence that the problem is coming out of the
buffer allocator.  Exhibit A is that vlad's presentation of the
problem was on a read only load (if not allocator lock, then what?).
Exhibit B is that lowering shared buffers to 2gb seems to have (so
far, 5 days in) fixed the issue.  This problem shows up on fast
machines with fast storage and lots of cores.  So what I think is
happening is that usage_count starts creeping up faster than it gets
cleared by the sweep with very large buffer settings which in turn
causes the 'problem' buffers to be analyzed for eviction more often.
What is not as clear is if the proposed optimizations will fix the
problem -- I'd have to get approval to test and confirm them in
production which seems unlikely at this juncture; that's why I'm
trying to keep things 'win-win' so as to not have to have them be
accepted on that basis.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Condition to become the standby mode.

2013-08-07 Thread Fujii Masao
On Sat, Jul 27, 2013 at 12:51 AM, Fujii Masao  wrote:
> On Fri, Jul 26, 2013 at 11:55 PM, Andres Freund  
> wrote:
>> On 2013-07-26 23:47:59 +0900, Fujii Masao wrote:
>>> >> If this problem is solved, there is possible of that we can failback
>>> >> by removing the all WAL record which is in pg_xlog before server
>>> >> starts as the slave server.
>>> >> ( And we also use "synchronous_transfer" which I'm proposing, I think
>>> >> we can fail-back without taking full backup surely)
>>> >
>>> > I still have *massive* doubts about the concept. But anyway, if you want
>>> > to do so, you should generate a backup label that specifies the startup
>>> > location.
>>>
>>> Generating a backup label doesn't seem to be enough because there is
>>> no backup-end WAL record and we cannot know the consistency point.
>>
>> Since 9.2 we allow generation of base backups from standbys, the
>> infrastructure built for should be sufficient to pass the lsn at which
>> consistency is achieved.
>
> Yeah, right. I'd forgotten about that.

On the second thought, using such an infrastructure seems not enough for
this issue. When we start new standby from the backup taken from another
standby, we use pg_control's min recovery point as the consistency point.

When we use this technique for the issue that Sawada raised, we will get
one problem that pg_control's min recovery point is zero because it's
pg_control under the master. The master doesn't update min recovery
point in pg_control. Only the standby updates it.

Therefore, we would need to find another way for the issue.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Thom Brown
On 7 August 2013 18:49, Jon Nelson  wrote:
> On Wed, Aug 7, 2013 at 12:42 PM, Thom Brown  wrote:
>> for i in 1 2 5 10 100; do ./test_fallocate foo $i 1; done
>> method: classic. 1 open/close iterations, 1 rewrite in 0.6380s
>> method: posix_fallocate. 1 open/close iterations, 1 rewrite in 0.3204s
>> method: glibc emulation. 1 open/close iterations, 1 rewrite in 0.6274s
>> method: classic. 2 open/close iterations, 1 rewrite in 1.2908s
>> method: posix_fallocate. 2 open/close iterations, 1 rewrite in 0.6596s
>> method: glibc emulation. 2 open/close iterations, 1 rewrite in 1.2666s
>> method: classic. 5 open/close iterations, 1 rewrite in 3.1419s
>> method: posix_fallocate. 5 open/close iterations, 1 rewrite in 1.5930s
>> method: glibc emulation. 5 open/close iterations, 1 rewrite in 3.1516s
>> method: classic. 10 open/close iterations, 1 rewrite in 6.2912s
>> method: posix_fallocate. 10 open/close iterations, 1 rewrite in 3.2626s
>> method: glibc emulation. 10 open/close iterations, 1 rewrite in 6.3667s
>> method: classic. 100 open/close iterations, 1 rewrite in 67.4174s
>> method: posix_fallocate. 100 open/close iterations, 1 rewrite in 37.8788s
>> method: glibc emulation. 100 open/close iterations, 1 rewrite in 55.0714s
>
> OK. That's interesting, and a good data point.
>
> One thing you could try manually disabling the use of posix_fallocate in 
> 269e78.
> After running ./configure --stuff-here
> edit src/include/pg_config.h and comment out the following line (on or
> around line 374)
> #define HAVE_POSIX_FALLOCATE 1
>
> *then* build postgresql and see if the performance hit is still there.

Okay, done that.  The TPS increases again:

2308.807568 / 2554.264572 / 2563.190204

And I did run ./configure... before removing the line, and built it
after the change.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Immediate shutdown causes the assertion failure in 9.4dev

2013-08-07 Thread Alvaro Herrera
Fujii Masao escribió:
> On Thu, Aug 1, 2013 at 2:26 AM, Fujii Masao  wrote:

> > TRAP: FailedAssertion("!(CheckpointerPID == 0)", File: "postmaster.c",
> > Line: 3440)
> >
> > The cause of this problem seems to be that PostmasterStateMachine()
> > may fail to wait for the checkpointer to exit. Attached patch fixes this.
> 
> Committed.

Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Immediate shutdown causes the assertion failure in 9.4dev

2013-08-07 Thread Fujii Masao
On Thu, Aug 1, 2013 at 2:26 AM, Fujii Masao  wrote:
> Hi,
>
> I encountered the following assertion failure when I executed
> an immediate shutdown.
>
> LOG:  received immediate shutdown request
> WARNING:  terminating connection because of crash of another server process
> DETAIL:  The postmaster has commanded this server process to roll back
> the current transaction and exit, because another server process
> exited abnormally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> TRAP: FailedAssertion("!(CheckpointerPID == 0)", File: "postmaster.c",
> Line: 3440)
>
> The cause of this problem seems to be that PostmasterStateMachine()
> may fail to wait for the checkpointer to exit. Attached patch fixes this.

Committed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Jon Nelson
On Wed, Aug 7, 2013 at 12:42 PM, Thom Brown  wrote:
> On 7 August 2013 17:54, Jon Nelson  wrote:
>> On Wed, Aug 7, 2013 at 11:21 AM, Thom Brown  wrote:
>>> Hi all,
>>>
>>> I recently tried a simple benchmark to see how far 9.4 had come since
>>> 8.4, but I discovered that I couldn't get 9.4 to even touch 8.4 for
>>> performance.  After checking 9.2 and 9.3 (as per Kevin Grittner's
>>> suggestion), I found that those were fine, so the issue must be in
>>> 9.4devel.  I used identical configurations for each test, and used
>>> 9.1's pgbench to ensure changes in pgbench didn't affect the outcome.
>>>  The common config changes were:
>>
>> ...
>>
>>> 8.4: 812.482108
>>> 9.4 HEAD: 355.397658
>>> 9.4 e5592c (9th July): 356.485625
>>> 9.4 537227 (7th July): 365.992518
>>> 9.4 9b2543 (7th July): 362.587339
>>> 9.4 269e78 (5th July): 359.439143
>>> 9.4 8800d8 (5th July): 821.933082
>>> 9.4 568d41 (2nd July): 822.991160
>>>
>>> 269e78 was the commit immediately after 8800d8, so it appears that
>>> introduced the regression.
>>>
>>> "Use posix_fallocate() for new WAL files, where available."
>>>
>>> Ironically, that was intended to be a performance improvement.
>>
>> Would it be possible for you to download, compile, and run the test
>> program as described and located in this email:
>>
>> http://www.postgresql.org/message-id/cakuk5j1acml-1cgbhnrzed-vh4og+8hkmfhy2eca-8jbj-6...@mail.gmail.com
>
> I shall do after the 2 x 1 hour benchmarks are complete.
>
>> I also wonder if there is a problem with the 3.8.0 kernel specifically.
>
> Well my laptop has the same kernel (and also 64-bit Linux Mint), so
> took 3 quick sample benchmarks on those two commits, and I get the
> following (all without --enable-cassert):
>
> 269e78: 1162.593665 / 1158.644302 / 1153.955611
> 8800d8: 2446.087618 / 2443.797252 / 2321.876431
>
> And running your test program gives the following (again, just on my laptop):
>
> for i in 1 2 5 10 100; do ./test_fallocate foo $i 1; done
> method: classic. 1 open/close iterations, 1 rewrite in 0.6380s
> method: posix_fallocate. 1 open/close iterations, 1 rewrite in 0.3204s
> method: glibc emulation. 1 open/close iterations, 1 rewrite in 0.6274s
> method: classic. 2 open/close iterations, 1 rewrite in 1.2908s
> method: posix_fallocate. 2 open/close iterations, 1 rewrite in 0.6596s
> method: glibc emulation. 2 open/close iterations, 1 rewrite in 1.2666s
> method: classic. 5 open/close iterations, 1 rewrite in 3.1419s
> method: posix_fallocate. 5 open/close iterations, 1 rewrite in 1.5930s
> method: glibc emulation. 5 open/close iterations, 1 rewrite in 3.1516s
> method: classic. 10 open/close iterations, 1 rewrite in 6.2912s
> method: posix_fallocate. 10 open/close iterations, 1 rewrite in 3.2626s
> method: glibc emulation. 10 open/close iterations, 1 rewrite in 6.3667s
> method: classic. 100 open/close iterations, 1 rewrite in 67.4174s
> method: posix_fallocate. 100 open/close iterations, 1 rewrite in 37.8788s
> method: glibc emulation. 100 open/close iterations, 1 rewrite in 55.0714s

OK. That's interesting, and a good data point.

One thing you could try manually disabling the use of posix_fallocate in 269e78.
After running ./configure --stuff-here
edit src/include/pg_config.h and comment out the following line (on or
around line 374)
#define HAVE_POSIX_FALLOCATE 1

*then* build postgresql and see if the performance hit is still there.

-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Thom Brown
On 7 August 2013 17:54, Jon Nelson  wrote:
> On Wed, Aug 7, 2013 at 11:21 AM, Thom Brown  wrote:
>> Hi all,
>>
>> I recently tried a simple benchmark to see how far 9.4 had come since
>> 8.4, but I discovered that I couldn't get 9.4 to even touch 8.4 for
>> performance.  After checking 9.2 and 9.3 (as per Kevin Grittner's
>> suggestion), I found that those were fine, so the issue must be in
>> 9.4devel.  I used identical configurations for each test, and used
>> 9.1's pgbench to ensure changes in pgbench didn't affect the outcome.
>>  The common config changes were:
>
> ...
>
>> 8.4: 812.482108
>> 9.4 HEAD: 355.397658
>> 9.4 e5592c (9th July): 356.485625
>> 9.4 537227 (7th July): 365.992518
>> 9.4 9b2543 (7th July): 362.587339
>> 9.4 269e78 (5th July): 359.439143
>> 9.4 8800d8 (5th July): 821.933082
>> 9.4 568d41 (2nd July): 822.991160
>>
>> 269e78 was the commit immediately after 8800d8, so it appears that
>> introduced the regression.
>>
>> "Use posix_fallocate() for new WAL files, where available."
>>
>> Ironically, that was intended to be a performance improvement.
>
> Would it be possible for you to download, compile, and run the test
> program as described and located in this email:
>
> http://www.postgresql.org/message-id/cakuk5j1acml-1cgbhnrzed-vh4og+8hkmfhy2eca-8jbj-6...@mail.gmail.com

I shall do after the 2 x 1 hour benchmarks are complete.

> I also wonder if there is a problem with the 3.8.0 kernel specifically.

Well my laptop has the same kernel (and also 64-bit Linux Mint), so
took 3 quick sample benchmarks on those two commits, and I get the
following (all without --enable-cassert):

269e78: 1162.593665 / 1158.644302 / 1153.955611
8800d8: 2446.087618 / 2443.797252 / 2321.876431

And running your test program gives the following (again, just on my laptop):

for i in 1 2 5 10 100; do ./test_fallocate foo $i 1; done
method: classic. 1 open/close iterations, 1 rewrite in 0.6380s
method: posix_fallocate. 1 open/close iterations, 1 rewrite in 0.3204s
method: glibc emulation. 1 open/close iterations, 1 rewrite in 0.6274s
method: classic. 2 open/close iterations, 1 rewrite in 1.2908s
method: posix_fallocate. 2 open/close iterations, 1 rewrite in 0.6596s
method: glibc emulation. 2 open/close iterations, 1 rewrite in 1.2666s
method: classic. 5 open/close iterations, 1 rewrite in 3.1419s
method: posix_fallocate. 5 open/close iterations, 1 rewrite in 1.5930s
method: glibc emulation. 5 open/close iterations, 1 rewrite in 3.1516s
method: classic. 10 open/close iterations, 1 rewrite in 6.2912s
method: posix_fallocate. 10 open/close iterations, 1 rewrite in 3.2626s
method: glibc emulation. 10 open/close iterations, 1 rewrite in 6.3667s
method: classic. 100 open/close iterations, 1 rewrite in 67.4174s
method: posix_fallocate. 100 open/close iterations, 1 rewrite in 37.8788s
method: glibc emulation. 100 open/close iterations, 1 rewrite in 55.0714s

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Josh Berkus
On 08/07/2013 10:10 AM, Andres Freund wrote:
> On 2013-08-07 10:04:08 -0700, Josh Berkus wrote:
>> In the novels The Hitchhiker's Guide To The Galaxy, there's a spaceship
>> which has been waiting 1000 years to take off because it's waiting for a
>> load of lemon-soaked paper napkins to be loaded.  You are being Mr.
>> Lemon-Soaked Paper Napkin.
> 
> Holy crap Josh. I agree that we should give reviews space in the release
> notes, but could you please cut the ad-hominem bullshit? You're
> preventing your own success here.

Huh?  I was comparing Bruce's *argument* to a scene in "Hitchhiker's
Guide".  How is that ad hominem?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Bruce Momjian
On Wed, Aug  7, 2013 at 10:04:08AM -0700, Josh Berkus wrote:
> Bruce,
> 
> > There are three issues here:
> > 
> > 1.  What will best motive reviewers?
> > 2.  What is a reasonable effort to accomplish #1?
> > 3.  What is acceptable for release note readers?
> > 
> > You seem to be only focused on #1, and you don't want to address the
> > other items --- that's fine --- I will still be around if people lose
> > interest or the system becomes unworkable.
> 
> Both I and other people have already addressed #2 and #3.  You're also
> having a huge failure of perspective here.  Motivating reviewers allows
> our project to continue developing PostgreSQL.  Items 2 and 3 are so
> insignificant in comparison to that as to not be worth discussing.

OK, my analysis was accurate then --- for you, #1 overshadows numbers 2
and 3.  I don't share those priorities, and the work required is
unbounded (no #2), so I will not perform additional work to help with
#1.

> In the novels The Hitchhiker's Guide To The Galaxy, there's a spaceship
> which has been waiting 1000 years to take off because it's waiting for a
> load of lemon-soaked paper napkins to be loaded.  You are being Mr.
> Lemon-Soaked Paper Napkin.

Actually, for me, motiving reviewers seems like the Lemon-Soaked Paper
Napkins, as it requires unbounded effort and its importance is not being
balanced with other priorities.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Andres Freund
On 2013-08-07 10:04:08 -0700, Josh Berkus wrote:
> Bruce,
> 
> > There are three issues here:
> > 
> > 1.  What will best motive reviewers?
> > 2.  What is a reasonable effort to accomplish #1?
> > 3.  What is acceptable for release note readers?
> > 
> > You seem to be only focused on #1, and you don't want to address the
> > other items --- that's fine --- I will still be around if people lose
> > interest or the system becomes unworkable.
> 
> Both I and other people have already addressed #2 and #3.  You're also
> having a huge failure of perspective here.  Motivating reviewers allows
> our project to continue developing PostgreSQL.  Items 2 and 3 are so
> insignificant in comparison to that as to not be worth discussing.
> 
> In the novels The Hitchhiker's Guide To The Galaxy, there's a spaceship
> which has been waiting 1000 years to take off because it's waiting for a
> load of lemon-soaked paper napkins to be loaded.  You are being Mr.
> Lemon-Soaked Paper Napkin.

Holy crap Josh. I agree that we should give reviews space in the release
notes, but could you please cut the ad-hominem bullshit? You're
preventing your own success here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] refactor heap_deform_tuple guts

2013-08-07 Thread Alvaro Herrera
Andres Freund escribió:
> On 2013-08-07 10:36:52 -0400, Alvaro Herrera wrote:

> > Yeah, I guess in isolation this doesn't make that much sense.  I am
> > hesitant to create a third copy in the minmax patch, but I will do that
> > for now and propose the refactoring later.
> 
> Well, you didn't mention upthread that you want to do this because
> you're going to need another variant of the same code. Imo that's
> sufficient reasoning.

Well, I would need to tweak the new code again, moving it from
heaptuple.c to htup_details.h as a STATIC_IF_INLINE function; and
furthermore I have noticed that if I modify the code in minmax, I can
have it do one more thing that I would need to do outside of it anyway.
(These tuples have two null bitmaps; I would have to decode the second
one separately.  Doing it inside the new "deform" variant is much
easier.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StrategyGetBuffer optimization, take 2

2013-08-07 Thread Andres Freund
On 2013-08-07 09:40:24 -0500, Merlin Moncure wrote:
> > I don't think the unlocked increment of nextVictimBuffer is a good idea
> > though. nextVictimBuffer jumping over NBuffers under concurrency seems
> > like a recipe for disaster to me. At the very, very least it will need a
> > good wad of comments explaining what it means and how you're allowed to
> > use it. The current way will lead to at least bgwriter accessing a
> > nonexistant/out of bounds buffer via StrategySyncStart().
> > Possibly it won't even save that much, it might just increase the
> > contention on the buffer header spinlock's cacheline.
> 
> I agree; at least then it's not unambiguously better. if you (in
> effect) swap all contention on allocation from a lwlock to a spinlock
> it's not clear if you're improving things; it would have to be proven
> and I'm trying to keep things simple.

I think converting it to a spinlock actually is a good idea, you just
need to expand the scope a bit.

> Attached is a scaled down version of the patch that keeps the freelist
> lock but still removes the spinlock during the clock sweep.  This
> still hits the major objectives of reducing the chance of scheduling
> out while holding the BufFreelistLock and mitigating the worst case
> impact of doing so if it does happen.

FWIW, I am not convinced this is the trigger for the problems you're
seing. It's a good idea nonetheless though.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Kudos for Reviewers -- wrapping it up

2013-08-07 Thread Josh Berkus
Bruce,

> There are three issues here:
> 
> 1.  What will best motive reviewers?
> 2.  What is a reasonable effort to accomplish #1?
> 3.  What is acceptable for release note readers?
> 
> You seem to be only focused on #1, and you don't want to address the
> other items --- that's fine --- I will still be around if people lose
> interest or the system becomes unworkable.

Both I and other people have already addressed #2 and #3.  You're also
having a huge failure of perspective here.  Motivating reviewers allows
our project to continue developing PostgreSQL.  Items 2 and 3 are so
insignificant in comparison to that as to not be worth discussing.

In the novels The Hitchhiker's Guide To The Galaxy, there's a spaceship
which has been waiting 1000 years to take off because it's waiting for a
load of lemon-soaked paper napkins to be loaded.  You are being Mr.
Lemon-Soaked Paper Napkin.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Thom Brown
On 7 August 2013 17:49, Andres Freund  wrote:
> On 2013-08-07 17:21:01 +0100, Thom Brown wrote:
>> Only build option used was --enable-depend.  I did have
>> --enable-cassert for the shorter 5 min benchmarks, but was removed for
>> the 30 min tests.
>
>> pgbench -j 80 -c 80 -T 300:
>>
>> 8.4 - 535.990042
>> 9.2 - 820.798141
>> 9.3 - 828.395498
>> 9.4 - 197.851720
> e>
>> pgbench -j 80 -c 80 -T 1800:
>>
>> 8.4: 812.482108
>> 9.4 HEAD: 355.397658
>> 9.4 e5592c (9th July): 356.485625
>> 9.4 537227 (7th July): 365.992518
>> 9.4 9b2543 (7th July): 362.587339
>> 9.4 269e78 (5th July): 359.439143
>> 9.4 8800d8 (5th July): 821.933082
>> 9.4 568d41 (2nd July): 822.991160
>
> The differences between those runs look to small for enable/disable
> cassert to me. Are you you properly rebuilt for that?
>
>> 269e78 was the commit immediately after 8800d8, so it appears that
>> introduced the regression.
>>
>> "Use posix_fallocate() for new WAL files, where available."
>
> This is curious. Could you either run a longer test before/after the
> commit or reduce checkpoint_timeout to something like 3min?

Okay, I'll rerun the test for both those commits at 1 hour each with
checkpoint_timeout set at 3mins, but with all other configuration
settings the same

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Jon Nelson
On Wed, Aug 7, 2013 at 11:21 AM, Thom Brown  wrote:
> Hi all,
>
> I recently tried a simple benchmark to see how far 9.4 had come since
> 8.4, but I discovered that I couldn't get 9.4 to even touch 8.4 for
> performance.  After checking 9.2 and 9.3 (as per Kevin Grittner's
> suggestion), I found that those were fine, so the issue must be in
> 9.4devel.  I used identical configurations for each test, and used
> 9.1's pgbench to ensure changes in pgbench didn't affect the outcome.
>  The common config changes were:

...

> 8.4: 812.482108
> 9.4 HEAD: 355.397658
> 9.4 e5592c (9th July): 356.485625
> 9.4 537227 (7th July): 365.992518
> 9.4 9b2543 (7th July): 362.587339
> 9.4 269e78 (5th July): 359.439143
> 9.4 8800d8 (5th July): 821.933082
> 9.4 568d41 (2nd July): 822.991160
>
> 269e78 was the commit immediately after 8800d8, so it appears that
> introduced the regression.
>
> "Use posix_fallocate() for new WAL files, where available."
>
> Ironically, that was intended to be a performance improvement.

Would it be possible for you to download, compile, and run the test
program as described and located in this email:

http://www.postgresql.org/message-id/cakuk5j1acml-1cgbhnrzed-vh4og+8hkmfhy2eca-8jbj-6...@mail.gmail.com

I also wonder if there is a problem with the 3.8.0 kernel specifically.

-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we remove "not fast" promotion at all?

2013-08-07 Thread Andres Freund
On 2013-08-07 22:26:53 +0900, Fujii Masao wrote:
> On Tue, Aug 6, 2013 at 1:07 PM, Tom Lane  wrote:
> > Fujii Masao  writes:
> >> On Tue, Aug 6, 2013 at 11:40 AM, Andres Freund  
> >> wrote:
> >>> FWIW I'd rather keep plain promotion for a release or two. TBH, I have a
> >>> bit of trust issues regarding the new method, and I'd like to be able to
> >>> test potential issues against a stock postgres by doing a normal instead
> >>> of a fast promotion.
> >
> >> So we should add new option specifying the promotion mode, into pg_ctl?
> >> Currently pg_ctl cannot trigger the normal promotion.
> >
> > It would be silly to add such an option if we want to remove the old mode
> > in a release or two.
> 
> Without such an option, a user cannot easily trigger the "normal" promotion
> when we find some problems in fast promotion. In this case, a user needs to
> create the "promote" file and send the SIGUSR1 signal to postmaster by hand.
> Or needs to execute pg_ctl promote by using old version (e.g., 9.2) of pg_ctl.
> Seems confusing.

Seems fine for debugging to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 regression

2013-08-07 Thread Andres Freund
On 2013-08-07 17:21:01 +0100, Thom Brown wrote:
> Only build option used was --enable-depend.  I did have
> --enable-cassert for the shorter 5 min benchmarks, but was removed for
> the 30 min tests.

> pgbench -j 80 -c 80 -T 300:
> 
> 8.4 - 535.990042
> 9.2 - 820.798141
> 9.3 - 828.395498
> 9.4 - 197.851720
e> 
> pgbench -j 80 -c 80 -T 1800:
> 
> 8.4: 812.482108
> 9.4 HEAD: 355.397658
> 9.4 e5592c (9th July): 356.485625
> 9.4 537227 (7th July): 365.992518
> 9.4 9b2543 (7th July): 362.587339
> 9.4 269e78 (5th July): 359.439143
> 9.4 8800d8 (5th July): 821.933082
> 9.4 568d41 (2nd July): 822.991160

The differences between those runs look to small for enable/disable
cassert to me. Are you you properly rebuilt for that?

> 269e78 was the commit immediately after 8800d8, so it appears that
> introduced the regression.
> 
> "Use posix_fallocate() for new WAL files, where available."

This is curious. Could you either run a longer test before/after the
commit or reduce checkpoint_timeout to something like 3min?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] refactor heap_deform_tuple guts

2013-08-07 Thread Andres Freund
On 2013-08-07 10:36:52 -0400, Alvaro Herrera wrote:
> Robert Haas escribió:
> > On Tue, Aug 6, 2013 at 6:32 PM, Alvaro Herrera  
> > wrote:
> > > heap_deform_tuple and slot_deform_tuple contain duplicated code.  This
> > > patch refactors them so that the guts are in a single place.
> > >
> > > I have checked the resulting assembly code for heap_deform_tuple, and
> > > with the "inline" declaration, the gcc version I have (4.7.2) generates
> > > almost identical output both after the patch than before, thus there
> > > shouldn't be any slowdown.
> > 
> > Although I'm generally in favor of eliminating duplicated code, I have
> > to admit that in this case I'm not sure I see the point.
> 
> Yeah, I guess in isolation this doesn't make that much sense.  I am
> hesitant to create a third copy in the minmax patch, but I will do that
> for now and propose the refactoring later.

Well, you didn't mention upthread that you want to do this because
you're going to need another variant of the same code. Imo that's
sufficient reasoning.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-07 Thread Jeff Janes
On Wed, Aug 7, 2013 at 7:03 AM, Fujii Masao  wrote:
> On Fri, Aug 2, 2013 at 12:24 AM, MauMau  wrote:
>> From: "Fujii Masao" 
>>
 However, isn't StandbyRequested true (= standby_mode set to on) to enable
 warm standby?
>>>
>>>
>>> We can set up warm-standby by using pg_standby even if standby_mode = off.
>>
>>
>> I see.  However, I understand that pg_standby is a legacy feature, and the
>> current way to setup a warm standby is to set standby=on and restore_command
>> in recovery.conf.  So I think it might not be necessary to support cascading
>> replication with pg_standby, if supporting it would prevent better
>> implementation of new features.
>>
>>
>>
  I'm afraid people set max_wal_senders>0 and hot_standby=on
 even on the primary server to make the contents of postgresql.conf
 identical
 on both the primary and the standby for easier configuration.  If so,
 normal
 archive recovery (PITR, not the standby recovery) would face the original
 problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if
 AllowCascadeReplication() is enough.
>>>
>>>
>>> One idea is to add new GUC parameter like enable_cascade_replication
>>> and then prevent WAL file from being kept in pg_xlog if that parameter is
>>> off.
>>> But we cannot apply such change into the already-released version 9.2.
>>
>>
>> Yes, I thought the same, too.  Adding a new GUC to enable cascading
>> replication now would be a usability degradation.  But I have no idea of any
>> better way.  It seems we need to take either v1 or v2 of the patch I sent.
>> If we consider that we don't have to support cascading replication for a
>> legacy pg_standby, v1 patch is definitely better because the standby server
>> would not keep restored archive WAL in pg_xlog when cascading replication is
>> not used.  Otherwise, we have to take v2 patch.
>>
>> Could you choose either and commit it?
>
> On second thought, probably we cannot remove the restored WAL files early
> because they might be required for fast promotion which is new feature in 9.3.
> In fast promotion, an end-of-recovery checkpoint is not executed. After the 
> end
> of recovery, normal online checkpoint starts. If the server crashes before 
> such
> an online checkpoint completes, the server needs to replay again all the WAL
> files which it replayed before the promotion. Since this is the crash 
> recovery,
> all those WAL files need to exist in pg_xlog directory. So if we remove the
> restored WAL file from pg_xlog early, such a crash recovery might fail.
>
> So even if cascade replication is disabled, if standby_mode = on, i.e., fast
> promotion can be performed, we cannot remove the restored WAL files early.

But we should still be able to remove files older than the latest
restart point, right?

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.4 regression

2013-08-07 Thread Thom Brown
Hi all,

I recently tried a simple benchmark to see how far 9.4 had come since
8.4, but I discovered that I couldn't get 9.4 to even touch 8.4 for
performance.  After checking 9.2 and 9.3 (as per Kevin Grittner's
suggestion), I found that those were fine, so the issue must be in
9.4devel.  I used identical configurations for each test, and used
9.1's pgbench to ensure changes in pgbench didn't affect the outcome.
 The common config changes were:

max_connections = 500
shared_buffers = 4GB
effective_cache_size = 12GB
random_page_cost = 2.0
cpu_tuple_cost = 0.03
wal_buffers = 32MB
work_mem = 100MB
maintenance_work_mem = 512MB
checkpoint_segments = 32
checkpoint_timeout = 15min
checkpoint_completion_target = 0.8
commit_delay = 50
commit_siblings = 15

System info:
8GB DDR3 RAM (yes, I know the config isn't optimal here)
64-bit Linux Mint 15 (3.8.0 kernel)
ext4

Only build option used was --enable-depend.  I did have
--enable-cassert for the shorter 5 min benchmarks, but was removed for
the 30 min tests.


Here are the results:

pgbench -j 80 -c 80 -T 300:

8.4 - 535.990042
9.2 - 820.798141
9.3 - 828.395498
9.4 - 197.851720


pgbench -j 20 -c 20 -T 300:

8.4 - 496.529343
9.2 - 569.626729
9.3 - 575.831264
9.4 - 385.658893


pgbench -j 20 -c 400 -T 300:

8.4 - 580.186868
9.2 - 824.590252
9.3 - 784.638848
9.4 - 524.493308

These were only run for 5 minutes each, but subsequent ones were run
for 30 mins.  All tests were run with -s 20.

Given a few key commits Robert Haas directed me to, I narrowed down
the regression to a time period, so benchmarked a few select commits.

pgbench -j 80 -c 80 -T 1800:

8.4: 812.482108
9.4 HEAD: 355.397658
9.4 e5592c (9th July): 356.485625
9.4 537227 (7th July): 365.992518
9.4 9b2543 (7th July): 362.587339
9.4 269e78 (5th July): 359.439143
9.4 8800d8 (5th July): 821.933082
9.4 568d41 (2nd July): 822.991160

269e78 was the commit immediately after 8800d8, so it appears that
introduced the regression.

"Use posix_fallocate() for new WAL files, where available."

Ironically, that was intended to be a performance improvement.

Regards

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unsafe GUCs and ALTER SYSTEM WAS: Re: ALTER SYSTEM SET

2013-08-07 Thread Bruce Momjian
On Wed, Aug  7, 2013 at 10:59:05AM -0400, Alvaro Herrera wrote:
> > If we do this, perhaps we should unconditionally just print the file
> > name they have to delete to undo the operation in case the server
> > doesn't start;
> 
> However, bear in mind that if the DBA is administering a server through
> ALTER SYSTEM and they don't have shell access, they might just be
> screwed if they bollix the system and they lose access.  Knowing what
> file you have to delete does you no good if you can't actually delete it.

My point was that if we tell them what file they have to edit to undo a
problem, then at least then know that there is chance this will require
shell access, and they can undo the change right there (with SET var =
DEFAULT?) e.g.:

WARNING:  If this change prevents the server from starting, you will
  need to edit postgresql.conf.auto to undo the change

> > I am unclear we can clearly identify all the GUC
> > settings that could cause a server not to start.  Also, I think we need
> > a SHOW SYSTEM command so users can see their settings via SQL.
> 
> Not sure about this.  SHOW normally just displays the current value,
> nothing more.  If you want more details, there's the pg_settings view
> with complete information.

The issue is that if we give users the ability to set values via SQL, we
should at least document how they can view the things they set via SQL,
and pg_settings doesn't do that because it just shows the _active_
value, which might have overridden the SQL-set value.

Right now we can see settings in postgresql.conf by viewing the file,
and ALTER USER/DATABASE via system tables.  Having no API to view
SQL-set values except viewing a flat file seems counter-productive.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unsafe GUCs and ALTER SYSTEM WAS: Re: ALTER SYSTEM SET

2013-08-07 Thread Alvaro Herrera
Bruce Momjian escribió:

> We already have six levels of GUC settings:
> 
>   postgresql.conf
>   user
>   database
>   session
>   function
>   subtransaction
> 
> If we add ALTER SYSTEM SET and config.d, we would then have eight. 

Actually, conf.d (not config.d) would just be reported as a file, and
the "file" column in the pg_settings view would show which file it is.
That's how it's done for postgresql.conf and any other file it includes,
so that seems a pretty reasonable thing to me.

> ALTER SYSTEM SET seems to add an entirely new set of behaviors and
> complexity.  Is that really what we want?

I had imagined that ALTER SYSTEM SET would be represented in the same
way as above, i.e. just be a different source file.  But thinking more
about it now, that doesn't make sense, because those files might be in a
completely different base directory, and the file names shouldn't even
be exposed to the user in the first place; so clearly ALTER SYSTEM
should show up differently in pg_settings.  Displaying each option's
full path seems useful for troubleshooting, as you say:

> If we do this, perhaps we should unconditionally just print the file
> name they have to delete to undo the operation in case the server
> doesn't start;

However, bear in mind that if the DBA is administering a server through
ALTER SYSTEM and they don't have shell access, they might just be
screwed if they bollix the system and they lose access.  Knowing what
file you have to delete does you no good if you can't actually delete it.

> I am unclear we can clearly identify all the GUC
> settings that could cause a server not to start.  Also, I think we need
> a SHOW SYSTEM command so users can see their settings via SQL.

Not sure about this.  SHOW normally just displays the current value,
nothing more.  If you want more details, there's the pg_settings view
with complete information.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StrategyGetBuffer optimization, take 2

2013-08-07 Thread Merlin Moncure
On Mon, Aug 5, 2013 at 11:40 AM, Andres Freund  wrote:
> On 2013-08-05 10:49:08 -0500, Merlin Moncure wrote:
>> optimization 4: remove free list lock (via Jeff Janes).  This is the
>> other optimization: one backend will no longer be able to shut down
>> buffer allocation
>
> I think splitting off the actual freelist checking into a spinlock makes
> quite a bit of sense. And I think a separate one should be used for the
> actual search for the clock sweep.


> I don't think the unlocked increment of nextVictimBuffer is a good idea
> though. nextVictimBuffer jumping over NBuffers under concurrency seems
> like a recipe for disaster to me. At the very, very least it will need a
> good wad of comments explaining what it means and how you're allowed to
> use it. The current way will lead to at least bgwriter accessing a
> nonexistant/out of bounds buffer via StrategySyncStart().
> Possibly it won't even save that much, it might just increase the
> contention on the buffer header spinlock's cacheline.

I agree; at least then it's not unambiguously better. if you (in
effect) swap all contention on allocation from a lwlock to a spinlock
it's not clear if you're improving things; it would have to be proven
and I'm trying to keep things simple.

Attached is a scaled down version of the patch that keeps the freelist
lock but still removes the spinlock during the clock sweep.  This
still hits the major objectives of reducing the chance of scheduling
out while holding the BufFreelistLock and mitigating the worst case
impact of doing so if it does happen.  An even more scaled down
version would keep the current logic exactly as is except for
replacing buffer lock in the clock sweep with a trylock (which is
IMNSHO a no-brainer).

merlin


buffer3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] refactor heap_deform_tuple guts

2013-08-07 Thread Alvaro Herrera
Robert Haas escribió:
> On Tue, Aug 6, 2013 at 6:32 PM, Alvaro Herrera  
> wrote:
> > heap_deform_tuple and slot_deform_tuple contain duplicated code.  This
> > patch refactors them so that the guts are in a single place.
> >
> > I have checked the resulting assembly code for heap_deform_tuple, and
> > with the "inline" declaration, the gcc version I have (4.7.2) generates
> > almost identical output both after the patch than before, thus there
> > shouldn't be any slowdown.
> 
> Although I'm generally in favor of eliminating duplicated code, I have
> to admit that in this case I'm not sure I see the point.

Yeah, I guess in isolation this doesn't make that much sense.  I am
hesitant to create a third copy in the minmax patch, but I will do that
for now and propose the refactoring later.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-07 Thread Fujii Masao
On Fri, Aug 2, 2013 at 12:24 AM, MauMau  wrote:
> From: "Fujii Masao" 
>
>>> However, isn't StandbyRequested true (= standby_mode set to on) to enable
>>> warm standby?
>>
>>
>> We can set up warm-standby by using pg_standby even if standby_mode = off.
>
>
> I see.  However, I understand that pg_standby is a legacy feature, and the
> current way to setup a warm standby is to set standby=on and restore_command
> in recovery.conf.  So I think it might not be necessary to support cascading
> replication with pg_standby, if supporting it would prevent better
> implementation of new features.
>
>
>
>>>  I'm afraid people set max_wal_senders>0 and hot_standby=on
>>> even on the primary server to make the contents of postgresql.conf
>>> identical
>>> on both the primary and the standby for easier configuration.  If so,
>>> normal
>>> archive recovery (PITR, not the standby recovery) would face the original
>>> problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if
>>> AllowCascadeReplication() is enough.
>>
>>
>> One idea is to add new GUC parameter like enable_cascade_replication
>> and then prevent WAL file from being kept in pg_xlog if that parameter is
>> off.
>> But we cannot apply such change into the already-released version 9.2.
>
>
> Yes, I thought the same, too.  Adding a new GUC to enable cascading
> replication now would be a usability degradation.  But I have no idea of any
> better way.  It seems we need to take either v1 or v2 of the patch I sent.
> If we consider that we don't have to support cascading replication for a
> legacy pg_standby, v1 patch is definitely better because the standby server
> would not keep restored archive WAL in pg_xlog when cascading replication is
> not used.  Otherwise, we have to take v2 patch.
>
> Could you choose either and commit it?

On second thought, probably we cannot remove the restored WAL files early
because they might be required for fast promotion which is new feature in 9.3.
In fast promotion, an end-of-recovery checkpoint is not executed. After the end
of recovery, normal online checkpoint starts. If the server crashes before such
an online checkpoint completes, the server needs to replay again all the WAL
files which it replayed before the promotion. Since this is the crash recovery,
all those WAL files need to exist in pg_xlog directory. So if we remove the
restored WAL file from pg_xlog early, such a crash recovery might fail.

So even if cascade replication is disabled, if standby_mode = on, i.e., fast
promotion can be performed, we cannot remove the restored WAL files early.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-07 Thread Fujii Masao
On Fri, Aug 2, 2013 at 10:52 AM, Michael Paquier
 wrote:
>
>
>
> On Fri, Aug 2, 2013 at 12:24 AM, MauMau  wrote:
>>
>> From: "Fujii Masao" 
>>
 However, isn't StandbyRequested true (= standby_mode set to on) to
 enable
 warm standby?
>>>
>>>
>>> We can set up warm-standby by using pg_standby even if standby_mode =
>>> off.
>>
>>
>> I see.  However, I understand that pg_standby is a legacy feature, and the
>> current way to setup a warm standby is to set standby=on and restore_command
>> in recovery.conf.  So I think it might not be necessary to support cascading
>> replication with pg_standby, if supporting it would prevent better
>> implementation of new features.
>
> You are right about that, you should stick with the core features as much as
> you can and limit the use of wrapper utilities. Since 9.1 and the apparition
> of a built-in standby mode inside Postgres (with the apparition of the GUC
> parameter hot_standby), it seems better to use that instead of pg_standby,
> which would likely (personal opinion, feel free to scream at me) be removed
> in a future release.

I'm sure that there are some users who use pg_standby for warm-standby
for some reasons, for example, the document (*1) explains such a configuration,
a user can specify the file-check-interval (by using -s option), and so on.
I agree that using pg_standby + cascade replication would be very rare.
But I'm not confident that *no one* uses pg_standby + cascade replication.

(*1) http://www.postgresql.org/docs/devel/static/log-shipping-alternative.html

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we remove "not fast" promotion at all?

2013-08-07 Thread Fujii Masao
On Tue, Aug 6, 2013 at 1:07 PM, Tom Lane  wrote:
> Fujii Masao  writes:
>> On Tue, Aug 6, 2013 at 11:40 AM, Andres Freund  
>> wrote:
>>> FWIW I'd rather keep plain promotion for a release or two. TBH, I have a
>>> bit of trust issues regarding the new method, and I'd like to be able to
>>> test potential issues against a stock postgres by doing a normal instead
>>> of a fast promotion.
>
>> So we should add new option specifying the promotion mode, into pg_ctl?
>> Currently pg_ctl cannot trigger the normal promotion.
>
> It would be silly to add such an option if we want to remove the old mode
> in a release or two.

Without such an option, a user cannot easily trigger the "normal" promotion
when we find some problems in fast promotion. In this case, a user needs to
create the "promote" file and send the SIGUSR1 signal to postmaster by hand.
Or needs to execute pg_ctl promote by using old version (e.g., 9.2) of pg_ctl.
Seems confusing.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated emacs configuration

2013-08-07 Thread Dimitri Fontaine
Peter Eisentraut  writes:
> Did anyone have any outstanding concerns about this latest version?  I
> thought it looked ready to commit.

I wanted to have another round at trying it here and get a better
understanding at my failures of the previous time, but that's not a
blocker at all. I had another look at it and it looks ready to me too.

Minor detail: you seem to have forgotten to apply the postgres(ql)?
regexp to the GNU Make mode in the emacs.samples file.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated emacs configuration

2013-08-07 Thread Peter Eisentraut
On 7/2/13 8:42 PM, Peter Eisentraut wrote:
> Updated files with changes:
> 
> - adjusted fill-column to 78, per Noah
> - added c-file-style, per Andrew
> - support both "postgresql" and "postgres" directory names
> - use defun instead of lambda, per Dimitri
> - put Perl configuration back into emacs.samples, for Tom
> 
> I also added configuration of c-auto-align-backslashes as well as label
> and statement-case-open to c-offsets-alist.  With those changes, the
> result of indent-region is now very very close to pgindent, with the
> main exception of the end-of-line de-indenting that pgindent does, which
> nobody likes anyway.

Did anyone have any outstanding concerns about this latest version?  I
thought it looked ready to commit.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how to pass data (tuples) to worker processes?

2013-08-07 Thread Jeremy Harris

On 06/08/13 13:59, Robert Haas wrote:

  My thought is to create a
queue abstraction that sits on top of the dynamic shared memory
infrastructure, so that you can set aside a portion of your dynamic
shared memory segment to use as a ring buffer and send messages back
and forth with using some kind of API along these lines:


You may find http://quatermass.co.uk/toolsmith/mplib1/ of use here.
--
Cheers,
   Jeremy


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers