[HACKERS] Re[2]: [HACKERS] standby, pg_basebackup and last xlog file

2013-02-07 Thread Миша Тюрин


Hello all and Heikki personally
Thank you for your answer

I have some new points:


21.01.2013, 10:08 +02:00 от Heikki Linnakangas :
>On 21.01.2013 09:14, Миша Тюрин wrote:
>>Is there any reason why pg_basebackup has limitation in an online backup 
>> from the standby: "The backup history file is not created in the database 
>> cluster backed up." ?
>
>WAL archiving isn't active in a standby, so even if it created a backup 
>history file, it wouldn't go anywhere. Also, the way the backup history 
>files are named, if you take a backup on the master at the same time (or 
>another backup at the same time in the standby), you would end up with 
>multiple backup history files with the same name.
>
>>So i can't get last xlog file needed to restore :(
>
>Good point. That was an oversight in the patch that allowed base backups 
>from a standby.
>
>>Also maybe i can use something like ( pg_last_xlog_replay_location() + 1 
>> ) after pg_basebackup finished.
>
>Yeah, that should work.
>
>>Does anybody know true way to getting last xlog file in case of applying 
>> pg_basebackup to standby?
>>How pg_basebackup gets last xlog file in case of standby and -x option?
>
>The server returns the begin and end WAL locations to pg_basebackup, 
>pg_basebackup just doesn't normally print them out to the user. In 
>verbose mode, it does actually print them out, but only with -x, so that 
>doesn't help you either. If you can compile from source, you could 
>modify pg_basebackup.c to print the locations without -x and --verbose, 
>search lines that print out "transaction log start point / end position".


1) we can get last xlog by using control data's "Minimum recovery ending 
location" 


>
>
>As a workaround, without modifying the source, you can do this after 
>pg_basebackup has finished, to get the first WAL file you need:
>
>$ pg_controldata backupdir | grep "REDO WAL file"
>Latest checkpoint's REDO WAL file:00030009


and I would like to correct your suggestion about first wal file
2.1)  we should use backup_label to determine first needed wal
2.2)  and we must not use redo from checkpoint. because there are might be more 
than one checkpoint during base_backup


>
>And as you suggested, pg_last_xlog_replay_location() for the last WAL 
>file you need.
>
>- Heikki
>
>
>-- 
>Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers

- Misha

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Comment typo

2013-02-07 Thread Etsuro Fujita
I found a comment typo.  Please find attached a patch.

Best regards,
Etsuro Fujita


comment_typo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: ANSI SQL 2011 syntax for named parameters

2013-02-07 Thread Tom Lane
Robert Haas  writes:
> On Thu, Feb 7, 2013 at 6:42 AM, Simon Riggs  wrote:
>> IMO the way to resolve that conflict is with a behaviour parameter to
>> allow people to choose, rather than be forced to wait a year because
>> some people still run an old version of an add-on package. A good way
>> to do that would be to have a sql_standard = postgres | 2011 etc so we
>> can tick the box in having a sql standard flagger as well.

> The undesirability of syntax-altering GUCs has been discussed here on
> many occasions.

Note that a GUC to change the behavior of the lexer or grammar is
particularly undesirable, for reasons noted at the top of gram.y as
well as others having to do with the behavior of plancache.c.
(Hint: it caches grammar output, not raw source text.)

We've put up with that for standard_conforming_strings because we
pretty much had to, but that doesn't mean that introducing more
such GUCs would be wise.

But regardless of those particular implementation artifacts, I think
most of us have come to the conclusion that GUCs that alter query
semantics are far more dangerous and unpleasant-to-use than they
might look.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] JPA + enum == Exception

2013-02-07 Thread Tom Lane
Tom Dunstan  writes:
> ... That works ok, but when attempting to use a prepared statement:

> ps = con.prepareStatement("insert into enumcast values (?)");
> ps.setString(1, "meh");
> ps.executeUpdate();

> we get a

> org.postgresql.util.PSQLException: ERROR: column "current_mood" is of
> type mood but expression is of type character varying
>   Hint: You will need to rewrite or cast the expression.

AFAIK this is just business as usual with JDBC: setString() implies that
the parameter is of a string type.  It'll fall over if the type actually
required is anything but a string.  (I'm no Java expert, but I seem to
recall that using setObject instead is the standard workaround.)

Enums are not suffering any special hardship here, and I'd be against
weakening the type system to give them a special pass.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Tom Lane
Alvaro Herrera  writes:
>   xid = HeapTupleHeaderGetRawXmax(tuple);
> ! if (((tuple->t_infomask & HEAP_XMAX_IS_MULTI) &&
> !  MultiXactIdIsValid(xid) &&
> !  MultiXactIdPrecedes(xid, cutoff_multi)) ||
> ! ((!(tuple->t_infomask & HEAP_XMAX_IS_MULTI)) &&
> !  TransactionIdIsNormal(xid) &&
> !  TransactionIdPrecedes(xid, cutoff_xid)))
>   {

Would this be clearer as a ternary expression?  That is,

if ((tuple->t_infomask & HEAP_XMAX_IS_MULTI) ?
(MultiXactIdIsValid(xid) &&
 MultiXactIdPrecedes(xid, cutoff_multi)) :
(TransactionIdIsNormal(xid) &&
 TransactionIdPrecedes(xid, cutoff_xid)))

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] JPA + enum == Exception

2013-02-07 Thread Tom Dunstan
> -Original Message-
> From: pgsql-jdbc-ow...@postgresql.org 
> [mailto:pgsql-jdbc-ow...@postgresql.org] On Behalf Of Marc G. Fournier
> I'm trying to use enum's in a database, but the java guys are telling me that 
> they are having problems with inserts ...
> reading from the database isn't a problem, but there appears to be an issue 
> with converting from string -> enum when saving it back again ...

This is interesting, it seems to be a difference between executing the
sql directly and using a prepared statement:

tomtest=# create type mood as enum ('happy', 'meh', 'sad');
CREATE TYPE
tomtest=# create table enumcast  (current_mood mood);
CREATE TABLE
tomtest=# insert into enumcast values ('sad');
INSERT 0 1
tomtest=# select * from enumcast ;
 current_mood
--
 sad
(1 row)


That works ok, but when attempting to use a prepared statement:

ps = con.prepareStatement("insert into enumcast values (?)");
ps.setString(1, "meh");
ps.executeUpdate();

we get a

org.postgresql.util.PSQLException: ERROR: column "current_mood" is of
type mood but expression is of type character varying
  Hint: You will need to rewrite or cast the expression.

Cue sad trombone. You can fix this with implicit casts using CREATE
CAST, or an explicit cast in the query, but this shouldn't really be
necessary for what is a basic use case for enums. In any case ORMs
won't know how to do that without writing custom converters, which
makes me sad. I had intended that ORMs could just treat enum fields as
text fields basically and not have to care about the underlying
implementation.

Cc'ing hackers - why the difference here? I presume that the input
function is getting triggered when the value is inline in the SQL, but
not so when the statement is prepared. Should we consider creating an
implicit cast from text to enums when we create an enum? Or is there
some other way to get the expected behaviour here?

Cheers

Tom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Alvaro Herrera
Alvaro Herrera escribió:
> Alvaro Herrera escribió:
> 
> > Hm, if the foreign key patch is to blame, this sounds like these tuples
> > had a different set of XMAX hint bits and a different Xmax, and they
> > were clobbered by something like vacuum or page pruning.
> 
> Hm, I think heap_freeze_tuple is busted, yes.

This patch seems to fix the problem for me.  Please confirm.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 5113,5122  heap_freeze_tuple(HeapTupleHeader tuple, TransactionId cutoff_xid,
  	 * cutoff; it doesn't remove dead members of a very old multixact.
  	 */
  	xid = HeapTupleHeaderGetRawXmax(tuple);
! 	if (TransactionIdIsNormal(xid) &&
! 		(((!(tuple->t_infomask & HEAP_XMAX_IS_MULTI) &&
! 		   TransactionIdPrecedes(xid, cutoff_xid))) ||
! 		 MultiXactIdPrecedes(xid, cutoff_multi)))
  	{
  		HeapTupleHeaderSetXmax(tuple, InvalidTransactionId);
  
--- 5113,5124 
  	 * cutoff; it doesn't remove dead members of a very old multixact.
  	 */
  	xid = HeapTupleHeaderGetRawXmax(tuple);
! 	if (((tuple->t_infomask & HEAP_XMAX_IS_MULTI) &&
! 		 MultiXactIdIsValid(xid) &&
! 		 MultiXactIdPrecedes(xid, cutoff_multi)) ||
! 		((!(tuple->t_infomask & HEAP_XMAX_IS_MULTI)) &&
! 		 TransactionIdIsNormal(xid) &&
! 		 TransactionIdPrecedes(xid, cutoff_xid)))
  	{
  		HeapTupleHeaderSetXmax(tuple, InvalidTransactionId);
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-07 Thread Tom Lane
Tomas Vondra  writes:
> I can't reproduce any of the issues anymore - I've tested both 9.2 and
> 9.3 branches (both containing the already commited fixes). And not only
> that the issues seem to be gone, but I'm actually getting significantly
> better performance.

Cool.  I noticed that it seemed faster too, but hadn't tried to quantify
that.

> I've also tested both branches (9.2 and 9.3) with the provided patch,
> and I can't reproduce any of the errors, but the performance seems to be
> equal to the old code. So it seems that the performance boost is due to
> the changes to the penalty function. Nice!

Yeah, the old penalty function was pretty badly broken with any non-C
sort order.

>> your report that the behavior is unstable, because I get the same result
>> each time if I start from an empty (truncated) table, with or without
>> the patch.  You may be seeing some further bug(s).  Could you test this
>> fix against your own test cases?

> This is what I meant by unstability:
> 
> Notice the numbers change all the time.

[ scratches head... ]  In HEAD, that's not so surprising because of
commit ba1cc6501, which added some randomness to choices about which
GiST page to insert new entries to (which could then affect whether the
union-calculation bugs got exercised).  If you saw that in older
branches, though, it might merit some further investigation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] performance bug in instrument.c

2013-02-07 Thread Tom Lane
Tomas Vondra  writes:
> Tom Lane  wrote:
>> A bigger question is why this is elog(DEBUG2) and not elog(ERROR).
>> Had it been the latter, we'd have noticed the mistake immediately.
>> The current choice might be masking any caller-logic errors that
>> exist, too.

> Not sure why it's DEBUG2, but if it really is an error then it should be
> logged as ERROR I guess.

Perhaps there was a reason for that once, but AFAICT the logic is clean
now.  I changed these elogs to ERROR and ran all the regression tests
under auto_explain, and nothing failed.  That's hardly conclusive of
course, but it's enough evidence to persuade me to make the change in
HEAD.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] performance bug in instrument.c

2013-02-07 Thread Tomas Vondra
Tom Lane  wrote:
> Tomas Vondra  writes:
> There's this piece of code in InstrStartNode:
>
>> if (instr->need_timer && INSTR_TIME_IS_ZERO(instr->starttime))
>> INSTR_TIME_SET_CURRENT(instr->starttime);
>> else
>> elog(DEBUG2, "InstrStartNode called twice in a row");
>
>> but it should actually be like this
>
>> if (instr->need_timer)
>> {
>> if (INSTR_TIME_IS_ZERO(instr->starttime))
>>   INSTR_TIME_SET_CURRENT(instr->starttime);
>> else
>> elog(DEBUG2, "InstrStartNode called twice in a row");
>> }
>
> Hm.  It's a bit annoying that we can't detect the "called twice"
> condition when !need_timer, but I suppose that any such bug would be a
> caller logic error that would probably not be sensitive to need_timer,
> so it's likely not worth adding overhead to handle that.

Yes, that's annoying. But if we need / want to detect that, wouldn't it
be cleaner to add there a separate "already executed" flag, instead of
misusing starttime for that?

>
> A bigger question is why this is elog(DEBUG2) and not elog(ERROR).
> Had it been the latter, we'd have noticed the mistake immediately.
> The current choice might be masking any caller-logic errors that
> exist, too.

Not sure why it's DEBUG2, but if it really is an error then it should be
logged as ERROR I guess.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-07 Thread Tomas Vondra
On 7.2.2013 01:10, Tom Lane wrote:
> The attached patch fixes these things, but not the buggy penalty
> function, because we ought to try to verify that these changes are
> enough to prevent creation of an incorrect index.  I can't reproduce any
> misbehavior anymore with this patch applied.  However, I'm troubled by

I can't reproduce any of the issues anymore - I've tested both 9.2 and
9.3 branches (both containing the already commited fixes). And not only
that the issues seem to be gone, but I'm actually getting significantly
better performance. For example this is the old code:

test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
Time: 17802,500 ms
test=# analyze;
ANALYZE
Time: 122,192 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
Time: 81253,104 ms
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 7
Time: 11008,737 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 1
Time: 21710,315 ms
test=#

and this is the new code:

test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
Time: 2196,802 ms
test=# analyze;
ANALYZE
Time: 113,598 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
Time: 2369,028 ms
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 0
Time: 832,910 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0
Time: 778,241 ms

That's more than 10x faster in all cases (and about 30x faster in some
of them).

I've also tested both branches (9.2 and 9.3) with the provided patch,
and I can't reproduce any of the errors, but the performance seems to be
equal to the old code. So it seems that the performance boost is due to
the changes to the penalty function. Nice!

> your report that the behavior is unstable, because I get the same result
> each time if I start from an empty (truncated) table, with or without
> the patch.  You may be seeing some further bug(s).  Could you test this
> fix against your own test cases?

This is what I meant by unstability:

test=# truncate test;
TRUNCATE TABLE
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 1
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 0
test=# truncate test;
TRUNCATE TABLE
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 4
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0
test=# truncate test;
TRUNCATE TABLE
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 3
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0

Notice the numbers change all the time. But I can't really reproduce
these with the current HEAD (nor with the patch), so I assume these were
caused by switching plans at different times.

Seems fixed to me.


Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Considering Gerrit for CFs

2013-02-07 Thread Josh Berkus
Folks,

First, thanks for the serious discussion of this.

>> There are obvious tooling gaps (aren't there always?), but I don't
>> really see the model as broken, and I don't think I've been around
>> pgsql-hackers exclusively or extensively enough to have developed
>> Stockholm syndrome.

I don't see the model as broken either.  Just the tooling, which is why
I'm looking at tooling.  As in, I'm looking for better tooling in order
to solve two problems:

1. maximize the efficiency of existing reviewer time

2. make tooling not be an obstacle to getting new reviewers

Of these two, (2) is actually the more critical. We have been losing,
not gaining, active committers and reviewers for the last couple years.
 Clearly "do more of what we've been doing" is a losing strategy.   We
need to be sucessfully moving people up the contributor chain if we're
ever going to get out of this "not enough reviewers" hole.

I agree that tooling is a minority of this, but tooling is also the
easiest thing to change (compared with project organization), so that's
what I'm tackling first.  Expect a discussion on the people aspects at
the developer meeting.

> Personally, I find the most annoying thing with the current process
> being when reviewers post their reviews as a completely separate
> thread, instead of replying in the original thread. This causes
> context switches when parsing things, because now you have to jump
> back and forth between the CF app and your mail reader. But it's still
> only on the annoyance side, I think the process in general is not
> broken. (That said, I *have* been on the inside a long time, *and* I
> live in Stockholm, so I might well have that syndrome)

So, look at this from the perspective of a casual reviewer, say at a PUG
reviewfest.  Instructions to new reviewer:

1. find the feature you want to review on the CF app.

2. Click the link to the mailing list archives.

3. Click all the way through the list thread to make sure there isn't a
later version of the patch.

4. Download the patch.   Hopefully it's not mangled by the archives
(this has gotten much better than it was last year)

5. Apply the patch to HEAD clone.

6. Do actual reviewing/testing.

7. Write an email review.

8. Send it to pgsql-hackers
8.a. this requires you to be subscribed to pgsql-hackers.

9. wait for the email to show up in the archives.

10. create a review comment in the CF app.
10.a. this requires you to be signed up for a community account
10.b. sign up for one now
10.c. wait for it to be approved

11. link the review comment to the messageID

12. change status of the patch

This is a few too many steps, and certainly appears completely broken to
any newcomer.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] performance bug in instrument.c

2013-02-07 Thread Tom Lane
Tomas Vondra  writes:
> There's this piece of code in InstrStartNode:

> if (instr->need_timer && INSTR_TIME_IS_ZERO(instr->starttime))
> INSTR_TIME_SET_CURRENT(instr->starttime);
> else
> elog(DEBUG2, "InstrStartNode called twice in a row");

> but it should actually be like this

> if (instr->need_timer)
> {
> if (INSTR_TIME_IS_ZERO(instr->starttime))
>   INSTR_TIME_SET_CURRENT(instr->starttime);
> else
> elog(DEBUG2, "InstrStartNode called twice in a row");
> }

Hm.  It's a bit annoying that we can't detect the "called twice"
condition when !need_timer, but I suppose that any such bug would be a
caller logic error that would probably not be sensitive to need_timer,
so it's likely not worth adding overhead to handle that.

A bigger question is why this is elog(DEBUG2) and not elog(ERROR).
Had it been the latter, we'd have noticed the mistake immediately.
The current choice might be masking any caller-logic errors that
exist, too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql and materialized views

2013-02-07 Thread Kevin Grittner
Kohei KaiGai  wrote:

> So, I'd like to review two options.
> 1) we uses db_table object class for materialized-views for
> a while, until selinux-side become ready. Probably, v9.3 will
> use db_table class then switched at v9.4.
> 2) we uses db_materialized_view object class from the
> begining, but its permission checks are ignored because
> installed security policy does not support this class yet.
>
> My preference is 2), even though we cannot apply label
> based permission checks until selinux support it, because
> 1) makes troubles when selinux-side become ready to
> support new db_materialized_view class. Even though
> policy support MV class, working v9.3 will ignore the policy.
>
> Let me ask selinux folks about this topic also.

To make sure I understand, the current patch is consistent with
option 1?  It sounds like I have code from a prior version of the
patch pretty close to what you describe for option 2, so that can
be put back in place if you confirm that as the preferred option.
From what you describe, it sounds like the only thing it doesn't
have is a new hook for REFRESH, but that doesn't sound like it
would take that much to implement.

Thanks for looking at this!

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] performance bug in instrument.c

2013-02-07 Thread Tomas Vondra
Hi,

it seems there's a issue in instrument.c that may have significant
performance impact for some plans when running explain analyze with
"TIMING OFF".

There's this piece of code in InstrStartNode:

if (instr->need_timer && INSTR_TIME_IS_ZERO(instr->starttime))
INSTR_TIME_SET_CURRENT(instr->starttime);
else
elog(DEBUG2, "InstrStartNode called twice in a row");

but it should actually be like this

if (instr->need_timer && INSTR_TIME_IS_ZERO(instr->starttime))
INSTR_TIME_SET_CURRENT(instr->starttime);
else if (instr->need_timer)
elog(DEBUG2, "InstrStartNode called twice in a row");

or maybe

if (instr->need_timer)
{
if (INSTR_TIME_IS_ZERO(instr->starttime))
INSTR_TIME_SET_CURRENT(instr->starttime);
else
elog(DEBUG2, "InstrStartNode called twice in a row");
}

The current code calls the "else" branch everytime when "TIMING OFF" is
used, and this may lead to serious performance degradation - see for
example this:

  CREATE TABLE x AS SELECT id FROM generate_series(1,15000) s(id);
  ANALYZE x;
  EXPLAIN ANALYZE SELECT a.id FROM x a, x b;

Current code:

   QUERY PLAN
---
 Nested Loop  (cost=0.00..2812971.50 rows=22500 width=4) (actual
time=0.013..29611.859 rows=22500 loops=1)
   ->  Seq Scan on x a  (cost=0.00..217.00 rows=15000 width=4) (actual
time=0.006..2.847 rows=15000 loops=1)
   ->  Materialize  (cost=0.00..292.00 rows=15000 width=0) (actual
time=0.000..0.740 rows=15000 loops=15000)
 ->  Seq Scan on x b  (cost=0.00..217.00 rows=15000 width=0)
(actual time=0.003..1.186 rows=15000 loops=1)
 Total runtime: 36054.079 ms
(5 rows)

and with the fix

   QUERY PLAN
---
 Nested Loop  (cost=0.00..1458333.00 rows=11664 width=4) (actual
time=0.021..13158.399 rows=1 loops=1)
   ->  Seq Scan on x a  (cost=0.00..153.00 rows=10800 width=4) (actual
time=0.014..1.960 rows=1 loops=1)
   ->  Materialize  (cost=0.00..207.00 rows=10800 width=0) (actual
time=0.000..0.499 rows=1 loops=1)
 ->  Seq Scan on x b  (cost=0.00..153.00 rows=10800 width=0)
(actual time=0.003..1.037 rows=1 loops=1)
 Total runtime: 16017.953 ms
(5 rows)

Obviously this is a quite extreme example, most plans won't be hit this
hard.

This was discovered by Pavel Stehule when backporting my "TIMING OFF"
patch (which introduced the bug).

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Alvaro Herrera
Jeff Janes escribió:
> On Thu, Feb 7, 2013 at 10:09 AM, Pavan Deolasee
>  wrote:
> >
> > Right. I don't have the database handy at this moment, but earlier in
> > the day I ran some queries against it and found that most of the
> > duplicates which are not accessible via indexes have xmin very close
> > to 2100345903. In fact, many of them are from a consecutive range.
> 
> Does anyone have suggestions on how to hack the system to make it
> fast-forward the current transaction id? It would certainly make
> testing this kind of thing faster if I could make transaction id
> increment by 100 each time a new one is generated.  Then wrap-around
> could be approached in minutes rather than hours.

I can reproduce the problem in a few minutes with the attached.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 64537d0..9513faf 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -159,16 +159,25 @@ GetNewTransactionId(bool isSubXact)
 	 *
 	 * Extend pg_subtrans too.
 	 */
-	ExtendCLOG(xid);
-	ExtendSUBTRANS(xid);
+	{
+		int		incr = pg_lrand48() & 0x7FF;
 
-	/*
-	 * Now advance the nextXid counter.  This must not happen until after we
-	 * have successfully completed ExtendCLOG() --- if that routine fails, we
-	 * want the next incoming transaction to try it again.	We cannot assign
-	 * more XIDs until there is CLOG space for them.
-	 */
-	TransactionIdAdvance(ShmemVariableCache->nextXid);
+		for (; incr > 0; incr--)
+		{
+			xid = ShmemVariableCache->nextXid;
+
+			ExtendCLOG(xid);
+			ExtendSUBTRANS(xid);
+
+			/*
+			 * Now advance the nextXid counter.  This must not happen until after we
+			 * have successfully completed ExtendCLOG() --- if that routine fails, we
+			 * want the next incoming transaction to try it again.	We cannot assign
+			 * more XIDs until there is CLOG space for them.
+			 */
+			TransactionIdAdvance(ShmemVariableCache->nextXid);
+		}
+	}
 
 	/*
 	 * We must store the new XID into the shared ProcArray before releasing

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Simon Riggs
On 7 February 2013 19:15, Jeff Janes  wrote:
> On Thu, Feb 7, 2013 at 10:09 AM, Pavan Deolasee
>  wrote:
>>
>> Right. I don't have the database handy at this moment, but earlier in
>> the day I ran some queries against it and found that most of the
>> duplicates which are not accessible via indexes have xmin very close
>> to 2100345903. In fact, many of them are from a consecutive range.
>
> Does anyone have suggestions on how to hack the system to make it
> fast-forward the current transaction id? It would certainly make
> testing this kind of thing faster if I could make transaction id
> increment by 100 each time a new one is generated.  Then wrap-around
> could be approached in minutes rather than hours.

This is a variation of one of the regression tests...

CREATE OR REPLACE FUNCTION burn_xids (n integer)
RETURNS void
LANGUAGE plpgsql
AS $$
BEGIN
IF n <= 0 THEN RETURN; END IF;
PERFORM burn_xids(n - 1);
RETURN;
EXCEPTION WHEN raise_exception THEN NULL; END;
$$;

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Tom Lane
Jeff Janes  writes:
> Does anyone have suggestions on how to hack the system to make it
> fast-forward the current transaction id?

What I've generally done is to stop the server then use pg_resetxlog
to put the XID counter where I want it.  I believe you'll need to
manually create a pg_clog file corresponding to the new XID counter
value, and maybe pg_subtrans too.  (Someday pg_resetxlog oughta be
improved to create those files itself, probably.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Andres Freund
On 2013-02-07 11:15:46 -0800, Jeff Janes wrote:
> On Thu, Feb 7, 2013 at 10:09 AM, Pavan Deolasee
>  wrote:
> >
> > Right. I don't have the database handy at this moment, but earlier in
> > the day I ran some queries against it and found that most of the
> > duplicates which are not accessible via indexes have xmin very close
> > to 2100345903. In fact, many of them are from a consecutive range.
>
> Does anyone have suggestions on how to hack the system to make it
> fast-forward the current transaction id? It would certainly make
> testing this kind of thing faster if I could make transaction id
> increment by 100 each time a new one is generated.  Then wrap-around
> could be approached in minutes rather than hours.

I had various plpgsql functions to do that, but those still took quite
some time. As I needed it before I just spent some minutes hacking up a
contrib module to do the job.

I doubt it really think it makes sense as a contrib module on its own
though?

postgres=# select * FROM burnxids(50);select * FROM
burnxids(50);
 burnxids
--
  5380767
(1 row)

Time: 859.807 ms
 burnxids
--
  5880767
(1 row)

Time: 717.700 ms

It doesn't really work in a nice way:
if (GetTopTransactionIdIfAny() != InvalidTransactionId)
elog(ERROR, "can't burn xids in a transaction with xid");

for (i = 0; i < nxids; i++)
{
last = GetNewTransactionId(false);
}

/* don't keep xid as assigned */
MyPgXact->xid = InvalidTransactionId;

but it seems to work ;)

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/contrib/Makefile b/contrib/Makefile
index 36e6bfe..aca12a7 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -54,7 +54,8 @@ SUBDIRS = \
 		tsearch2	\
 		unaccent	\
 		vacuumlo	\
-		worker_spi
+		worker_spi	\
+		xidfuncs
 
 ifeq ($(with_openssl),yes)
 SUBDIRS += sslinfo
diff --git a/contrib/xidfuncs/Makefile b/contrib/xidfuncs/Makefile
new file mode 100644
index 000..6977a3b
--- /dev/null
+++ b/contrib/xidfuncs/Makefile
@@ -0,0 +1,18 @@
+# contrib/xidfuncs/Makefile
+
+MODULE_big	= xidfuncs
+OBJS		= xidfuncs.o
+
+EXTENSION = xidfuncs
+DATA = xidfuncs--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/xidfuncs
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/xidfuncs/xidfuncs.c b/contrib/xidfuncs/xidfuncs.c
new file mode 100644
index 000..e11b4fb
--- /dev/null
+++ b/contrib/xidfuncs/xidfuncs.c
@@ -0,0 +1,50 @@
+/*-
+ * contrib/xidfuncs/xidfuncs.c
+ *
+ * Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/xidfuncs/xidfuncs.c
+ *
+ *-
+ */
+
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xact.h"
+#include "funcapi.h"
+#include "storage/proc.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(burnxids);
+
+extern Datum burnxids(PG_FUNCTION_ARGS);
+
+Datum
+burnxids(PG_FUNCTION_ARGS)
+{
+	int nxids = PG_GETARG_INT32(0);
+	int i;
+	TransactionId last;
+
+	if (nxids <= 1)
+		elog(ERROR, "can't burn a negative amount of xids");
+
+	if (nxids > 50)
+		elog(ERROR, "burning too many xids is dangerous");
+
+	if (GetTopTransactionIdIfAny() != InvalidTransactionId)
+		elog(ERROR, "can't burn xids in a transaction with xid");
+
+	for (i = 0; i < nxids; i++)
+	{
+		last = GetNewTransactionId(false);
+	}
+
+	/* don't keep xid as assigned */
+	MyPgXact->xid = InvalidTransactionId;
+
+	return Int64GetDatum((int64)last);
+}
diff --git a/contrib/xidfuncs/xidfuncs.control b/contrib/xidfuncs/xidfuncs.control
new file mode 100644
index 000..bca7194
--- /dev/null
+++ b/contrib/xidfuncs/xidfuncs.control
@@ -0,0 +1,5 @@
+# xidfuncs extension
+comment = 'xid debugging functions'
+default_version = '1.0'
+module_pathname = '$libdir/xidfuncs'
+relocatable = true

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Jeff Janes
On Thu, Feb 7, 2013 at 10:09 AM, Pavan Deolasee
 wrote:
>
> Right. I don't have the database handy at this moment, but earlier in
> the day I ran some queries against it and found that most of the
> duplicates which are not accessible via indexes have xmin very close
> to 2100345903. In fact, many of them are from a consecutive range.

Does anyone have suggestions on how to hack the system to make it
fast-forward the current transaction id? It would certainly make
testing this kind of thing faster if I could make transaction id
increment by 100 each time a new one is generated.  Then wrap-around
could be approached in minutes rather than hours.

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.

2013-02-07 Thread Simon Riggs
On 6 February 2013 17:43, Simon Riggs  wrote:

>> Here's what I think should be done:
>>
>> 1. Remove the check that prev checkpoint record exists.
>
> Agreed

Done

>> 2. Always do fast promotion if in standby mode. Remove the pg_ctl option.
>
> Disagreed, other viewpoints welcome.

Waiting for further comments.

>> 3. Improve docs.
>
> Agreed

Pending.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Identity projection

2013-02-07 Thread Tom Lane
Amit Kapila  writes:
> There can be 2 ways to remove result node
> a. Remove the Result plan node in case it is not required - This is same as
> currently it does for SubqueryScan. 
>We can check if the result plan is trivial (with logic similar to
> trivial_subqueryscan()), then remove result node.

> b. to avoid adding it to Plan node in case it is not required - 
>For this, in grouping_planner() currently it checks if the plan is
> projection capable (is_projection_capable_plan()),
>we can enhance this check such that it also check projection is really
> required depending if the targetlist contains
>any non Var element.

> Please suggest which way is more preferable and if one of above 2 seems to
> be okay,

Adding a result node only to remove it again seems a bit expensive.
It'd be better not to generate the node in the first place.  (There's
a technical reason to generate a temporary SubqueryScan, which is to
keep Var numbering in the subplan separate from that in the upper plan;
but AFAICS that doesn't apply to Result.)

An advantage of removing useless Results at setrefs.c time is that we
can be sure it will be applied to all Result nodes.  However, we might
be able to do it the other way with only one point-of-change if we hack
make_result itself to check whether the proposed tlist is an identity.

Note that "contains non Var element" is the wrong test for this anyway
--- the question is does the tlist have the same expressions in the same
order as the tlist of the Result's child node.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Pavan Deolasee
On Thu, Feb 7, 2013 at 10:48 PM, Jeff Janes  wrote:
> On Thu, Feb 7, 2013 at 1:44 AM, Pavan Deolasee  
> wrote:

>>
>> jjanes=# select *, xmin, xmax, ctid from foo where index IN (select
>> index from foo group by index having count(*) > 1 ORDER by index)
>> ORDER by index LIMIT 3;
>>  index | count |xmin| xmax |   ctid
>> ---+---++--+---
>>219 |   353 | 2100345903 |0 | (150,98)
>>219 |   354 | 2100346051 |0 | (150,101)
>>219 |   464 | 2101601086 |0 | (150,126)
>> (3 rows)
>
> The one where count=464 should be the correct one to be visible, and
> the other two are old tuples that were updated away.  (The test driver
> increases the count column monotonically for each any given value of
> index column.
>

Right. I don't have the database handy at this moment, but earlier in
the day I ran some queries against it and found that most of the
duplicates which are not accessible via indexes have xmin very close
to 2100345903. In fact, many of them are from a consecutive range.

Thanks,
Pavan
-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Alvaro Herrera
Jeff Janes escribió:
> On Thu, Feb 7, 2013 at 12:55 AM, Pavan Deolasee
>  wrote:

> I don't see the assertion failure myself.  If I do REINDEX INDEX it
> gives a duplicate key violation, and if I do REINDEX TABLE or REINDEX
> DATABASE I get claimed success.
> 
> This is using either current head (ab0f7b6) or 168d315 as binaries to
> start up the cluster.

Note that the visibility tests are correct: those tuples should all be
visible.  The problem is not the binary that's running the cluster now;
the problem is the binary that created the cluster in the first place
(or rather the binary that was running when tuple freezing took place).
That is, assuming my theory about tuple freezing being involved is correct.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Jeff Janes
On Thu, Feb 7, 2013 at 9:32 AM, Jeff Janes  wrote:
> On Thu, Feb 7, 2013 at 12:55 AM, Pavan Deolasee
>  wrote:
>>
>> Index scans do not return any duplicates and you need to force a seq
>> scan to see them. Assuming that the page level VM bit might be
>> corrupted, I tried to REINDEX the table to see if it complains of
>> unique key violations, but that crashes the server with
>>
>> TRAP: FailedAssertion("!(((bool) ((root_offsets[offnum - 1] !=
>> ((OffsetNumber) 0)) && (root_offsets[offnum - 1] <= ((OffsetNumber)
>> (8192 / sizeof(ItemIdData)))", File: "index.c", Line: 2482)
>
> I don't see the assertion failure myself.  If I do REINDEX INDEX it
> gives a duplicate key violation, and if I do REINDEX TABLE or REINDEX
> DATABASE I get claimed success.

Ah, ignore that claim.  Of course one does not get assertion failures
if one neglected to turn them on!

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] split rm_name and rm_desc out of rmgr.c

2013-02-07 Thread Tom Lane
Peter Eisentraut  writes:
> On 2/5/13 3:47 PM, Alvaro Herrera wrote:
>> Patch attached.

> This has broken cpluspluscheck:

> ./src/include/access/rmgrlist.h:28:8: error: expected constructor, 
> destructor, or type conversion before '(' token

cpluspluscheck has an explicit exclusion for kwlist.h, looks like it
needs one for rmgrlist.h as well, since neither of them are meant to
compile standalone.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Jeff Janes
On Thu, Feb 7, 2013 at 12:55 AM, Pavan Deolasee
 wrote:
> On Thu, Feb 7, 2013 at 11:09 AM, Jeff Janes  wrote:
>> While stress testing Pavan's 2nd pass vacuum visibility patch, I realized
>> that vacuum/visibility was busted.  But it wasn't his patch that busted it.
>> As far as I can tell, the bad commit was in the range
>> 692079e5dcb331..168d3157032879
>>
>> Since a run takes 12 to 24 hours, it will take a while to refine that
>> interval.
>>
>> I was testing using the framework explained here:
>>
>> http://www.postgresql.org/message-id/CAMkU=1xoa6fdyoj_4fmlqpiczr1v9gp7clnxjdhu+iggqb6...@mail.gmail.com
>>
>> Except that I increased  JJ_torn_page to 8000, so that autovacuum has a
>> chance to run to completion before each crash; and I turned off archive_mode
>> as it was not relevant and caused annoying noise.  As far as I know,
>> crashing is entirely irrelevant to the current problem, but I just used and
>> adapted the framework I had at hand.
>>
>> A tarball  of the data directory is available below, for those who would
>> like to do a forensic inspection.  The table jjanes.public.foo is clearly in
>> violation of its unique index.
>
> The xmins of all the duplicate tuples look dangerously close to 2^31.
> I wonder if XID wrap around has anything to do with it.
>
> Index scans do not return any duplicates and you need to force a seq
> scan to see them. Assuming that the page level VM bit might be
> corrupted, I tried to REINDEX the table to see if it complains of
> unique key violations, but that crashes the server with
>
> TRAP: FailedAssertion("!(((bool) ((root_offsets[offnum - 1] !=
> ((OffsetNumber) 0)) && (root_offsets[offnum - 1] <= ((OffsetNumber)
> (8192 / sizeof(ItemIdData)))", File: "index.c", Line: 2482)

I don't see the assertion failure myself.  If I do REINDEX INDEX it
gives a duplicate key violation, and if I do REINDEX TABLE or REINDEX
DATABASE I get claimed success.

This is using either current head (ab0f7b6) or 168d315 as binaries to
start up the cluster.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: ANSI SQL 2011 syntax for named parameters

2013-02-07 Thread Dimitri Fontaine
Tom Lane  writes:
> If you're suggesting that we should back-patch hstore 1.1 into 9.1,
> there might not be a technical reason why we couldn't do it, but there
> are certainly project-policy reasons.  Removing operators, or indeed
> changing any SQL interface at all, is exactly the kind of change we do
> not make in back branches.

For core itself, it makes perfect sense. For extensions, I wonder about
the upgrade path, and if we shouldn't leave some level fo choice to the
user. Shipping the ability to upgrade to hstore 1.1 into back branches
is not the same thing as upgrading our users.

>> To make that easier to maintain, there's a patch in the queue
>> implementing default_major_version so that we can ship hstore--1.0.sql
>> and hstore--1.0--1.1.sql and still have that command just works:
>>   CREATE EXTENSION hstore VERSION '1.1';
>
> If the argument for this patch is only to support doing something like
> the above, I'd vote for rejecting it entirely.

This patch allows us to ship bug and security fixes in back branches
without having to maintain both the 1.1 and the 1.2 full scripts, as
PostgreSQL will now be able to install 1.1 and upgrade to 1.2 at CREATE
EXTENSION time.

So no, this patch is not made for something like forcing incompatible
changes down the throat of our users, it's made to make the life of
extension maintainers (core included) easier.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Jeff Janes
On Thu, Feb 7, 2013 at 1:44 AM, Pavan Deolasee  wrote:
> On Thu, Feb 7, 2013 at 2:25 PM, Pavan Deolasee  
> wrote:
>
>>
>> Will look more into it, but thought this might be useful for others to
>> spot the problem.
>>
>
> And here is some more forensic info about one of the pages having
> duplicate tuples.
>
> jjanes=# select *, xmin, xmax, ctid from foo where index IN (select
> index from foo group by index having count(*) > 1 ORDER by index)
> ORDER by index LIMIT 3;
>  index | count |xmin| xmax |   ctid
> ---+---++--+---
>219 |   353 | 2100345903 |0 | (150,98)
>219 |   354 | 2100346051 |0 | (150,101)
>219 |   464 | 2101601086 |0 | (150,126)
> (3 rows)

The one where count=464 should be the correct one to be visible, and
the other two are old tuples that were updated away.  (The test driver
increases the count column monotonically for each any given value of
index column.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] split rm_name and rm_desc out of rmgr.c

2013-02-07 Thread Peter Eisentraut
On 2/5/13 3:47 PM, Alvaro Herrera wrote:
> Tom Lane wrote:
>> Alvaro Herrera  writes:
> 
>>> The
>>> approach in the second patch is to turn these into "extern const RmgrId"
>>> instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns
>>> them the values as consts.
>>
>> ... but I don't especially like that implementation, as it will result
>> in nonzero code bloat and runtime cost due to replacing all those
>> constants with global-variable references.  Couldn't you instead set it
>> up as an enum definition?
> 
> That seems to work.  I would like to have some way of specifying that
> the enum members should be of type RmgrId, but I don't think there's any
> way to do that.
> 
> Patch attached.

This has broken cpluspluscheck:

./src/include/access/rmgrlist.h:28:8: error: expected constructor, destructor, 
or type conversion before '(' token



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: ANSI SQL 2011 syntax for named parameters

2013-02-07 Thread Tom Lane
Dimitri Fontaine  writes:
> Robert Haas  writes:
>> I don't know what to add to that.

> There's no technical reason that I'm aware of for hstore 1.1 not to
> support all our maintained releases at the same time. That's exactly how
> we do it with non-core extensions, by the way.

If you're suggesting that we should back-patch hstore 1.1 into 9.1,
there might not be a technical reason why we couldn't do it, but there
are certainly project-policy reasons.  Removing operators, or indeed
changing any SQL interface at all, is exactly the kind of change we do
not make in back branches.

> To make that easier to maintain, there's a patch in the queue
> implementing default_major_version so that we can ship hstore--1.0.sql
> and hstore--1.0--1.1.sql and still have that command just works:
>   CREATE EXTENSION hstore VERSION '1.1';

If the argument for this patch is only to support doing something like
the above, I'd vote for rejecting it entirely.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: ANSI SQL 2011 syntax for named parameters

2013-02-07 Thread Dimitri Fontaine
Robert Haas  writes:
> $ Right now there is one and only one release in
> $ the field that contains hstore 1.1.  If we go ahead and prohibit => as
> $ an operator name now, we're going to require everyone who is on 9.1
> $ and uses hstore and wants to get to 9.3 to either (a) first upgrade to
> $ 9.2, then update hstore, then upgrade to 9.3; or (b) dig the
> $ hstore-1.1 update out of a future release, apply it to an earlier
> $ release on which it did not ship, and then upgrade.
>
> I don't know what to add to that.

There's no technical reason that I'm aware of for hstore 1.1 not to
support all our maintained releases at the same time. That's exactly how
we do it with non-core extensions, by the way.

To make that easier to maintain, there's a patch in the queue
implementing default_major_version so that we can ship hstore--1.0.sql
and hstore--1.0--1.1.sql and still have that command just works:

  CREATE EXTENSION hstore VERSION '1.1';

That support is going to ease a lot dump and support of Extensions
installed from a Template, too, so much so that I would really like to
get some reviewing about that before sending the full patch.

We've been talking about "in-core extensions" as opposed to contribs for
a while now, I think this is another angle to see things through. We
could actually maintain proper extensions the proper way.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Record previous TLI in end-of-recovery record (was Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.)

2013-02-07 Thread Simon Riggs
On 7 February 2013 16:07, Heikki Linnakangas  wrote:

> It just occurred to me that it would be really nice if the end-of-recovery
> record, and the timeline-switching shutdown checkpoint record too for that
> matter, would include the previous timeline's ID that we forked from, in
> addition to the new TLI. Although it's not required for anything at the
> moment, it would be useful debugging information. It would allow
> reconstructing timeline history files from the WAL; that might come handy.
>
> Barring objections, I'll add that.

Good idea, please do.

That means a shutdown checkpoint becomes it's own record type but
my understanding of our other conversations was that you want to never
use shutdown checkpoints for end of recovery ever again, so that seems
unnecesary. Sorry to mix things up.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Record previous TLI in end-of-recovery record (was Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.)

2013-02-07 Thread Heikki Linnakangas

(this is unrelated to the other discussion about this patch)

On 29.01.2013 02:07, Simon Riggs wrote:

Fast promote mode skips checkpoint at end of recovery.
pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
can achieve very fast failover when the apply delay is low. Write new WAL record
XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
readers. If we skip synchronous end of recovery checkpoint we request a normal
spread checkpoint so that the window of re-recovery is low.


It just occurred to me that it would be really nice if the 
end-of-recovery record, and the timeline-switching shutdown checkpoint 
record too for that matter, would include the previous timeline's ID 
that we forked from, in addition to the new TLI. Although it's not 
required for anything at the moment, it would be useful debugging 
information. It would allow reconstructing timeline history files from 
the WAL; that might come handy.


Barring objections, I'll add that.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.

2013-02-07 Thread Kevin Grittner
Heikki Linnakangas  wrote:
> On 07.02.2013 10:41, Simon Riggs wrote:


>>  Why would someone want to turn off safe-promote mode, assuming it was
>>  fast enough?

> Because faster is nicer, even if the slow mode would be "fast enough".

http://www.youtube.com/watch?v=H3R-rtWPyJY

-Kevin



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Alvaro Herrera
Alvaro Herrera escribió:

> Hm, if the foreign key patch is to blame, this sounds like these tuples
> had a different set of XMAX hint bits and a different Xmax, and they
> were clobbered by something like vacuum or page pruning.

Hm, I think heap_freeze_tuple is busted, yes.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-02-07 Thread MauMau

Hello, Tom-san, folks,

From: "Tom Lane" 

I think if we want to make it bulletproof we'd have to do what the
OP suggested and switch to SIGKILL.  I'm not enamored of that for the
reasons I mentioned --- but one idea that might dodge the disadvantages
is to have the postmaster wait a few seconds and then SIGKILL any
backends that hadn't exited.

I think if we want something that's actually trustworthy, the
wait-and-then-kill logic has to be in the postmaster.



I'm sorry to be late to submit a patch.  I made a fix according to the above 
comment.  Unfortunately, it was not in time for 9.1.8 release...  I hope 
this patch will be included in 9.1.9.


Could you review the patch?  The summary of the change is:
1. postmaster waits for children to terminate when it gets an immediate 
shutdown request, instead of exiting.


2. postmaster sends SIGKILL to remaining children if all of the child 
processes do not terminate within 10 seconds since the start of immediate 
shutdown or FatalError condition.


3. On Windows, kill(SIGKILL) calls TerminateProcess().

4. Documentation change to describe the behavior of immediate shutdown.


Regards
MauMau


reliable_immediate_shutdown.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Alvaro Herrera
Pavan Deolasee escribió:
> On Thu, Feb 7, 2013 at 2:25 PM, Pavan Deolasee  
> wrote:
> 
> >
> > Will look more into it, but thought this might be useful for others to
> > spot the problem.
> >
> 
> And here is some more forensic info about one of the pages having
> duplicate tuples.
> 
> jjanes=# select *, xmin, xmax, ctid from foo where index IN (select
> index from foo group by index having count(*) > 1 ORDER by index)
> ORDER by index LIMIT 3;
>  index | count |xmin| xmax |   ctid
> ---+---++--+---
>219 |   353 | 2100345903 |0 | (150,98)
>219 |   354 | 2100346051 |0 | (150,101)
>219 |   464 | 2101601086 |0 | (150,126)
> (3 rows)

Hm, if the foreign key patch is to blame, this sounds like these tuples
had a different set of XMAX hint bits and a different Xmax, and they
were clobbered by something like vacuum or page pruning.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re[2]: [HACKERS] standby, pg_basebackup and last xlog file

2013-02-07 Thread Миша Тюрин


Hello all and Heikki personally

Thank you for your answer

I have some new points:


Понедельник, 21 января 2013, 10:08 +02:00 от Heikki Linnakangas 
:
>On 21.01.2013 09:14, Миша Тюрин wrote:
>>Is there any reason why pg_basebackup has limitation in an online backup 
>> from the standby: "The backup history file is not created in the database 
>> cluster backed up." ?
>
>WAL archiving isn't active in a standby, so even if it created a backup 
>history file, it wouldn't go anywhere. Also, the way the backup history 
>files are named, if you take a backup on the master at the same time (or 
>another backup at the same time in the standby), you would end up with 
>multiple backup history files with the same name.
>
>>So i can't get last xlog file needed to restore :(
>
>Good point. That was an oversight in the patch that allowed base backups 
>from a standby.
>
>>Also maybe i can use something like ( pg_last_xlog_replay_location() + 1 
>> ) after pg_basebackup finished.
>
>Yeah, that should work.
>
>>Does anybody know true way to getting last xlog file in case of applying 
>> pg_basebackup to standby?
>>How pg_basebackup gets last xlog file in case of standby and -x option?
>
>The server returns the begin and end WAL locations to pg_basebackup, 
>pg_basebackup just doesn't normally print them out to the user. In 
>verbose mode, it does actually print them out, but only with -x, so that 
>doesn't help you either. If you can compile from source, you could 
>modify pg_basebackup.c to print the locations without -x and --verbose, 
>search lines that print out "transaction log start point / end position".

1) we can get last xlog by using control data's "Minimum recovery ending 
location" 


>
>
>As a workaround, without modifying the source, you can do this after 
>pg_basebackup has finished, to get the first WAL file you need:
>
>$ pg_controldata backupdir | grep "REDO WAL file"
>Latest checkpoint's REDO WAL file:00030009

and I would like to correct your suggestion about first wal file
2.1)  we should use backup_label to determine first needed wal
2.2)  and we must not use redo from checkpoint. because there are might be more 
than one checkpoint during base_backup


>
>And as you suggested, pg_last_xlog_replay_location() for the last WAL 
>file you need.
>
>- Heikki
>
>
>-- 
>Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers

- Misha



Re: [HACKERS] Identity projection

2013-02-07 Thread Amit Kapila
On Friday, December 14, 2012 5:11 PM Heikki Linnakangas wrote:
> On 12.11.2012 12:07, Kyotaro HORIGUCHI wrote:
> > Hello, This is new version of identity projection patch.
> >
> > Reverted projectionInfo and ExecBuildProjectionInfo. Identity
> > projection is recognized directly in ExecGroup, ExecResult, and
> > ExecWindowAgg. nodeAgg is reverted because I couldn't make it
> > sane..
> >
> > The following is the result of performance test posted before in
> > order to show the source of the gain.
> 
> Hmm, this reminds me of the discussion on removing useless Limit nodes:
> http://archives.postgresql.org/pgsql-performance/2012-12/msg00127.php.
> 
> The optimization on Group, WindowAgg and Agg nodes doesn't seem that
> important, the cost of doing the aggregation/grouping is likely
> overwhelming the projection cost, and usually you do projection in
> grouping/aggregation anyway. But makes sense for Result.
> 
> For Result, I think you should aim to remove the useless Result node
> from the plan altogether. 

I was reviewing this patch and found that it can be move forward as follows:

There can be 2 ways to remove result node
a. Remove the Result plan node in case it is not required - This is same as
currently it does for SubqueryScan. 
   We can check if the result plan is trivial (with logic similar to
trivial_subqueryscan()), then remove result node.

b. to avoid adding it to Plan node in case it is not required - 
   For this, in grouping_planner() currently it checks if the plan is
projection capable (is_projection_capable_plan()),
   we can enhance this check such that it also check projection is really
required depending if the targetlist contains
   any non Var element.

Please suggest which way is more preferable and if one of above 2 seems to
be okay,
I can update the patch on behalf of Kyotaro-san incase they don't have time
currently.

> And do the same for useless Limit nodes.
Is it okay, if this can be done as part of separate patch?

With Regards,
Amit Kapila.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A question about the psql \copy command

2013-02-07 Thread Etsuro Fujita
Through the work on the patch [1], I had a question about the psql \copy
command.  We are permitted 1) but not permitted 2):
1) \copy foo from stdin ;
2) \copy foo from stdin;
Is this intentional?  I think it would be better to allow for 2).  Attached is a
patch.

Thanks,

Best regards,
Etsuro Fujita

[1]
http://www.postgresql.org/message-id/002e01cdff64$a53663b0$efa32b10$@kapila@huaw
ei.com



psql_copy.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LDAP: bugfix and deprecated OpenLDAP API

2013-02-07 Thread Robert Haas
On Tue, Feb 5, 2013 at 4:39 AM, Albe Laurenz  wrote:
> I guess it's too late for something like that to go into 9.3.
> Should I add it to the next commitfest?

Bug fixes can go in pretty much whenever, but adding it to the next
CommitFest is a good way of backstopping it against the possibility
that it might be forgotten.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: ANSI SQL 2011 syntax for named parameters

2013-02-07 Thread Robert Haas
On Thu, Feb 7, 2013 at 6:42 AM, Simon Riggs  wrote:
> IMO the way to resolve that conflict is with a behaviour parameter to
> allow people to choose, rather than be forced to wait a year because
> some people still run an old version of an add-on package. A good way
> to do that would be to have a sql_standard = postgres | 2011 etc so we
> can tick the box in having a sql standard flagger as well.

The undesirability of syntax-altering GUCs has been discussed here on
many occasions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.

2013-02-07 Thread Robert Haas
On Thu, Feb 7, 2013 at 4:04 AM, Heikki Linnakangas
 wrote:
> It makes me uncomfortable that we're adding switches to pg_ctl promote just
> because we're worried there might be bugs in our code. If we don't trust the
> code as it is, it needs more testing. We can analyze the code more
> thoroughly, to make an educated guess on what's likely to happen if it's
> broken, and consider adding some sanity checks etc. to make the consequences
> less severe. We should not put the burden on our users to decide if the code
> is trustworthy enough to use.

+1

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: ANSI SQL 2011 syntax for named parameters

2013-02-07 Thread Simon Riggs
On 6 February 2013 20:31, Robert Haas  wrote:
> On Wed, Feb 6, 2013 at 1:06 PM, Simon Riggs  wrote:
>> On 6 February 2013 17:43, Robert Haas  wrote:
>>> On Mon, Feb 4, 2013 at 3:32 PM, Simon Riggs  wrote:
 On 4 February 2013 19:53, Robert Haas  wrote:
> This seems pretty close to an accusation of bad faith, which I don't
> believe to be present.

 Robert, this is not an accusation of bad faith, just an observation
 that we can move forwards more quickly.
>>>
>>> It's your opinion, to which you are certainly entitled, but it is not
>>> an observation of an objective fact.
>>
>> And what? You expressed an opinion, as did I.
>>
>> I repeat: I don't see why waiting a year changes anything here. Can
>> you please explain why the situation is improved by waiting a year?
>
> What was unclear or incomplete about the last two times I explained
> it?  Here's what I wrote the first time:
>
> $ Keep in mind that, as recently as PostgreSQL 9.1, we shipped hstore
> $ with a =>(text, text) operator.  That operator was deprecated in 9.0,
> $ but it wasn't actually removed until PostgreSQL 9.2.  Whenever we do
> $ this, it's going to break things for anyone who hasn't yet upgraded
> $ from hstore v1.0 to hstore v1.1.  So I would prefer to wait one more
> $ release.  That way, anyone who does an upgrade, say, every other major
> $ release cycle should have a reasonably clean upgrade path.
>
> And here's what I wrote the second time:
>
> $ Right now there is one and only one release in
> $ the field that contains hstore 1.1.  If we go ahead and prohibit => as
> $ an operator name now, we're going to require everyone who is on 9.1
> $ and uses hstore and wants to get to 9.3 to either (a) first upgrade to
> $ 9.2, then update hstore, then upgrade to 9.3; or (b) dig the
> $ hstore-1.1 update out of a future release, apply it to an earlier
> $ release on which it did not ship, and then upgrade.
>
> I don't know what to add to that.

I don't see a problem with requiring that, but there are other ways also.

hstore, as well as other code, might contain a definition of the =>
operator. So the hstore situation isn't that relevant in itself.

There is potentially code out there that currently runs on PostgreSQL
that uses =>. There is also potentially code out there that could run
on Postgres if we allow the => standard syntax.  There is also a
conflict in that we are continuing to encourage the development of
non-standard code because we aren't supporting the standard yet. So
there is a conflict.

IMO the way to resolve that conflict is with a behaviour parameter to
allow people to choose, rather than be forced to wait a year because
some people still run an old version of an add-on package. A good way
to do that would be to have a sql_standard = postgres | 2011 etc so we
can tick the box in having a sql standard flagger as well.

I believe the same issue exists with the -> operator, which is also
part of the SQL standard on "reference types".

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Considering Gerrit for CFs

2013-02-07 Thread Magnus Hagander
On Thu, Feb 7, 2013 at 8:29 AM, Brendan Jurd  wrote:
> On 7 February 2013 08:07, Josh Berkus  wrote:
>> The existing Gerrit community would be keen to have the PostgreSQL
>> project as a major user, though, and would theoretically help with
>> modification needs.  Current major users are OpenStack, Mediawiki,
>> LibreOffice and QT.
>
> Do we actually have any requirements that are known to be uncatered
> for in gerrit's standard feature set?

Email being the primary interaction method has long been a stated
requirement, and we've just been told that's uncatered for in gerrit.

Turning the same question around, do we have any requirements on top
of the current CF app that actually *are* catered for in gerrit? That
is, what problem are we actually trying to solve? (On a technical
level - "bring in more reviewers" doesn't count, you have to explain
*how* that's going to happen)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Considering Gerrit for CFs

2013-02-07 Thread Magnus Hagander
On Thu, Feb 7, 2013 at 8:20 AM, Daniel Farina  wrote:
> On Wed, Feb 6, 2013 at 3:00 PM, Joshua D. Drake  
> wrote:
>>
>> On 02/06/2013 01:53 PM, Tom Lane wrote:
>>
>>> ... if it's going to try to coerce us out of our email-centric habits,
>>> then I for one am very much against it.  To me, the problems with the
>>> existing CF app are precisely that it's not well enough integrated with
>>> the email discussions.  The way to fix that is not to abandon email (or
>>> at least, it's not a way I wish to pursue).
>>
>>
>> The email centric habits are by far the biggest limiting factor we have.
>> Email was never designed for integral collaboration. That said, I see no way
>> around it. I have brought up this idea before but, Redmine has by far served
>> the purposes (with a little effort) of CMD and it also integrates with GIT.
>> It also does not break the email work flow. It does not currently allow
>> commands via email but that could be easily (when I say easily, I mean it)
>> added.
>>
>> Just another thought.
>
> I don't think controlling things by email is the issue.  I have used
> the usual litany of bug trackers and appreciate the
> correspondence-driven model that -hackers and -bugs uses pretty
> pleasant.  If nothing else, the uniform, well-developed, addressable,
> and indexed nature of the archives definitely provides me a lot of
> value that I haven't really seen duplicated in other projects that use
> structured bug or patch tracking.  The individual communications tend
> to be of higher quality -- particularly to the purpose of later
> reference -- maybe a byproduct of the fact that prose is on the
> pedestal.
>
> There are obvious tooling gaps (aren't there always?), but I don't
> really see the model as broken, and I don't think I've been around
> pgsql-hackers exclusively or extensively enough to have developed
> Stockholm syndrome.
>
> I also happen to feel that the commitfest application works rather
> well for me in general.  Sure, I wish that it might slurp up all
> submitted patches automatically or something like that with default
> titles or something or identify new versions when they appear, but
> I've always felt that skimming the commitfest detail page for a patch
> was useful.  It was perhaps harder to know if the commitfest page I
> was looking at was complete or up to date or not, although it often
> is.
>
> Here's what I find most gut-wrenching in the whole submit-review-commit 
> process:
>
> * When a reviewer shows up a couple of weeks later and says "this
> patch doesn't apply" or "make check crash" or "fails to compile".
> It's especially sad because the reviewer has presumably cut out a
> chunk of time -- now probably aborted -- to review the patch.
> Machines can know these things automatically.

If the report is *just* "this patch doesn't apply", I agree. If the
reviewer is more "this patch doesn't apply anymore. Here's an adjusted
version that does" it has a value in itself - because somebody else
just got more familiar with that part of the code.

> * When on occasion patches are submitted with non-C/sh/perl suites
> that may not really be something that anyone wants to be a
> build/source tree, but seem like they might do a better job testing
> the patch.  The inevitable conclusion is that the automated test
> harness is tossed, or never written because it is known it will have
> no place to live after a possible commit.  Somehow I wish those could
> live somewhere and run against all submitted patches.
>
> I've toyed a very, very tiny bit with running build farm clients on
> Heroku with both of these in mind, but haven't revisited it in a
> while.

It's certainly been loosely discusse dbefore as a possible enhancement
- but the main thing being it basically *has* to be run in a
virtualized environment that's thrown away, or you're going to open
all sorts of issues with running arbitrary code on peoples machines.
Of course, virtualization is kind of what you guys do :)


Personally, I find the most annoying thing with the current process
being when reviewers post their reviews as a completely separate
thread, instead of replying in the original thread. This causes
context switches when parsing things, because now you have to jump
back and forth between the CF app and your mail reader. But it's still
only on the annoyance side, I think the process in general is not
broken. (That said, I *have* been on the inside a long time, *and* I
live in Stockholm, so I might well have that syndrome)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql and materialized views

2013-02-07 Thread Kohei KaiGai
Thanks for your introduction. It made me almost clear.

>From selinux perspective, it does not switch user's privilege even when
query scans underlying tables because it has no ownership concept.
The current implementation does not make a problem because all the
MV refresh shall be done in a particular user session, thus, it is also
associated with a particular security label if sepgsql is enabled.
So, as you introduced, all we need to do is identical with SELECT ...
INTO and CREATE TABLE AS from selinux perspective also.

> I think these are issues require vigilance.  I hadn't really
> thought through all of this, but since the creation and refresh
> work off of the existing VIEW code, and the querying works off of
> the existing TABLE code, the existing security mechanisms tend to
> come into play by default.
>
If we will try to support refresh MV out of user session like what
autovacuum is doing towards vacuum by hand, probably,
a reasonable design is applying a pseudo session user-id come
from ownership of MV. I think, similar idea can be applied to
apply a temporary pseudo security label, as long as it allows
extensions to get control around these processed.
Anyway, I'm inclined to be optimistic about this possible issue
as long as extension can get necessary support such as new
hooks.

I think it is a reasonable alternative to apply db_table object
class for materialized-view, right now, because it performs as
if we are doing onto regular tables.
However, one exception is here; how to control privilege to
refresh the MV. Of course, db_table class does not have
"refresh" operation. Even though the default permission checks
its ownership (or superuser) at ExecRefreshMatView, but nothing
are checked in extension side.
(Of course, it will need a new hook around here, also.)

So, an ideal solution is to add db_materialized_view
(or db_matview in short) object class in selinux world, then
we checks "refresh" permission of MV.
However, it takes more time to have discussion in selinux
community then shipped to distributions.

So, I'd like to review two options.
1) we uses db_table object class for materialized-views for
a while, until selinux-side become ready. Probably, v9.3 will
use db_table class then switched at v9.4.
2) we uses db_materialized_view object class from the
begining, but its permission checks are ignored because
installed security policy does not support this class yet.

My preference is 2), even though we cannot apply label
based permission checks until selinux support it, because
1) makes troubles when selinux-side become ready to
support new db_materialized_view class. Even though
policy support MV class, working v9.3 will ignore the policy.

Let me ask selinux folks about this topic also.

Thanks,

2013/2/5 Kevin Grittner :
> Kohei KaiGai  wrote:
>
>> Let me confirm a significant point. Do you never plan a feature
>> that allows to update/refresh materialized-views out of user
>> session?
>
> Currently only the owner of the MV or a database superuser can
> refresh it, and the refresh is run with the permissions of the MV
> owner.  The main change to that I see is that the owner could
> establish a policy of automatic updates to the MV based on changes
> to the underlying tables, with a timing established by the MV owner
> or a database superuser.
>
>> I had an impression on asynchronous update of MV something like
>> a feature that moves data from regular tables to MV with
>> batch-jobs in mid-night, but under the privilege that bypass
>> regular permission checks.
>
> I would expect it to be more a matter of being based on the
> authority of the MV owner.  That raises interesting questions about
> what happens if a permission which allowed the MV to be defined is
> revoked from the owner, or if the MV is altered to have an owner
> without permission to access the underlying data.  With the current
> patch, if the owner is changed to a user who does not have rights
> to access the underlying table, a "permission denied" error is
> generated when that new owner tries to refresh the MV.
>
>> It it is never planned, my concern was pointless.
>
> I think these are issues require vigilance.  I hadn't really
> thought through all of this, but since the creation and refresh
> work off of the existing VIEW code, and the querying works off of
> the existing TABLE code, the existing security mechanisms tend to
> come into play by default.
>
>> My concern is future development that allows to update/refresh MV
>> asynchronously, out of privilege control.
>
> While it has not yet been defined, my first reaction is that it
> should happen under privileges of the MV owner.
>
>> As long as all the update/refresh operation is under privilege
>> control with user-id/security label of the current session, here
>> is no difference from regular writer operation of tables with
>> contents read from other tables.
>
> Again, it's just my first impression, but I think that the
> permissions of the current sessi

Re: [HACKERS] [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.

2013-02-07 Thread Simon Riggs
On 7 February 2013 09:04, Heikki Linnakangas  wrote:

> It makes me uncomfortable that we're adding switches to pg_ctl promote just
> because we're worried there might be bugs in our code. If we don't trust the
> code as it is, it needs more testing. We can analyze the code more
> thoroughly, to make an educated guess on what's likely to happen if it's
> broken, and consider adding some sanity checks etc. to make the consequences
> less severe. We should not put the burden on our users to decide if the code
> is trustworthy enough to use.

I don't think I said I was worried about bugs in code, did I? The
point is that this has been a proven mechanism for many years and
we're now discussing turning that off completely with no user option
to put it back, which has considerable risk with it.

Acknowledging risks and taking risk mitigating actions is a normal
part of any IT project. If we start getting unexplained errors it
could take a long time to trace that back to the lack of a shutdown
checkpoint.

I don't mind saying openly this worries me and its why I took months
to commit it. If there was no risk here and its all so easy, why
didn't we commit this last year, or why didn't you override me and
commit this earlier in this cycle?

I have to say I care very little for the beauty or lack of command
switches, in such a case. The "cost" there is low.

Tell me you understand the risk I am discussing, tell me in your
opinion we're safe and I'm being unnecessarily cautious, maybe even
foolishly so, and I'll relent. I'll stand by that and take the flak.
But saying you don't like a switch is like telling me you don't like
the colour of my car safety belt.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Pavan Deolasee
On Thu, Feb 7, 2013 at 2:25 PM, Pavan Deolasee  wrote:

>
> Will look more into it, but thought this might be useful for others to
> spot the problem.
>

And here is some more forensic info about one of the pages having
duplicate tuples.

jjanes=# select *, xmin, xmax, ctid from foo where index IN (select
index from foo group by index having count(*) > 1 ORDER by index)
ORDER by index LIMIT 3;
 index | count |xmin| xmax |   ctid
---+---++--+---
   219 |   353 | 2100345903 |0 | (150,98)
   219 |   354 | 2100346051 |0 | (150,101)
   219 |   464 | 2101601086 |0 | (150,126)
(3 rows)

jjanes=# select * from page_header(get_raw_page('foo',150));
 lsn | tli | flags | lower | upper | special | pagesize |
version | prune_xid
-+-+---+---+---+-+--+-+---
 4C/52081968 |   1 | 5 |  1016 |  6304 |8192 | 8192 |
 4 | 0
(1 row)

jjanes=# select * from heap_page_items(get_raw_page('foo',150)) WHERE
lp IN (98, 101, 126);
 lp  | lp_off | lp_flags | lp_len |   t_xmin   | t_xmax | t_field3 |
t_ctid   | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
-++--++++--+---+-++++---
  98 |   7968 |1 | 32 | 2100345903 |  0 |0 |
(150,101) |   32770 |  10496 | 24 ||
 101 |   7904 |1 | 32 | 2100346051 |  0 |0 |
(150,101) |   32770 |  10496 | 24 ||
 126 |   7040 |1 | 32 | 2101601086 |  0 |0 |
(150,126) |   32770 |  10496 | 24 ||
(3 rows)

So every duplicate tuple has the same flags set:

HEAP_XMAX_INVALID
HEAP_XMIN_COMMITED
HEAP_UPDATED
HEAP_ONLY_TUPLE

The first two duplicates are chained by the ctid chain, but the last
one looks independent. More later.

Thanks,
Pavan
-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.

2013-02-07 Thread Heikki Linnakangas

On 07.02.2013 10:41, Simon Riggs wrote:

On 6 February 2013 18:02, Robert Haas  wrote:


So I would ask this question: why would someone want to turn off
fast-promote mode, assuming for the sake of argument that it isn't
buggy?


You can write a question many ways, and lead people towards a
conclusion as a result.

Why would someone want to turn off safe-promote mode, assuming it was
fast enough?


Okay, I'll bite..

Because in some of your servers, the safe/slow promotion is not fast 
enough, and you want to use the same promotion script in both scenarios, 
to keep things simple.


Because you're not sure if it's fast enough, and want to play it safe.

Because faster is nicer, even if the slow mode would be "fast enough".


It makes me uncomfortable that we're adding switches to pg_ctl promote 
just because we're worried there might be bugs in our code. If we don't 
trust the code as it is, it needs more testing. We can analyze the code 
more thoroughly, to make an educated guess on what's likely to happen if 
it's broken, and consider adding some sanity checks etc. to make the 
consequences less severe. We should not put the burden on our users to 
decide if the code is trustworthy enough to use.


Note that we still wouldn't do fast promotion in crash recovery, so 
there's that escape hatch if there is indeed a bug in our code and fast 
promotion fails.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Pavan Deolasee
On Thu, Feb 7, 2013 at 11:09 AM, Jeff Janes  wrote:
> While stress testing Pavan's 2nd pass vacuum visibility patch, I realized
> that vacuum/visibility was busted.  But it wasn't his patch that busted it.
> As far as I can tell, the bad commit was in the range
> 692079e5dcb331..168d3157032879
>
> Since a run takes 12 to 24 hours, it will take a while to refine that
> interval.
>
> I was testing using the framework explained here:
>
> http://www.postgresql.org/message-id/CAMkU=1xoa6fdyoj_4fmlqpiczr1v9gp7clnxjdhu+iggqb6...@mail.gmail.com
>
> Except that I increased  JJ_torn_page to 8000, so that autovacuum has a
> chance to run to completion before each crash; and I turned off archive_mode
> as it was not relevant and caused annoying noise.  As far as I know,
> crashing is entirely irrelevant to the current problem, but I just used and
> adapted the framework I had at hand.
>
> A tarball  of the data directory is available below, for those who would
> like to do a forensic inspection.  The table jjanes.public.foo is clearly in
> violation of its unique index.

The xmins of all the duplicate tuples look dangerously close to 2^31.
I wonder if XID wrap around has anything to do with it.

Index scans do not return any duplicates and you need to force a seq
scan to see them. Assuming that the page level VM bit might be
corrupted, I tried to REINDEX the table to see if it complains of
unique key violations, but that crashes the server with

TRAP: FailedAssertion("!(((bool) ((root_offsets[offnum - 1] !=
((OffsetNumber) 0)) && (root_offsets[offnum - 1] <= ((OffsetNumber)
(8192 / sizeof(ItemIdData)))", File: "index.c", Line: 2482)

Will look more into it, but thought this might be useful for others to
spot the problem.

Thanks,
Pavan

P.S BTW, you would need to connect as user "jjanes" to a database
"jjanes" to see the offending table.

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.

2013-02-07 Thread Simon Riggs
On 6 February 2013 18:02, Robert Haas  wrote:

> So I would ask this question: why would someone want to turn off
> fast-promote mode, assuming for the sake of argument that it isn't
> buggy?

You can write a question many ways, and lead people towards a
conclusion as a result.

Why would someone want to turn off safe-promote mode, assuming it was
fast enough?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-02-07 Thread Simon Riggs
On 7 February 2013 05:39, Jeff Janes  wrote:

> While stress testing Pavan's 2nd pass vacuum visibility patch, I realized
> that vacuum/visibility was busted.  But it wasn't his patch that busted it.
> As far as I can tell, the bad commit was in the range
> 692079e5dcb331..168d3157032879

Please define "busted" so we can all help track this down.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-02-07 Thread Michael Paquier
On Thu, Feb 7, 2013 at 5:15 PM, Andres Freund wrote:

> On 2013-02-07 03:01:36 -0500, Tom Lane wrote:
> > Andres Freund  writes:
> > > What about
> >
> > > 3) Use reltoastidxid if != InvalidOid and manually build the list
> (using
> > > RelationGetIndexList) otherwise?
> >
> > Do we actually need reltoastidxid at all?  I always thought having that
> > field was a case of premature optimization.
>
> I am a bit doubtful its really measurable as well. Really supporting a
> dynamic number of indexes might be noticeable because we would need to
> allocate memory et al for each toasted Datum, but only supporting one or
> two seems easy enough.
>
> The only advantage besides the dubious performance advantage of my
> proposed solution is that less code needs to change as only
> toast_save_datum() would need to change.
>
> > There might be some case
> > for keeping it to avoid breaking any client-side code that might be
> > looking at it ... but if you're proposing changing the field contents
> > anyway, that argument goes right out the window.
>
> Well, it would only be 0/InvalidOid while being reindexed concurrently,
> but yea.
>
Removing reltoastindxid is more appealing for at least 2 reasons regarding
current implementation of REINDEX CONCURRENTLY:
1) if reltoastidxid is set to InvalidOid during a concurrent reindex and
reindex fails, how would it be possible to set it back to the correct
value? This would need more special code, which could become a maintenance
burden for sure.
2) There is already some special code in my patch to update reltoastidxid
to the new Oid value when swapping indexes. Removing that would honestly
make the index swapping cleaner.

Btw, I think that if this optimization for toast relations is done, it
should be a separate patch. Also, as I am not a specialist in toast
indexes, any opinion about potential performance impact (if any) is welcome
if we remove reltoastidxid and use RelationGetIndexList instead.
-- 
Michael


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-02-07 Thread Michael Paquier
On Thu, Feb 7, 2013 at 5:01 PM, Tom Lane  wrote:

> Andres Freund  writes:
> > What about
>
> > 3) Use reltoastidxid if != InvalidOid and manually build the list (using
> > RelationGetIndexList) otherwise?
>
> Do we actually need reltoastidxid at all?  I always thought having that
> field was a case of premature optimization.  There might be some case
> for keeping it to avoid breaking any client-side code that might be
> looking at it ... but if you're proposing changing the field contents
> anyway, that argument goes right out the window.
>
Here is an interesting idea. Could there be some performance impact if we
remove this field and replace it by RelationGetIndexList to fetch the list
of indexes that need to be inserted?
-- 
Michael


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-02-07 Thread Andres Freund
On 2013-02-07 03:01:36 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > What about
> 
> > 3) Use reltoastidxid if != InvalidOid and manually build the list (using
> > RelationGetIndexList) otherwise?
> 
> Do we actually need reltoastidxid at all?  I always thought having that
> field was a case of premature optimization.

I am a bit doubtful its really measurable as well. Really supporting a
dynamic number of indexes might be noticeable because we would need to
allocate memory et al for each toasted Datum, but only supporting one or
two seems easy enough.

The only advantage besides the dubious performance advantage of my
proposed solution is that less code needs to change as only
toast_save_datum() would need to change.

> There might be some case
> for keeping it to avoid breaking any client-side code that might be
> looking at it ... but if you're proposing changing the field contents
> anyway, that argument goes right out the window.

Well, it would only be 0/InvalidOid while being reindexed concurrently,
but yea.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-02-07 Thread Tom Lane
Andres Freund  writes:
> What about

> 3) Use reltoastidxid if != InvalidOid and manually build the list (using
> RelationGetIndexList) otherwise?

Do we actually need reltoastidxid at all?  I always thought having that
field was a case of premature optimization.  There might be some case
for keeping it to avoid breaking any client-side code that might be
looking at it ... but if you're proposing changing the field contents
anyway, that argument goes right out the window.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers